IT IDOL Technologies
Jan 05, 2026
No image
Mathco
Completed

Mathco

$25,000+
4-6 months
India
2-5
view project
Service categories
Service Lines
Artificial Intelligence
Domain focus
Healthcare
Other
Programming language
Python
Frameworks
React.js

Challenge

AbbVie required a scalable and reliable platform to evaluate and validate multiple GenAI agents such as Text2API, Text2Doc, and Text2SQL.

The key challenge was handling large Excel files containing 500–600 questions, running long-running AI evaluations (30–45 seconds per question), comparing source and target responses against ground-truth data, and generating accurate evaluation reports.

The system also needed to support team-based collaboration, high performance, and background processing without impacting user experience.

Solution

IT IDOL Technologies designed and developed a complete AI agent evaluation platform from the ground up. We built a scalable system architecture with Python-based backend APIs to orchestrate evaluations, manage environment configurations, and process large Excel uploads.


GenAI validators were integrated into backend workflows, and Celery-based background processing was used to manage long-running tasks efficiently. A React-based frontend enabled configuration, result previews, team collaboration, and report downloads.

Results

  • Scalable platform capable of processing large datasets efficiently
  • Reliable evaluation of multiple GenAI agents using configurable pass/fail criteria
  • Improved performance through background task execution and multithreading
  • Enhanced team collaboration with group-based access and file sharing
  • Automated, downloadable evaluation reports for faster decision-making
No image
Mathco