EvalAI
Benchmarking tool
A platform for comparing and evaluating AI and machine learning algorithms at scale
Evaluating state of the art in AI
2k stars
54 watching
799 forks
Language: Python
last commit: about 1 year ago
Linked from 1 awesome list
aiai-challengesangular7angularjsartificial-intelligencechallengedjangodockerevalaievaluationleaderboardmachine-learningpythonreproducibilityreproducible-research
Related projects:
| Repository | Description | Stars |
|---|---|---|
| | Comparative benchmarks of various machine learning algorithms | 169 |
| | A benchmark for evaluating large language models' ability to process multimodal input | 322 |
| | A tool for benchmarking performance and accuracy of generative AI models on various AWS platforms | 210 |
| | Tools for comparing and benchmarking small code snippets | 514 |
| | An eXplainability toolbox for machine learning that enables data analysis and model evaluation to mitigate biases and improve performance | 1,135 |
| | Evaluating and improving large multimodal models through in-context learning | 21 |
| | An evaluation suite for assessing chart understanding in multimodal large language models. | 85 |
| | A benchmarking suite for multimodal in-context learning models | 31 |
| | A tool for benchmarking Elixir code and comparing performance statistics | 1,422 |
| | A comprehensive resource guide to stay updated on AI, ML, DL, and CV advancements | 1,039 |
| | An open-source benchmarking framework for evaluating cross-style visual capability of large multimodal models | 84 |
| | A collection of algorithms and data structures for artificial intelligence and machine learning in Swift | 335 |
| | Evaluates language models using standardized benchmarks and prompting techniques. | 2,059 |
| | Automated machine learning protocols for cheminformatics using Python | 39 |
| | A benchmark suite for evaluating the performance of video generative models | 643 |