EvalAI

Benchmarking tool

A platform for comparing and evaluating AI and machine learning algorithms at scale

cloud rocket bar_chart chart_with_upwards_trend Evaluating state of the art in AI

GitHub

2k stars
54 watching
786 forks
Language: Python
last commit: 2 months ago
Linked from 1 awesome list

aiai-challengesangular7angularjsartificial-intelligencechallengedjangodockerevalaievaluationleaderboardmachine-learningpythonreproducibilityreproducible-research

Backlinks from these awesome lists:

Related projects:

Repository Description Stars
catboost/benchmarks Comparative benchmarks of various machine learning algorithms 169
ailab-cvc/seed-bench A benchmark for evaluating large language models' ability to process multimodal input 315
aws-samples/foundation-model-benchmarking-tool A tool for benchmarking and evaluating generative AI models on various AWS platforms 196
alco/benchfella Tools for comparing and benchmarking small code snippets 516
ethicalml/xai An eXplainability toolbox for machine learning that enables data analysis and model evaluation to mitigate biases and improve performance 1,125
mshukor/evalign-icl Evaluating and improving large multimodal models through in-context learning 20
princeton-nlp/charxiv An evaluation suite for assessing chart understanding in multimodal large language models. 75
ys-zong/vl-icl A benchmarking suite for multimodal in-context learning models 28
bencheeorg/benchee A tool for benchmarking Elixir code and comparing performance statistics 1,417
bailool/doyouevenlearn A comprehensive resource guide to stay updated on AI, ML, DL, and CV advancements 1,038
aifeg/benchlmm An open-source benchmarking framework for evaluating cross-style visual capability of large multimodal models 83
vlall/swift-brain A collection of algorithms and data structures for artificial intelligence and machine learning in Swift 335
openai/simple-evals A library for evaluating language models using standardized prompts and benchmarking tests. 1,939
jvalegre/robert Automated machine learning protocols for cheminformatics using Python 38
vchitect/vbench A tool for evaluating and benchmarking video generative models in computer vision and artificial intelligence 576