mteb

Text embedding model evaluator

A benchmarking suite for evaluating text embedding models across various NLP tasks and datasets.

MTEB: Massive Text Embedding Benchmark

GitHub

2k stars
15 watching
272 forks
Language: Jupyter Notebook
last commit: 6 days ago
Linked from 1 awesome list

benchmarkbitext-miningclusteringinformation-retrievalmultilingual-nlpneural-searchrerankingretrievalsbertsemantic-searchsentence-transformerssgptststext-classificationtext-embedding

Backlinks from these awesome lists:

Related projects:

Repository Description Stars
ermlab/polish-word-embeddings-review An evaluation framework for Polish word embeddings prepared by various research groups using analogy tasks. 4
mnqu/pte An implementation of the Predictive Text Embedding model for learning word representations from large-scale heterogeneous text networks. 96
kudkudak/word-embeddings-benchmarks Provides methods for evaluating word embeddings on various benchmarks 437
nlprinceton/text_embedding A utility class for generating and evaluating document representations using word embeddings. 54
botcenter/spanishwordembeddings This project generates Spanish word embeddings using fastText on large corpora. 9
bheinzerling/bpemb A collection of pre-trained subword embeddings in 275 languages, useful for natural language processing tasks. 1,184
vzhong/embeddings Provides fast and efficient word embeddings for natural language processing. 223
aifeg/benchlmm An open-source benchmarking framework for evaluating cross-style visual capability of large multimodal models 82
krisselden/ember-macro-benchmark An Ember application benchmarking tool to measure the effects of small changes on web applications. 25
damo-nlp-sg/m3exam A benchmark for evaluating large language models in multiple languages and formats 92
ncbi-nlp/biosentvec Pre-trained word and sentence embeddings for biomedical text analysis 578
rguthrie3/morphologicalpriorsforwordembeddings A project implementing a method to incorporate morphological information into word embeddings using a neural network model 52
alexandres/lexvec An implementation of a word embedding model that uses character n-grams and achieves state-of-the-art results in multiple NLP tasks 803
bencheeorg/benchee A tool for benchmarking Elixir code and comparing performance statistics 1,417
binwang28/sbert-wk-sentence-embedding A method to generate sentence embeddings from pre-trained language models 177