mteb
Text Embedding Benchmark
Provides tools and benchmarks for evaluating text embedding models
MTEB: Massive Text Embedding Benchmark
2k stars
15 watching
285 forks
Language: Jupyter Notebook
last commit: 2 months ago
Linked from 1 awesome list
benchmarkbitext-miningclusteringinformation-retrievalmultilingual-nlpneural-searchrerankingretrievalsbertsemantic-searchsentence-transformerssgptststext-classificationtext-embedding
Related projects:
Repository | Description | Stars |
---|---|---|
| An evaluation framework for Polish word embeddings prepared by various research groups using analogy tasks. | 4 |
| An implementation of the Predictive Text Embedding model for learning word representations from large-scale heterogeneous text networks. | 96 |
| Provides methods for evaluating word embeddings on various benchmarks | 437 |
| A utility class for generating and evaluating document representations using word embeddings. | 54 |
| This project generates Spanish word embeddings using fastText on large corpora. | 9 |
| A collection of pre-trained subword embeddings in 275 languages, useful for natural language processing tasks. | 1,189 |
| Provides fast and efficient word embeddings for natural language processing. | 223 |
| An open-source benchmarking framework for evaluating cross-style visual capability of large multimodal models | 84 |
| An Ember application benchmarking tool to measure the effects of small changes on web applications. | 25 |
| A benchmark for evaluating large language models in multiple languages and formats | 93 |
| Pre-trained word and sentence embeddings for biomedical text analysis | 578 |
| A project implementing a method to incorporate morphological information into word embeddings using a neural network model | 52 |
| An implementation of a word embedding model that uses character n-grams and achieves state-of-the-art results in multiple NLP tasks | 803 |
| A tool for benchmarking Elixir code and comparing performance statistics | 1,422 |
| A method to generate sentence embeddings from pre-trained language models | 178 |