BCEmbedding

Multilingual retriever

Provides bilingual and crosslingual retrieval models for semantic search and question-answering in multiple languages

Netease Youdao's open-source embedding and reranker models for RAG products.

GitHub

1k stars
8 watching
99 forks
Language: Python
last commit: 3 months ago

Related projects:

Repository Description Stars
xverse-ai/xverse-moe-a36b Develops and publishes large multilingual language models with advanced mixing-of-experts architecture. 36
tonianelope/multilingual-bert Investigating multilingual language models for Named Entity Recognition in German and English 14
untra/polyglot A plugin for Jekyll blogs that enables support for multiple languages and internationalization. 420
elblogbruno/notionai-mymind Enables users to collect and organize web content in a customizable Notion database using AI-powered tagging and search capabilities 260
microsoft/unicoder This repository provides pre-trained models and code for understanding and generation tasks in multiple languages. 88
tiger-ai-lab/uniir Trains and evaluates a universal multimodal retrieval model to perform various information retrieval tasks. 110
eleutherai/polyglot Large language models designed to perform well in multiple languages and address performance issues with current multilingual models. 475
microsoft/kernel-memory An AI service for efficient indexing and retrieval of data using natural language queries and semantic search 1,602
alexa/massive A collection of tools and modeling code for a large multilingual Natural Language Understanding dataset 538
emlid/ntripbrowser Tool to retrieve and display information about NTRIP caster sources 32
workday/upshot-montague Translates natural language into formal representations using Combinatory Categorial Grammar (CCG), enabling semantic parsing. 59
pleisto/yuren-baichuan-7b A multi-modal large language model that integrates natural language and visual capabilities with fine-tuning for various tasks 72
facebookresearch/spiritlm This repository provides an end-to-end language model capable of generating coherent text based on both spoken and written inputs. 777
nemotyrant/manong A curated collection of resources categorized by programming language and technology. 3,877
neulab/pangea An open-source multilingual large language model designed to understand and generate content across diverse languages and cultural contexts 91