infinity

Infinity is a high-throughput, low-latency REST API for serving text-embeddings, reranking models and clip

GitHub

1k stars
18 watching
94 forks
Language: Python
last commit: 6 days ago
Linked from 1 awesome list

bert-embeddingsllmtext-embeddings

Backlinks from these awesome lists: