text-generation-inference
Text Generation Toolkit
A toolkit for deploying and serving Large Language Models (LLMs) for high-performance text generation
Large Language Model Text Generation Inference
9k stars
104 watching
1k forks
Language: Python
last commit: about 1 month ago
Linked from 1 awesome list
bloomdeep-learningfalcongptinferencenlppytorchstarcodertransformer
Related projects:
Repository | Description | Stars |
---|---|---|
huggingface/text-embeddings-inference | A toolkit for deploying and serving text embeddings models with high-performance inference capabilities. | 2,932 |
ggerganov/llama.cpp | Enables LLM inference with minimal setup and high performance on various hardware platforms | 69,185 |
fminference/flexllmgen | Generates large language model outputs in high-throughput mode on single GPUs | 9,236 |
huggingface/tokenizers | A toolkit providing optimized tokenizers for natural language processing tasks in various programming languages. | 9,156 |
google/big-bench | A benchmark designed to probe large language models and extrapolate their future capabilities through a diverse set of tasks. | 2,899 |
huggingface/transformers | A collection of pre-trained machine learning models for various natural language and computer vision tasks, enabling developers to fine-tune and deploy these models on their own projects. | 136,357 |
brexhq/prompt-engineering | Guides software developers on how to effectively use and build systems around Large Language Models like GPT-4. | 8,487 |
meta-llama/codellama | Provides inference code and tools for fine-tuning large language models, specifically designed for code generation tasks | 16,097 |
modeltc/lightllm | A Python-based framework for serving large language models with low latency and high scalability. | 2,691 |
oobabooga/text-generation-webui | A web-based interface for generating text using large language models | 41,123 |
qwenlm/qwen2.5 | A large language model series with various sizes and variants for text generation and understanding. | 10,959 |
confident-ai/deepeval | A framework for evaluating large language models | 4,003 |
microsoft/flaml | Automates machine learning workflows and optimizes model performance using large language models and efficient algorithms | 3,968 |
opennmt/ctranslate2 | A high-performance inference engine for transformer models | 3,467 |
ericlbuehler/mistral.rs | A high-performance LLM inference framework written in Rust | 4,677 |