text-generation-inference

Text Generation Toolkit

A toolkit for deploying and serving Large Language Models (LLMs) for high-performance text generation

Large Language Model Text Generation Inference

GitHub

9k stars
104 watching
1k forks
Language: Python
last commit: about 1 month ago
Linked from 1 awesome list

bloomdeep-learningfalcongptinferencenlppytorchstarcodertransformer

Backlinks from these awesome lists:

Related projects:

Repository Description Stars
huggingface/text-embeddings-inference A toolkit for deploying and serving text embeddings models with high-performance inference capabilities. 2,932
ggerganov/llama.cpp Enables LLM inference with minimal setup and high performance on various hardware platforms 69,185
fminference/flexllmgen Generates large language model outputs in high-throughput mode on single GPUs 9,236
huggingface/tokenizers A toolkit providing optimized tokenizers for natural language processing tasks in various programming languages. 9,156
google/big-bench A benchmark designed to probe large language models and extrapolate their future capabilities through a diverse set of tasks. 2,899
huggingface/transformers A collection of pre-trained machine learning models for various natural language and computer vision tasks, enabling developers to fine-tune and deploy these models on their own projects. 136,357
brexhq/prompt-engineering Guides software developers on how to effectively use and build systems around Large Language Models like GPT-4. 8,487
meta-llama/codellama Provides inference code and tools for fine-tuning large language models, specifically designed for code generation tasks 16,097
modeltc/lightllm A Python-based framework for serving large language models with low latency and high scalability. 2,691
oobabooga/text-generation-webui A web-based interface for generating text using large language models 41,123
qwenlm/qwen2.5 A large language model series with various sizes and variants for text generation and understanding. 10,959
confident-ai/deepeval A framework for evaluating large language models 4,003
microsoft/flaml Automates machine learning workflows and optimizes model performance using large language models and efficient algorithms 3,968
opennmt/ctranslate2 A high-performance inference engine for transformer models 3,467
ericlbuehler/mistral.rs A high-performance LLM inference framework written in Rust 4,677