text-generation-inference

Language Model Toolkit

A toolkit for deploying and serving Large Language Models.

Large Language Model Text Generation Inference

GitHub

9k stars
101 watching
1k forks
Language: Python
last commit: 4 days ago
Linked from 1 awesome list

bloomdeep-learningfalcongptinferencenlppytorchstarcodertransformer

Backlinks from these awesome lists:

Related projects:

Repository Description Stars
huggingface/text-embeddings-inference A blazing fast inference solution for text embeddings models. 2,838
ggerganov/llama.cpp Enables efficient inference of large language models using optimized C/C++ implementations and various backend frameworks 67,866
fminference/flexllmgen Generates large language model outputs in high-throughput mode on single GPUs 9,192
huggingface/tokenizers A toolkit providing optimized tokenizers for natural language processing tasks in various programming languages. 9,051
google/big-bench A benchmark designed to evaluate the capabilities of large language models by simulating various tasks and measuring their performance 2,868
huggingface/transformers A collection of pre-trained machine learning models for various natural language and computer vision tasks, enabling developers to fine-tune and deploy these models on their own projects. 135,022
brexhq/prompt-engineering Guides software developers on how to effectively use and build systems around Large Language Models like GPT-4. 8,440
meta-llama/codellama Provides inference code and tools for fine-tuning large language models, specifically designed for code generation tasks 16,039
modeltc/lightllm An LLM inference and serving framework providing a lightweight design, scalability, and high-speed performance for large language models. 2,609
oobabooga/text-generation-webui A web-based interface for generating text using large language models 40,673
qwenlm/qwen2.5 A large language model series with various sizes and variants for text generation and understanding. 9,710
confident-ai/deepeval A framework for evaluating large language models 3,669
microsoft/flaml Automates machine learning workflows and optimizes model performance using large language models and efficient algorithms 3,919
opennmt/ctranslate2 A high-performance library for efficient inference with Transformer models on CPUs and GPUs. 3,404
ericlbuehler/mistral.rs A fast and flexible LLM inference platform supporting various models and devices 4,466