text-generation-inference
Language Model Toolkit
A toolkit for deploying and serving Large Language Models.
Large Language Model Text Generation Inference
9k stars
101 watching
1k forks
Language: Python
last commit: 4 days ago
Linked from 1 awesome list
bloomdeep-learningfalcongptinferencenlppytorchstarcodertransformer
Related projects:
Repository | Description | Stars |
---|---|---|
huggingface/text-embeddings-inference | A blazing fast inference solution for text embeddings models. | 2,838 |
ggerganov/llama.cpp | Enables efficient inference of large language models using optimized C/C++ implementations and various backend frameworks | 67,866 |
fminference/flexllmgen | Generates large language model outputs in high-throughput mode on single GPUs | 9,192 |
huggingface/tokenizers | A toolkit providing optimized tokenizers for natural language processing tasks in various programming languages. | 9,051 |
google/big-bench | A benchmark designed to evaluate the capabilities of large language models by simulating various tasks and measuring their performance | 2,868 |
huggingface/transformers | A collection of pre-trained machine learning models for various natural language and computer vision tasks, enabling developers to fine-tune and deploy these models on their own projects. | 135,022 |
brexhq/prompt-engineering | Guides software developers on how to effectively use and build systems around Large Language Models like GPT-4. | 8,440 |
meta-llama/codellama | Provides inference code and tools for fine-tuning large language models, specifically designed for code generation tasks | 16,039 |
modeltc/lightllm | An LLM inference and serving framework providing a lightweight design, scalability, and high-speed performance for large language models. | 2,609 |
oobabooga/text-generation-webui | A web-based interface for generating text using large language models | 40,673 |
qwenlm/qwen2.5 | A large language model series with various sizes and variants for text generation and understanding. | 9,710 |
confident-ai/deepeval | A framework for evaluating large language models | 3,669 |
microsoft/flaml | Automates machine learning workflows and optimizes model performance using large language models and efficient algorithms | 3,919 |
opennmt/ctranslate2 | A high-performance library for efficient inference with Transformer models on CPUs and GPUs. | 3,404 |
ericlbuehler/mistral.rs | A fast and flexible LLM inference platform supporting various models and devices | 4,466 |