FlexLLMGen

Batch processor

Generates large language model outputs in high-throughput mode on single GPUs

Running large language models on a single GPU for throughput-oriented scenarios.

GitHub

9k stars
111 watching
548 forks
Language: Python
last commit: 24 days ago
deep-learninggpt-3high-throughputlarge-language-modelsmachine-learningoffloadingopt

Related projects:

Repository Description Stars
sjtu-ipads/powerinfer An efficient Large Language Model inference engine leveraging consumer-grade GPUs on PCs 7,964
sgl-project/sglang A framework for serving large language models and vision models with efficient runtime and flexible interface. 6,082
huggingface/text-generation-inference A toolkit for deploying and serving Large Language Models. 9,106
brexhq/prompt-engineering Guides software developers on how to effectively use and build systems around Large Language Models like GPT-4. 8,440
lyogavin/airllm A Python library that optimizes inference memory usage for large language models on limited GPU resources. 5,259
modeltc/lightllm An LLM inference and serving framework providing a lightweight design, scalability, and high-speed performance for large language models. 2,609
google/big-bench A benchmark designed to evaluate the capabilities of large language models by simulating various tasks and measuring their performance 2,868
optimalscale/lmflow A toolkit for finetuning large language models and providing efficient inference capabilities 8,273
thudm/glm-130b An open-source implementation of a large bilingual language model pre-trained on vast amounts of text data. 7,659
microsoft/flaml Automates machine learning workflows and optimizes model performance using large language models and efficient algorithms 3,919
aksnzhy/xlearn A high-performance machine learning package with linear models and factorization machines. 3,087
qwenlm/qwen2.5 A large language model series with various sizes and variants for text generation and understanding. 9,710
dair-ai/ml-papers-explained An explanation of key concepts and advancements in the field of Machine Learning 7,315
mlabonne/llm-course A comprehensive course and resource package on building and deploying Large Language Models (LLMs) 39,120
young-geng/easylm A framework for training and serving large language models using JAX/Flax 2,409