FlexLLMGen
Batch processor
Generates large language model outputs in high-throughput mode on single GPUs
Running large language models on a single GPU for throughput-oriented scenarios.
9k stars
111 watching
548 forks
Language: Python
last commit: 24 days ago deep-learninggpt-3high-throughputlarge-language-modelsmachine-learningoffloadingopt
Related projects:
Repository | Description | Stars |
---|---|---|
sjtu-ipads/powerinfer | An efficient Large Language Model inference engine leveraging consumer-grade GPUs on PCs | 7,964 |
sgl-project/sglang | A framework for serving large language models and vision models with efficient runtime and flexible interface. | 6,082 |
huggingface/text-generation-inference | A toolkit for deploying and serving Large Language Models. | 9,106 |
brexhq/prompt-engineering | Guides software developers on how to effectively use and build systems around Large Language Models like GPT-4. | 8,440 |
lyogavin/airllm | A Python library that optimizes inference memory usage for large language models on limited GPU resources. | 5,259 |
modeltc/lightllm | An LLM inference and serving framework providing a lightweight design, scalability, and high-speed performance for large language models. | 2,609 |
google/big-bench | A benchmark designed to evaluate the capabilities of large language models by simulating various tasks and measuring their performance | 2,868 |
optimalscale/lmflow | A toolkit for finetuning large language models and providing efficient inference capabilities | 8,273 |
thudm/glm-130b | An open-source implementation of a large bilingual language model pre-trained on vast amounts of text data. | 7,659 |
microsoft/flaml | Automates machine learning workflows and optimizes model performance using large language models and efficient algorithms | 3,919 |
aksnzhy/xlearn | A high-performance machine learning package with linear models and factorization machines. | 3,087 |
qwenlm/qwen2.5 | A large language model series with various sizes and variants for text generation and understanding. | 9,710 |
dair-ai/ml-papers-explained | An explanation of key concepts and advancements in the field of Machine Learning | 7,315 |
mlabonne/llm-course | A comprehensive course and resource package on building and deploying Large Language Models (LLMs) | 39,120 |
young-geng/easylm | A framework for training and serving large language models using JAX/Flax | 2,409 |