FlexLLMGen
Batch processor
Generates large language model outputs in high-throughput mode on single GPUs
Running large language models on a single GPU for throughput-oriented scenarios.
Archived
9k stars
112 watching
553 forks
Language: Python
last commit: about 2 months ago deep-learninggpt-3high-throughputlarge-language-modelsmachine-learningoffloadingopt
Related projects:
Repository | Description | Stars |
---|---|---|
sjtu-ipads/powerinfer | An efficient Large Language Model inference engine leveraging consumer-grade GPUs on PCs | 8,011 |
sgl-project/sglang | A fast serving framework for large language models and vision language models. | 6,551 |
huggingface/text-generation-inference | A toolkit for deploying and serving Large Language Models (LLMs) for high-performance text generation | 9,456 |
brexhq/prompt-engineering | Guides software developers on how to effectively use and build systems around Large Language Models like GPT-4. | 8,487 |
lyogavin/airllm | Optimizes large language model inference on limited GPU resources | 5,446 |
modeltc/lightllm | A Python-based framework for serving large language models with low latency and high scalability. | 2,691 |
google/big-bench | A benchmark designed to probe large language models and extrapolate their future capabilities through a diverse set of tasks. | 2,899 |
optimalscale/lmflow | A toolkit for fine-tuning and inferring large machine learning models | 8,312 |
thudm/glm-130b | An open-source implementation of a large bilingual language model pre-trained on vast amounts of text data. | 7,672 |
microsoft/flaml | Automates machine learning workflows and optimizes model performance using large language models and efficient algorithms | 3,968 |
aksnzhy/xlearn | A high-performance machine learning package with linear models and factorization machines. | 3,087 |
qwenlm/qwen2.5 | A large language model series with various sizes and variants for text generation and understanding. | 10,959 |
dair-ai/ml-papers-explained | An explanation of key concepts and advancements in the field of Machine Learning | 7,352 |
mlabonne/llm-course | A comprehensive course and resource package on building and deploying Large Language Models (LLMs) | 40,053 |
young-geng/easylm | A framework for training and serving large language models using JAX/Flax | 2,428 |