snowflake-arctic
LLM inference stack
A project providing optimized stacks for fine-tuning and inference of large language models, focusing on low-latency and high-throughput performance.
525 stars
6 watching
47 forks
Language: Python
last commit: 5 months ago Related projects:
Repository | Description | Stars |
---|---|---|
pratyushmaini/llm_dataset_inference | Detects whether a given text sequence is part of the training data used to train a large language model. | 23 |
deepseek-ai/deepseek-moe | A large language model with improved efficiency and performance compared to similar models | 1,024 |
ediglacuq/fenics_ice | A framework for Bayesian quantification of uncertainty in large-scale ice sheet models. | 5 |
luogen1996/lavin | An open-source implementation of a vision-language instructed large language model | 513 |
mlcommons/inference | Measures the performance of deep learning models in various deployment scenarios. | 1,256 |
gmftbygmftby/science-llm | A large-scale language model for scientific domain training on redpajama arXiv split | 125 |
nixtla/mlforecast | A framework to perform time series forecasting using machine learning models on large datasets. | 924 |
dreadnode/rigging | A framework for leveraging language models in production code | 216 |
damo-nlp-sg/m3exam | A benchmark for evaluating large language models in multiple languages and formats | 93 |
talwalkarlab/leaf | A benchmarking framework for federated machine learning tasks across various domains and datasets | 856 |
kvcache-ai/ktransformers | A flexible framework for LLM inference optimizations with support for multiple models and architectures | 771 |
davidmigloz/langchain_dart | Provides a set of tools and components to simplify the integration of Large Language Models into Dart/Flutter applications | 441 |
snunez1/llama.cl | A Common Lisp port of a Large Language Model (LLM) implementation | 36 |
microsoft/msrflute | A platform for conducting high-performance federated learning simulations in Python. | 185 |
clue-ai/promptclue | A pre-trained language model for multiple natural language processing tasks with support for few-shot learning and transfer learning. | 656 |