snowflake-arctic

LLM inference stack

A project providing optimized stacks for fine-tuning and inference of large language models, focusing on low-latency and high-throughput performance.

GitHub

519 stars
6 watching
46 forks
Language: Python
last commit: 3 months ago

Related projects:

Repository Description Stars
pratyushmaini/llm_dataset_inference Detects whether a given text sequence is part of the training data used to train a large language model. 23
deepseek-ai/deepseek-moe A large language model with improved efficiency and performance compared to similar models 1,006
ediglacuq/fenics_ice A framework for Bayesian quantification of uncertainty in large-scale ice sheet models. 5
luogen1996/lavin An open-source implementation of a vision-language instructed large language model 508
mlcommons/inference Measures the performance of deep learning models in various deployment scenarios. 1,236
gmftbygmftby/science-llm A large-scale language model for scientific domain training on redpajama arXiv split 122
nixtla/mlforecast A Python library for scalable machine learning-based time series forecasting with efficient feature engineering and out-of-the-box compatibility. 899
dreadnode/rigging An LLM framework that simplifies interacting with language models in production code 209
damo-nlp-sg/m3exam A benchmark for evaluating large language models in multiple languages and formats 92
talwalkarlab/leaf A benchmarking framework for federated machine learning tasks across various domains and datasets 851
kvcache-ai/ktransformers A flexible framework for LLM inference optimizations with support for multiple models and architectures 736
davidmigloz/langchain_dart Provides a set of tools and components to simplify the integration of Large Language Models into Dart/Flutter applications 425
snunez1/llama.cl A Common Lisp port of a Large Language Model (LLM) implementation 35
microsoft/msrflute A platform for conducting high-performance federated learning simulations in Python. 185
clue-ai/promptclue A pre-trained language model for multiple natural language processing tasks with support for few-shot learning and transfer learning. 654