skypilot

AI/batch workload manager

A framework for running AI and batch workloads on any infrastructure, offering unified execution, cost savings, and high GPU availability.

SkyPilot: Run AI and batch jobs on any infra (Kubernetes or 12+ clouds). Get unified execution, cost savings, and high GPU availability via a simple interface.

GitHub

7k stars

70 watching

529 forks

Language: Python

last commit: over 1 year ago

Linked from 2 awesome lists

cloud-computingcloud-managementcost-managementcost-optimizationdata-sciencedeep-learningdistributed-trainingfinopsgpuhyperparameter-tuningjob-queuejob-schedulerllm-servingllm-trainingmachine-learningml-infrastructureml-platformmulticloudspot-instancestpu

Screenshot of skypilot-org/skypilot website

docs.skypilot.co/

Backlinks from these awesome lists:

Related projects:

Repository	Description	Stars
lightning-ai/lit-llama	An implementation of a large language model using the nanoGPT architecture	6,013
opengvlab/llama-adapter	An implementation of a method for fine-tuning language models to follow instructions with high efficiency and accuracy	5,775
hiyouga/llama-factory	A tool for efficiently fine-tuning large language models across multiple architectures and methods.	36,219
alpha-vllm/llama2-accessory	An open-source toolkit for pretraining and fine-tuning large language models	2,732
lyogavin/airllm	Optimizes large language model inference on limited GPU resources	5,446
nomic-ai/gpt4all	An open-source Python client for running Large Language Models (LLMs) locally on any device.	71,176
meta-llama/llama-recipes	Provides tools and examples for fine-tuning the Meta Llama model and building applications with it	15,578
haotian-liu/llava	A system that uses large language and vision models to generate and process visual instructions	20,683
scisharp/llamasharp	An efficient C#/.NET library for running Large Language Models (LLMs) on local devices	2,750
sgl-project/sglang	A fast serving framework for large language models and vision language models.	6,551
llava-vl/llava-next	Develops large multimodal models for various computer vision tasks including image and video analysis	3,099
sjtu-ipads/powerinfer	An efficient Large Language Model inference engine leveraging consumer-grade GPUs on PCs	8,011
optimalscale/lmflow	A toolkit for fine-tuning and inferring large machine learning models	8,312
ploomber/ploomber	A platform for building and deploying data pipelines using Python, with features for caching, automation, and modularization.	3,530
meta-llama/llama-stack	Provides pre-packaged building blocks for generative AI applications with standardized APIs and service-oriented design.	5,164