LLaVA-NeXT

Multimodal model developer

Develops large multimodal models for various computer vision tasks including image and video analysis

3k stars

37 watching

266 forks

Language: Python

last commit: almost 2 years ago

Related projects:

Repository	Description	Stars
haotian-liu/llava	A system that uses large language and vision models to generate and process visual instructions	20,683
pku-yuangroup/video-llava	A deep learning framework for generating videos from text inputs and visual features.	3,071
opengvlab/llama-adapter	An implementation of a method for fine-tuning language models to follow instructions with high efficiency and accuracy	5,775
damo-nlp-sg/video-llama	An audio-visual language model designed to understand and respond to video content with improved instruction-following capabilities	2,842
dvlab-research/mgm	An open-source framework for training large language models with vision capabilities.	3,229
alpha-vllm/llama2-accessory	An open-source toolkit for pretraining and fine-tuning large language models	2,732
llava-vl/llava-interactive-demo	An all-in-one demo for interactive image processing and generation	353
llava-vl/llava-plus-codebase	A platform for training and deploying large language and vision models that can use tools to perform tasks	717
scisharp/llamasharp	An efficient C#/.NET library for running Large Language Models (LLMs) on local devices	2,750
wisconsinaivision/vip-llava	A system designed to enable large multimodal models to understand arbitrary visual prompts	302
eleutherai/lm-evaluation-harness	Provides a unified framework to test generative language models on various evaluation tasks.	7,200
hiyouga/llama-factory	A tool for efficiently fine-tuning large language models across multiple architectures and methods.	36,219
optimalscale/lmflow	A toolkit for fine-tuning and inferring large machine learning models	8,312
qwenlm/qwen2-vl	A multimodal large language model series developed by the Qwen team to understand and process images, videos, and text.	3,613
nvidia/megatron-lm	A framework for training large language models using scalable and optimized GPU techniques	10,804