Otter
Multi-modal AI model
A multi-modal AI model developed for improved instruction-following and in-context learning, utilizing large-scale architectures and various training datasets.
🦦 Otter, a multi-modal model based on OpenFlamingo (open-sourced version of DeepMind's Flamingo), trained on MIMIC-IT and showcasing improved instruction-following and in-context learning ability.
4k stars
100 watching
242 forks
Language: Python
last commit: 9 months ago artificial-inteligencechatgptdeep-learningembodied-aifoundation-modelsgpt-4instruction-tuninglarge-scale-modelsmachine-learningmulti-modalityvisual-language-learning
Related projects:
Repository | Description | Stars |
---|---|---|
haotian-liu/llava | A system that uses large language and vision models to generate and process visual instructions | 20,232 |
mlfoundations/open_flamingo | A framework for training large multimodal models to generate text conditioned on images or other text. | 3,742 |
dvlab-research/mgm | An open-source framework for training large language models with vision capabilities. | 3,211 |
pku-yuangroup/video-llava | This project enables large language models to perform visual reasoning capabilities on images and videos simultaneously by learning united visual representations before projection. | 2,990 |
opengvlab/llama-adapter | An implementation of a method for fine-tuning language models to follow instructions with high efficiency and accuracy | 5,754 |
paddlepaddle/fastdeploy | A toolkit for easy and high-performance deployment of deep learning models on various hardware platforms | 2,998 |
eleutherai/lm-evaluation-harness | Provides a unified framework to test generative language models on various evaluation tasks. | 6,970 |
facico/chinese-vicuna | An instruction-following Chinese LLaMA-based model project aimed at training and fine-tuning models on specific hardware configurations for efficient deployment. | 4,142 |
modeltc/lightllm | An LLM inference and serving framework providing a lightweight design, scalability, and high-speed performance for large language models. | 2,609 |
alpha-vllm/llama2-accessory | An open-source toolkit for pretraining and fine-tuning large language models | 2,720 |
open-mmlab/mmdeploy | A toolset for deploying deep learning models on various devices and platforms | 2,774 |
deep-floyd/if | A text-to-image synthesis model with a modular design, utilizing a frozen text encoder and cascaded pixel diffusion modules to generate photorealistic images. | 7,688 |
ludwig-ai/ludwig | A low-code framework for building custom deep learning models and neural networks | 11,189 |
open-mmlab/mmcv | Provides a foundational library for computer vision research and training deep learning models with high-quality implementation of common CPU and CUDA ops. | 5,906 |
openbmb/minicpm-v | A multimodal language model designed to understand images, videos, and text inputs and generate high-quality text outputs. | 12,619 |