OFA
Sequence-to-sequence framework
Develops a unified sequence-to-sequence learning framework to unify modalities and tasks through pretraining and fine-tuning
Official repository of OFA (ICML 2022). Paper: OFA: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework
2k stars
21 watching
248 forks
Language: Python
last commit: 10 months ago chineseimage-captioningmultimodalpretrained-modelspretrainingpromptprompt-tuningreferring-expression-comprehensiontext-to-image-synthesisvision-languagevisual-question-answering
Related projects:
Repository | Description | Stars |
---|---|---|
| Transforms the OFA-Chinese model to work with the Hugging Face Transformers framework | 123 |
| A system that uses large language and vision models to generate and process visual instructions | 20,683 |
| An interactive workflow for generating high-definition images from text prompts using a human-in-the-loop approach | 2,837 |
| This project provides an implementation of a method to manipulate images by driving the style with text. | 4,025 |
| A tool to evaluate vision-language models by comparing their performance on various tasks such as image recognition and text generation. | 79 |
| Enabling vision-language understanding by fine-tuning large language models on visual data. | 25,490 |
| An implementation of DALL-E 2's text-to-image synthesis neural network in PyTorch | 11,184 |
| Provides a benchmarking framework and implementation for deep learning-based text recognition models | 3,769 |
| A neural network trained on image and text pairs to predict the most relevant text snippet given an image | 26,460 |
| An implementation of a multimodal LLM training paradigm to enhance truthfulness and ethics in language models | 19 |
| An implementation of multimodal chain-of-thought reasoning in language models using a decoupled training framework for rationale generation and answer inference. | 3,833 |
| An end-to-end OCR system implementing General OCR Theory towards a unified model | 6,334 |
| Unsupervised feature learning by image inpainting using Generative Adversarial Networks (GANs) | 887 |
| An implementation of a deep learning-based facial cartoonization system using TensorFlow | 3,958 |
| This project generates animated videos from open-domain images by leveraging pre-trained video diffusion priors. | 2,668 |