IdealGPT

Vision Reasoning Framework

A deep learning framework for iteratively decomposing vision and language reasoning via large language models.

Official Code of IdealGPT

32 stars

2 watching

8 forks

Language: Python

last commit: almost 3 years ago

Related projects:

Repository	Description	Stars
shizhediao/davinci	Implementing a unified modal learning framework for generative vision-language models	43
yuxie11/r2d2	A framework for large-scale cross-modal benchmarks and vision-language tasks in Chinese	157
fyu/dilation	This project provides a deep learning framework implementing dilated convolutions for semantic image segmentation	782
tobypde/frrn	A software framework for training and evaluating full-resolution residual networks for semantic image segmentation tasks	280
nvlabs/prismer	A deep learning framework for training multi-modal models with vision and language capabilities.	1,299
jshilong/gpt4roi	Training and deploying large language models on computer vision tasks using region-of-interest inputs	517
ivaylo-popov/theano-lights	A deep learning framework built on top of Theano, providing a wide range of models and training techniques for research and development.	267
wpiroboticsprojects/grip	A computer vision framework for robotics applications that simplifies the creation of vision systems and generates code in multiple programming languages.	380
jy0205/lavit	A unified framework for training large language models to understand and generate visual content	544
yaodongyu/tct	An approach to train and optimize machine learning models in a decentralized setting by convexifying the optimization process	4
guopengf/auto-fedrl	A reinforcement learning-based framework for optimizing hyperparameters in distributed machine learning environments.	15
sarababakn/mfcl-neurips23	An approach to mitigating catastrophic forgetting in federated class incremental learning for vision tasks using a generative model and data-free methods	15
pku-yuangroup/chat-univi	A framework for unified visual representation in image and video understanding models, enabling efficient training of large language models on multimodal data.	895
jonfanlab/glonet	A software framework for training neural networks to optimize dielectric metasurfaces using physics-driven generative models and global optimization algorithms.	101
tianyi-lab/hallusionbench	An image-context reasoning benchmark designed to challenge large vision-language models and help improve their accuracy	259