GroundingDINO

Open-world detector

An implementation of an object detection model designed to work in open-world scenarios with the ability to detect and recognize objects based on language descriptions.

[ECCV 2024] Official implementation of the paper "Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection"

GitHub

7k stars

43 watching

711 forks

Language: Python

last commit: almost 2 years ago

Linked from 1 awesome list

object-detectionopen-worldopen-world-detectionvision-languagevision-language-transformer

Screenshot of IDEA-Research/GroundingDINO website

arxiv.org/abs/2303.05499

Backlinks from these awesome lists:

amrzv/awesome-colab-notebooks

Related projects:

Repository	Description	Stars
idea-research/dino	An implementation of a deep learning-based object detection model with improved anchor boxes for end-to-end detection tasks.	2,295
facebookresearch/dinov2	A PyTorch implementation of a self-supervised learning method for learning robust visual features without supervision.	9,425
theshadow29/zsgnet-pytorch	An implementation of a computer vision model that grounds objects in images using natural language queries.	69
jhcho99/coformer	An implementation of a deep learning model for grounding situation recognition in images	45
tencentarc/gfpgan	An algorithm for restoring damaged or obscured faces in images	36,009
amdegroot/ssd.pytorch	An implementation of a deep learning-based object detection system in PyTorch.	5,160
cszn/kair	Image restoration toolbox with training and testing codes for various deep learning-based methods	2,994
doubiiu/dynamicrafter	This project generates animated videos from open-domain images by leveraging pre-trained video diffusion priors.	2,668
junyanz/interactive-deep-colorization	A system for automatically colorizing black and white images with user interactions.	2,701
thu-mig/yolov10	Real-time object detection using a neural network architecture	10,116
huawei-noah/efficient-ai-backbones	A collection of efficient AI backbone architectures developed by Huawei Noah's Ark Lab.	4,098
layumi/person_reid_baseline_pytorch	A PyTorch implementation of an Object Re-ID baseline with various training methods and architectures	4,149
roboflow/notebooks	This repository contains tutorials and examples on using state-of-the-art computer vision models and techniques	5,678
devendrachaplot/deeprl-grounding	Trains an RL agent to execute natural language instructions in a 3D environment using a combination of A3C and gated attention mechanisms.	237
mlfoundations/open_flamingo	A framework for training large multimodal models to generate text conditioned on images or other text.	3,781