co-tracker

Point tracker

A model for tracking any point on a video using transformer-based architecture and optical flow benefits

CoTracker is a model for tracking any point (pixel) on a video.

GitHub

4k stars

34 watching

262 forks

Language: Jupyter Notebook

last commit: about 1 year ago

optical-flowpoint-trackingtrack-anything

Screenshot of facebookresearch/co-tracker website

co-tracker.github.io/

Related projects:

Repository	Description	Stars
visionml/pytracking	A comprehensive framework for building and training visual object tracking models using PyTorch.	3,271
doubiiu/dynamicrafter	This project generates animated videos from open-domain images by leveraging pre-trained video diffusion priors.	2,668
facebookresearch/segment-anything	This project provides code and tools for running inference with a visual segmentation model that can generate object masks from input prompts.	48,092
facebookresearch/sam2	A software framework for video segmentation in images and videos using AI models	13,054
eduardolundgren/tracking.js	A computer vision library for the web that provides various algorithms and techniques for tracking objects and features in images.	9,445
facebookresearch/slowfast	Provides state-of-the-art video understanding codebase with efficient training methods and pre-trained models for various tasks	6,680
clementpinard/sfmlearner-pytorch	Pytorch implementation of unsupervised depth and ego-motion learning from video sequences	1,022
doubiiu/tooncrafter	Generates cartoon-style videos from two images using pre-trained diffusion models	5,447
facebookresearch/dinov2	A PyTorch implementation of a self-supervised learning method for learning robust visual features without supervision.	9,425
mkocabas/vibe	A video pose and shape estimation method that predicts body parameters for each frame of an input video.	2,911
facebookresearch/detectron2	A platform for object detection and segmentation tasks using machine learning algorithms	30,778
benaclejames/vrcfacetracking	Provides eye and lip tracking functionality for VRChat avatars using OSC protocol and custom parameter mappings	634
thudm/cogvideo	Generates videos from text and images using large language models	9,761
facebookresearch/imagebind	An AI framework that combines data from multiple sources into a single embedding space, enabling various applications such as cross-modal retrieval and generation.	8,424
martinruenz/co-fusion	Enables a robot to maintain scene descriptions at the object level by segmenting and tracking objects in real-time from RGB-D images.	502