co-tracker
Point tracker
A model for tracking any point on a video using transformer-based architecture and optical flow benefits
CoTracker is a model for tracking any point (pixel) on a video.
4k stars
34 watching
250 forks
Language: Jupyter Notebook
last commit: 29 days ago optical-flowpoint-trackingtrack-anything
Related projects:
Repository | Description | Stars |
---|---|---|
visionml/pytracking | A comprehensive framework for building and training visual object tracking models using PyTorch. | 3,247 |
doubiiu/dynamicrafter | This project generates animated videos from open-domain images by leveraging pre-trained video diffusion priors. | 2,580 |
facebookresearch/segment-anything | This project provides code and tools for running inference with a visual segmentation model that can generate object masks from input prompts. | 47,627 |
facebookresearch/sam2 | An open-source software project providing code and tools for running inference with a deep learning model designed for visual segmentation in images and videos. | 12,353 |
eduardolundgren/tracking.js | A computer vision library for the web that provides various algorithms and techniques for tracking objects and features in images. | 9,443 |
facebookresearch/slowfast | Provides state-of-the-art video understanding codebase with efficient training methods and pre-trained models for various tasks | 6,623 |
clementpinard/sfmlearner-pytorch | PyTorch implementation of unsupervised depth and ego-motion learning from video sequences | 1,014 |
doubiiu/tooncrafter | Generates cartoon-style videos from two images using pre-trained diffusion models | 5,353 |
facebookresearch/dinov2 | A PyTorch implementation of a self-supervised learning method for learning robust visual features without supervision. | 9,211 |
mkocabas/vibe | A video pose and shape estimation method that predicts body parameters for each frame of an input video. | 2,897 |
facebookresearch/detectron2 | A platform for object detection and segmentation tasks using machine learning algorithms | 30,539 |
benaclejames/vrcfacetracking | Provides eye and lip tracking functionality for VRChat avatars using OSC protocol and custom parameter mappings | 627 |
thudm/cogvideo | Generates videos from text and images using large language models | 9,156 |
facebookresearch/imagebind | An AI framework that combines data from multiple sources into a single embedding space, enabling various applications such as cross-modal retrieval and generation. | 8,362 |
martinruenz/co-fusion | Enables a robot to maintain scene descriptions at the object level by segmenting and tracking objects in real-time from RGB-D images. | 501 |