mmaction2
Video analysis toolkit
A comprehensive video understanding toolbox and benchmark with modular design, supporting various tasks such as action recognition, localization, and retrieval.
OpenMMLab's Next Generation Video Understanding Toolbox and Benchmark
4k stars
42 watching
1k forks
Language: Python
last commit: 3 months ago
Linked from 3 awesome lists
action-recognitionavabenchmarkdeep-learningi3dnon-localopenmmlabposec3dpytorchslowfastspatial-temporal-action-detectiontemporal-action-localizationtsmtsnuniformerv2video-classificationvideo-understandingx3d
Related projects:
Repository | Description | Stars |
---|---|---|
open-mmlab/mmcv | Provides a foundational library for computer vision research and training deep learning models with high-quality implementation of common CPU and CUDA ops. | 5,906 |
open-mmlab/mmdetection | An object detection toolbox built on top of PyTorch, providing a modular framework for various tasks such as bounding box detection and instance segmentation. | 29,603 |
open-mmlab/mmdetection3d | An open-source platform for general 3D object detection with support for multiple modalities and datasets. | 5,313 |
open-mmlab/mmsegmentation | An open source toolbox for semantic segmentation in images and medical images. | 8,285 |
open-mmlab/mmagic | A toolkit for building and experimenting with generative AI models for image and video generation, restoration, enhancement, and other tasks. | 6,945 |
open-mmlab/mmdeploy | A toolset for deploying deep learning models on various devices and platforms | 2,774 |
open-mmlab/mmaction | An open-source toolbox for action understanding from video data using PyTorch. | 1,863 |
open-mmlab/mmpose | A comprehensive toolbox for pose estimation tasks in computer vision | 5,846 |
openmv/openmv | A platform for machine vision development with programmable cameras and extensive image processing capabilities | 2,438 |
open-mmlab/mmengine | Provides a flexible and configurable framework for training deep learning models with PyTorch. | 1,179 |
opengvlab/internvideo | Developing video foundation models and datasets for multimodal understanding and applications | 1,413 |
pku-yuangroup/video-llava | This project enables large language models to perform visual reasoning capabilities on images and videos simultaneously by learning united visual representations before projection. | 2,990 |
open-mmlab/mmhuman3d | Provides a modular framework and tools for working with 3D human parametric models in computer vision and graphics | 1,240 |
openbmb/minicpm-v | A multimodal language model designed to understand images, videos, and text inputs and generate high-quality text outputs. | 12,619 |
fuxiaoliu/mmc | Develops a large-scale dataset and benchmark for training multimodal chart understanding models using large language models. | 84 |