mmaction2

Video analysis toolkit

A comprehensive video understanding toolbox and benchmark with modular design, supporting various tasks such as action recognition, localization, and retrieval.

OpenMMLab's Next Generation Video Understanding Toolbox and Benchmark

GitHub

4k stars
42 watching
1k forks
Language: Python
last commit: 3 months ago
Linked from 3 awesome lists

action-recognitionavabenchmarkdeep-learningi3dnon-localopenmmlabposec3dpytorchslowfastspatial-temporal-action-detectiontemporal-action-localizationtsmtsnuniformerv2video-classificationvideo-understandingx3d

Backlinks from these awesome lists:

Related projects:

Repository Description Stars
open-mmlab/mmcv Provides a foundational library for computer vision research and training deep learning models with high-quality implementation of common CPU and CUDA ops. 5,906
open-mmlab/mmdetection An object detection toolbox built on top of PyTorch, providing a modular framework for various tasks such as bounding box detection and instance segmentation. 29,603
open-mmlab/mmdetection3d An open-source platform for general 3D object detection with support for multiple modalities and datasets. 5,313
open-mmlab/mmsegmentation An open source toolbox for semantic segmentation in images and medical images. 8,285
open-mmlab/mmagic A toolkit for building and experimenting with generative AI models for image and video generation, restoration, enhancement, and other tasks. 6,945
open-mmlab/mmdeploy A toolset for deploying deep learning models on various devices and platforms 2,774
open-mmlab/mmaction An open-source toolbox for action understanding from video data using PyTorch. 1,863
open-mmlab/mmpose A comprehensive toolbox for pose estimation tasks in computer vision 5,846
openmv/openmv A platform for machine vision development with programmable cameras and extensive image processing capabilities 2,438
open-mmlab/mmengine Provides a flexible and configurable framework for training deep learning models with PyTorch. 1,179
opengvlab/internvideo Developing video foundation models and datasets for multimodal understanding and applications 1,413
pku-yuangroup/video-llava This project enables large language models to perform visual reasoning capabilities on images and videos simultaneously by learning united visual representations before projection. 2,990
open-mmlab/mmhuman3d Provides a modular framework and tools for working with 3D human parametric models in computer vision and graphics 1,240
openbmb/minicpm-v A multimodal language model designed to understand images, videos, and text inputs and generate high-quality text outputs. 12,619
fuxiaoliu/mmc Develops a large-scale dataset and benchmark for training multimodal chart understanding models using large language models. 84