mmaction2

Video analysis toolkit

A comprehensive video understanding toolbox and benchmark with modular design, supporting various tasks such as action recognition, localization, and retrieval.

OpenMMLab's Next Generation Video Understanding Toolbox and Benchmark

GitHub

4k stars
42 watching
1k forks
Language: Python
last commit: 5 months ago
Linked from 3 awesome lists

action-recognitionavabenchmarkdeep-learningi3dnon-localopenmmlabposec3dpytorchslowfastspatial-temporal-action-detectiontemporal-action-localizationtsmtsnuniformerv2video-classificationvideo-understandingx3d

Backlinks from these awesome lists:

Related projects:

Repository Description Stars
open-mmlab/mmcv Provides a foundational library for computer vision research and training deep learning models with high-quality implementation of common CPU and CUDA ops. 5,948
open-mmlab/mmdetection An object detection toolbox built on top of PyTorch, providing a modular framework for various tasks such as bounding box detection and instance segmentation. 29,808
open-mmlab/mmdetection3d An open-source platform for general 3D object detection with support for multiple modalities and datasets. 5,391
open-mmlab/mmsegmentation An open source toolbox for semantic segmentation in images and medical images. 8,406
open-mmlab/mmagic A toolkit for building and experimenting with generative AI models for image and video generation, restoration, enhancement, and other tasks. 6,986
open-mmlab/mmdeploy A toolset for deploying deep learning models on various devices and platforms 2,797
open-mmlab/mmaction An open-source toolbox for action understanding from video data using PyTorch. 1,863
open-mmlab/mmpose A comprehensive toolbox for pose estimation tasks in computer vision 5,952
openmv/openmv A platform for machine vision development with programmable cameras and extensive image processing capabilities 2,446
open-mmlab/mmengine Provides a flexible and configurable framework for training deep learning models with PyTorch. 1,196
opengvlab/internvideo Develops general video foundation models and related datasets for multimodal understanding and generation through generative and discriminative learning. 1,467
pku-yuangroup/video-llava A deep learning framework for generating videos from text inputs and visual features. 3,071
open-mmlab/mmhuman3d Provides a modular framework and tools for working with 3D human parametric models in computer vision and graphics 1,253
openbmb/minicpm-v A multimodal language model designed to understand images, videos, and text inputs and generate high-quality text outputs. 12,870
fuxiaoliu/mmc Develops a large-scale dataset and benchmark for training multimodal chart understanding models using large language models. 87