Open-Sora-Dataset

Video dataset

A large video dataset collected from various open-source websites for use in computer vision and multimedia applications.

GitHub

93 stars
8 watching
6 forks
Language: Python
last commit: 5 months ago
Linked from 1 awesome list


Backlinks from these awesome lists:

Related projects:

Repository Description Stars
pku-yuangroup/video-bench Evaluates and benchmarks large language models' video understanding capabilities 117
pku-yuangroup/magictime A system for generating time-lapse videos from text prompts using deep learning models. 1,296
pku-yuangroup/chronomagic-bench A benchmark and dataset for evaluating text-to-video generation models' ability to generate coherent and varied metamorphic time-lapse videos. 186
gsig/pyvideoresearch A collection of video analysis methods and datasets for research and development 533
google-research/cad-estate A large dataset of 3D object and room layout annotations on RGB videos, designed to test automatic scene understanding methods. 105
openarabic/ocr_gs_data A collection of double-checked gold standard data for training and testing OCR engines. 13
jxshin/mzdata A comprehensive dataset of Mozilla issue tracking history, providing multiple extracts and levels for analysis. 7
ubisoft/ubisoft-laforge-animation-dataset An animation dataset for studying human motion and developing computer vision algorithms 1,026
openearth/videomap Tools for processing and exporting video map data 2
pku-yuangroup/chat-univi A framework for unified visual representation in image and video understanding models, enabling efficient training of large language models on multimodal data. 847
littleyuyu/stackoverflow-question-code-dataset A collection of mined question-code pairs from Stack Overflow used for training and testing AI models 165
opengvlab/internvideo Developing video foundation models and datasets for multimodal understanding and applications 1,413
nytud/hulu A collection of linguistic datasets and benchmarks for natural language understanding tasks 9
pharo-ai/datasets A Smalltalk library for loading and managing datasets as data frames. 9
pythainlp/prachathai-67k An article classification dataset created from news articles scraped from Prachathai.com with multiple benchmark models for multi-label classification 16