Open-Sora-Dataset
Video dataset
A large video dataset collected from various open-source websites for use in computer vision and multimedia applications.
94 stars
8 watching
6 forks
Language: Python
last commit: 5 months ago
Linked from 1 awesome list
Related projects:
Repository | Description | Stars |
---|---|---|
pku-yuangroup/video-bench | Evaluates and benchmarks large language models' video understanding capabilities | 117 |
pku-yuangroup/magictime | Tools and models for generating time-lapse videos from text prompts | 1,303 |
pku-yuangroup/chronomagic-bench | A benchmark and dataset for evaluating text-to-video generation models' ability to generate coherent and varied metamorphic time-lapse videos. | 187 |
gsig/pyvideoresearch | A collection of video analysis methods and datasets for research and development | 533 |
google-research/cad-estate | A large dataset of 3D object and room layout annotations on RGB videos, designed to test automatic scene understanding methods. | 105 |
openarabic/ocr_gs_data | A collection of double-checked gold standard data for training and testing OCR engines. | 13 |
jxshin/mzdata | A comprehensive dataset of Mozilla issue tracking history, providing multiple extracts and levels for analysis. | 7 |
ubisoft/ubisoft-laforge-animation-dataset | An animation dataset for studying human motion and developing computer vision algorithms | 1,026 |
openearth/videomap | Tools for processing and exporting video map data | 2 |
pku-yuangroup/chat-univi | A framework for unified visual representation in image and video understanding models, enabling efficient training of large language models on multimodal data. | 847 |
littleyuyu/stackoverflow-question-code-dataset | A collection of mined question-code pairs from Stack Overflow used for training and testing AI models | 165 |
opengvlab/internvideo | Developing video foundation models and datasets for multimodal understanding and applications | 1,413 |
nytud/hulu | A collection of linguistic datasets and benchmarks for natural language understanding tasks | 9 |
pharo-ai/datasets | A Smalltalk library for loading and managing datasets as data frames. | 9 |
pythainlp/prachathai-67k | An article classification dataset created from news articles scraped from Prachathai.com with multiple benchmark models for multi-label classification | 16 |