datasets
ML data loader
A tool providing efficient data manipulation and loading for machine learning models
🤗 The largest hub of ready-to-use datasets for ML models with fast, easy-to-use and efficient data manipulation tools
19k stars
277 watching
3k forks
Language: Python
last commit: 4 days ago
Linked from 1 awesome list
computer-visiondatasetsdeep-learninghacktoberfestmachine-learningnatural-language-processingnlpnumpypandaspytorchspeechtensorflow
Related projects:
Repository | Description | Stars |
---|---|---|
huggingface/transformers | A collection of pre-trained machine learning models for various natural language and computer vision tasks, enabling developers to fine-tune and deploy these models on their own projects. | 136,357 |
huggingface/datatrove | A platform-agnostic data processing framework for large-scale text data pipelines | 2,103 |
huggingface/tokenizers | A toolkit providing optimized tokenizers for natural language processing tasks in various programming languages. | 9,156 |
huggingface/diffusion-models-class | A comprehensive course teaching the theory and hands-on implementation of diffusion models in image and audio generation using PyTorch. | 3,722 |
switchablenorms/celebamask-hq | A large-scale face image dataset for training and evaluating algorithms in face parsing, recognition, generation, and editing. | 2,136 |
huggingface/transformers.js | An open-source JavaScript library for running machine learning models in the browser without a server. | 12,363 |
huggingface/lerobot | A platform providing pre-trained models, datasets, and tools for robotics with focus on imitation learning and reinforcement learning. | 7,874 |
pandas-dev/pandas | A powerful data analysis toolkit for Python that provides flexible and expressive data structures for efficient data manipulation and analysis. | 44,052 |
ayush1997/visualize_ml | A Python package for data analysis and visualization in machine learning | 198 |
huggingface/peft | An efficient method for fine-tuning large pre-trained models by adapting only a small fraction of their parameters | 16,699 |
iigroup/mm-celeba-hq-dataset | A large-scale dataset for training and evaluating algorithms for text-driven face generation and understanding tasks. | 223 |
huggingface/diffusers | A PyTorch-based library for training and using state-of-the-art diffusion models to generate images, audio, and 3D structures | 26,676 |
dotnet/machinelearning-samples | A collection of samples and examples demonstrating the usage of ML.NET for machine learning tasks in .NET applications. | 4,508 |
rosejn/torch-datasets | A collection of pre-processed machine learning datasets for use with the Torch7 deep learning framework. | 37 |
huggingface/alignment-handbook | Provides recipes and guidelines for training language models to align with human preferences and AI goals | 4,800 |