dirty_cat
Categorical encoder library
A Python library for handling and encoding dirty categorical data in machine learning
Machine learning on dirty tabular data (legacy clone of skrub)
17 stars
0 watching
4 forks
Language: Python
last commit: 2 months ago
Linked from 2 awesome lists
Related projects:
Repository | Description | Stars |
---|---|---|
| A Python library that provides a way to easily create and manage data models without modifying the original data. | 10 |
| Adapting clean code principles to machine learning and data science in Python | 714 |
| Converting dirty machine learning code into clean, modular, and reusable components using the Pipe and Filter Design Pattern for Machine Learning. | 18 |
| Library that provides dynamic data cleaning and filtering capabilities for PyTorch datasets and samplers | 378 |
| An algorithm and package for handling noisy labels in binary classification problems | 82 |
| Provides pre-cleaned and documented audio samples for musical experimentation | 45 |
| A Python implementation of a machine learning solution for classifying images as dogs or cats from the Kaggle competition. | 66 |
| Tool to optimize database usage in Django by identifying and suggesting the removal of unused fields. | 196 |
| Implementation of a method to improve machine learning models trained with noisy labels by selecting and collaborating with high-quality samples | 39 |
| A Python package implementing an interpretable machine learning model for text classification with visualization tools | 336 |
| Provides scalable, performant data loading solutions and utilities to be shared by PyTorch domain libraries | 1,149 |
| A tool for cleaning up data in MongoDB databases. | 9 |
| A Python package for data analysis and visualization in machine learning | 198 |
| Removes unnecessary image manifests from a Docker Registry | 56 |
| An SVG optimizer/cleaner tool that reduces the size of vector graphics by removing unnecessary data. | 785 |