bad-data-guide
Data problem solver
An exhaustive guide to common problems in real-world data and suggestions on how to resolve them.
An exhaustive reference to problems seen in real-world data along with suggestions on how to resolve them.
4k stars
215 watching
405 forks
last commit: about 3 years ago
Linked from 1 awesome list
datadocumentationguideqz-things
Related projects:
Repository | Description | Stars |
---|---|---|
adibro/data-science-resources | A collection of resources and cheatsheets for learning and practicing data science | 63 |
cleanlab/cleanlab | Automates data quality checks and model training with AI-driven methods to improve machine learning performance | 9,820 |
xlaszlo/datascience-fails | Collects and documents common pitfalls and failure reasons in data science projects | 460 |
meteoswiss/publication-opendata | Provides access to standardized meteorological and climatological data from MeteoSwiss. | 70 |
pedrobarcha/old-books-dataset | A collection of scanned book pages with ground truth annotations for OCR research and text analysis | 12 |
hadley/stats337 | An educational resource providing discussions on applied data science topics in R, with a focus on practical applications and real-world examples. | 1,617 |
cuemacro/findatapy | A unified Python API to download market data from various sources | 1,716 |
ghiggi/gpm_api | Provides a Python interface to download and analyze GPM data from NASA's Precipitation Processing System | 60 |
geocryology/globsim | Automates downloading and processing of global reanalyses to generate meteorological time series | 19 |
gdsbook/book | An interactive introduction to geospatial data analysis using Python and Jupyter Notebook | 339 |
serpro69/kotlin-faker | A library for generating realistic fake data for testing and development purposes | 475 |
capitalone/dataprofiler | A Python library to analyze and profile datasets, detecting sensitive data and generating reports. | 1,442 |
dataforgoodfr/quotaclimat | A tool to quantify media coverage of climate crises by collecting and analyzing radio and TV data from the Mediatree API. | 29 |
getredash/redash | Enables users to connect to various data sources, visualize and share their data, making it easy to explore insights and drive business decisions. | 26,572 |
jphall663/gwu_data_mining | Materials and lecture notes for a data science and machine learning course at GWU | 237 |