ydata-profiling
Data Profiler
An exploratory data analysis tool for Pandas and Spark DataFrames
1 Line of code data quality profiling & exploratory data analysis for Pandas and Spark DataFrames.
13k stars
152 watching
2k forks
Language: Python
last commit: 1 day ago
Linked from 6 awesome lists
big-data-analyticsdata-analysisdata-explorationdata-profilingdata-qualitydata-sciencedeep-learningedaexplorationexploratory-data-analysishacktoberfesthtml-reportjupyterjupyter-notebookmachine-learningpandaspandas-dataframepandas-profilingpythonstatistics
Related projects:
Repository | Description | Stars |
---|---|---|
pandas-dev/pandas | A powerful data analysis toolkit for Python that provides flexible and expressive data structures for efficient data manipulation and analysis. | 44,052 |
pydata/pandas-datareader | Extracts data from various internet sources into a pandas DataFrame | 2,982 |
jvns/pandas-cookbook | A comprehensive guide to getting started with Python's pandas library using real-world data examples | 6,697 |
pydantic/pydantic | A Python library for validating data using type hints and JSON Schema. | 21,677 |
wesm/pydata-book | Materials and IPython notebooks for data analysis with Python | 22,389 |
panda-re/panda | An open-source platform for analyzing and debugging complex software systems | 2,507 |
twopirllc/pandas-ta | A Python package providing an extensive collection of technical analysis indicators and utility functions for financial data analysis. | 5,545 |
adamerose/pandasgui | A GUI tool for visualizing and analyzing Pandas DataFrames | 3,204 |
kanaries/pygwalker | A Python library that enables interactive data analysis and visualization using an open-source alternative to Tableau. | 13,533 |
databricks/koalas | A Python package that allows users to work with pandas DataFrames on top of Apache Spark | 3,343 |
lux-org/lux | A Python library that automates data exploration by recommending visualizations and suggesting next steps based on user interest | 5,226 |
unionai-oss/pandera | A lightweight library for validating and processing statistical data in Python | 3,472 |
sparklingpandas/sparklingpandas | Enables distributed data analysis using PySpark and Pandas APIs | 362 |
ydataai/ydata-synthetic | An educational package providing generative models for synthetic data generation. | 1,456 |
sinaptik-ai/pandas-ai | Makes data analysis conversational using LLMs and natural language | 13,714 |