ydata-profiling
Data Profiler
An exploratory data analysis tool for Pandas and Spark DataFrames
1 Line of code data quality profiling & exploratory data analysis for Pandas and Spark DataFrames.
13k stars
152 watching
2k forks
Language: Python
last commit: 8 days ago
Linked from 6 awesome lists
big-data-analyticsdata-analysisdata-explorationdata-profilingdata-qualitydata-sciencedeep-learningedaexplorationexploratory-data-analysishacktoberfesthtml-reportjupyterjupyter-notebookmachine-learningpandaspandas-dataframepandas-profilingpythonstatistics
Related projects:
Repository | Description | Stars |
---|---|---|
pandas-dev/pandas | A powerful data analysis toolkit for Python that provides flexible and expressive data structures for efficient data manipulation and analysis. | 43,807 |
pydata/pandas-datareader | Extracts data from various internet sources into a pandas DataFrame | 2,948 |
jvns/pandas-cookbook | A comprehensive guide to getting started with Python's pandas library using real-world data examples | 6,664 |
pydantic/pydantic | A Python library for validating data using type hints and JSON Schema. | 21,145 |
wesm/pydata-book | Materials and IPython notebooks for data analysis with Python | 22,248 |
panda-re/panda | An open-source platform for analyzing and debugging complex software systems | 2,489 |
twopirllc/pandas-ta | A Python package providing an extensive collection of technical analysis indicators and utility functions for financial data analysis. | 5,432 |
adamerose/pandasgui | A GUI tool for visualizing and analyzing Pandas DataFrames | 3,194 |
kanaries/pygwalker | A Python library that enables interactive data analysis and visualization using an open-source alternative to Tableau. | 13,382 |
databricks/koalas | A Python package that allows users to work with pandas DataFrames on top of Apache Spark | 3,336 |
lux-org/lux | A Python library that automates data exploration by recommending visualizations and suggesting next steps based on user interest | 5,210 |
unionai-oss/pandera | A lightweight library for validating and processing statistical data in Python | 3,393 |
sparklingpandas/sparklingpandas | Enables distributed data analysis using PySpark and Pandas APIs | 361 |
ydataai/ydata-synthetic | An educational package providing generative models for synthetic data generation. | 1,441 |
sinaptik-ai/pandas-ai | Makes data analysis conversational using LLMs and natural language | 13,516 |