datacompy
Data Comparer
A tool for comparing and analyzing data in various formats, such as Pandas DataFrames and Spark DataFrames.
Pandas, Polars, Spark, and Snowpark DataFrame comparison for humans and more!
485 stars
23 watching
129 forks
Language: Python
last commit: 6 days ago
Linked from 3 awesome lists
comparedaskdatadata-sciencedataframesfuguenumpypandaspolarspysparkpythonsnowflakesnowparkspark
Related projects:
Repository | Description | Stars |
---|---|---|
capitalone/dataprofiler | A Python library to analyze and profile datasets, detecting sensitive data and generating reports. | 1,434 |
blaylockbk/synopticpy | Converts Synoptic's Weather API data into Polars DataFrames for Python analysis. | 50 |
cedadev/cis | A Python tool for collocation, visualization, analysis, and comparison of diverse datasets used in earth sciences. | 46 |
columbia-applied-data-science/rosetta | Tools and utilities for efficient data processing with a focus on text analysis. | 206 |
opendatacube/datacube-core | A Python-based platform for integrated gridded data analysis from decades of Earth observation satellite data | 514 |
nikolaydubina/fpdecimal | Provides a precise and efficient data type for fixed-point decimals in Go. | 31 |
wswup/gridwxcomp | Compares weather station data with gridded climate datasets hosted on Google Earth Engine | 17 |
svenkreiss/pysparkling | A lightweight Python implementation of Spark's RDD and DStream interfaces for improved performance on small datasets | 262 |
rocketlaunchr/dataframe-go | A lightweight data manipulation and analysis package for Go. | 1,192 |
ncas-cms/cf-python | A Python library implementing a CF data model and providing tools for Earth Science data analysis | 125 |
datastax/spark-cassandra-connector | A library that enables integration between Apache Spark and Apache Cassandra for fast data processing and analysis. | 1,943 |
jealous/stockstats | A Python library providing a wrapper around pandas.DataFrame with support for stock market statistics and indicators calculation | 1,303 |
pylons/colander | A library for serializing and deserializing data structures into strings, mappings, and lists while performing validation. | 451 |
laion-ai/clip_benchmark | Evaluates and compares the performance of various CLIP-like models on different tasks and datasets. | 615 |
scwilkinson/pd-replicator | A tool that allows users to easily copy data from a pandas DataFrame to the clipboard with one click. | 11 |