datacompy

Data Comparer

A tool for comparing and analyzing data in various formats, such as Pandas DataFrames and Spark DataFrames.

Pandas, Polars, Spark, and Snowpark DataFrame comparison for humans and more!

GitHub

485 stars
23 watching
129 forks
Language: Python
last commit: 6 days ago
Linked from 3 awesome lists

comparedaskdatadata-sciencedataframesfuguenumpypandaspolarspysparkpythonsnowflakesnowparkspark

Backlinks from these awesome lists:

Related projects:

Repository Description Stars
capitalone/dataprofiler A Python library to analyze and profile datasets, detecting sensitive data and generating reports. 1,434
blaylockbk/synopticpy Converts Synoptic's Weather API data into Polars DataFrames for Python analysis. 50
cedadev/cis A Python tool for collocation, visualization, analysis, and comparison of diverse datasets used in earth sciences. 46
columbia-applied-data-science/rosetta Tools and utilities for efficient data processing with a focus on text analysis. 206
opendatacube/datacube-core A Python-based platform for integrated gridded data analysis from decades of Earth observation satellite data 514
nikolaydubina/fpdecimal Provides a precise and efficient data type for fixed-point decimals in Go. 31
wswup/gridwxcomp Compares weather station data with gridded climate datasets hosted on Google Earth Engine 17
svenkreiss/pysparkling A lightweight Python implementation of Spark's RDD and DStream interfaces for improved performance on small datasets 262
rocketlaunchr/dataframe-go A lightweight data manipulation and analysis package for Go. 1,192
ncas-cms/cf-python A Python library implementing a CF data model and providing tools for Earth Science data analysis 125
datastax/spark-cassandra-connector A library that enables integration between Apache Spark and Apache Cassandra for fast data processing and analysis. 1,943
jealous/stockstats A Python library providing a wrapper around pandas.DataFrame with support for stock market statistics and indicators calculation 1,303
pylons/colander A library for serializing and deserializing data structures into strings, mappings, and lists while performing validation. 451
laion-ai/clip_benchmark Evaluates and compares the performance of various CLIP-like models on different tasks and datasets. 615
scwilkinson/pd-replicator A tool that allows users to easily copy data from a pandas DataFrame to the clipboard with one click. 11