datacompy

Data comparer

A package to compare and analyze similar data structures from Pandas, Polars, Spark, and Snowpark

Pandas, Polars, Spark, and Snowpark DataFrame comparison for humans and more!

GitHub

487 stars
23 watching
130 forks
Language: Python
last commit: about 1 month ago
Linked from 3 awesome lists

comparedaskdatadata-sciencedataframesfuguenumpypandaspolarspysparkpythonsnowflakesnowparkspark

Backlinks from these awesome lists:

Related projects:

Repository Description Stars
capitalone/dataprofiler A Python library to analyze and profile datasets, detecting sensitive data and generating reports. 1,442
blaylockbk/synopticpy A Python package that retrieves weather data from Synoptic's API and converts it to Polars DataFrames. 55
cedadev/cis A suite of tools for comparing and analyzing diverse datasets used in earth sciences. 45
columbia-applied-data-science/rosetta Tools and utilities for efficient data processing with a focus on text analysis. 206
opendatacube/datacube-core A Python-based platform for integrated gridded data analysis from decades of Earth observation satellite data 518
nikolaydubina/fpdecimal Provides a precise and efficient data type for fixed-point decimals in Go. 31
wswup/gridwxcomp Compares weather station data with gridded climate datasets hosted on Google Earth Engine 17
svenkreiss/pysparkling A lightweight Python implementation of Spark's RDD and DStream interfaces for improved performance on small datasets 262
rocketlaunchr/dataframe-go A lightweight data manipulation and analysis package for Go. 1,206
ncas-cms/cf-python A Python library implementing a CF data model and providing tools for Earth Science data analysis 129
datastax/spark-cassandra-connector A library that enables integration between Apache Spark and Apache Cassandra for fast data processing and analysis. 1,944
jealous/stockstats A Python library providing a wrapper around pandas.DataFrame with support for stock market statistics and indicators calculation 1,312
pylons/colander A library for serializing and deserializing data structures into strings, mappings, and lists while performing validation. 451
laion-ai/clip_benchmark Evaluates and compares the performance of various CLIP-like models on different tasks and datasets. 632
scwilkinson/pd-replicator A tool that allows users to easily copy data from a pandas DataFrame to the clipboard with one click. 11