DataProfiler

Data Analysis Tool

A Python library to analyze and profile datasets, detecting sensitive data and generating reports.

What's in your data? Extract schema, statistics and entities from datasets

GitHub

1k stars
21 watching
163 forks
Language: Python
last commit: about 1 month ago
Linked from 3 awesome lists

avrocsvdata-analysisdata-labelsdata-sciencedataprofilingdatasetgdprgraph-datamachine-learningnetwork-datanlpnpipandaspiiprivacypythonsecuritysensitive-datatabular-data

Backlinks from these awesome lists:

Related projects:

Repository Description Stars
capitalone/datacompy A package to compare and analyze similar data structures from Pandas, Polars, Spark, and Snowpark 487
mbevilacqua/appcompatprocessor An application compatibility data analysis tool designed to extract value beyond traditional techniques 197
dataoneorg/d1_python A collection of Python libraries and tools for interacting with DataONE repositories 17
columbia-applied-data-science/rosetta Tools and utilities for efficient data processing with a focus on text analysis. 206
circl/circlean A tool to securely analyze and transfer data from compromised USB keys to trusted devices. 455
cgarciae/phi A Python library for functional programming that aims to simplify the experience by providing a unified API and operator overloading for common data transformations and operations. 134
johnjreiser/chupaesri A Python tool to extract data from ArcGIS Server and import it into PostgreSQL databases. 39
mcdallas/wallstreet A Python library providing real-time stock and option data analysis tools 1,396
opendatacube/datacube-core A Python-based platform for integrated gridded data analysis from decades of Earth observation satellite data 518
basilesimon/datajournalists-toolbox A collection of curated tools and resources for datajournalists to analyze and visualize their data 43
ypares/porcupine A tool that enables data manipulation and analysis pipelines to be flexible, reusable, and reproducible in different environments 89
bpsmith/tia A toolkit providing data access and analysis tools for financial markets 409
python-bonobo/bonobo A Python framework for parallelizing data transformations and processing 1,589
hashlookup/hashlookup-forensic-analyser Analyze digital evidence by searching for files against a large public hash database and generating reports on findings. 126
cedadev/cis A suite of tools for comparing and analyzing diverse datasets used in earth sciences. 45