DataProfiler

Data Analysis Tool

A Python library to analyze and profile datasets, detecting sensitive data and generating reports.

What's in your data? Extract schema, statistics and entities from datasets

GitHub

1k stars
21 watching
162 forks
Language: Python
last commit: 10 days ago
Linked from 3 awesome lists

avrocsvdata-analysisdata-labelsdata-sciencedataprofilingdatasetgdprgraph-datamachine-learningnetwork-datanlpnpipandaspiiprivacypythonsecuritysensitive-datatabular-data

Backlinks from these awesome lists:

Related projects:

Repository Description Stars
capitalone/datacompy A tool for comparing and analyzing data in various formats, such as Pandas DataFrames and Spark DataFrames. 485
mbevilacqua/appcompatprocessor An application compatibility data analysis tool designed to extract value beyond traditional techniques 197
dataoneorg/d1_python A collection of Python libraries and tools for interacting with DataONE repositories 17
columbia-applied-data-science/rosetta Tools and utilities for efficient data processing with a focus on text analysis. 206
circl/circlean A tool to securely analyze and transfer data from compromised USB keys to trusted devices. 454
cgarciae/phi A Python library for functional programming that aims to simplify the experience by providing a unified API and operator overloading for common data transformations and operations. 134
johnjreiser/chupaesri A Python tool to extract data from ArcGIS Server and import it into PostgreSQL databases. 39
mcdallas/wallstreet A Python library providing real-time stock and option data analysis tools 1,375
opendatacube/datacube-core A Python-based platform for integrated gridded data analysis from decades of Earth observation satellite data 514
basilesimon/datajournalists-toolbox A collection of curated tools and resources for datajournalists to analyze and visualize their data 43
ypares/porcupine A tool that enables data manipulation and analysis pipelines to be flexible, reusable, and reproducible in different environments 89
bpsmith/tia A toolkit providing data access and analysis tools for financial markets 409
python-bonobo/bonobo A Python framework for parallelizing data transformations and processing 1,589
hashlookup/hashlookup-forensic-analyser Analyze digital evidence by searching for files against a large public hash database and generating reports on findings. 125
cedadev/cis A Python tool for collocation, visualization, analysis, and comparison of diverse datasets used in earth sciences. 46