vaex

Data explorer

A high-performance Python library for streaming and exploring large tabular datasets.

Out-of-Core hybrid Apache Arrow/NumPy DataFrame for Python, ML, visualization and exploration of big tabular data at a billion rows per second 🚀

GitHub

8k stars
145 watching
591 forks
Language: Python
last commit: 3 months ago
Linked from 3 awesome lists

bigdatadata-sciencedataframehdf5machine-learningmachinelearningmemory-mapped-filepyarrowpythontabular-datavisualization

Backlinks from these awesome lists:

Related projects:

Repository Description Stars
wesm/feather A binary data frame storage system that enables efficient and interoperable data sharing across multiple programming languages. 2,742
pandas-dev/pandas A powerful data analysis toolkit for Python that provides flexible and expressive data structures for efficient data manipulation and analysis. 44,052
lux-org/lux A Python library that automates data exploration by recommending visualizations and suggesting next steps based on user interest 5,226
blaze/blaze A Python library that translates familiar NumPy/Pandas-like syntax into database query language 3,185
nixtla/statsforecast A fast and accurate Python library for forecasting time series data using various statistical and econometric models 4,045
mwaskom/seaborn A high-level interface for statistical data visualization 12,669
lancedb/lance A modern columnar data format for machine learning and large language models. 4,010
explosion/spacy Industrial-strength NLP library for Python and Cython 30,459
finos/perspective A component for creating interactive analytics and data visualization applications with support for large datasets and streaming queries. 8,669
ibis-project/ibis A portable Python library for working with dataframes across multiple backends. 5,390
apache/datafusion-ballista-python Bindings for using Apache Arrow's query engine in Python to analyze and manipulate large datasets 34
mckinsey/vizro A low-code toolkit for building high-quality data visualization apps using Python 2,736
vega/altair A declarative statistical visualization library for Python 9,441
vispy/vispy A high-performance interactive 2D/3D data visualization library for Python 3,334
holoviz/datashader Automates the process of creating meaningful representations of large data sets 3,336