vaex

Data explorer

A high-performance Python library for streaming and exploring large tabular datasets.

Out-of-Core hybrid Apache Arrow/NumPy DataFrame for Python, ML, visualization and exploration of big tabular data at a billion rows per second 🚀

GitHub

8k stars
144 watching
590 forks
Language: Python
last commit: about 2 months ago
Linked from 3 awesome lists

bigdatadata-sciencedataframehdf5machine-learningmachinelearningmemory-mapped-filepyarrowpythontabular-datavisualization

Backlinks from these awesome lists:

Related projects:

Repository Description Stars
wesm/feather A binary data frame storage system that enables efficient and interoperable data sharing across multiple programming languages. 2,742
pandas-dev/pandas A powerful data analysis toolkit for Python that provides flexible and expressive data structures for efficient data manipulation and analysis. 43,807
lux-org/lux A Python library that automates data exploration by recommending visualizations and suggesting next steps based on user interest 5,213
blaze/blaze A Python library that translates familiar NumPy/Pandas-like syntax into database query language 3,187
nixtla/statsforecast An implementation of widely used time series forecasting models in Python 3,990
mwaskom/seaborn A high-level interface for statistical data visualization 12,575
lancedb/lance A modern columnar data format for machine learning and large language models. 3,956
explosion/spacy Industrial-strength NLP library for Python and Cython 30,230
finos/perspective A component for creating interactive analytics and data visualization applications with support for large datasets and streaming queries. 8,530
ibis-project/ibis A Python library for working with dataframes across multiple databases and backends 5,306
apache/datafusion-ballista-python Bindings for using Apache Arrow's query engine in Python to analyze and manipulate large datasets 33
mckinsey/vizro A toolkit for creating modular data visualization applications with a focus on simplicity and flexibility. 2,707
vega/altair A Python library for creating declarative statistical visualizations with a simple and consistent API. 9,391
vispy/vispy A high-performance interactive 2D/3D data visualization library for Python 3,326
holoviz/datashader Automates the process of creating meaningful representations of large data sets 3,323