vaex
Data explorer
A high-performance Python library for streaming and exploring large tabular datasets.
Out-of-Core hybrid Apache Arrow/NumPy DataFrame for Python, ML, visualization and exploration of big tabular data at a billion rows per second 🚀
8k stars
144 watching
590 forks
Language: Python
last commit: about 2 months ago
Linked from 3 awesome lists
bigdatadata-sciencedataframehdf5machine-learningmachinelearningmemory-mapped-filepyarrowpythontabular-datavisualization
Related projects:
Repository | Description | Stars |
---|---|---|
wesm/feather | A binary data frame storage system that enables efficient and interoperable data sharing across multiple programming languages. | 2,742 |
pandas-dev/pandas | A powerful data analysis toolkit for Python that provides flexible and expressive data structures for efficient data manipulation and analysis. | 43,807 |
lux-org/lux | A Python library that automates data exploration by recommending visualizations and suggesting next steps based on user interest | 5,213 |
blaze/blaze | A Python library that translates familiar NumPy/Pandas-like syntax into database query language | 3,187 |
nixtla/statsforecast | An implementation of widely used time series forecasting models in Python | 3,990 |
mwaskom/seaborn | A high-level interface for statistical data visualization | 12,575 |
lancedb/lance | A modern columnar data format for machine learning and large language models. | 3,956 |
explosion/spacy | Industrial-strength NLP library for Python and Cython | 30,230 |
finos/perspective | A component for creating interactive analytics and data visualization applications with support for large datasets and streaming queries. | 8,530 |
ibis-project/ibis | A Python library for working with dataframes across multiple databases and backends | 5,306 |
apache/datafusion-ballista-python | Bindings for using Apache Arrow's query engine in Python to analyze and manipulate large datasets | 33 |
mckinsey/vizro | A toolkit for creating modular data visualization applications with a focus on simplicity and flexibility. | 2,707 |
vega/altair | A Python library for creating declarative statistical visualizations with a simple and consistent API. | 9,391 |
vispy/vispy | A high-performance interactive 2D/3D data visualization library for Python | 3,326 |
holoviz/datashader | Automates the process of creating meaningful representations of large data sets | 3,323 |