vaex
Data explorer
A high-performance Python library for streaming and exploring large tabular datasets.
Out-of-Core hybrid Apache Arrow/NumPy DataFrame for Python, ML, visualization and exploration of big tabular data at a billion rows per second 🚀
8k stars
145 watching
591 forks
Language: Python
last commit: 4 months ago
Linked from 3 awesome lists
bigdatadata-sciencedataframehdf5machine-learningmachinelearningmemory-mapped-filepyarrowpythontabular-datavisualization
Related projects:
Repository | Description | Stars |
---|---|---|
| A binary data frame storage system that enables efficient and interoperable data sharing across multiple programming languages. | 2,742 |
| A powerful data analysis toolkit for Python that provides flexible and expressive data structures for efficient data manipulation and analysis. | 44,052 |
| A Python library that automates data exploration by recommending visualizations and suggesting next steps based on user interest | 5,226 |
| A Python library that translates familiar NumPy/Pandas-like syntax into database query language | 3,185 |
| A fast and accurate Python library for forecasting time series data using various statistical and econometric models | 4,045 |
| A high-level interface for statistical data visualization | 12,669 |
| A modern columnar data format for machine learning and large language models. | 4,010 |
| Industrial-strength NLP library for Python and Cython | 30,459 |
| A component for creating interactive analytics and data visualization applications with support for large datasets and streaming queries. | 8,669 |
| A portable Python library for working with dataframes across multiple backends. | 5,390 |
| Bindings for using Apache Arrow's query engine in Python to analyze and manipulate large datasets | 34 |
| A low-code toolkit for building high-quality data visualization apps using Python | 2,736 |
| A declarative statistical visualization library for Python | 9,441 |
| A high-performance interactive 2D/3D data visualization library for Python | 3,334 |
| Automates the process of creating meaningful representations of large data sets | 3,336 |