datafusion-ballista-python

Data analysis library

Bindings for using Apache Arrow's query engine in Python to analyze and manipulate large datasets

Apache Arrow Ballista Python bindings

GitHub

34 stars
35 watching
8 forks
Language: Shell
last commit: 12 months ago
arrowbig-datadataframedistributedolappythonquery-enginerustsql

Related projects:

Repository Description Stars
apache/datafusion-ballista Distributed query engine for Apache DataFusion applications 1,580
apache/datafusion-python A Python library that provides a data processing and querying framework using the Apache Arrow in-memory query engine. 385
googleapis/python-bigtable Provides a Python interface to interact with Google Cloud Bigtable NoSQL database service. 68
paradedb/pg_analytics Enables direct querying of data lakes from Postgres without moving data to a cloud data warehouse 407
dialoguemd/fastapi-sqla An extension for FastAPI that simplifies interaction with SQLAlchemy databases. 229
apache/calcite-avatica-go An implementation of the Avatica/Phoenix SQL Driver in Go 120
osuked/elexondataportal A Python wrapper for retrieving data from the Elexon/BMRS API 52
apache/spark An analytics engine designed to handle large-scale data processing and analysis 40,170
datastax/python-driver A Python client library for interacting with Apache Cassandra databases 1,393
apache/arrow A toolkit for efficient data interchange and in-memory analytics in various languages 14,728
googleapis/python-bigquery-pandas Provides an interface to Google BigQuery from pandas data structures 451
apache/datafusion A query engine that supports various data formats and allows customization of its functionality. 6,462
jwkvam/bowtie An interactive dashboard library for Python that enables users to create web-based data exploration tools without needing to know web frameworks or JavaScript. 766
apache/tez A system that enables flexible data processing pipelines using a low-level engine for higher-level frameworks 482