hive

Data warehouse tool

A software project that enables data warehousing and management of large datasets using SQL

Apache Hive

GitHub

6k stars
327 watching
5k forks
Language: Java
last commit: 6 days ago
Linked from 2 awesome lists

apachebig-datadatabasehadoophivejavasql

Backlinks from these awesome lists:

Related projects:

Repository Description Stars
apache/hudi Manages large analytical datasets on distributed storage systems by enabling incremental processing and snapshot isolation. 5,429
apache/shardingsphere Enables data sharding, scaling, and encryption across multiple databases 19,966
apache/kyuubi An Apache project providing a distributed and multi-tenant gateway to enable serverless SQL on data warehouses and lakehouses 2,105
apache/kylin An OLAP engine designed to handle Big Data with sub-second query latency and seamless integration with BI tools. 3,653
apache/cassandra A highly scalable, partitioned row store that allows flexible data distribution and organization. 8,858
apache/hbase Provides a distributed, versioned column-oriented data storage system 5,225
apache/datafusion A query engine that supports various data formats and allows customization of its functionality. 6,298
dbeaver/dbeaver A multi-platform tool for connecting to and managing various databases 40,507
crate/crate A distributed and scalable SQL database for storing and analyzing massive amounts of data in near real-time. 4,119
apache/ignite A distributed, in-memory database system for high-performance computing and data processing 4,814
apache/drill A distributed query layer for Hadoop and NoSQL data storage systems, supporting various query languages. 1,945
apache/arrow A toolkit for efficient data interchange and in-memory analytics in various languages 14,590
apache/datafusion-ballista A distributed SQL query engine built on Apache Arrow and Rust, designed to provide efficient columnar processing and low memory usage. 1,544
apache/iotdb A time-series data management system for industrial IoT applications 5,618
apache/tez A system that enables flexible data processing pipelines using a low-level engine for higher-level frameworks 479