pig
Data processor
Enables data processing and transformation in large files using a high-level language with compile-time optimizations for efficient execution on distributed computing frameworks.
Mirror of Apache Pig
682 stars
79 watching
451 forks
Language: Java
last commit: 4 months ago
Linked from 1 awesome list
databasejavapig
Related projects:
Repository | Description | Stars |
---|---|---|
| A map-reduce framework for Clojure that compiles to Apache Pig or Cascading without requiring prior knowledge of those systems. | 567 |
| A distributed stream processing framework for handling high-volume data streams with fault tolerance and durability guarantees | 817 |
| A set of Apache Pig scripts and UDFs for machine learning and natural language processing | 53 |
| A high-performance real-time analytics database for fast queries and ingest | 13,548 |
| A high-performance query engine designed to handle large-scale data processing and analytics | 1,164 |
| Provides a toolkit for natural language text processing tasks using machine learning algorithms in Java. | 1,449 |
| A system that enables flexible data processing pipelines using a low-level engine for higher-level frameworks | 482 |
| A distributed key/value store with robust data storage and retrieval capabilities | 1,075 |
| An environment for efficient data processing using Haskell or R code. | 587 |
| An analytics engine designed to handle large-scale data processing and analysis | 40,170 |
| An environment for quickly creating scalable machine learning applications | 2,145 |
| A Clojure-based library for working with dataframes and numerical computations using Python libraries. | 189 |
| Provides tools and APIs for text processing and analysis on Java-based platforms. | 148 |
| A Python library that provides a data processing and querying framework using the Apache Arrow in-memory query engine. | 385 |
| A collection of utilities and examples for processing RDF data using various big-data technologies. | 24 |