datafu
Hadoop data processing library
A collection of libraries for working with large-scale data in Hadoop, providing incremental processing capabilities and user-defined functions.
Hadoop library for large-scale data processing, now an Apache Incubator project
584 stars
75 watching
134 forks
Language: Java
last commit: over 10 years ago Related projects:
Repository | Description | Stars |
---|---|---|
apache/datafu | A collection of libraries for data mining and statistics in large-scale Hadoop environments | 118 |
datasalt/pangool | A Java framework that simplifies Hadoop's MapReduce API to build efficient data processing pipelines | 57 |
linkedinattic/cleo | A flexible library for enabling rapid development of typeahead search functionality | 565 |
linkedinattic/kamikaze | A utility package implementing set operations and compression algorithms for efficient document searching in search engines. | 22 |
lacuna/bifurcan | A Java library providing efficient, functional data structures with customizable equality semantics and high performance. | 967 |
apache/tez | A system that enables flexible data processing pipelines using a low-level engine for higher-level frameworks | 479 |
linkeddata/rdflib.js | A JavaScript library for working with RDF data in various formats and querying RDF stores | 566 |
dfianthdl/dfhdl | A programming language and library for describing dataflow-based digital hardware in a high-level, object-oriented way | 80 |
frappe/datatable | A JavaScript library for displaying and editing tabular data in a modern and interactive way | 1,042 |
twitter/scalding | A Scala library for specifying and executing MapReduce jobs in Hadoop | 3,500 |
rbrahul/gofp | A utility library providing common functions for working with data structures like slices and maps in Go. | 146 |
linkedin/ambry | A distributed object store designed to handle large amounts of small and large immutable objects with high availability and low latency. | 1,751 |
mhausenblas/mrlin | Maps RDF data into HBase for scalable storage and processing of Linked Data | 17 |
alangrafu/lodspeakr | A framework for creating Linked Data applications using PHP. | 32 |
intentmedia/mario | A library that enables the definition of complex data pipelines in a functional, typesafe, and efficient way using a declarative syntax | 139 |