cascalog

Data processor

A library for data processing and querying on large datasets without the need for Hadoop expertise

Data processing on Hadoop without the hassle.

GitHub

1k stars
80 watching
178 forks
Language: Clojure
last commit: over 1 year ago

Related projects:

Repository Description Stars
scicloj/tablecloth A dataset manipulation library built on top of tech.ml.dataset, providing a simplified API for data processing and analysis. 308
dkogan/vnlog A toolkit for manipulating tabular ASCII data with normal UNIX tools. 161
ndmitchell/cmdargs A Haskell library for building command line applications with minimal code 91
techascent/tech.ml.dataset A Clojure library for efficient tabular data processing and analysis 687
nysol/mcmd A set of commands for high-speed processing of large-scale CSV data 33
tweag/haskellr An environment for efficient data processing using Haskell or R code. 585
zepgram/module-multi-threading A module that enables parallel processing of large data sets in Magento 2 using multiple child processes. 80
snoyberg/conduit A framework for handling and transforming streaming data in a consistent and efficient way 903
kapolos/pramda A PHP implementation of functional programming concepts to simplify data processing and analysis. 245
netflix/pigpen A map-reduce framework for Clojure that compiles to Apache Pig or Cascading without requiring prior knowledge of those systems. 567
travitch/datalog A Haskell implementation of Datalog, allowing recursive queries in a logic language. 102
apache/samza A distributed stream processing framework for handling high-volume data streams with fault tolerance and durability guarantees 817
hashrock/deno-fnparse A parser combinator library for Deno that provides a simple way to parse CSV data. 11
sodiumjoe/lobar A command-line wrapper around lodash's chain method for functional data processing 28
reubano/meza A lightweight toolkit for processing tabular data with a focus on functional programming and PyPy compatibility. 417