dumbo

Hadoop tool

Makes writing and running Hadoop programs easier with a Python API

Python module that allows one to easily write and run Hadoop programs.

GitHub

1k stars
62 watching
146 forks
Language: Python
last commit: almost 7 years ago
Linked from 1 awesome list


Backlinks from these awesome lists:

Related projects:

Repository Description Stars
bwhite/hadoopy A Python MapReduce library written in Cython for efficient data processing on clusters. 243
swyxio/swyxio A Python project focused on GitHub and DevRel, with the goal of providing resources and support for developers. 111
damballa/parkour A Clojure-based library for writing efficient MapReduce programs on the Hadoop platform 257
helgeho/hadoopconcatgz Provides a custom input format for handling concatenated GZIP files in distributed processing systems like Hadoop 9
bbva/kapow An HTTP microframework allowing developers to easily expose scripts as APIs and restrict execution. 612
swaroopch/byte-of-python A beginner's guide to the Python programming language 2,317
jhamrick/nbflow Tool that supports reproducible workflows with Jupyter Notebooks and SCons. 160
mzero/haskell-amuse-bouche A collection of Haskell code examples and resources illustrating the language's features and programming techniques. 114
pawegio/kandroid A Kotlin library that provides useful extensions to eliminate boilerplate code in Android development 896
netflix-skunkworks/cloudaux Provides a unified interface to various cloud providers 76
harisekhon/devops-python-tools A collection of 80+ CLI tools for DevOps, Cloud, Big Data, and Python development 773
kwpolska/pkgbuilder A command-line application for building and managing Arch Linux packages from the AUR. 71
halcy/mastodon.py A Python wrapper for the Mastodon API allowing developers to interact with the social media platform's public and private APIs. 884
jedie/django-kippo An integration layer for the kippo SSH honeypot with Django's administrative interface 12
clusto/clusto Tool for managing infrastructure clusters by tracking inventory, connections, and abstracting interactions with infrastructure elements. 291