mrjob

Job runner

Enables running MapReduce jobs on various environments

Run MapReduce jobs on Hadoop or Amazon Web Services

GitHub

3k stars
109 watching
587 forks
Language: Python
last commit: over 1 year ago
Linked from 3 awesome lists


Backlinks from these awesome lists:

Related projects:

Repository Description Stars
kcrandall/emr_spark_automation Automates deployment of an AWS EMR cluster and execution of Spark jobs 8
vincentclaes/datajob Automates end-to-end machine learning pipeline deployment with AWS services 110
spinnaker/spinnaker A platform for managing software releases across multiple cloud environments in a safe and efficient manner. 9,341
mljar/mercury Converts Jupyter Notebooks to interactive web applications with customizable widgets and re-execution of cells on change. 4,044
gradio-app/gradio Enables rapid creation and deployment of web applications for machine learning models and functions using Python 33,962
ml-tooling/opyrator Automates conversion of machine learning code into production-ready microservices with web API and GUI. 3,102
frappe/erpnext A comprehensive enterprise resource planning system built on top of the Frappe Framework and Python. 21,890
mrpowers-io/quinn Pyspark helper functions to maximize developer productivity 643
mechanicalsoup/mechanicalsoup Automates interaction with websites by simulating browser behavior and handling HTTP sessions and document navigation. 4,672
cloudquery/cloudquery An open-source ELT framework that enables data movement between any source and destination using high-performance data ingestion and processing 5,877
joblib/joblib-spark Enables parallelization of machine learning tasks on a distributed Spark cluster using the joblib library. 242
aws/chalice A Python framework for building serverless applications on AWS using Lambda and other services 10,665
spark-jobserver/spark-jobserver Provides a RESTful interface for managing Apache Spark jobs and contexts 2,840
mhausenblas/mrlin Maps RDF data into HBase for scalable storage and processing of Linked Data 17
cloudtools/troposphere A Python library to generate AWS CloudFormation descriptions in JSON or YAML format 4,931