mrjob

Job runner

Enables running MapReduce jobs on various environments

Run MapReduce jobs on Hadoop or Amazon Web Services

GitHub

3k stars
109 watching
587 forks
Language: Python
last commit: almost 2 years ago
Linked from 3 awesome lists


Backlinks from these awesome lists:

Related projects:

Repository Description Stars
kcrandall/emr_spark_automation Automates deployment of an AWS EMR cluster and execution of Spark jobs 8
vincentclaes/datajob Automates end-to-end machine learning pipeline deployment with AWS services 111
spinnaker/spinnaker A platform for managing software releases across multiple cloud environments in a safe and efficient manner. 9,366
mljar/mercury Converts Jupyter Notebooks to interactive web applications with customizable widgets and re-execution of cells on change. 4,071
gradio-app/gradio Enables rapid creation and deployment of web applications for machine learning models and functions using Python 34,557
ml-tooling/opyrator Automates conversion of machine learning code into production-ready microservices with web API and GUI. 3,116
frappe/erpnext An all-in-one business management system that integrates various functions such as accounting, inventory management, and project tracking in a single platform. 22,243
mrpowers-io/quinn Pyspark helper functions to maximize developer productivity 651
mechanicalsoup/mechanicalsoup Automates interaction with websites by simulating browser behavior and handling HTTP sessions and document navigation. 4,685
cloudquery/cloudquery An open-source ELT framework that enables data movement between any source and destination using high-performance data ingestion and processing 5,913
joblib/joblib-spark Enables parallelization of machine learning tasks on a distributed Spark cluster using the joblib library. 243
aws/chalice A Python framework for building serverless applications on AWS using Lambda and other services 10,690
spark-jobserver/spark-jobserver Provides a RESTful interface for managing Apache Spark jobs and contexts 2,839
mhausenblas/mrlin Maps RDF data into HBase for scalable storage and processing of Linked Data 17
cloudtools/troposphere A Python library to generate AWS CloudFormation descriptions in JSON or YAML format 4,933