mrjob
Job runner
Enables running MapReduce jobs on various environments
Run MapReduce jobs on Hadoop or Amazon Web Services
3k stars
109 watching
587 forks
Language: Python
last commit: over 1 year ago
Linked from 3 awesome lists
Related projects:
Repository | Description | Stars |
---|---|---|
kcrandall/emr_spark_automation | Automates deployment of an AWS EMR cluster and execution of Spark jobs | 8 |
vincentclaes/datajob | Automates end-to-end machine learning pipeline deployment with AWS services | 110 |
spinnaker/spinnaker | A platform for managing software releases across multiple cloud environments in a safe and efficient manner. | 9,341 |
mljar/mercury | Converts Jupyter Notebooks to interactive web applications with customizable widgets and re-execution of cells on change. | 4,044 |
gradio-app/gradio | Enables rapid creation and deployment of web applications for machine learning models and functions using Python | 33,962 |
ml-tooling/opyrator | Automates conversion of machine learning code into production-ready microservices with web API and GUI. | 3,102 |
frappe/erpnext | A comprehensive enterprise resource planning system built on top of the Frappe Framework and Python. | 21,890 |
mrpowers-io/quinn | Pyspark helper functions to maximize developer productivity | 643 |
mechanicalsoup/mechanicalsoup | Automates interaction with websites by simulating browser behavior and handling HTTP sessions and document navigation. | 4,672 |
cloudquery/cloudquery | An open-source ELT framework that enables data movement between any source and destination using high-performance data ingestion and processing | 5,877 |
joblib/joblib-spark | Enables parallelization of machine learning tasks on a distributed Spark cluster using the joblib library. | 242 |
aws/chalice | A Python framework for building serverless applications on AWS using Lambda and other services | 10,665 |
spark-jobserver/spark-jobserver | Provides a RESTful interface for managing Apache Spark jobs and contexts | 2,840 |
mhausenblas/mrlin | Maps RDF data into HBase for scalable storage and processing of Linked Data | 17 |
cloudtools/troposphere | A Python library to generate AWS CloudFormation descriptions in JSON or YAML format | 4,931 |