zr-obp
Off-policy eval
A framework for off-policy evaluation and learning in multi-armed bandit algorithms
Open Bandit Pipeline: a python library for bandit algorithms and off-policy evaluation
648 stars
88 watching
89 forks
Language: Python
last commit: 9 months ago
Linked from 1 awesome list
contextual-banditsdatasetsmulti-armed-banditsoff-policy-evaluationresearch
Related projects:
Repository | Description | Stars |
---|---|---|
| A toolkit for evaluating and analyzing off-policy policy estimation methods in reinforcement learning | 61 |
| Implementation of a conditional imitation learning policy in PyTorch for autonomous driving using the Carla dataset. | 65 |
| A Rust SDK to evaluate Open Policy Agent policies in WebAssembly format. | 50 |
| An evaluation tool for co-saliency detection tasks | 97 |
| This library provides tools and algorithms for estimating the distribution correction in off-policy reinforcement learning problems | 99 |
| Develops an interpretable evaluation procedure for off-policy evaluation (OPE) methods to quantify their sensitivity to hyper-parameter choices and/or evaluation policy choices. | 31 |
| A Python SDK for executing and managing Open Policy Agent policies in WebAssembly format | 10 |
| Evaluates policies against user credentials and properties to determine access permissions. | 59 |
| Evaluates Ontology-Based Data Access systems with inference and meta knowledge benchmarking | 4 |
| An open-source policy search algorithm for robotics that uses Gaussian processes to model robot dynamics and accounts for uncertainty. | 64 |
| A Python package for parsing and processing AWS IAM policies and statements. | 427 |
| Provides exploratory data and algorithms for offline reinforcement learning in various control domains | 105 |
| Standardized framework for creating and sharing incident response processes in a shared language | 151 |
| A comprehensive Python toolbox for evaluating salient object detection and camouflaged object detection tasks | 168 |
| An implementation of a reinforcement learning algorithm using multi-branch architecture and Deep Deterministic Policy Gradients (DDPG) to control autonomous vehicles in simulation environments. | 81 |