COBS

Policy evaluation toolkit

A toolkit for evaluating and analyzing off-policy policy estimation methods in reinforcement learning

OPE Tools based on Empirical Study of Off Policy Policy Estimation paper.

GitHub

61 stars
3 watching
14 forks
Language: Python
last commit: over 2 years ago
Linked from 1 awesome list


Backlinks from these awesome lists:

Related projects:

Repository Description Stars
zzhanghub/eval-co-sod An evaluation tool for co-saliency detection tasks 96
st-tech/zr-obp A framework for off-policy evaluation and learning in multi-armed bandit algorithms 645
sony/pyieoe Develops an interpretable evaluation procedure for off-policy evaluation (OPE) methods to quantify their sensitivity to hyper-parameter choices and/or evaluation policy choices. 31
onlytailei/carla_cil_pytorch Implementation of a conditional imitation learning policy in PyTorch for autonomous driving using the Carla dataset. 66
lartpang/pysodevaltoolkit A comprehensive Python toolbox for evaluating salient object detection and camouflaged object detection tasks 167
cbfinn/gps An implementation of guided policy search and LQG-based trajectory optimization for reinforcement learning 598
psecio/propauth Evaluates policies against user credentials and properties to determine access permissions. 59
albermax/innvestigate A toolbox to help understand neural networks' predictions by providing different analysis methods and a common interface. 1,265
cmlplatform/pycirk Software to model Circular Economy policy and technological interventions in Environmental Extended Input-Output Analysis 20
hkust-nlp/ceval An evaluation suite providing multiple-choice questions for foundation models in various disciplines, with tools for assessing model performance. 1,636
nci/scores A comprehensive package for evaluating and optimizing forecasts and models in various scientific fields 67
gzcch/bingo An analysis project investigating limitations of visual language models in understanding and processing images with potential biases and interference challenges. 53
mhubii/ppo_libtorch An implementation of the proximal policy optimization algorithm in PyTorch. 73
princeton-nlp/charxiv An evaluation suite for assessing chart understanding in multimodal large language models. 75
krrishdholakia/betterprompt An API for evaluating the quality of text prompts used in Large Language Models (LLMs) based on perplexity estimation 38