COBS

Policy evaluation toolkit

A toolkit for evaluating and analyzing off-policy policy estimation methods in reinforcement learning

OPE Tools based on Empirical Study of Off Policy Policy Estimation paper.

GitHub

61 stars

3 watching

14 forks

Language: Python

last commit: almost 3 years ago

Linked from 1 awesome list

Backlinks from these awesome lists:

hanjuku-kaso/awesome-offline-rl

Related projects:

Repository	Description	Stars
zzhanghub/eval-co-sod	An evaluation tool for co-saliency detection tasks	97
st-tech/zr-obp	A framework for off-policy evaluation and learning in multi-armed bandit algorithms	648
sony/pyieoe	Develops an interpretable evaluation procedure for off-policy evaluation (OPE) methods to quantify their sensitivity to hyper-parameter choices and/or evaluation policy choices.	31
onlytailei/carla_cil_pytorch	Implementation of a conditional imitation learning policy in PyTorch for autonomous driving using the Carla dataset.	65
lartpang/pysodevaltoolkit	A comprehensive Python toolbox for evaluating salient object detection and camouflaged object detection tasks	168
cbfinn/gps	An implementation of guided policy search and LQG-based trajectory optimization for reinforcement learning	599
psecio/propauth	Evaluates policies against user credentials and properties to determine access permissions.	59
albermax/innvestigate	A toolbox to help understand neural networks' predictions by providing different analysis methods and a common interface.	1,271
cmlplatform/pycirk	Software to model Circular Economy policy and technological interventions in Environmental Extended Input-Output Analysis	20
hkust-nlp/ceval	An evaluation suite providing multiple-choice questions for foundation models in various disciplines, with tools for assessing model performance.	1,650
nci/scores	A collection of tools and functions for evaluating and optimizing forecasts and models in various scientific fields.	85
gzcch/bingo	An analysis project investigating limitations of visual language models in understanding and processing images with potential biases and interference challenges.	53
mhubii/ppo_libtorch	An implementation of the proximal policy optimization algorithm in PyTorch.	73
princeton-nlp/charxiv	An evaluation suite for assessing chart understanding in multimodal large language models.	85
krrishdholakia/betterprompt	An API for evaluating the quality of text prompts used in Large Language Models (LLMs) based on perplexity estimation	43