intercode

Code environment framework

An interactive code environment framework for evaluating language agents through execution feedback.

[NeurIPS 2023 D&B] Code repository for InterCode benchmark https://arxiv.org/abs/2306.14898

GitHub

198 stars

8 watching

35 forks

Language: Python

last commit: about 1 year ago

Linked from 1 awesome list

Screenshot of princeton-nlp/intercode website

intercode-benchmark.github.io/

Backlinks from these awesome lists:

ethicalml/awesome-production-machine-learning

Related projects:

Repository	Description	Stars
opennlg/openba	A pre-trained language model designed for various NLP tasks, including dialogue generation, code completion, and retrieval.	94
ptone/jiffylab	A web-based teaching environment that provides a standardized and managed Python/Unix development setup.	187
codefuse-ai/codefuse-devops-eval	An evaluation suite for assessing foundation models in the DevOps field.	690
turingapp/turing	An integrated development environment (IDE) for writing pseudocode and Python code.	38
proycon/python-frog	A Python binding to a C++ NLP tool for Dutch language processing tasks	47
jorgenschaefer/elpy	An Emacs package to provide a comprehensive Python development environment.	1,901
alibaba/alicemind	A collection of pre-trained encoder-decoders and related optimization techniques for natural language processing	1,986
datafolklabs/cement	An application framework for Python that provides a standard platform for building command line and backend applications with support for rapid development and quality.	1,254
codepothunter/fednp	A framework for non-IID federated learning via neural propagation	6
princeton-nlp/charxiv	An evaluation suite for assessing chart understanding in multimodal large language models.	85
google/playground-elements	A set of components for creating interactive, editable coding environments with live updating previews.	552
sloria/environs	A library for parsing and managing environment variables in Python applications.	1,227
mahmoud/clastic	A Python web framework that streamlines explicit development practices without global state	155
qirky/foxdot	A Python-driven environment for interacting with SuperCollider for live coding and audio processing	1,047
bigcode-project/bigcode-evaluation-harness	A framework for evaluating autoregressive code generation language models in terms of their accuracy and robustness.	846