checklist
Behavior testing toolkit
A suite of tools and tests for evaluating the behavior of natural language processing models
Beyond Accuracy: Behavioral Testing of NLP models with CheckList
2k stars
29 watching
205 forks
Language: Jupyter Notebook
last commit: 11 months ago Related projects:
Repository | Description | Stars |
---|---|---|
marcotcr/anchor | Provides a method to generate explanations for predictions made by any black box classifier. | 798 |
geeklearningio/testavior | A lightweight solution to develop automated tests for ASP.NET Core applications using behavior testing | 41 |
mikegu721/xiezhibenchmark | An evaluation suite to assess language models' performance in multi-choice questions | 91 |
steinfletcher/apitest | A simple and extensible library for behavioural testing of Go web applications. | 796 |
django-behave/django-behave | Provides a way to run Behavior-Driven Development tests in Django applications | 198 |
anthonylloyd/cscheck | A C# random testing library with generators and shrinkers for property-based testing of .NET code. | 162 |
libcheck/check | A unit testing framework for C. | 1,077 |
kimtaro/ve | A linguistic framework for natural language processing tasks. | 216 |
tanprathan/owasp-testing-checklist | A comprehensive security testing checklist based on OWASP guidelines | 1,506 |
bradleyjkemp/cupaloy | Automatically checks test output for changes and fails tests if output differs from previously recorded snapshots | 309 |
jpeg729/pytorch_bits | An experimental framework for developing and testing deep learning models on time-series prediction tasks | 79 |
antonboom/testifylint | A tool that checks the usage of the testify testing framework in Go programs | 101 |
cybertk/abao | Automated testing tool for API documentation written in RAML format | 354 |
blazemeter/taurus | Automates performance and functional tests using a suite of tools | 2,019 |
chirino/mqtt-benchmark | A benchmarking tool for MQTT servers to evaluate performance under various usage scenarios | 121 |