refinery
Data annotation toolkit
A tool to help data scientists manage and annotate natural language data for training AI models
The data scientist's open-source choice to scale, assess and maintain natural language data. Treat training data like a software artifact.
1k stars
18 watching
69 forks
Language: Python
last commit: 2 months ago
Linked from 2 awesome lists
active-learningannotationsartificial-intelligencedata-centric-aidata-labelingdata-sciencedeep-learninghuman-in-the-looplabelinglabeling-toolmachine-learningnatural-language-processingneural-searchnlppythonspacysupervised-learningtext-annotationtext-classificationtransformers
Related projects:
Repository | Description | Stars |
---|---|---|
| A collection of natural language processing models and tools for collaboration on a joint project between BAAI and JDAI. | 254 |
| A Python framework for building deep learning models with optimized encoding layers and batch normalization. | 2,044 |
| An automated machine learning toolkit with visualization and feature engineering capabilities | 40 |
| An efficient AutoML system that automates the machine learning lifecycle | 53 |
| A web-based annotation tool for natural language processing (NLP) | 520 |
| Provides training materials and tools for building machine learning applications | 72 |
| A web-based annotation tool designed to facilitate intuitive and fast creation of text-bound and relational annotations. | 1,831 |
| A collection of Go-based resources and tools for data science tasks | 879 |
| A collection of machine learning models and tools for real-time time series data analytics and anomaly detection | 168 |
| A guide to using pre-trained large language models in source code analysis and generation | 1,789 |
| A question answering annotation platform with features like text input, user management and scoring | 87 |
| A collaborative annotation toolkit for massive amounts of image data used in connectomics and neuroscience research | 188 |
| Provides a toolbox of components to extend PyTorch Lightning for deep learning research and production | 1,700 |
| A PyTorch-based toolkit for creating customized multimedia datasets and handling heterogeneous data for training AI models. | 346 |
| A toolbox of AI modules written in Swift for various machine learning tasks and algorithms | 794 |