udpipe
Text parser
A trainable pipeline for tokenization, tagging, lemmatizing and parsing of annotated text data
UDPipe: Trainable pipeline for tokenizing, tagging, lemmatizing and parsing Universal Treebanks and other CoNLL-U files
367 stars
28 watching
77 forks
Language: C++
last commit: 3 months ago
Linked from 1 awesome list
Related projects:
Repository | Description | Stars |
---|---|---|
| A tool for processing and analyzing Danish text data using a pre-trained language model. | 7 |
| A tokeniser for natural language text that separates words from punctuation and supports basic preprocessing steps such as case changing | 66 |
| A lightweight C++ library for building compile-time processing pipelines with customizable filters and branches. | 67 |
| Provides a set of pre-defined data processing pipelines for pandas DataFrames. | 718 |
| Automated machine learning system for selecting promising models or pipelines for new datasets | 82 |
| A template-based text parsing library | 353 |
| Parser for Logstash pipeline configuration files | 3 |
| A non-blocking streaming codec for Unicode encoding schemes | 32 |
| A header-only C++14 library for building expressive data pipelines using a chainable interface. | 808 |
| Provides tools for part of speech tagging and lemmatization across multiple languages using machine learning models. | 18 |
| An OCaml library for segmenting Unicode text into grapheme clusters, words, and sentences. | 23 |
| A compact, incremental text parsing library for Scala that enables efficient and functional processing of structured data | 359 |
| A tool that enables data manipulation and analysis pipelines to be flexible, reusable, and reproducible in different environments | 89 |
| A C/C++ library that provides secure and efficient Unicode algorithms for text processing | 285 |
| A fast and spec-compliant URL parser written in C++ for use in various Node.js-based systems | 1,396 |