udpipe
Text parser
A trainable pipeline for tokenization, tagging, lemmatizing and parsing of annotated text data
UDPipe: Trainable pipeline for tokenizing, tagging, lemmatizing and parsing Universal Treebanks and other CoNLL-U files
364 stars
28 watching
77 forks
Language: C++
last commit: 8 days ago
Linked from 1 awesome list
Related projects:
Repository | Description | Stars |
---|---|---|
itunlp/dapipe | A tool for processing and analyzing Danish text data using a pre-trained language model. | 7 |
languagemachines/ucto | A tokeniser for natural language text that separates words from punctuation and supports basic preprocessing steps such as case changing | 65 |
kenavolic/pipet | A lightweight C++ library for building compile-time processing pipelines with customizable filters and branches. | 67 |
pdpipe/pdpipe | A tool for creating and managing data pipelines with pandas DataFrames | 716 |
udellgroup/oboe | Automated machine learning system for selecting promising models or pipelines for new datasets | 82 |
dmulyalin/ttp | A template-based text parsing library | 349 |
tomaskoutek/logstash-pipeline-parser | Parser for Logstash pipeline configuration files | 3 |
dbuenzli/uutf | A non-blocking streaming codec for Unicode encoding schemes | 32 |
joboccara/pipes | A header-only C++14 library for building expressive data pipelines using a chainable interface. | 803 |
ixa-ehu/ixa-pipe-pos | Provides tools for part of speech tagging and lemmatization across multiple languages using machine learning models. | 17 |
dbuenzli/uuseg | An OCaml library for segmenting Unicode text into grapheme clusters, words, and sentences. | 23 |
tpolecat/atto | A compact, incremental text parsing library for Scala that enables efficient and functional processing of structured data | 359 |
ypares/porcupine | A tool that enables data manipulation and analysis pipelines to be flexible, reusable, and reproducible in different environments | 89 |
uni-algo/uni-algo | A C/C++ library that provides secure and efficient Unicode algorithms for text processing | 280 |
ada-url/ada | A fast and spec-compliant URL parser written in C++ | 1,358 |