colibri-core
Pattern extractor library
Tools and algorithms for efficient pattern extraction from large corpus data
Colibri core is an NLP tool as well as a C++ and Python library for working with basic linguistic constructions such as n-grams and skipgrams (i.e patterns with one or more gaps, either of fixed or dynamic size) in a quick and memory-efficient way. At the core is the tool colibri-patternmodeller
whi ch allows you to build, view, manipulate and query pattern models.
124 stars
12 watching
20 forks
Language: C++
last commit: 2 months ago
Linked from 2 awesome lists
c-plus-pluscomputational-linguisticscorpuslibrarylinguisticsngramngramsnlppattern-recognitionpythonskipgramtext-processing
Related projects:
Repository | Description | Stars |
---|---|---|
| A Python binding to a C++ NLP tool for Dutch language processing tasks | 47 |
| A Python library for natural language processing tasks, including text manipulation and analysis. | 479 |
| A DSL for building custom NLP patterns from manual language rules | 65 |
| A morphological analyzer and generator for Russian and Ukrainian languages | 1,127 |
| A Python package implementing an interpretable machine learning model for text classification with visualization tools | 336 |
| A toolset for analyzing and identifying patterns in compiled code from various architectures. | 439 |
| An implementation of a computational model for linguistic analysis based on cognitive inspiration | 1 |
| A set of tooling plugins built on top of the LLVM/Clang compiler infrastructure to analyze and improve C/C++ code quality. | 102 |
| A Ruby library providing an interface to Large Language Model (LLM) providers for text generation and embedding | 1,487 |
| A collection of software tools for linguists to manage and analyze linguistic data | 13 |
| An open-source Python package for analyzing scRNA-Seq data using PCA-based latent spaces | 39 |
| A collection of Python scripts for analyzing and interacting with Cobalt Strike beacons. | 168 |
| A comprehensive Python library for parsing and processing FoLiA documents used in Natural Language Processing. | 18 |
| A Python package for processing high-dimensional data from microscopy imaging experiments | 82 |
| Analyze and classify errors in coreference resolution output | 29 |