galeXtra

Term extractor

A multi-language term extractor that uses morphosyntax tagging and filtering to identify multi-word terms from plain text input.

Multiword Extractor for Portuguese, English, Spanish, Galician, French

GitHub

2 stars
2 watching
1 forks
Language: Shell
last commit: over 8 years ago
Linked from 1 awesome list


Backlinks from these awesome lists:

Related projects:

Repository Description Stars
gamallo/citiussentiment A Perl-based sentiment analysis tool for analyzing text in multiple languages. 7
dwisiswant0/galer A tool to extract URLs from HTML attributes by crawling in and evaluating JavaScript 253
zoomio/tagify An application that extracts keywords from text sources 38
theacharya/markersextractor A tool and library for extracting metadata from Final Cut Pro FCPXML data export format. 37
recrm/archivetools A collection of tools for extracting and analyzing data from web archives 69
darccio/pipar A tool for extracting and processing data from political parties' registries 3
eset-la/lord-of-the-strings A tool to extract and classify relevant strings from binary files 9
aymericbeaumet/squeeze A tool to extract relevant information from text 17
limiu82214/gojmapr A library to extract specific properties from complex JSON structures into Go structs with minimal code changes. 22
eyurtsev/kor Extracts structured data from unstructured text using large language models 1,629
apertium/apertium-glg A package providing linguistic data for Galician language analysis and generation. 0
gmarty/xgettext Tools for extracting translatable strings from source code written in template languages. 77
ftramer/lm_memorization A tool to extract memorized content from large language models like GPT-2 by analyzing their training data 175
gamallo/deppattern A Perl-based dependency parsing system for multiple Romance languages, including grammar compiler and parser generator. 10
pxyup/fitter A utility for extracting and processing data from various sources, including APIs, websites, and static text 119