hundict
Bilingual dictionary extractor
A tool for extracting bilingual dictionaries from parallel corpora by leveraging Python's speed and flexibility.
bilingual dictionary extractor from parallel corpora
22 stars
5 watching
2 forks
Language: Python
last commit: over 10 years ago
Linked from 1 awesome list
Related projects:
Repository | Description | Stars |
---|---|---|
juditacs/wikt2dict | Tool to parse and process Wiktionary translation data for dictionary creation | 53 |
eyurtsev/kor | An open-source wrapper around LLMs to extract structured data from text | 1,638 |
danieljdufour/date-extractor | A Python library that extracts dates from plain text | 65 |
szegedai/hun-date-parser | A Python package for extracting datetime intervals from Hungarian sentences and converting date objects to text. | 8 |
xyntopia/pydoxtools | A Python library for extracting information from unstructured documents using AI techniques and customizable pipelines. | 78 |
gamallo/galextra | A multi-language term extractor that uses morphosyntax tagging and filtering to identify multi-word terms from plain text input. | 2 |
tchayintr/best2010_cooker | Extracts segmented words from Thai BEST2010 corpus. | 2 |
fox-it/dissect.target | Provides a programming API and command line tools to access various data sources inside disk images or file collections. | 48 |
thunlp/thulac-python | An efficient Chinese lexical analyzer with morphological analysis capabilities | 2,032 |
belgianbiodiversityplatform/python-dwca-reader | A tool to parse and retrieve biodiversity data from archived files | 45 |
danburzo/hred | Extracts data from HTML or XML documents to JSON using a CSS selector-like query language | 70 |
zaataylor/wikiref | An extension that extracts and edits Wikipedia references with ease | 2 |
51j0/android-storage-extractor | A tool to extract local data storage of an Android application in one click. | 16 |
csababarta/ntdsxtract | A Python-based tool for extracting and analyzing data from Windows domain controllers to aid in Active Directory forensic investigations | 321 |
eset-la/lord-of-the-strings | A tool to extract and classify relevant strings from binary files | 9 |