wikipron
pronunciation scraper
A tool for extracting and processing multilingual pronunciation data from Wiktionary.
Massively multilingual pronunciation mining
321 stars
18 watching
71 forks
Language: Python
last commit: 2 months ago
Linked from 1 awesome list
computational-linguisticsg2planguagelinguisticsnlpphoneticsphonologypronunciationpython-apiscraped-dataspeech
Related projects:
Repository | Description | Stars |
---|---|---|
fielddb/lex4all | Tool for automating pronunciation lexicon creation for low-resource languages using speech recognition and machine learning algorithms. | 1 |
lex4all/lex4all | Software tool to generate pronunciation lexicons for low-resource languages using speech recognition and machine learning algorithms. | 21 |
phonologicalcorpustools/corpustools | A collection of tools and libraries for analyzing and processing phonological data in various languages | 113 |
macr0dev/audiobooks.bundle | A metadata agent that scrapes audiobook metadata from Audible.com and integrates it with Plex media servers. | 605 |
ytsvetko/str2ipa | A tool for phonetic transcription of languages with close-to-phonetic writing systems | 10 |
analyzeplatypus/translitkit | A Ruby framework for converting Hebrew text to English using phoneme maps | 7 |
ukplab/linspector | A framework to interpret multilingual NLP models and understand their word representations. | 23 |
ibm/max-chinese-phonetic-similarity-estimator | Estimates phonetic similarity between Chinese words and suggests similar-sounding candidates | 35 |
huspacy/huspacy | An industrial-strength natural language processing library for Hungarian language text analysis | 155 |
vchahun/gv-crawl | Automates text extraction and alignment from Global Voices articles to create parallel corpora for low-resource languages. | 9 |
bgutter/cl-phonetic | Provides phonetic pattern matching functionality in Common Lisp to aid with natural language processing and text analysis. | 24 |
khrystyna-skopyk/ukr_spell_check | Spelling correction system for the Ukrainian language using noisy channel model | 3 |
prosodylab/prosodylab.alignertools | A package of scripts to prepare data for use in Prosodylab-Aligner by cleaning and relabeling transcriptions and generating orthography-based dictionaries. | 12 |
eyurtsev/kor | Extracts structured data from unstructured text using large language models | 1,629 |
synyi/poplar | A web-based annotation tool for natural language processing (NLP) | 519 |