cnminlangwebcollect
Language detector collection tool
Detects languages of Chinese minority websites and collects them into a dataset.
Chinese minorities website languages detection and websites collection
1 stars
2 watching
8 forks
Language: Python
last commit: about 4 years ago
Linked from 1 awesome list
Related projects:
Repository | Description | Stars |
---|---|---|
pemistahl/lingua | An accurate language detection library for Java and the JVM suitable for both short and long text inputs. | 707 |
hashwin/scylla | A Ruby-based language detection tool that uses N-Gram based text categorization to identify the language of given text. | 36 |
vseloved/wiki-lang-detect | Uses Wikipedia data to identify the language of unstructured text | 31 |
unlyed/universal-language-detector | Detects and resolves the language used in user requests | 95 |
hanzhenlei767/nlp_learn | A comprehensive collection of NLP-related code snippets and notes on various models and techniques, including pre-trained language models and Chinese text processing methods. | 25 |
pemistahl/lingua-go | A library that accurately detects the language of short to long text inputs without requiring external APIs or configuration. | 1,190 |
minibikini/paasaa | Tools for detecting the language of unstructured text in Elixir applications | 115 |
jingzhang617/cod-rank-localize-and-segment | Develops a system to detect, segment, and rank camouflaged objects in images. | 74 |
greyblake/whatlang-rs | A Rust library for detecting the language of text, including script recognition and reliability estimation. | 970 |
olivomarco/lc4j | An open-source Java library implementing text categorization and language detection using N-grams. | 5 |
abadojack/whatlanggo | A library for detecting and identifying languages in text | 643 |
alvations/sugali | A system designed to identify the language of an arbitrary text string using machine learning and multiple data sources. | 2 |
hemangsk/capacitor-mlkit-language | An Android and iOS plugin using ML Kit for language identification on device | 3 |
detectlanguage/detectlanguage-go | A Go client for detecting the language of given text and interacting with the Detect Language API | 25 |
cisnlp/glotlid | A language identification model that supports over 2000 languages and can be used for various NLP tasks. | 90 |