emLam
Language model script
Preprocessing and modeling scripts for Hungarian language modeling using Python and TensorFlow.
Preprocessing scripts for Hungarian Language Modeling
3 stars
1 watching
2 forks
Language: Python
last commit: over 4 years ago
Linked from 1 awesome list
language-modelingpaperpythontensorflow
Related projects:
Repository | Description | Stars |
---|---|---|
nytud/emtsv | A text processing system designed to handle various tasks in Hungarian language processing using Python and TSV-based data exchange. | 28 |
nytud/emmorph | An online Hungarian humor analysis tool using morphology and finite-state grammar. | 14 |
nytud/machine-translation | Provides machine translation models and a demo site for Hungarian language translations | 5 |
ppke-nlpg/emmorphpy | A Python wrapper and lemmatizer for emMorph, a Hungarian morphological analyzer. | 3 |
nytud/panmorph | Harmonized tagset and annotation scheme for Hungarian morphological analysers | 4 |
nytud/hucola | A collection of 9,076 annotated sentences in Hungarian to evaluate linguistic acceptability and grammaticality | 1 |
nytud/hadifogoly-adatbazis | An attempt to transcribe Cyrillic text into Hungarian script for a large dataset of WWII prisoner-of-war records | 23 |
nytud/hunlp-gate | A collection of Hungarian NLP tools integrated as GATE processing resources | 8 |
yfzhang114/slime | Develops large multimodal models for high-resolution understanding and analysis of text, images, and other data types. | 143 |
ermlab/politbert | Trains a language model using a RoBERTa architecture on high-quality Polish text data | 33 |
nytud/quntoken | A C++ tokenizer that tokenizes Hungarian text | 14 |
jalammar/ecco | An interactive visualization library for exploring and understanding transformer-based language models | 1,986 |
nytud/nytk-nerkor | A Hungarian language named entity annotated corpus containing 1 million tokens with morphological annotation layers and various source files. | 15 |
davidnemeskey/embert | Provides pre-trained transformer-based models and tools for natural language processing tasks | 2 |
vhellendoorn/code-lms | A guide to using pre-trained large language models in source code analysis and generation | 1,786 |