emLam

Language model script

Preprocessing and modeling scripts for Hungarian language modeling using Python and TensorFlow.

Preprocessing scripts for Hungarian Language Modeling

GitHub

3 stars

1 watching

2 forks

Language: Python

last commit: over 6 years ago

Linked from 1 awesome list

language-modelingpaperpythontensorflow

Backlinks from these awesome lists:

oroszgy/awesome-hungarian-nlp

Related projects:

Repository	Description	Stars
nytud/emtsv	A text processing system designed to handle various tasks in Hungarian language processing using Python and TSV-based data exchange.	28
nytud/emmorph	An online Hungarian humor analysis tool using morphology and finite-state grammar.	14
nytud/machine-translation	Provides machine translation models and a demo site for Hungarian language translations	5
ppke-nlpg/emmorphpy	A Python wrapper and lemmatizer for emMorph, a Hungarian morphological analyzer.	3
nytud/panmorph	Harmonized tagset and annotation scheme for Hungarian morphological analysers	4
nytud/hucola	A collection of 9,076 annotated sentences in Hungarian to evaluate linguistic acceptability and grammaticality	1
nytud/hadifogoly-adatbazis	An attempt to transcribe Cyrillic text into Hungarian script for a large dataset of WWII prisoner-of-war records	23
nytud/hunlp-gate	A collection of Hungarian NLP tools integrated as GATE processing resources	8
yfzhang114/slime	Develops large multimodal models for high-resolution understanding and analysis of text, images, and other data types.	143
ermlab/politbert	Trains a language model using a RoBERTa architecture on high-quality Polish text data	33
nytud/quntoken	A C++ tokenizer that tokenizes Hungarian text	14
jalammar/ecco	An interactive visualization library for exploring and understanding transformer-based language models	1,986
nytud/nytk-nerkor	A Hungarian language named entity annotated corpus containing 1 million tokens with morphological annotation layers and various source files.	15
davidnemeskey/embert	Provides pre-trained transformer-based models and tools for natural language processing tasks	2
vhellendoorn/code-lms	A guide to using pre-trained large language models in source code analysis and generation	1,789