HerBERT

Polish NLP model

A BERT-based language model pre-trained on Polish corpora for understanding Polish language.

HerBERT is a BERT-based Language Model trained on Polish Corpora using only MLM objective with dynamic masking of whole words.

GitHub

65 stars

7 watching

5 forks

last commit: over 3 years ago

Linked from 1 awesome list

Backlinks from these awesome lists:

ksopyla/awesome-nlp-polish

Related projects:

Repository	Description	Stars
dbmdz/berts	Provides pre-trained language models for natural language processing tasks	155
kldarek/polbert	A Polish BERT-based language model trained on various corpora for natural language processing tasks	70
langboat/mengzi	Develops lightweight yet powerful pre-trained models for natural language processing tasks	533
sdadas/polish-nlp-resources	Pre-trained models and resources for Natural Language Processing in Polish	329
ermlab/politbert	Trains a language model using a RoBERTa architecture on high-quality Polish text data	33
certainlyio/nordic_bert	Provides pre-trained BERT models for Nordic languages with limited training data.	164
tonianelope/multilingual-bert	Investigating multilingual language models for Named Entity Recognition in German and English	14
turkunlp/wikibert	Provides pre-trained language models derived from Wikipedia texts for natural language processing tasks	34
laurentmazare/ocaml-bert	Implementing BERT-like NLP models in OCaml using PyTorch bindings and pre-trained weights from popular sources.	24
deeppavlov/slavic-bert-ner	A shared BERT model for NER tasks in Slavic languages, pre-trained on Bulgarian, Czech, Polish, and Russian texts.	73
allenai/scibert	A BERT model trained on scientific text for natural language processing tasks	1,532
dfki-nlp/gevalm	Evaluates German transformer language models with syntactic agreement tests	7
zhuiyitechnology/pretrained-models	A collection of pre-trained language models for natural language processing tasks	989
ymcui/pert	Develops a pre-trained language model to learn semantic knowledge from permuted text without mask labels	356
tal-tech/edu-bert	A pre-trained language model designed to improve natural language processing tasks in education	186