PoLitBert

Polish LLM

Trains a language model using a RoBERTa architecture on high-quality Polish text data

Polish RoBERTA model trained on Polish literature, Wikipedia, and Oscar. The major assumption is that quality text will give a good model.

GitHub

33 stars

11 watching

3 forks

Language: Python

last commit: about 4 years ago

Linked from 1 awesome list

nlppolishrobertatext-corpus

Backlinks from these awesome lists:

ksopyla/awesome-nlp-polish

Related projects:

Repository	Description	Stars
ermlab/pl-sentiment-analysis	A Python library providing an API for sentiment analysis of Polish text using deep learning and Word2vec models	27
german-nlp-group/german-transformer-training	Trains German transformer models to improve language understanding	23
kldarek/polbert	A Polish BERT-based language model trained on various corpora for natural language processing tasks	70
allegro/herbert	A BERT-based language model pre-trained on Polish corpora for understanding Polish language.	65
l0sg/relational-rnn-pytorch	An implementation of DeepMind's Relational Recurrent Neural Networks (Santoro et al. 2018) in PyTorch for word language modeling	245
kefirski/bytenet	A Pytorch implementation of a neural network model for machine translation	47
ncbi-nlp/bluebert	Pre-trained language models for biomedical natural language processing tasks	560
sdadas/polish-nlp-resources	Pre-trained models and resources for Natural Language Processing in Polish	329
bilibili/index-1.9b	A lightweight, multilingual language model with a long context length	920
rdspring1/pytorch_gbw_lm	Trains a large-scale PyTorch language model on the 1-Billion Word dataset	123
luogen1996/lavin	An open-source implementation of a vision-language instructed large language model	513
peremartra/large-language-model-notebooks-course	A practical course teaching large language models and their applications through hands-on projects using OpenAI API and Hugging Face library.	1,338
brightmart/xlnet_zh	Trains a large Chinese language model on massive data and provides a pre-trained model for downstream tasks	230
balavenkatesh3322/nlp-pretrained-model	A collection of pre-trained natural language processing models	170
shawn-ieitsystems/yuan-1.0	Large-scale language model with improved performance on NLP tasks through distributed training and efficient data processing	591