wikibert

Language model library

Provides pre-trained language models derived from Wikipedia texts for natural language processing tasks

BERT models for many languages created from Wikipedia texts

GitHub

34 stars
12 watching
1 forks
last commit: over 4 years ago
Linked from 1 awesome list


Backlinks from these awesome lists:

Related projects:

Repository Description Stars
langboat/mengzi Develops lightweight yet powerful pre-trained models for natural language processing tasks 534
dbmdz/berts Provides pre-trained language models for natural language processing tasks 155
certainlyio/nordic_bert Provides pre-trained BERT models for Nordic languages with limited training data. 161
thunlp/openclap A repository of pre-trained language models for natural language processing tasks in Chinese 979
zhuiyitechnology/wobert A pre-trained Chinese language model that uses word embeddings and is designed to process Chinese text 458
zhuiyitechnology/pretrained-models A collection of pre-trained language models for natural language processing tasks 987
ethan-yt/guwenbert A pre-trained language model for classical Chinese based on RoBERTa and ancient literature. 506
baai-wudao/model A repository of pre-trained language models for various tasks and domains. 121
apache/opennlp-models Provides pre-trained models for text processing in various languages 4
ncbi-nlp/bluebert Pre-trained language models for biomedical natural language processing tasks 558
kldarek/polbert A Polish BERT-based language model trained on various corpora for natural language processing tasks 70
ibm-granite/granite-3.0-language-models A collection of lightweight state-of-the-art language models designed to support multilinguality, coding, and reasoning tasks on constrained resources. 214
ymcui/macbert Improves pre-trained Chinese language models by incorporating a correction task to alleviate inconsistency issues with downstream tasks 645
peleiden/daluke A language model trained on Danish Wikipedia data for named entity recognition and masked language modeling 9
nttcslab-nlp/doc_lm This repository contains source files and training scripts for language models. 12