wikibert

Language model library

Provides pre-trained language models derived from Wikipedia texts for natural language processing tasks

BERT models for many languages created from Wikipedia texts

GitHub

34 stars
12 watching
1 forks
last commit: over 4 years ago
Linked from 1 awesome list


Backlinks from these awesome lists:

Related projects:

Repository Description Stars
langboat/mengzi Develops lightweight yet powerful pre-trained models for natural language processing tasks 533
dbmdz/berts Provides pre-trained language models for natural language processing tasks 155
certainlyio/nordic_bert Provides pre-trained BERT models for Nordic languages with limited training data. 164
thunlp/openclap A repository of pre-trained language models for natural language processing tasks in Chinese 977
zhuiyitechnology/wobert A Word-based Chinese BERT model trained on large-scale text data using pre-trained models as a foundation 460
zhuiyitechnology/pretrained-models A collection of pre-trained language models for natural language processing tasks 989
ethan-yt/guwenbert Pre-trained language model for classical Chinese texts using RoBERTa architecture 511
baai-wudao/model A repository of pre-trained language models for various tasks and domains. 121
apache/opennlp-models Provides pre-trained binary models for natural language text processing across multiple languages 4
ncbi-nlp/bluebert Pre-trained language models for biomedical natural language processing tasks 560
kldarek/polbert A Polish BERT-based language model trained on various corpora for natural language processing tasks 70
ibm-granite/granite-3.0-language-models A collection of lightweight state-of-the-art language models designed to support multilinguality, coding, and reasoning tasks on constrained resources. 232
ymcui/macbert Improves pre-trained Chinese language models by incorporating a correction task to alleviate inconsistency issues with downstream tasks 646
peleiden/daluke A language model trained on Danish Wikipedia data for named entity recognition and masked language modeling 9
nttcslab-nlp/doc_lm This repository contains source files and training scripts for language models. 12