FakeNewsCorpus
News corpus
A large dataset of news articles with labeled categories to train fake news recognition algorithms
A dataset of millions of news articles scraped from a curated list of data sources.
387 stars
16 watching
97 forks
last commit: almost 5 years ago artificial-intelligencecorpusdatabasedatasetfakenewsmachine-learningnatural-language-processingnlp
Related projects:
Repository | Description | Stars |
---|---|---|
cluebenchmark/cluecorpus2020 | A large-scale pre-training corpus for Chinese language models | 925 |
christos-c/bible-corpus | A multilingual parallel corpus created from translations of the Bible. | 176 |
chatopera/insuranceqa-corpus-zh | An insurance industry conversation corpus with pre-processed data for natural language processing and question answering tasks. | 1,020 |
nytud/hucopa | A dataset of Hungarian translations of English 'cause-and-effect' questions with plausible alternative answers | 1 |
poltextlab/hunempoli_corpus | A manually annotated corpus for training and testing machine learning models of Aspect Based Sentiment Analysis (ABSA) in Hungarian language. | 0 |
rifkybujana/fnd | A machine learning-based system to predict whether news articles are fake or not | 8 |
zake7749/gossiping-chinese-corpus | A collection of question-answer pairs extracted from online Chinese forums. | 238 |
rowanz/grover | A framework for defending against neural fake news through both generation and detection of fake news articles. | 917 |
certainlyio/corona_dataset | A collection of data to train chatbots on COVID-19-related questions | 11 |
blairconrad/selfinitializingfakes | A framework for creating reusable fake objects with persistent behavior after the initial setup | 11 |
bertez/corpora | A collection of Galician language data in JSON format. | 2 |
cyberboysumanjay/inshorts-news-api | An unofficial API to fetch news content from Inshorts using Flask and Python. | 226 |
ibm/max-news-text-generator | Generates English-language text similar to news articles using machine learning and natural language processing techniques. | 26 |
jbaiter/archiscribe-corpus | A repository of transcribed 19th century German texts from various sources. | 8 |
josecannete/spanish-corpora | A collection of unannotated Spanish text data, compiled from various sources and processed for natural language processing tasks. | 92 |