hungarian-text-mining-workshop

Text Mining Workshop

An introductory text mining workshop using Python and spaCy to extract insights from Hungarian text data

Materials for the Text Mining workshop held in the HuNLP meetup, June 2017

GitHub

20 stars
3 watching
5 forks
Language: Jupyter Notebook
last commit: over 2 years ago
Linked from 1 awesome list

classificationhungarianinformation-extractionkeyword-extractionmachine-learningmeetupnatural-language-processingnlppythonscikit-learnsentiment-analysisspacyspacy-modelstext-miningtext-mining-workshoptextacytutorialworkshop

Backlinks from these awesome lists:

Related projects:

Repository Description Stars
huspacy/huspacy An industrial-strength natural language processing library for Hungarian language text analysis 158
sergioburdisso/pyss3 A Python package implementing an interpretable machine learning model for text classification with visualization tools 336
explosion/spacy-stanza Wraps the Stanza NLP library to use Stanford models with spaCy 726
sedthh/lara-hungarian-nlp A lightweight Python library for natural language processing in Hungarian 29
jsksxs360/how-to-use-transformers A comprehensive guide to using the Transformers library for natural language processing tasks 1,220
proger/uk4b Develops pretraining and finetuning techniques for language models using metadata-conditioned text generation 18
amakukha/stemmers_ukrainian A novel stemmer for the Ukrainian language trained with AI 28
chartbeat-labs/textacy A Python library providing NLP tools and utilities built on top of spaCy for text processing and analysis. 2,217
eyurtsev/kor An open-source wrapper around LLMs to extract structured data from text 1,638
explosion/spacy-lookups-data Provides additional data resources for spaCy's natural language processing capabilities 98
dask/dask-ml A Python library for scalable machine learning using Dask alongside popular ML libraries 907
tyson925/magyarlanc_spark A Spark-based tool for processing Hungarian text data with Magyarlanc language processing features and optional integration with ElasticSearch. 4
ermlab/politbert Trains a language model using a RoBERTa architecture on high-quality Polish text data 33
kefirski/bytenet A Pytorch implementation of a neural network model for machine translation 47
nytud/emtsv A text processing system designed to handle various tasks in Hungarian language processing using Python and TSV-based data exchange. 28