hungarian-text-mining-workshop
Text Mining Workshop
An introductory text mining workshop using Python and spaCy to extract insights from Hungarian text data
Materials for the Text Mining workshop held in the HuNLP meetup, June 2017
20 stars
3 watching
5 forks
Language: Jupyter Notebook
last commit: over 2 years ago
Linked from 1 awesome list
classificationhungarianinformation-extractionkeyword-extractionmachine-learningmeetupnatural-language-processingnlppythonscikit-learnsentiment-analysisspacyspacy-modelstext-miningtext-mining-workshoptextacytutorialworkshop
Related projects:
Repository | Description | Stars |
---|---|---|
huspacy/huspacy | An industrial-strength natural language processing library for Hungarian language text analysis | 158 |
sergioburdisso/pyss3 | A Python package implementing an interpretable machine learning model for text classification with visualization tools | 336 |
explosion/spacy-stanza | Wraps the Stanza NLP library to use Stanford models with spaCy | 726 |
sedthh/lara-hungarian-nlp | A lightweight Python library for natural language processing in Hungarian | 29 |
jsksxs360/how-to-use-transformers | A comprehensive guide to using the Transformers library for natural language processing tasks | 1,220 |
proger/uk4b | Develops pretraining and finetuning techniques for language models using metadata-conditioned text generation | 18 |
amakukha/stemmers_ukrainian | A novel stemmer for the Ukrainian language trained with AI | 28 |
chartbeat-labs/textacy | A Python library providing NLP tools and utilities built on top of spaCy for text processing and analysis. | 2,217 |
eyurtsev/kor | An open-source wrapper around LLMs to extract structured data from text | 1,638 |
explosion/spacy-lookups-data | Provides additional data resources for spaCy's natural language processing capabilities | 98 |
dask/dask-ml | A Python library for scalable machine learning using Dask alongside popular ML libraries | 907 |
tyson925/magyarlanc_spark | A Spark-based tool for processing Hungarian text data with Magyarlanc language processing features and optional integration with ElasticSearch. | 4 |
ermlab/politbert | Trains a language model using a RoBERTa architecture on high-quality Polish text data | 33 |
kefirski/bytenet | A Pytorch implementation of a neural network model for machine translation | 47 |
nytud/emtsv | A text processing system designed to handle various tasks in Hungarian language processing using Python and TSV-based data exchange. | 28 |