Danish-Similarity-Dataset

Similarity data

A dataset for evaluating Danish word embeddings

Gold standard resource for evaluation of Danish word embedding models.

GitHub

8 stars
5 watching
0 forks
last commit: over 4 years ago
Linked from 1 awesome list

danishembedding-evaluationmanual-annotationssemantic-similarity

Backlinks from these awesome lists:

Related projects:

Repository Description Stars
alessandrogianfelici/danish_reviews_dataset A dataset of Danish reviews scraped from the internet to train sentiment classification models 2
bplank/danplus This repository provides code and data for a named entity recognition system for the Danish language, including tools for lexical normalization. 5
karthikncode/nlp-datasets A curated list of Natural Language Processing datasets used to train and evaluate NLP models. 919
sarnikowski/danish_transformers An open-source collection of Danish language models for natural language processing tasks 30
dcaribou/transfermarkt-datasets Extracts, prepares and publishes football dataset from Transfermarkt website 247
doukremt/distance Library for comparing sequences of characters with various distance metrics. 117
dsldk/danish-sentiment-lexicon A comprehensive lexicon of Danish words with sentiment polarity annotations 8
bbc/similarity Calculates similarity between pieces of text using TF-IDF weights 115
nytud/husst A dataset and benchmarking kit for evaluating language understanding in Hungarian 1
lexmag/simetric Facilities to calculate the distance and similarity between strings using various algorithms 61
steffan267/sentiment-analysis-on-danish-social-media This project provides annotated data and guidelines for fine-grained sentiment analysis on Danish social media comments. 7
rosejn/torch-datasets A collection of pre-processed machine learning datasets for use with the Torch7 deep learning framework. 37
kudkudak/word-embeddings-benchmarks Provides methods for evaluating word embeddings on various benchmarks 437
cbaggers/mk-string-metrics Provides efficient algorithms to calculate string similarity metrics 22
igobronidze/hrs_training_data Training data for a handwritten recognition system 20