ldc-word-aligner

Word aligner

A tool for annotating manual word alignments in parallel texts using Python.

The LDC Word Aligner is a Python-based tool used for annotating manual word alignments (or gold standard alignments). Sentence-segmented parallel texts are required as input.

GitHub

2 stars
3 watching
0 forks
Language: Python
last commit: over 6 years ago
Linked from 1 awesome list


Backlinks from these awesome lists:

Related projects:

Repository Description Stars
clab/fast_align A fast and simple unsupervised word aligner for generating parallel corpus alignments. 740
ldmt-muri/alignment-with-openfst An implementation of the CRF autoencoder framework for tasks in natural language processing and machine translation 21
lowerquality/gentle A tool for aligning speech with text by analyzing audio and providing an output transcript 1,471
josefnpat/reflowprint A library that enables character-by-character text alignment in real-time 46
machinalis/yalign Automates the process of extracting parallel sentences from comparable corpora to aid in statistical machine translation 127
guitarbum722/align An application and library for aligning text with flexible formatting options. 84
nigel2392/wagtail_text_alignment Enhances text alignment in Wagtail richtext editors with support for block entities. 4
thudm/longalign A framework for training and evaluating large language models on long context inputs 230
moses-smt/mgiza A C++ implementation of a word alignment tool with multi-threading and incremental training capabilities for machine translation. 161
richardlitt/lrl Developing tools and scripts to extract data from low-resource languages, focusing on language processing and machine learning applications. 2
lowresourcelanguages/champollion A toolkit providing ready-to-use parallel text sentence alignment tools for multiple language pairs. 18
martinsos/edlib A lightweight library for calculating sequence alignment using edit distance 517
artificiai/multilingual-latent-dirichlet-allocation-lda An LDA-based text clustering pipeline for multiple languages 82
raphael-group/paste2 Software to align and reconstruct 3D structures from overlapping spatial transcriptomics data 31
gao-lab/slat A software package for aligning single-cell spatial omics data using deep learning and graph neural networks 83