ldc-word-aligner
Word aligner
A Python-based tool for annotating manual word alignments in parallel texts.
The LDC Word Aligner is a Python-based tool used for annotating manual word alignments (or gold standard alignments). Sentence-segmented parallel texts are required as input.
2 stars
3 watching
0 forks
Language: Python
last commit: almost 7 years ago
Linked from 1 awesome list
Related projects:
Repository | Description | Stars |
---|---|---|
| A fast and simple unsupervised word aligner for generating parallel corpus alignments. | 740 |
| An implementation of the CRF autoencoder framework for tasks in natural language processing and machine translation | 21 |
| A tool for aligning speech with text by analyzing audio and providing an output transcript | 1,471 |
| A library that enables character-by-character text alignment in real-time | 46 |
| Automates the process of extracting parallel sentences from comparable corpora to aid in statistical machine translation | 127 |
| An application and library for aligning text with flexible formatting options. | 84 |
| Enhances text alignment in Wagtail richtext editors with support for block entities. | 4 |
| A framework for training and evaluating large language models on long context inputs | 230 |
| A C++ implementation of a word alignment tool with multi-threading and incremental training capabilities for machine translation. | 161 |
| Developing tools and scripts to extract data from low-resource languages, focusing on language processing and machine learning applications. | 2 |
| A toolkit providing ready-to-use parallel text sentence alignment tools for multiple language pairs. | 18 |
| A lightweight library for calculating sequence alignment using edit distance | 517 |
| An LDA-based text clustering pipeline for multiple languages | 82 |
| Software to align and reconstruct 3D structures from overlapping spatial transcriptomics data | 31 |
| A software package for aligning single-cell spatial omics data using deep learning and graph neural networks | 83 |