MultilingualCorporaExtractor
Corpora extractor
Extracts and formats multilingual corpora from international bibles into XML, JSON, and HTML files for analysis.
Node io Spider for extracting multilingual corpora
0 stars
3 watching
0 forks
Language: JavaScript
last commit: over 12 years ago
Linked from 1 awesome list
Related projects:
| Repository | Description | Stars |
|---|---|---|
| | A Node.js web server implementing a lexicon API for the Drag and Drop FieldLinguistics project | 1 |
| | Provides a Chrome extension and associated server for accessing definitions from Wiktionary | 6 |
| | Demos and examples for utilizing linguistics in natural language processing with Lucene and Solr | 0 |
| | Extracts data from HTML or XML documents to JSON using a CSS selector-like query language | 70 |
| | Enables CORS requests to connect to CouchDB from other domains | 0 |
| | An app for managing and sharing text and audio data in various contexts, adaptable to users' terminology and I-Language. | 79 |
| | A .NET framework for extracting text from various document formats across multiple platforms. | 362 |
| | Tool for automating pronunciation lexicon creation for low-resource languages using speech recognition and machine learning algorithms. | 1 |
| | A web-based interface for browsing and editing lexical data in FieldDB databases | 0 |
| | Allows dependencies to be downloaded with specific files extracted and installed during package installation | 0 |
| | Identifies unused translations in Ember.js projects to help maintain consistency and accuracy of internationalization. | 48 |
| | A Node.js service that uses a morphological analyzer to generate morphemes and glosses for words | 0 |
| | A web application dashboard for tracking language learning metrics and statistics. | 0 |
| | Extracts binary relationships from English sentences at scale | 543 |
| | A CouchDB-powered wiki application with a jQuery interface. | 0 |