MultilingualCorporaExtractor
Corpora extractor
Extracts and formats multilingual corpora from international bibles into XML, JSON, and HTML files for analysis.
Node io Spider for extracting multilingual corpora
0 stars
3 watching
0 forks
Language: JavaScript
last commit: over 11 years ago
Linked from 1 awesome list
Related projects:
Repository | Description | Stars |
---|---|---|
fielddb/lexiconwebservicesample | A Node.js web server implementing a lexicon API for the Drag and Drop FieldLinguistics project | 1 |
fielddb/dictionarychromeextension | Provides a Chrome extension and associated server for accessing definitions from Wiktionary | 6 |
fielddb/lucenerevolution-2013 | This project provides demo examples and tools for exploring linguistic features in Lucene and Solr, two popular search engine technologies. | 0 |
danburzo/hred | Extracts data from HTML or XML documents to JSON using a CSS selector-like query language | 69 |
fielddb/corpuswebservice | Enables CORS requests to connect to CouchDB from other domains | 0 |
fielddb/fielddb | An app for managing and sharing text and audio data in various contexts, adaptable to users' terminology and I-Language. | 79 |
nissl-lab/toxy | A .NET framework for extracting text from various document formats across multiple platforms. | 359 |
fielddb/lex4all | Tool for automating pronunciation lexicon creation for low-resource languages using speech recognition and machine learning algorithms. | 1 |
fielddb/fielddblexicon | A web-based interface for browsing and editing lexical data in FieldDB databases | 0 |
lastcallmedia/composerextrafiles | Allows dependencies to be downloaded with specific files extracted and installed during package installation | 0 |
mainmatter/ember-intl-analyzer | Identifies unused translations in Ember.js projects to help maintain consistency and accuracy of internationalization. | 48 |
fielddb/lexiconwebservice | A Node.js service that uses a morphological analyzer to generate morphemes and glosses for words | 0 |
fielddb/languageclassdashboard | A web application dashboard for tracking language learning metrics and statistics. | 0 |
knowitall/reverb | Extracts binary relationships from English sentences at scale | 543 |
fielddb/octothorpe | A CouchDB-powered wiki application with a jQuery interface. | 0 |