MultilingualCorporaExtractor

Corpora extractor

Extracts and formats multilingual corpora from international bibles into XML, JSON, and HTML files for analysis.

Node io Spider for extracting multilingual corpora

GitHub

0 stars
3 watching
0 forks
Language: JavaScript
last commit: over 11 years ago
Linked from 1 awesome list


Backlinks from these awesome lists:

Related projects:

Repository Description Stars
fielddb/lexiconwebservicesample A Node.js web server implementing a lexicon API for the Drag and Drop FieldLinguistics project 1
fielddb/dictionarychromeextension Provides a Chrome extension and associated server for accessing definitions from Wiktionary 6
fielddb/lucenerevolution-2013 This project provides demo examples and tools for exploring linguistic features in Lucene and Solr, two popular search engine technologies. 0
danburzo/hred Extracts data from HTML or XML documents to JSON using a CSS selector-like query language 69
fielddb/corpuswebservice Enables CORS requests to connect to CouchDB from other domains 0
fielddb/fielddb An app for managing and sharing text and audio data in various contexts, adaptable to users' terminology and I-Language. 79
nissl-lab/toxy A .NET framework for extracting text from various document formats across multiple platforms. 359
fielddb/lex4all Tool for automating pronunciation lexicon creation for low-resource languages using speech recognition and machine learning algorithms. 1
fielddb/fielddblexicon A web-based interface for browsing and editing lexical data in FieldDB databases 0
lastcallmedia/composerextrafiles Allows dependencies to be downloaded with specific files extracted and installed during package installation 0
mainmatter/ember-intl-analyzer Identifies unused translations in Ember.js projects to help maintain consistency and accuracy of internationalization. 48
fielddb/lexiconwebservice A Node.js service that uses a morphological analyzer to generate morphemes and glosses for words 0
fielddb/languageclassdashboard A web application dashboard for tracking language learning metrics and statistics. 0
knowitall/reverb Extracts binary relationships from English sentences at scale 543
fielddb/octothorpe A CouchDB-powered wiki application with a jQuery interface. 0