folia
Linguistic annotation format
A standardized format for annotating and exchanging linguistic data in XML.
FoLiA: Format for Linguistic Annotation - FoLiA is a rich XML-based annotation format for the representation of language resources (including corpora) with linguistic annotations. A wide variety of linguistic annotations are supported, making FoLiA a useful format for NLP tasks and data interchange. Note that the actual Python library for processing FoLiA is implemented as part of PyNLPl, this contains higher-level tools that use the library as well as the full documentation, validation schemas, and set definitions
61 stars
13 watching
10 forks
Language: Python
last commit: 9 months ago
Linked from 1 awesome list
computational-linguisticscorpusfile-formatfolialanguagelibrarylinguistic-annotation-frameworklinguisticsnlppythonxml
Related projects:
Repository | Description | Stars |
---|---|---|
| A comprehensive Python library for parsing and processing FoLiA documents used in Natural Language Processing. | 18 |
| A web-based tool for annotating and managing linguistic documents using the FoLiA format. | 111 |
| A C++ library for working with linguistic annotation formats | 16 |
| Provides an HTTP-based backend for annotating and serving FoLiA documents using the FoLiA Query Language. | 6 |
| A Python library for natural language processing tasks, including text manipulation and analysis. | 479 |
| A web-based annotation tool for natural language processing (NLP) | 520 |
| A collection of software tools for linguists to manage and analyze linguistic data | 13 |
| A Python library for converting linguistic data from various formats into unified annotation graphs. | 18 |
| Software supporting a lightweight dependency-style annotation language. | 8 |
| An annotation tool designed to simplify and standardize data annotation processes across various schemas and workflows. | 241 |
| A collection of small datasets from various languages to test and evaluate NLP scripts | 3 |
| A Python binding to a C++ NLP tool for Dutch language processing tasks | 47 |
| A flexible data model and API for representing linguistic data in a language-independent and theory-neutral way. | 15 |
| Tools for extracting and converting annotations from annotated PDFs and ePubs to Markdown, PDF, PNG, or SVG formats. | 359 |
| A dictionary file for machine translation between two languages using a specific rule-based machine translation system | 1 |