folia
Linguistic annotation format
A standardized format for annotating and exchanging linguistic data in XML.
FoLiA: Format for Linguistic Annotation - FoLiA is a rich XML-based annotation format for the representation of language resources (including corpora) with linguistic annotations. A wide variety of linguistic annotations are supported, making FoLiA a useful format for NLP tasks and data interchange. Note that the actual Python library for processing FoLiA is implemented as part of PyNLPl, this contains higher-level tools that use the library as well as the full documentation, validation schemas, and set definitions
61 stars
13 watching
10 forks
Language: Python
last commit: 6 months ago
Linked from 1 awesome list
computational-linguisticscorpusfile-formatfolialanguagelibrarylinguistic-annotation-frameworklinguisticsnlppythonxml
Related projects:
Repository | Description | Stars |
---|---|---|
proycon/foliapy | A comprehensive Python library for parsing and processing FoLiA documents used in Natural Language Processing. | 18 |
proycon/flat | A web-based tool for annotating and managing linguistic documents using the FoLiA format. | 110 |
languagemachines/libfolia | A C++ library for working with linguistic annotation formats | 15 |
proycon/foliadocserve | Provides an HTTP-based backend for annotating and serving FoLiA documents using the FoLiA Query Language. | 6 |
proycon/pynlpl | A Python library for natural language processing tasks, including text manipulation and analysis. | 479 |
synyi/poplar | A web-based annotation tool for natural language processing (NLP) | 519 |
cidles/poio-analyzer | A collection of software tools for linguists to manage and analyze linguistic data | 13 |
cidles/poio-api | A Python library for converting linguistic data from various formats into unified annotation graphs. | 18 |
brendano/gfl_syntax | Software supporting a lightweight dependency-style annotation language. | 8 |
weitechen/anafora | An annotation tool designed to simplify and standardize data annotation processes across various schemas and workflows. | 241 |
cesine/corporaforfieldlinguistics | A collection of small datasets from various languages to test and evaluate NLP scripts | 3 |
proycon/python-frog | A Python binding to a C++ NLP tool for Dutch language processing tasks | 47 |
korpling/salt | A flexible data model and API for representing linguistic data in a language-independent and theory-neutral way. | 15 |
lucasrla/remarks | Tools for extracting and converting annotations from annotated PDFs and ePubs to Markdown, PDF, PNG, or SVG formats. | 357 |
pld-linux/apertium-dict-es-gl | A dictionary file for machine translation between two languages using a specific rule-based machine translation system | 1 |