folia

Linguistic annotation format

A standardized format for annotating and exchanging linguistic data in XML.

FoLiA: Format for Linguistic Annotation - FoLiA is a rich XML-based annotation format for the representation of language resources (including corpora) with linguistic annotations. A wide variety of linguistic annotations are supported, making FoLiA a useful format for NLP tasks and data interchange. Note that the actual Python library for processing FoLiA is implemented as part of PyNLPl, this contains higher-level tools that use the library as well as the full documentation, validation schemas, and set definitions

GitHub

61 stars
13 watching
10 forks
Language: Python
last commit: 6 months ago
Linked from 1 awesome list

computational-linguisticscorpusfile-formatfolialanguagelibrarylinguistic-annotation-frameworklinguisticsnlppythonxml

Backlinks from these awesome lists:

Related projects:

Repository Description Stars
proycon/foliapy A comprehensive Python library for parsing and processing FoLiA documents used in Natural Language Processing. 18
proycon/flat A web-based tool for annotating and managing linguistic documents using the FoLiA format. 110
languagemachines/libfolia A C++ library for working with linguistic annotation formats 15
proycon/foliadocserve Provides an HTTP-based backend for annotating and serving FoLiA documents using the FoLiA Query Language. 6
proycon/pynlpl A Python library for natural language processing tasks, including text manipulation and analysis. 479
synyi/poplar A web-based annotation tool for natural language processing (NLP) 519
cidles/poio-analyzer A collection of software tools for linguists to manage and analyze linguistic data 13
cidles/poio-api A Python library for converting linguistic data from various formats into unified annotation graphs. 18
brendano/gfl_syntax Software supporting a lightweight dependency-style annotation language. 8
weitechen/anafora An annotation tool designed to simplify and standardize data annotation processes across various schemas and workflows. 241
cesine/corporaforfieldlinguistics A collection of small datasets from various languages to test and evaluate NLP scripts 3
proycon/python-frog A Python binding to a C++ NLP tool for Dutch language processing tasks 47
korpling/salt A flexible data model and API for representing linguistic data in a language-independent and theory-neutral way. 15
lucasrla/remarks Tools for extracting and converting annotations from annotated PDFs and ePubs to Markdown, PDF, PNG, or SVG formats. 357
pld-linux/apertium-dict-es-gl A dictionary file for machine translation between two languages using a specific rule-based machine translation system 1