Dataset card
Description
FLORES+ dev and devtest sets in Asturian.
License
CC-BY-SA-4.0
Attribution
@inproceedings{wmt24-spain,
title="Expanding the FLORES+ Multilingual Benchmark with Translations for {Aragonese, Aranese, Asturian, and Valencian}",
author="Juan Antonio Perez-Ortiz and Felipe S{\'a}nchez-Martínez and Víctor M. S{\'a}nchez-Cartagena and Miquel Esplà-Gomis and Aaron Galiano Jimenez and Antoni Oliver and Claudi Aventín-Boya and Alejandro Pardos and Cristina Valdés and Jus{\'e}p Loís Sans Socasau and Juan Pablo Martínez",
booktitle = "Proceedings of the Ninth Conference on Machine Translation",
month = nov,
year = "2024",
address = "Miami, USA",
publisher = "Association for Computational Linguistics"
}
With the support of the R+D+i projects PID2021-127999NB-I00 (LiLowLa: Lightweight neural translation technologies for low-resource languages) and PID2021-124663OB-I00 (TAN-IBE: Neural Machine Translation for the Romance languages of the Iberian Peninsula), funded by MCIN /AEI /10.13039/501100011033 / FEDER, UE.
Language codes
- ISO-639-3: ast
- ISO 15924: Latn
- Glottocode: astu1246
Additional language information
Workflow
The Asturian sentences were originally obtained by Meta via professional translation from English using professional translators as part of their no language left behind. We had this translation into Asturian reviewed by native speakers, some of whom are members of the Academia de la Llingua Asturiana, philologists and a renowned writer, translator, and activist for the Asturian language. The revision process was carried out twice by different people. In the first round, the reviewers were presented with the Spanish text and the existing version of the Asturian FLORES+, and in the second round, with the Spanish FLORES+ version and the first revised version.
Additional guidelines
The guidelines provided by the Academia de la Llingua Asturiana were followed to ensure that the translation into Asturian aligned with their recommendations.