Open Language Data Initiative

The contents of this card can be edited in the source repository.

Dataset card

Description

FLORES+ in Standard Moroccan Tamazight

License

CC-BY-SA-4.0

Attribution

@article{nllb-22,
    title = {No Language Left Behind: Scaling Human-Centered Machine Translation},
    author = {{NLLB Team} and Costa-jussà, Marta R. and Cross, James and Çelebi, Onur and Elbayad, Maha and Heafield, Kenneth and Heffernan, Kevin and Kalbassi, Elahe and Lam, Janice and Licht, Daniel and Maillard, Jean and Sun, Anna and Wang, Skyler and Wenzek, Guillaume and Youngblood, Al and Akula, Bapi and Barrault, Loic and Mejia-Gonzalez, Gabriel and Hansanti, Prangthip and Hoffman, John and Jarrett, Semarley and Sadagopan, Kaushik Ram and Rowe, Dirk and Spruit, Shannon and Tran, Chau and Andrews, Pierre and Ayan, Necip Fazil and Bhosale, Shruti and Edunov, Sergey and Fan, Angela and Gao, Cynthia and Goswami, Vedanuj and Guzmán, Francisco and Koehn, Philipp and Mourachko, Alexandre and Ropers, Christophe and Saleem, Safiyyah and Schwenk, Holger and Wang, Jeff},
    year = {2022},
    eprint = {arXiv:1902.01382},
}

Language codes

Additional language information

Reference dictionary: IRCAM’s Dictionnaire Général de la Langue Amazighe Informatisé.

Workflow

This data was released as part of the FLORES-200 dataset, where it was incorrectly labeled tzm_Tfng. It was relabeled as zgh_Tfng after community feedback and additional quality assessment. Please refer to the paper for further information.