CroDi – The Regensburg diachronic corpus of Croatian

The corpus may be accessed via the ANNIS interface here

The Regensburg Diachronic Corpus of the Croatian Language, CroDi, served as the material basis for a project funded by the German Research Foundation called "Subjects II" (HA 2659/1-2). It was encoded within this project and further developed at the Institute of Slavic Studies at the University of Regensburg in collaboration with the Institute for the Croatian Language and Linguistics in Zagreb.

CroDi currently includes texts from the 16th to the 19th centuries in all varieties of Croatian (Štokavian, Čakavian, and Kajkavian), some of which are annotated with morphosyntactic categories. In addition to significant representatives of Croatian literature from this period, it also includes lesser-known authors and anonymous texts. Some of the texts are manually annotated. The principles for this were established by Roland Meyer, Björn Hansen, and Veronika Wald and are detailed in the article

Hansack, E., Hansen, B., Wald, V., Horvat, M. & S. Perić Gavrančić, S. (2016). Regensburški dijakronijski korpus hrvatskoga jezika – CroDi. In: Hrvatski jezik, 1-2, 2016, 1-20. (https://hrcak.srce.hr/clanak/234301)

Team:

Roland Meyer
Ernst Hansack
Björn Hansen
Marijana Horvat
Sanja Perić Gavrančić
Veronika Wald