The methods and procedures applied for the selection of a Word of the Year (WY) in the various countries vary greatly. They range from the selection by an expert panel or the general public to linguistically grounded, corpus-based methods. The WY initiative in Switzerland represents the latter group and will be introduced in this talk with a focus on data, methods, and tools.
Since 2017, the selection of the WY in Switzerland is operated by the ZHAW School of Applied Linguistics. A short list of candidates in all four national languages is compiled by applying three procedures (Perrin et al. 2021: 174) which all emphasize that corpora are a necessary instrument to identify salient patterns in public language use: (1) corpus-linguistic analysis of journalistic media, (2) collection of candidates suggested by the general public, and (3) collection of candidates suggested by a multilingual consortium of language professionals. Crucially, candidates brought up in (2) and (3) are examined corpus linguistically.
The corpus-based selection and examination of candidates is based on Swiss-AL (Applied Linguistics), a multilingual collection of corpora for the analysis of societally relevant language use in Switzerland (Krasselt et al., 2020). It is the most extensive collection of its kind (currently approx. 4.5 billion words) and includes a linguistic processing pipeline and a browser-based analysis workbench to access the corpora (Krasselt et al., 2021). For the selection of WY candidates, Swiss-AL media subcorpora for German, French, Italian, and Rhaeto-Romance are used. They contain Swiss journalistic media of national and regional scope from the last five years and are provided for academic use by the Swiss Media Database and, to a smaller extent, are collected in a web-crawling procedure. All corpora are processed linguistically with the Swiss-AL pipeline (cf. Krasselt et al. 2020) and published on the Swiss-AL workbench which is used within committee meetings to empirically validate candidates suggested by the public and by the language professionals. From the perspective of science communication, the workbench is a crucial element to make the selection process transparent to the public.
More specifically, candidates for the WY are identified/examined with the following corpus linguistic methods:
(1) Keyword analysis: comparison of vocabulary used in the current year with vocabulary used in the previous year(s)
(2) Frequency analysis: distribution of candidates througout the year
(3) Identification of words not used in the year(s) before.
In this talk, the Swiss-AL media subcorpora and the associated workbench will be introduced as an empirical database for the WY in Switzerland alongside the methods applied to create a short list of WY candidates.
References
Krasselt, J., Dreesen, P., Fluor, M., Mahlow, C., Rothenhäusler, K., & Runte, M. (2020). Swiss- AL: A Multilingual Swiss Web Corpus for Applied Linguistics. Proceedings of the 12th LREC, 4138-4144.
Perrin, D., Whitehouse, M., Liste Lamas, E., & Kriele, Ch. (2021). Diskursforschung im Schaufenster. Ein transdisziplinärer Ansatz zur Ermittlung und Vermittlung von Wörtern des Jahres. In: Dreesen, P. & Stücheli-Herlach, P. (ed.): Zeitschrift für Diskursforschung 8/2-3, p. 164-189.