An innovative approach to setting standards and testing claims of CEFR alignment across multiples languages

This submission has open access

Abstract Summary

Little research has been produced to evaluate the comparability of alignment claims of tests of different languages that target the same levels of the Common European Framework of Reference. This presentation describes an innovative procedure to put standard-setting results for reading tasks in English, French, Spanish, and German onto a common scale. Participants in this experimental study first carried out standard setting using a Modified Angoff method with reading tasks in English. They then split into three groups to carry out standard setting using the same method on reading tasks in one of the other three languages mentioned above. Exemplar tasks of reading comprehension collected by the Council of Europe and made freely available to facilitate best practice in linking exams to the CEFR were used. The standard-setting judgements were then analysed in a concurrent data matrix using multi-facet Rasch model (MFRM) analysis. While this experimental study is not intended as definitive evidence for any of the test developer's claims, it is a potentially useful method for programs that produce tests targeting the same CEFR levels in different languages to gather evidence to support those claims.

Submission ID :

AILA1229

Submission Type

Oral Presentation

Select a Symposium

[SYMP40] Language futures: tensions and synergies in the use of standards in language assessment in a multi/plurilingual world

Argument :

The Common European Framework of Reference has become a key resource in language education and assessment not only within Europe but internationally. A key goal of the Common European Framework of Reference was to provide "a common basis for the elaboration of language syllabuses, curriculum guidelines, examinations, textbooks, etc. across Europe" (Council of Europe, 2001). Since its launch, a great deal of work has been put into the methodology for aligning an exam to the CEFR and some work into ensuring the comparability of claims of CEFR alignment of different foreign language exams targeting the same language. However, little research has been produced to facilitate the comparability of exams claiming alignment to the CEFR across different languages that target the same CEFR levels. This presentation presents evidence from an innovative pilot procedure to link standard setting results for tasks across English, French, Spanish, and German to the CEFR, and to compare the results to the original claims of alignment with the CEFR from the test developers. Exemplar tasks of reading comprehension that were collected by the Council of Europe and made freely available to facilitate best practice in linking exams to the CEFR were used. Participants in this experimental study first carried out standard setting using a Modified Angoff method with reading tasks in English. They then split into four groups to carry out standard setting using the same method on reading tasks in one of the other three languages mentioned above. The English tasks thus acted as an anchor set, linking all judges. The standard-setting judgement data were then pooled and analysed in a concurrent data matrix using a multi-facet Rasch model (MFRM) analysis which allowed the results to be placed onto a common scale. The common scale approach allows for a comparison of difficulty using the Rasch logit scale. Thus test tasks for different languages that are posited to be at a B2 level of difficulty, for example, can be compared in terms of their difficulty estimation on a common scale by the four standard-setting panels (English, French, Spanish and German). The results show that the original CEFR levels posited by the different test developers generally held in practice. It is important to note that this was an experimental procedure designed to investigate the potential of this innovative methodology, and it not intended as definitive evidence for any of the test developer's claims. It is presented as a potentially useful method for language education programs that need to ensure that assessments they produce for targeting the same CEFR levels in different languages can be supported by standard-setting evidence. The methodology is feasible and sustainable for such programs with access to language education experts who can participate in standard-setting panels for more than one language.