From task development to quality assurance: Using an iterative model to develop and improve tasks

This submission has open access

Abstract Summary

Developing comparable tasks and tests across multiple languages poses a challenge for language testing organizations in how they ensure that tasks are developed consistently across languages to ensure equity. However, the literature is limited in addressing the effectiveness of task developer training and the effectiveness of this training as perceived by the quality assurance advisors who review tasks for fidelity to task specifications both within and across specific languages and cultures.

This presentation examines the perceptions of stakeholders at the two ends of the task development process: the task developers themselves and the multi-lingual quality assurance advisors who review and revise such tasks for a multi-language, large-scale assessment administered to over 100,000 learners annually. First, an analysis of a short questionnaire sent to speaking and writing task developers and quality assurance advisors shows the relative importance of task in development, review and revision. Second, short interviews with developers and quality assurance advisors were coded and reviewed. Finally, the results are compared by stakeholder group and across languages to determine how to improve task development.

The presentation will specifically focus on how to improve task development training across languages in a multi-language test development approach.

Submission ID :

AILA1169

Select a Symposium

[SYMP40] Language futures: tensions and synergies in the use of standards in language assessment in a multi/plurilingual world

Argument :

Developing a task-based language assessment (TBLTA) instrument begins by developing test specifications that result in tasks that articulate into a cohesive whole. In an assessment situation, such tasks must both individually and together present a solid picture of what the test taker can do with the language. While Long (2016), Winke (2014) and others have focused primarily on classroom teaching and formative assessment as settings for TBLTA, large-scale assessments can also mirror TBLT principles not only in task development but also throughout the process, including task review. This presentation focuses on ways an existing, large-scale, multiple language assessment incorporates principles of TBLTA from item development through quality. The key to such development is ensuring a strong functional approach to test development and rating (Norris, 2016) and focuses on both tasks and the intersection of function, authenticity, reliability and practicality in developing a large-scale, TBLTA.

However, there is often a gap between what different stakeholders in the task development, rating and rating adjudication process attend to based on their differing roles and perspectives. While some of research (for example, Pill & Smart, 2020; Attali, 2016; Kuiken, & Vedder, 2014) has focused on test raters and their processes, rather less research (Rossi, O., & Brunfaut, T., 2018) examines the role of task developers and the quality assurance advisors who review tasks for fidelity to task specifications. Determining what task developers attend to during task development and as they revise these tasks can shed light on the effectiveness of the item development process as well as the training procedures used. Moreover, quality assurance advisors, who review and suggest revisions for tasks, provide insight on how the task developers adhere to task specifications and the effectiveness of training

This current study examines research conducted with both task developers and quality assurance advisors working with a multi-language, large-scale assessment administered to over 100,000 learners annually. First, an analysis of a short questionnaire sent to speaking and writing task developers (N=20), and quality assurance advisors (N=26) shows the relative importance of task in development, review and revision. Next, short interviews with developers, and quality assurance advisors (N=10) were coded and reviewed. Finally, the results are compared by stakeholder group and across languages to determine what can be improved in task development.

The presentation will specifically focus on how to improve task development training across languages in a multi-language test development approach.

Kremmel, B., Eberharter, K., Holzknecht, F., & Konrad, E. (2018). Fostering language assessment literacy through teacher involvement in high-stakes test development. In Teacher involvement in high-stakes language testing (pp. 173-194). Springer, Chap.

Long, M. H. (2016). In defense of tasks and TBLT: Nonissues and real issues. Annual Review of Applied Linguistics, 36, 5-33.

Norris, J. M. (2016). Current uses for task-based language assessment. Annual Review of Applied Linguistics, 36, 230-244.

Rossi, O., & Brunfaut, T. (2018). Test item writers. The TESOL Encyclopedia of English Language Teaching, 1-7

Winke, P. M. (2014). Formative, task-based oral assessments in an advanced Chinese-language class. In Technology-mediated TBLT (pp. 263-294). John Benjamins.

Associated Sessions

[SYMP40] Language Futures: Tensions And Synergies In The Use Of Standards In Language Assessment In A Multi/plurilingual World

Primary Author
Co-Authors

She/Her Celia Zamora

Director, Professional Learning and Certification

ACTFL

She/Her Margaret Malone

Director of Assessment and Research

ACTFL

She/Her Caroline Favero

ACTFL

27 hits