While the number of studies on language learning with authentic audiovisual input has exponentially increased in recent years, research on pronunciation and L2 captioned video is scarce and has produced mixed findings (Montero Perez, 2022). In particular, the availability of verbatim captions may support speech segmentation and the mapping of auditory forms onto written forms, but a manipulation of the audiovisual input's properties may be necessary to increase the salience of specific phonemic features and promote pronunciation learning. This paper reports on a mixed-method small-scale study conducted within a larger project on intermediate learners' processing of authentic audiovisual input with and without pronunciation-focused input enhancement. We recorded the eye movements of 11 Spanish learners of English (age 15) as they watched two video clips from a TV series, in one of which regular past verb forms highlighted in captions in synchrony with their auditory onset. After the viewing, participants underwent stimulated recall, i.e., viewed their own eye gaze behavior and were asked to recall why they fixated or skipped some words or other areas of the screen. We pre- and post- tested regular past -ed pronunciation accuracy through a word reading task. An X-Lex measured participants' vocabulary size, and a questionnaire explored their English learning background and explicit knowledge of regular past pronunciation. The significantly longer fixations and reduced skipping of enhanced words pointed at the noticing of these words, although no gains were evident in terms of word reading performance. The stimulated recall data suggest that participants consciously tried (and not always managed) to avoid reading the captions, except when they struggled parsing and understanding speech or expected the speaker to deliver important information. Overall, processing efforts seemed mostly directed at making sense of content, rather than linguistic form (Van Patten, 2004). Students did notice the enhancement of past -ed verbs, but in terms of grammatical function rather than pronunciation. Our results suggest that combining caption enhancement with pre- and post-viewing activities, as well as explicit teaching, may be necessary to direct learners' attention to pronunciation during exposure to L2 captioned video. Further analyses will provide a more comprehensive picture of the factors at play in learners' auditory and visual processing of authentic L2 captioned video.
References:
Montero Perez, M. (2022). Second or foreign language learning through watching audio-visual input and the role of on-screen text. Language Teaching,55(2), 163-192.
VanPatten, B. (2004). Input processing in SLA. In B. VanPatten (Ed.), Processing instruction: Theory, research, and commentary (pp. 5–31). Mahwah, NJ: Lawrence Erlbaum.