Eyeing Empirical Evidence for Subtitling Conventions

Written by Juha Lång

Juha Lång, PhD, defended his dissertation on the reception of interlingual subtitles in February 2023 at University of Eastern Finland, Joensuu. Currently his research interests involve translation in academic research as a member of ReTra research group and, as a member of DECA project, the role of machine translation in enabling democratic epistemic capacities.

While subtitling has long traditions in many countries and there is a set of established rules or guidelines on what constitutes “good” subtitles, for a long time the scientific basis of these subtitling conventions has been unclear. Instead, it seems that the conventions have mostly evolved through practice inside boundaries created by technology. Now that subtitling has gained popularity in an exponential fashion, a change propelled by the explosive expansion of internet video streaming services during the last decade, empirical examination of subtitling conventions has become perhaps more topical than ever.

In my doctoral dissertation, which I successfully defended in February, I examined the reception of interlingual (television) subtitles. With the help of eye tracking, I wanted to find out how different aspects of subtitles and subtitling conventions affect the way people watch subtitled audiovisual material.

Although eye tracking is by all means not a new research method, as it has been used for decades in, for example, reading research, it is a fairly novel approach in translation studies. There are two main benefits of using eye tracking in researching the reception of subtitles. Firstly, eye tracking makes it possible to examine participants’ reactions to subtitles in real time while subtitles are viewed in their natural audiovisual context. Secondly, eye movements reflect cognitive processes (the eye-mind hypothesis, coined by Just and Carpenter in 1980), which means that by analyzing eye movements we can identify what features of subtitles or other elements in an audiovisual stimulus cause problems for processing.

My dissertation includes three empirical studies. The first experiment examined how contravening subtitling conventions affects viewer’s eye movements. The conventions examined included, for example, the rule that subtitles should be synchronous to the spoken audio while still being paced so that the reader has enough time to read them. The results highlighted the importance of synchrony: whenever viewers hear language spoken, they predict the appearance of subtitles by moving their eye towards the spot on the screen where subtitles normally appear. This usually happens almost immediately when a line is spoken, and when the viewer sees that no subtitles have appeared, the eyes are quickly moved back to the image. So, if the appearance of subtitles is delayed, a viewer’s gaze makes unnecessary back-and-forth movement. Another way of interpreting this result is that viewers get accustomed to the rhythm of subtitles and spoken audio. Disrupting this rhythm also disrupts the natural dynamic of the reading process.

The second experiment concentrated on comparing subtitles to spoken audio as an information source. The results suggest that subtitles can be as effective an information channel as spoken audio. Furthermore, the results revealed no drawback of following the subtitles to processing the image, meaning that participants who followed the subtitles could also process the image equally well to participants who did not follow the subtitles. It should be noted that the subtitles used in this study were composed and timed in accordance with the national Finnish conventions, which favor relaxed reading speed and text condensation over the amount of information. It can be speculated that if the subtitles were timed faster, the processing of images could have suffered.

In the final empirical study, my colleagues and I attempted to tease out, with the help of statistical modelling, what lexical and structural factors of subtitles have the greatest impact on the time viewers spend looking at the subtitles. The model revealed that structural length (total number of characters in a subtitle) and temporal duration are the main factors. At first glance, this result seems quite self-evident, as it makes perfect sense that the amount of text correlates with the time it takes to read it and that the on-screen time of a subtitle affects how long a viewer can spend reading the text. Nevertheless, it is interesting that the effects of the two factors are separate: the temporal duration of the subtitle correlates with the time viewers spend looking at it even when the structural length of the subtitle stays the same. In other words, even a short subtitle consisting of one or two words draws viewers’ gaze longer if the on-screen time of the subtitle increases.

Overall, the studies give empirical proof for the validity and importance of subtitling conventions, if the aim is to create subtitles that support the viewing experience instead of disturbing it. In particular, subtitlers should pay attention to the synchronization between the audio and subtitles, as well as to the speed of subtitles.

Although some studies have shown that, in certain contexts, viewers are able to process subtitles that are paced much faster than the current general European recommendations, there is convincing proof that the amount of text on the screen correlates heavily with the time viewers spend looking at the subtitles. Text is a very effective attractor of gaze, and in an audiovisual product, subtitles are often prioritized at the expense of the image. This means that with fast paced subtitles, viewers have very little time to inspect the image as the majority of the time is spent reading the subtitles.


