You are here: Home / The IPN / Archive / Less is more: Irrelevant Information Affects Diagram Reading

Less is more: Irrelevant Information Affects Diagram Reading

October 1st, 2019

(Translation of an article in: IPN Journal No 4)

The effects of visual design features on cognitive processes in the solving of tasks with diagrams

Benjamin Strobel


Diagrams have found their way into almost all areas of daily life - school, the Internet, books and magazines, television and so on. Using eye movement analysis (eye tracking), an IPN study investigated the effects that design features have on the processing of tasks with diagrams. The ability to analyze and interpret scientific data presented in different manners and to deduce appropriate conclusions from them is considered essential these days, as information is fast and easily accessible due to increasing digitalization. Reading and understanding data and forms of presentation are therefore a necessary part of young people's education, regardless of whether they want to take up a technical or scientific profession. Today’s students must already have the ability to analyze data.

The presentation of data in diagrams is common and widespread because it offers many advantages. For example, it is more difficult to read numbers from a flowing text and mentally contextualize them than to read them in a diagram. In my study, I investigated the effects of design features of a diagram on cognitive processes in the processing of diagram reading tasks. In school, these effects play a significant role because teachers often use ready-made material which has not been evaluated as suitable for the intended use. Diagrams contain numerous graphical elements (e.g., axes, labels, legends and a coordinate system) and can also represent a large number of data points that can be read and interpreted at several levels of information extraction. This number of elements and relations can put a great strain on students' working memory. If the data presented in diagrams is based on a complex structure (which, for example, leads to a high number of data points and variables and their relations), this data complexity can lead to a high cognitive load on learners.

The layout of a diagram can also make it easier or more difficult for learners to solve the task at hand. For example, the appropriate use of labels and color coding can help learners identify groupings of elements in the diagram and decrease their overall cognitive load. On the other hand, superfluous elements can increase the cognitive load, for example when data series or variables are completely irrelevant to a given task. Magazines, books and the Internet often contain diagrams supplemented by texts and illustrations to provide additional information or to make the material graphically appealing. Interesting but superfluous additional information, seductive details that are used to make material appealing, can increase the interest of learners, but usually lead to students retaining less of what they have learned.

In my study, I investigated the effects of visual design features on cognitive processes when solving problems with diagrams. I was particularly interested in the following three questions:

  1. Are people able to use the diagram best suited to a given task to complete that task?

  2. Does high data complexity affect the execution and performance of a task, even if additional data is completely irrelevant to the execution of the task?

  3. Does interesting but irrelevant additional content (seductive details) in diagrams affect the handling and performance of people working on diagram reading tasks?

In the following I will go into more detail on the second question.


Eye tracking is a method of studying the spatial temporal distribution of eye movements. In contrast to typical performance measures such as error rates and processing times, eye movement is a (continuous) process measure. Eye movement patterns are used as indicators for underlying cognitive processes. Eye tracking is particularly suitable to study the reading process of diagrams, because spatial structures in diagrams closely correspond to functional structures, so that eye movement patterns can easily be assigned to the underlying work processes. In addi-äwtion, the method is non-invasive and does not lead to additional strain on the working memory.

The effect of irrelevant data on problem solving

We know that the number of data points and data series affects the processing of tasks. A higher data complexity leads to longer processing times and in some cases to a higher error rate. This is because human memory is an information processing system with limited resources. Diagram reading can be particularly challenging when data complexity is not appropriate for the given task.

Previous studies did not differentiate or explicitly break down whether the additional data points, which increase complexity, were relevant for the given tasks. If, for example, a trend is to be identified in a data series and this is to be extended by further data points, these must of course be taken into account for the investigation and interpretation of the trend. This means that the inherent complexity of the task is inevitably increased by the addition of further points. On the other hand, there are cases in which additional data points are irrelevant for a task, for example when two data series are given but a task explicitly refers to only one of the two data series. This raises the question of whether task irrelevant data points also increase cognitive load and complicate processing. This is particularly relevant for the use of diagrams in teaching, since teachers sometimes have access to ready-made material that can contain information not relevant for the actual learning objective. The question therefore arises whether additional data increase processing time, error rate and cognitive load even if they are completely irrelevant to the task at hand. There was an assumption that the effects of irrelevant data points differed from those of an irrelevant data series. Because people with additional data points must check the labels on all points to distinguish relevant from irrelevant points, task processing should take more time, be more error-prone, and lead to a higher cognitive load. If an additional - irrelevant - data series is added to an existing data series, a color coding of the legend allows the data points to be grouped into two data series. Therefore, people only need to check the legend to determine which of the two data series is relevant or not relevant for solving a given task. The negative effects of the irrelevant data on the error rate, processing time and cognitive load should therefore be lower than for task irrelevant data points.

In my study, students (N = 60) worked on computerized tasks for column diagrams in the classical multiple-choice format. The diagrams contained either

  a. irrelevant data points,
  b. an irrelevant data series or
  c. no task-irrelevant data.

All test persons received several tasks from each of the three groups (within-subject design). Respondents had to read several data points, compare them and choose the correct one from four answer alternatives to solve the tasks. Eye tracking was used to examine not only the error rate, processing time and cognitive load per task, but also the total fixation times for different areas of the diagram display (task material, labels, axes, relevant data and irrelevant data). The participants of this study submitted an assessment of their skills in diagram reading in a questionnaire.

What interferes more with the processing: irrelevant data points or irrelevant data series?

Small but significant effects of irrelevant data points on the processing time, the error rate and the perceived cognitive load were found during the completion of diagram reading tasks. The presentation of an irrelevant data series, on the other hand, resulted in effects that were initially surprising: For example, large performance losses were observed in the diagram reading tasks when an irrelevant data series could be seen, whereas this could have been easily identified by color coding. Contrary to expectations, participants needed significantly longer to complete the task, made more mistakes and showed a higher cognitive load. The analysis of the eye movement data provided a possible explanation for the initially unexpected findings: If irrelevant data points were visible, subjects spent more time reading the axis labels, as expected, but the same amount of time reading the task and reading the data. If, however, an additional data series could be seen in the diagram, not only the time spent reading the irrelevant data increased, but also the time spent on the task. This indicates that the presence of an additional data series has changed the requirements for the task. People may be confused by the overabundance of information and conclude that there is a reason for additional information. To identify the relevant data series, an additional (and error-prone) work step may have been necessary, namely the re-examination of the task text. Participants seemed to re-examine which data series they actually needed to solve the task.


One can summarize that task irrelevant data points in this study had only a small effect on the completion of diagram reading tasks. However, additional data series significantly changed the processing of the task and led to substantial performance losses. Therefore, it is reasonable that unnecessary cognitive stress in learning materials can be prevented by limiting representations to relevant data series or variables.