Using Eye Tracking to Advance Learning Processes
Diane Kucharczyk
Education is an integral part of our lives. It serves to prepare us for numerous aspects of life; especially careers. Education comes in many different forms, including teachings from parent to child, schooling, professional workshops, athletic training, and countless other examples. In most forms of education learners receive feedback to help improve their performance. Eye tracking is a scientific tool used to understand eye movements and how they correlate with cognitive processes. The key to this research is to design an experiment to understand and identify types of eye movements while people learn a new task.
This research focuses on visuomotor tasks, specifically, people learning to play the brainteaser game Rush Hour, by Binary Arts Corporation in Alexandria, VA. Rush Hour requires players to look (Figure 2) at an arrangement of vehicles on a grid and slide them as necessary to move a red car through the only exit from the grid, thus moving the red car out of gridlock. An experiment was designed to utilize an eye tracker to record the eye movements of fifteen subjects as they learned to play the game. The eye movements were compared between different feedback conditions, and skilled and unskilled players.
If eye tracking information were used effectively in similar environments and extended to other areas of learning, the impact on the rate of learning could be significantly greater than current popular methods. People could gain skills much quicker and resultantly learn more: People could not only learn about their initial task, but also potentially have more time to learn additional tasks/skills. Throughout the school year, the goal of this research was to design an experiment that would find if eye movements are a reliable metric for visuomotor learning and a useful tool to increase the speed of learning.
Eye movements under laboratory conditions are well understood by researchers. Human vision allows people to perceive a full visual field with the illusion of a high-resolution scene. However, the anisotropic retina only provides a small area called the fovea for high-resolution perception, while the majority of the visual field provides only low-resolution perception. People must make over 100,000 eye movements a day to provide this illusion throughout the visual field, yet people are typically unaware of them (1). These movements are classified as saccades, smooth-pursuits, VOR (vestibulur-ocular reflex), OKR (optokinetic reflex), vergence movements and miniature eye movements.
People acquire visual information with eye movements, therefore, they are important and beneficial to track. Eye movements are a good estimate of what people attend to, and are correlated with cognitive processes. Discovering how people use their eye movements to learn may contribute to efforts to improve methods of education and communication. It is useful to understand why people fixate on certain areas while ignoring others when extracting visual information. This is of great interest not only to the image makers and imaging systems engineers, but also to the communicators and educators. To name a few examples, learning takes place while viewing diagrams to assemble furniture, studying sketches in textbooks, reading maps, and examining images in fields such as law enforcement and medicine. Additionally, research has shown that eye movements change when their purpose is to gain information to direct another body movement such as that of the arm and hand (2). A change in the pattern of eye movements supports the hypothesis that they are a reliable metric of visuomotor learning that is task-dependent.
The general process of education is a slowly evolving field: Students sit in classrooms and most changes take effect within that environment. Learning is a process that can be described in the following manner: A student takes in and processes information from one or many sources, uses that information, and, again, intakes and processes. However, the succeeding stage of information acquisition is more of a response from that usage and/or additional information from the source(s). This cycle repeats itself as a student gains a greater understanding and/or skill level.
In order to evaluate students, many times they are asked to complete a written or oral examination, perform a particular hands-on task, or complete a paper or project proving they understand the material in question. The evaluation of a student's knowledge is often part of the cycle described above where the student uses the information. In many cases, students receive feedback in the form of a test grade or by being told where they were correct and where their mistakes were located. Students use that information to improve their knowledge and skill level.
Elizabeth Krupinski completed research that involved radiologists searching for abnormalities in medical X-rays. She eye tracked radiologists during their first look at an X-ray and gave them feedback on where their visual fixations were concentrated. With the new information the radiologists took a second look at the X-ray. A 20% increase in performance was found with this feedback versus a second look without the eye tracking information (3). Thus, eye movement feedback increased true-positive rates without increasing false-positive rates.
Hilde Haider performed research with Peter A. Frensch to explore information-reduction. They used an eye tracker to help determine if people ignore redundant information while learning to complete a task. They constructed experiments involving a string of alphabetically ordered letters and one number. The number was inserted to replace a group of letters, where the number represents the amount of letters that were replaced. The task given to the subjects was to determine if the number was correct (Figure 1). Since the only relevant information to the task is the letters on either side of the number, all the other letters were considered irrelevant information. Haider and Frensch hypothesized that the subjects would tend to fixate on the extraneous letters for shorter durations than in early trials. Haider's and Frensch's results supported their hypothesis, even when considering the subjects who continued to look at the irrelevant letters. Those subjects were expecting alphabetical ordering mistakes within the excess letters (4).
Julie Epelboim and others constructed an experiment to examine two types of looking tasks. The first task asked subjects to tap on two, four, or six targets while the second simply asked subjects to look at the targets. It was found that while tapping, subjects tended to have shorter fixations than when only looking at targets (2).
Each of the experiments mentioned included a visual search task and mental processing based on that search. However, each experiment focused on a specific element separate of the others, which are feedback, irrelevant information, and visuomotor actions. By combining those three elements it will be possible to understand how significant eye movements are during the learning process.
Eye movement patterns may be a strong indication of a subject's strategy to complete a task. The subjects in this experiment were instructed to perform brainteaser-type games as fast as possible. These games required people to look at objects in a confined spatial area and determine how to adjust them to solve the game. A separate group of subjects solved the same games, but were also be given feedback. This was aimed to show any difference in the rate of learning between the control group, which did not receive any feedback, and the groups that did receive feedback.
The hypothesis for this experiment is that eye movements are a reliable metric for learning and can be useful to increase the speed of learning. They may be utilized as a tool to understand how people learn and how to train them better either during traditional education or vocational training. As a result of understanding how people use their eyes to visually acquire information, eye movements may aid in building more effective imaging systems to convey information to people.
Each of the experiments mentioned included a visual search task and mental processing based on that search. However, each experiment focused on a specific element separate of the others, which are feedback, irrelevant information, and visuomotor actions. By combining those three elements it will be possible to understand how significant eye movements are during the learning process.
In order to determine if eye movements are a reliable metric for visuomotor learning and a useful tool to increase the speed of learning, an experiment was designed to collect data. The experiment was structured with three conditions that differed by the methods used to train subjects to complete a task, in this case the games of Rush Hour. The location of the subjects' gaze in the field of view was calculated and superimposed on videotape as they performed the task.
The videotapes were broken into two levels of data. "Low-level data" consists of the length, frequency, and number of gaze fixations during various stages of solving a game. It is described as low-level due to the fundamental nature of the metric, being of the eye movements themselves. High-level data is more subjective; it consists of finding patterns of eye movements. For this research, high-level data was analyzed to correlate patterns of eye movements with different cognitive stages of visuomotor problem solving. The patterns are identified by relating them to the pieces on the game board, as well as by the actual game piece movements made by the subject.
Solving time is a traditional metric of a one's skill and can be used to determine the difficulty level of a game. By characterizing each subject, game, and condition with solving time, those levels were correlated with trends in eye movements. The idea that solving time is a reliable metric of skill is an assumption until the goal of the game is clarified. To eliminate that assumption the subjects were given the goal to complete each game "as fast as possible without making mistakes."
Patterns of eye movements are task-dependent. Solving a game of Rush Hour is a complex task because it contains numerous sub-tasks at many levels. Some of those tasks at one level include looking over the game to see what is there, analyzing the positions of the game pieces, looking ahead to "see what will happen if...," and physically moving game pieces. Those sub-tasks can be put into phases of solving a game, which including exploratory, planning, and execution. However, since the execution phase basically includes eye movements that guide the hands to move a game piece, they are of little interest. The interests lie in differences in eye movement patterns that indicate exploratory and planning phases, and how skilled and unskilled subjects use their eye movements during easy and more difficult games.
In order to test the hypothesis that eye movements are a reliable metric for visuomotor learning and a useful tool to increase the speed of learning, an experiment was built to collect data. The major emphasis in the experiment was in the actual construction of the experiment.
The brainteaser game Rush Hour was chosen as a game for subjects to learn to play while wearing an eye tracker. In Rush Hour, the player is given an arrangement of vehicles on a 6 box by 6 box square grid (Figure 2). Cars are 2 boxes long and trucks are 3 boxes long. The grid is enclosed, except for one opening at the right side of the right-most box in the second row from the top. The vehicles can only slide in their row or column and can not be turned. The goal is to slide the vehicles within the grid to allow the red car to exit the grid, thus escaping gridlock. Rush Hour is a complex visuomotor activity: Players look at the puzzle, make decisions regarding how to move vehicles, and then execute those decisions physically by sliding the vehicles with their hands. Each subject played the same 10 games of Rush Hour in the same order. The game was placed at ~20 degrees off the horizontal toward the subject.
Three conditions were implemented. Condition 1 provided no feedback to the subjects. In Condition 2, after each of the first 9 games, subjects received the optimal solutions (as provided by the manufacturer) and replayed each game following the solution. Condition 3 is similar in structure to Condition 2, except subjects watched a video of the most skilled player from Condition 1 solving the games with his eye movements overlaid on the video. Each condition was tested on five subjects.
Overall, the design of the experiment included 15 adult subjects: 7 men and 8 women. All subjects had normal vision capabilities at arm's length; one subject wore glasses; one other subject wore contact lenses. It is important that no subjects were colorblind to eliminate any possible influence the bright colored vehicles had on decision making. No subjects had played Rush Hour before. At the end of each session subjects filled out a questionnaire regarding familiarity with similar games and their experience with the new game they just learned to play.
The solving time for each subject to complete each game was computed and compared by game and condition with the other subjects. The answers on the questionnaires each subject filled out after solving all ten games was compared with the numerical results.
In order to compare the games for difficulty level, the average solving times and standard deviations across all subjects were compared. The shortest solving time and lowest standard deviation was for Game 4 with values of 28.9 seconds and 8.0 respectively. Games 2, 3, and 10 had average solving times between 80 and 140 seconds and standard deviations above 90. The next lowest standard deviation was for Game 7, which was 45.3, coupled with a solving time just below 80 seconds.
In order to compare between conditions, the average solving times and standard deviations for each game across the five subjects in each condition were compared. Again, Game 4 had the shortest solving time in each condition. However, it can be seen that the average solving times for Games 2, 3, and 10 decreased with feedback, especially with eye movement feedback (Figures 3, 4, and 5).
Figure 3: Solving Times of Rush Hour Games for Subjects in Condition 1
Figure 4: Solving Times of Rush Hour Games for Subjects in Condition 2
Figure 5: Solving Times of Rush Hour Games for Subjects in Condition 3
By comparing the standard deviations of each game for each condition, it can be seen that the variability between subjects on Games 2, 3, and 10 also decreased with feedback, and notably with eye movement feedback (Figure 6). While there is a clear trend toward lower solving times between conditions 1, 2, and 3, the reduction was not statistically significant (0.12, 0.19, and 0.09, respectively).
Figure 6: Comparing Conditions by Standard Deviation
| Standard deviations were calculated between subjects within the same condition. Games 2, 3, and 10 show that Conditions 2 and 3 decreased variation between subjects. |
Analysis of the videotapes revealed three distinct types of eye movements: exploring, planning, and guiding. Exploring occurs when subjects scan the game board looking for a route to move vehicles, and is termed the exploratory phase. Planning occurs when subjects systematically move their eyes through steps in a possible plan and is called the planning phase. Throughout the game guiding eye movements occur and guide one's hand to move a vehicle on the game board. Oftentimes, guiding eye movements take place when a subject looks at a vehicle, grips the vehicle, looks at the position where the vehicle is to be moved, and finally moves the vehicle. Guiding eye movements are most noticeable during the carrying out of a decision, called the execution phase, and are also interjected in the exploratory phase as subjects move vehicles in an unplanned manner.
Analysis showed a few characteristics about both the exploratory and planning phases. While exploring, the subjects looked at various vehicles in an unplanned manner, often bouncing between vehicles that were closely related on the game board. The strategy to solve the game in this phase was trial-and-error, which was detected in both the eye-only, and eye-and-hand movements. The exploratory phase almost always occurred at the very beginning of a new game and when the subject became stuck in the game.
The planning phase was much more systematic than the exploratory phase. Once the subjects found a vehicle that might advance them in the game, their eyes would trace through a possible pathway (Figure 7). It is in this phase that decisions were made. If the subjects approved the plan, execution began. If the subjects found the plan to be unhelpful, they would either revert to an exploration or trace a different pathway.
| A typical progression of eye movements that exhibit mental planning. The circle indicates the location of the subjects' gaze and the arrow represents the direction the subject's eyes moved. |
| When beginning a planning phase a subject looked at the red car, and then traced through the cars that are blocking the pathway to move that car. | ||
| After approving that pathway, the subject moved those cars to new positions, and found the new problem. | ||
| The subject again traced through the cars, moved the vehicles, and found a new problem. | ||
| At this point, it is easy to see how to solve the rest of the game. The yellow and blue trucks were moved downward to allow the red car to pass through the exit off the right side of the game board. | However, in the original configuration (reference image at right) a skilled player would have looked ahead and moved the light blue car to left three spaces, then the yellow truck downward, and finally the first group of moves used in this example. That would have provided the optimal way to solve the game, which was also listed as the solution by the manufacturer. |
After each session, subjects filled out a questionnaire. This aided in the assessment of what parts of learning the subjects were aware of versus what the eye movements revealed.
It is normal for subjects not be aware of their eye movements while completing a task, which includes how often they move their eyes (saccades per second) and how long that look at certain objects (fixation duration). These are the low-level processes. All subjects were able to supply a response when asked how they solved the games, a high-level idea. However, the responses varied, many of which conflicted with the eye movement record. For example, the most skilled subject from Condition 1 responded that the game is not strategy based and felt that there would be no improvement if the experiment was repeated. This player used the planning phase effectively, tracing through the game with his eyes and making several decisions.
On the contrary, most skilled subjects were aware that they moved vehicles "in their heads" to test out their options. As with all the subjects, the eye movement record showed where the skilled subjects looked while they were testing their pathways by tracing through the game "in their heads."
Overall, there was little consistency regarding whether the subjects felt the games increased with complexity or difficulty as they progressed from Game 1 through 10. The following comments were made once or several times on the questionnaires:
The subjects were also asked if they thought they could perform better if they played the games again. The only two people who said no were from Condition 1, both of whom were skilled subjects. Similarly, six of the fifteen subjects stated that had played similar games, most notably puzzles to slide tiles to form an image. Of those six only one was not a slower solver of the game. Most of the other subjects solved the games more quickly than the subjects without exposure to similar games.
The numerical results showed that with feedback, especially eye movement feedback, the standard deviation between the subjects on harder games (Games 2, 3, and 10) decreased. It is important to note that the skilled players were not affected by feedback as much, because they have very little room to improve. Therefore, the unskilled players improved with feedback.
The various phases of eye movements were listed as exploring, planning, and guiding. The usage of each phase changes with skill level. For example, a novice may spend most of his time exploring and using trial-and-error methods. A good player may perform more productive explorations, gain more useful information, test plans, and then execute decisions by using guiding eye movements. As one subject noted, sometimes the solution to the games are recognized immediately. An expert player might do that more quickly than others and then go directly into guiding because he does not need to explore or test any plans. However, a novice or good player might use guiding eye movements frequently to do his testing of plans physically on the game board instead of mentally by looking at the game board.
Ultimately, it is very important to understand the visuomotor task in question. By understanding the goals of the game, the various bits of knowledge one must gain to be a very skilled player, and how those bits of knowledge might be indicated by eye movement patterns, a person can be prepared to understand the eye movement record and how to use it for feedback purposes. In Rush Hour, there are a few items that subjects might learn, some of which require subjects to play very many games. Initially, the concept of the game is to slide the red car off the game board. It is easy to realize that it is necessary to move the red car to help clear the pathway and that vertical trucks must be in the lower portion of the game board to allow the red car to pass. It may be more difficult to realize that if there are two cars in the same column, they must separate to allow the red car to pass. What may be even more difficult is to realize that there are certain key vehicles that repeatedly move in a "domino effect" to not only allow the red car to pass, but to allow other vehicles to pass. Those vehicles might clear the way for even more vehicles to move that relate to a vehicle blocking the red car.
The hypothesis that eye movements are a reliable metric for learning and a useful tool to increase the speed of learning is supported by this experiment. As with most anything, learning requires practice. Eye movement information can help decrease the practice time and help to evaluate a learner, provided that the task requires the use of the human visual system (one's eyes).
The eye movements can indicate where a subject is looking, but they can not determine what information is acquired. When looking at an object a person can take in several bits of information including color, size, shape, location, orientation, etc. By understanding the task at hand one can make a strong assumption as to the what information is acquired, which is the relative location in this experiment.
Eye movement records take skill to understand, analyze, and interpret. Simply showing the records to a learner who is unskilled in "reading" them may not be as beneficial as using a different medium to relay eye movement information. However, since eye movements are task-dependent, each type of task may require different forms of eye movement feedback. Krupinski told radiologists where they looked the most, but that is not adequate for an unskilled person who probably isn't looking at the correct locations. For instance, there were numerous incidents when both unskilled and skilled subjects were not looking at the key vehicles to solve the games. Qquestions also arise concerning what information to relay and how to do so. The learner should be able to easily understand the feedback and how to take advantage of it, thus the eye movement information should be transformed to something with which the learner is familiar.
An issue with the experimental design is the structure of learning. Fifteen subjects were instructed to learn how to play Rush Hour by playing the first ten games. That is not realistic. To become an expert at the game several games must be played at escalating difficulty levels. The model used did not show a sufficient learning curve. To correct for that, one might adjust the experiment so that the same subjects return periodically to play the games several times. Obstacles to that may include finding enough reliable subjects to return at specific times, deciding how much of the learning process to eye track considering the lengths of time one might need to play the game to excel from a novice to an advanced player, and standardizing the number of games each subject plays at each session, as well as the length of the sessions. Overall, there are a great number of variables to contend with, and as few variables as possible should be tested at a time to understand how eye movements correlate with visuomotor skill.
However, several points must be considered. Foremost, since the values of deviation between the Conditions for Games 2, 3, and 10 are approaching significane, more than five subjects per condition may be needed to confirm these results. Poisson analysis should be implemented, because the solving times are never negative and the solving times lean closer toward zero as subjects learn skills. Similarly, it is necessary to perform future experiments to differentiate between a traditional video of someone solving the games and a video with eye movements overlaid on it. By implementing high-level analysis, patterns of eye movements were identified and correlated with cognitive processes. The eye movement patterns include exploring, planning, and guiding eye movements. Low level-analysis needs to be completed to find if these phases can be characterized by fixation duration, fixation density, and fixations per second to create a better metric than just the patterns themselves. In effect, to find a way to characterize the metric. Also, it is important to find more effective ways to relay eye movement data into information the learner can use.
Eye movement patterns can identify how people use their eyes to acquire information and execute decisions. For the appropriate learning situation, these patterns of eye movements can easily indicate one's skill level at a particular task, provided the task itself is well understood by those evaluating the person. That evaluation can then be used to prepare feedback for the learner, which may be more effective to help with the methods of how to solve the problem. The applications are endless and may not necessarily be limited to visuomotor activities.