Correlation Between Image Reproduction Preferences
and Viewing Patterns Measured with a Head
Mounted Eye Tracker
 
Lisa A. Markel
 

Introduction

        This experiment was an attempt to determine how people look at images when faced with an original and a reproduction.  The main emphasis was on what was looked at in the scenes and what type of patterns people used to look at the images.  It is already known what people look at when given a single image.  This test determined what was looked at when given an original and two reproductions of this original.  The scene types used in the experiment were restricted to landscapes, people in nature, and portraits.  People were asked to judge images based on two questions, "Which image do you like better, A or B?", and "Which image looks more like the original, A or B?". The fixation  patterns and other statistics were collected for 5 subjects for the two questions asked and 34 images.



Background

        In the past, experimentation has provided information on eye movements.  It has been determined that there are several categories of eye movements.  Some include drifts, tremors, microsaccades and saccades.  Fixations are not eye movements, but often are the results of saccades.  0Drifts are irregular movements of low velocity and low amplitude that are made to keep the image of the object of interest on the fovea.   Acting with the drifts are tremors.  These eye movements are rapid jittery motions.  Microsaccades come about when fixations exceeds 0.3 to 0.5 seconds or when a drift moves the image of an object too far from the fovea. They shift the retinal image about the retina over an area larger than the fovea preventing fading of the image.(1) These three movements are not under conscious control.  The most frequent eye movement is the saccade.  Cumming stated that we make billions of saccades in a typical lifetime.(2)  Saccades are very sharp movements that move the eye from one point in a scene to another.  They are used to scan the visual world bringing different areas to the fovea.(3)  Saccades are very fast movements that require 100-300 msec to plan and 30-100 msec to execute.(4)  These movements are under voluntary control but sometimes overshoot their target and result in corrective saccades.(5,6) During the saccade there is no stimuli perceived by the visual system.(7,8)  Fixations  are relatively stable eye positioning to gather information.  During a fixation the observer is not only encoding information about the visual stimulus, but they are also programming the subsequent saccade.(3) The duration of the saccade will vary depending on the task the observer is performing.(9,10)   Just and Carpenter reported that fixations range from 0.7 to 1.2 seconds for cognitive tasks.(11)  Yarbus reported that when perceiving stationary objects , the eye is either fixating or changing to a new fixation, making a saccade.(12)
 
        There have been several studies on eye movements and pictures.  Wolf stated that eye movements present an unusual opportunity for finding out the reaction of viewers to visual stimulus.(13)  He stated that eye movements give information where the subject is looking and how long he looks at a particular area, how often he looks at a particular object and the type of eye movements he makes.
 
        Wolf believed that the more complex the stimuli in a scene the more fixations the view would make.  When the stimulus got extremely complex the observer will either fixate centrally on it or ignore it.  Mackworth and Morandi (14) also demonstrated that people fixated on areas of high frequency data which was consistent with Zusne and Michels experiment on non-representational shapes. (15)  They found that for complex stimuli, the fixations were clustered on the outlines and the intricate portions .  Contours are also thought to be very informative.  Gould performed tests that proved contours contain information to the viewer and resulted in increased fixations. (16)

        Nesbit reported two basic factors that influence eye movements, observer intelligence  and the nature of the stimulus.(16)  Yarbus thought that when people were asked to look at pictures they would concentrate on the information that would give them the most information.( 12).  Pictures containing humans had dominant eye fixations in areas of the hands and face, eyes, nose and lips.  These being the most important features of the face.  Yarbus also reported people did not fixate on light or darkness in pictures unless they contained information.  Contradictory to these results were Hughes and Cole, who found that many fixations fell in the sky areas of the scenes.(17)
 
        Though different people have examined the concentration of fixations in images, there is also interest in the fixation or scanning patterns when viewing images.  Yarbus believed that the perception of pictures is composed of a series of cycles or patterns.(12)  Bruswell found two types of eye movements patterns. The first is a survey of the picture where the eye moves quickly with short pauses over the entire picture.  The other is a set of long fixations meant to examine the image.(18)   Antes confirmed this in his testing finding a strong relationship between the number of fixations in an image and the apparent amount of information in that area of the image.  Subjects would make quick assessments of the picture then focus back on the detail in the image or the portions that would provide them with information.
 
        Picture content is not the only factor that influences the viewing patterns.  The instructions given to an observer will also effect the pattern of fixations.  Bruswell found the instructions very important when asking people to look at pictures.  He asked people to look at a picture of the Tribune Tower in Chicago.  He asked some to look at the picture normally, and he asked some to look in the windows of the building for people looking out.  He found that people make longer fixations  in the window areas of the picture when asked to look for people. (18)   Yarbus agreed with the notion of the instructions influencing the fixation patterns when viewing pictures.(12)  Depending on the task the subject is asked to do, the eye fixation patterns will vary.
 
        In the past several eye tackers have been used to study the movements of eyes.
Frietman experimented with the EYE-SISTANT, a portable system that detected vertical and horizontal eyeball movements.  Both eyes were simultaneously lit with energy from two Infra Red Light Emitting Diodes(IRLED).  Horizontal movements were detected with one pair of silicon  Photo Transistors (PTR) which measured the difference in the reflectivity of iris-to-sclera boundary.  Both the IR sources and the PTRs were mounted on the front of the eyes but not in the way of the field of view.  The vertical eyeball movements are sensed with two pairs of PTRs that detect differences in reflectivity of the pupil-and-iris boundaries.  The problem with this detection system is the ease of misalignment of the IRLEDs in to the PTRs.(19)  This eye tracker is not suitable for this application because it primarily  tracks eye movements-it does not reference a subject's line of gaze to the scene.  Another eye movement recording system used in the past is the Honeywell Oculometer.  Its function is based on the principles of pupil cornea reflection method.   The eye is illuminated by a single light source reflected from a mirror into the eye.  Some of the light is reflected off the back of the retina through the pupil.  Some of the light is also reflected off the cornea.  The reflected radiation is collected with a infrared television camera.  It provides an enlarged image of the eye with a bright pupil and brighter small image of the corneal reflection.  As the eye rotates around its center, the position of the corneal reflection moves differently with respect to the pupil due to the different radii of the two.  Shifts of the corneal reflection with respect to the pupil corresponds to the shifts in the direction of the eye.  The computer determines the line of sight  with respect to the scene and generates x and y coordinates every 20msec.  Though use of this apparatus sound similar to the one being used, subjects are forced to have their head immobilized by placing their chin in a chin rest and forehead in a headband.(20)  Not very conducive for long-term investigations.
 
        Another eye tracker, Jthe SRI Dual Purkinje eye tracker has been used to monitor eye movements.  This tracker utilized the first and forth Purkinje images.  These being reflections off the surfaces of the eye.  It is not suitable for being mobile or easy to use.  The subjects are immobilized through the use of a bite bar.
 
        The Applied Science Laboratory Model 5000 Eye tracking system is ideal for this experiment due to its ease of use, noninvasive  operation  and use with human subjects.  The headband is like an insert in a hard hat.  It fits snugly around the subject's entire head and allows normal head motion.  The entire system can be made to be mobile so as to leave the laboratory to do experimentation.
 
        Eye movements are the main topic of study in this experiment.  They were recorded while viewing pictures.  Prior studies have been primarily to study eye movements to gain information on the eye movements.  This experiment is more a collection of eye movements that can be analyzed and used in further work of color reproduction.  The instructions given in this project were used to determine the effect of the instructions on eye movements.  All of the background information was taken into consideration when designing the methodology of the experiments.
 



Theory

        The experimental sessions needed to carry out this project were performed in the Visual Perception Laboratory at the Center for Imaging Science, Rochester Institute of Technology.  All of the equipment necessary for the project was assembled in the laboratory.  The equipment necessary included an eye tracking system, a tape recorder and controller, a printer for the images being used, computer systems (both Macintosh and PC), software, and a viewing booth.  Subjects were asked to meet in the laboratory for testing.
 
        The most important resource necessary for this project is the  Applied Science Laboratory Series 5000 Eye Tracking System.  The system consists of the Model 501 head mounted eye tracker.  This eye tracker is designed to measure a person’s eye line of gaze with respect to his/her head.  The hardware for the eye tracker is mounted on an adjustable head band.  The hardware on the head band consists of an eye illuminator, optics, both a scene and eye camera.
 

Figure 1:  ASL Model 501 Eye Tracker

The eye is illuminated with a beam of light from a near infrared source.  The optical system focuses an image of the eye onto a solid state video sensor, eye camera.  The illuminator is positioned such that the illumination is reflected off of a visor in front of the eyes.  The visor is coated with a material that is reflective in the near infrared and transmissive in the visible region of the spectrum.  The visor is angled in front of the eye such that the illumination is shined in to the left eye.  The infrared light is reflected off the first surface of the eye, corneal reflection, and off the back of the eye, pupil image.  These images from the eye are reflected back toward the illuminator  to the  eye camera.  The eye camera and the illuminator are in the same imaging path but are separated with a beam splitter., as shown in the diagram below, Figure 1.
 

Figure 2:  Cameras and Eye Illuminator
 
 
 
 
 
        A second solid state camera, scene camera, is mounted on the head band near the left ear.  It is positioned as close to the eye via a boom mechanism that allows alignment.  The camera is aligned minimizing parallax between the eye and the camera.  Parallax being a difference in where the eye is looking vs where the camera is “looking”.    This method is not recommended.  The recommended method entails aligning the camera beneath the visor such that it captures the reflection of he scene off the front of the visor.  This method is not conducive for analyzing data.  The calibration will be set up to minimize parallax errors.  Each of the cameras are connected to a small controller unit and from there to a Model 5000 Control Unit.  The Control Unit processes the eye camera signal and extracts the pupil and corneal reflection, then computes the diameter of the pupil and the line of gaze.  Two monitors, both 13” SONY Trinitron Monitors are connected to the Control Unit.  One displays the eye and includes information on the systems ability to locate the pupil and corneal reflection .  The other displays the scene and a cross hair showing where in the scene the subject is looking.  This cross hair is the eye line of gaze with respect to the head.
        Calibration, and most of the other interactions with the operator, take place through the interface PC.  In the laboratory, the interface PC is a 200 MHz Pentium Dell PC with the appropriate interface software supplied by Applied Science Laboratory.
 
Figure 3:  Computer Interface and Output Monitors

        From the scene monitor is a SONY EVO-9650 Hi-8 deck tape recorder.  The tape recorder is an intricate part of the experiment.  It recorded the scene image with the superimposed gaze position , cross hairs, as a function of time.  The recorder is controlled by a SONY Control Unit RM-9650.  The controller will allow for more ease and accuracy in data analysis.  It allows the recordings to be analyzed frame by frame and displays the time for each frame.  SONY P6-120HMPX tapes will be used for the recordings.  These tapes are designed for multiple advances and rewinds during analysis.  The below outlines all of the components and the configurations.  The figure was copied from the Eye Tracker Manual (Model 501).
 

Figure 4:  Components of the Eye Tracking System
 
 
        For the experiment the Eye Camera Electronics unit and the Scene Camera Electronics unit were mounted to the back of the chair the subject was seated at.  Because the connections from the head band to the electronic units are short, the electronic units were mounted to the chair making the subject with the chair more mobile, otherwise the electronic units are placed on the table top and the subject is forced to be near the table top.

        In the experiment, a calibration routine must be conducted to calibrate the actual eye line of gaze to the system output.  This meaning it is necessary for the system to correctly identify where the subject is looking.  To accomplish this a nine point target was used.  Typically the nine points are located on a board at a fixed distance from the subject.  The nine points are entered in to the computers memory as locations in a plane in space.  The subject is asked to keep his or her head in a fixed location and look through each of the nine points when specified.  There is a problem with this method.  If the subject moves his/her head, the calibration routine is not as precise as it could be.   To relieve the experiment of this problem, a LASER was affixed to the headband.  Over the front of the LASER were two perpendicularly oriented diffraction gratings.  The spacing of the gratings produced a two-dimensional diffraction pattern.  The center being the brightest and the octaves getting dimmer as the distance from the center increased.  The first and second octaves shown up the best resulting in a nine pattern display.  The others were not bright enough to really see.
 

Figure 5:  Calibration Target
The spacing of the diffraction pattern was the correct size to encompass the area the images were located in the light box at the distance the subjects sat from the light box.  The LASER being attached to the subject's head also eliminated the problem with head movements during calibration.  If the head moved, the pattern also moved.  The pattern was always in the same position with respect to the scene camera.  This allowed for a more accurate calibration.
 



 
Methods

        Before the experimentation could actually begin, the Institutional Review Board for the Protection of Human Subjects in Research at RIT had to approve the experimentation. To get approval, a Request for Board Review and Approval was submitted along with the research proposal.  The request is located in Appendix A of this report.  The one thing that was of most concern was the safety of the infrared illuminator on the eye.  The maximum setting on the illuminator is 1/10 the level of infrared radiation from normal day light.  The illuminator level was set at the lowest setting, 1, which was closer to 1/100 the level of daylight. There were also some comments from the review board that force a couple word changes in the Participant Informed Consent that each subject read and signed prior to testing.  The sheet is located in Appendix B along with a general questionnaire.  Once the board approved the proposed research, collecting data began.
 


Participants

Seven subjects participated in the experiment, six male and one female.  The age ranged from 18 to 40 years old.  The subjects are outlined below:
 
                                   Subject                         Gender                             Age

 
Subject 1
Male
28
Subject 2
Male
18
Subject 3
Female
23
Subject 4
Male
21
Subject 5
Male
29
Subject 6
Male
19
Subject 8
Male
40
 
Subject 7 was eliminated from the experiment due to inability to calibrate the eye tracker to his eyes.  The cause was blamed on dirty contact lenses.   Initially, 15 subjects were slated for testing.  An even mix of men and women of various ages was also the initial plan.  The number was cut significantly due to time constraints on the project.  Only the first five will be used in this writeup due to lack of time.

All of the subjects that were used in the experiment had normal color vision.   None of the subjects used wore contacts or glasses.  Each was required to fill out an informed consent form and a questionnaire.  These two forms can be found in Appendix BThe questionnaire provided information on the subject's residence, health, and experience with taking and evaluating pictures.  The group of subjects was split between novice image evaluators and more experienced viewers.


Stimuli

        Stimuli generated for this experiment consisted of thirty four 16" x 20" gray boards containing three 5” x 7” hard copy prints.   The three prints on each board were an original, reproduction A, and reproduction B.  The images selected for the experiment consisted of 12 images in two of three categories, landscape, people in nature and 10 images in the category of portraits. Initially there were 35 total scenes, one was eliminated due to ownership of the picture.
 
        All of the originals originated on color negative film.  The original films were scanned at a resolution of 2k x3k pixels on a KODAK Photo CD Film Scanner 2000.  During the scanning the appropriate film type and resolution fields were set properly.  The images were  scanned to Photo CD and imported into Adobe Photoshop as YCC.  From the YCC color space the images will be converted to the color space of the KODAK DS 8650 Thermal Printer using color management profiles that take the image data from YCC to  color management abL then to Thermal Printer Output, which is RGB .  For more information on KODAK Color Management ICC Profiles, consult the Kodak web site at www.kodak.com.

    For the experiment, the original image for each of the 34 scenes was printed.  The print was adjusted in Adobe Photoshop software until the best print was achieved for the given picture and printer.  This was a subjective decision based solely on the preference of the investigator.   This optimization was applied in the output color space.  The idea was that the profiles would provide a good quality print without user tweaks.  This was generally the case, the profiles alone were good enough to generate the original without adjustments.  Once the original was printed, the two reproductions for this scene were generated.  The digital file for the original was the starting point of the two reproductions.  In Photoshop, there were either global or selective.    This meaning only a selected part of the image was altered.  The actual edits for both reproductions for each of the 34 scenes can found in Appendix C.  The edits were controlled such that they produced just noticeable changes in the image.  They were changes such as contrast adjustment, color balance adjustments, curve adjustments, selective color adjustments...
 
        Once the images were generated, they were mounted to 16" x 20" 18% gray boards.  These boards contained one original and two reproductions of that original.  The outline is below:
 

Figure 6:  Scene Layout

The boards were then place in a light box simulating daylight illumination.


Procedure

    Setup

        To begin the actual experimentation, subjects were asked to come to the laboratory for an hour block.  During this hour, the eye tracker was fitted to the persons head, calibrated to his/her eyes, and data collected as they looked at the scenes.

         Fitting the tracker to the subjects head required adjustment of the headband, eye camera, visor, scene camera, and computer software.

        The headband consists of a round strap that encompasses the diameter of the head at the brim of the head and a strap that goes up over the top of the head from ear to ear.  The top one was necessary to get the correct height of the headband on the head.  The strap that went around the diameter of the head was responsible for getting the eye tracker snug on the subjects head.  It was very important to get the headband adjusted correctly.  If not the eye tracker would be too loose resulting in movement on the head.  This would cause errors in tracking.  If the band was too tight, the subject suffered of a head ache early in the testing.

        The visor was then positioned so the subject could easily see through it.  The angle also dictated the view of the eye.  If the angle was very steep with the face, the view of the eye captured was a view from the bottom of the eye.  It would be as if the eye was looked at from below. This was adjusted until an optimal image of the eye was captured in the eye monitor.   The eye camera and illuminator housing was adjusted simultaneously with the visor.  The housing was adjusted left-right by a gear in a track that moved as a knob was rotated.   The entire housing rotated front-back to locate the correct angle the reflected eye image was subtending.

        The scene camera was adjusted as close to the observer eye as possible and still capture the image they were looking at.  The subject was asked to look at the center of a target while the scene camera was adjusted to be centered on the same central location.   It was important to get the camera adjusted as close to the eye as possible to reduce parallax errors.  The aperture was also adjusted to get the best image quality for the images captured.

        Finally, the illuminator power was fixed to a setting of 1, and the corneal and pupil thresholds were adjusted.  The illuminator power just sets the brightness of the illuminator.  One is the lowest setting but it works the best for the situation this testing was conducted in.  The corneal and pupil thresholds were adjusted to optimize the detection of the corneal reflection and the pupil reflection.  To adjust the pupil threshold the eye was closed and the threshold increased until the closed eye was reflecting enough infrared light that it was detecting it as a pupil.  The threshold was backed down from there until there was no detection of a pupil when the eye was closed.    The corneal threshold was adjusted by having the subject with his/her eye open facing straight ahead.  The threshold was decreased until the system no longer detected a corneal reflection, then it was increased until the corneal reflection was just detected.  Once both of these were set, the subject was asked to look at a variety of things to determine if there were any problems with the setup.  If there were problems, the correct adjustments were made.  When the entire system was optimized, the subject then went through a calibration routine.

        The eye tracker must also calibrate the cross hair location in the scene to where the subject is looking in the scene, this way the cross hairs displayed in the scene image of the subjects line of gaze will be an accurate representation of where they are looking in the scene.  A 9 point calibration was used for this experiment.  The 9 points were generated by the laser attached to the headband as mentioned in the Background section.    For the calibration , the subject was asked to comfortably position themselves and try not move his/her head.  The first part of the calibration consists of entering the 9 points as references in space for the computer to use.  Once the computer has the 9 points entered, subject should keep his or her  position, but can move his or her eyes.  The investigator scrolls through each of the 9 points having the subject look at each of them.  One by one the operator will signal the subject to look at the points.  The operator hits return to enter the position into the software.  The calibration was tested to see how accurate it was.  If the subject looks at the center number in the target and the eye tracker position indicates that they are  looking at something close but not the correct position, this is an error.   The software allows the operators to remove this source of error by relocating the cross hairs on the scene image to the spot in the scene the subject is focusing on.  The calibration should be off by no more than 1 degree of visual angle .
 
    Data Collection

        Once the eye tracking gear was adjusted and the subject was calibrated to the equipment and software, the experimentation was conducted.  The subject sat in front of a light box simulating daylight such that they were comfortable and in a way that the images could be seen and captured with the scene camera.  The 35 scenes were set up in the light box.  The order of the scenes was randomized so that all one category (landscape, portrait, or people in nature) were not all together.  To randomize the scenes, numbers from one to thirty five were put into a bag.  Each slip was pulled out one at a time, this was the order of the scenes.  For each scene, the subject was asked two questions, "Which picture, A or B, do you prefer?" and "Which picture, A or B, more closely matches the original?".  The subject was forced to answer A or B.  HIs/her response was recorded. Appendix D outlines the order of the images and the order of the questions asked.  For each image one question was asked then the second.  The last two subjects were asked the first question for each image.  The images were put back in order and  second question was asked.  This way the subject was not influenced on the second question by the first.  Below is a schematic of the setup:
 

Figure 7:  Setup Schematic
 
 
 

        The eye tracking system collected data on the pupil and corneal reflections as well as a cursor in the scene the subject looked at as the position of their eye line of gaze or their fixation points.   The fixations in the scene were recorded using a SONY Hi-8 deck recorder.   The taped recording of each session was the main source of information in this study.   These tapes were analyzed frame by frame to determine the fixations.  For each subject ,35 sheets, each containing one scene with the three images placed proportional to the 16" x 20" boards, were constructed and printed.  These sheets served as a templates to capture the fixations on paper.   Frame by frame the tapes were viewed.  Each fixation was given a number in sequential order and written on the paper template.  Below is an example how the data was collected:

 
 Figure 8:  Data Collection
 
This type of data collection was performed on all of the images viewed by all the subjects for each question asked.   Once the numbers were on the paper, they were connected to look at viewing patterns and they were entered into a spreadsheet to gain some statistics.

        To connect the points, Microsoft PowerPoint was used.  The templates that were used collect the numbers were put together in PowerPoint.  These templates served as the image that would represent the scenes viewed.  The fixation patterns were overlayed by drawing arrows in the order and locations of the numbers.  By plotting out the fixations, fixation patterns could be seen relatively easily.  Below is an example of  the way the fixation patterns were drawn:
 

Figure 9:  Fixation Pattern
 
 
Not only were the fixations plotted, but they were also entered in a spreadsheet.  The duration time of the fixations was also stored in the spreadsheet and used to calculate fixations/second.  The number of fixations in each image was found.  Run length was also determined using the spreadsheet.  Run length refers to the number of sequential fixations in a single image.  The number of corresponding points was also recorded.  The corresponding points referred to any time a subject looked at one point in an image and then to the same point in another image on the same board.  An example of the spreadsheet can be found in Appendix E.
 

Results

        The results for this experiment were collected from the spread sheets of each scene viewed and the fixation patterns.   Only 5 of the subjects' data was extracted from the tape.  Time did not allow for all subjects' data to be included. The results were compile into a final spread sheet as averages for each subject in each of the scene categories for all of the parameters mentioned above.  The spread sheet of the averaged data can be found in Appendix F.   The spreadsheet of these averages contains the values for the time viewing the image before the subject answered the question, the number of fixations, fixations per second, corresponding fixations, average sequential fixations in each of the images-original, A, B and averaged them for each scene type and each question.  In other words, the results in this spread sheet are the averages of the parameters for all 12 scenes in each category by subject. The spreadsheet allows lots to be made to evaluate the subject against other subjects and to generate an overall average for all subjects to determine if there is a difference between evaluation of scenes by the questions asked.

 Spreadsheet Data

        To begin , the average viewing time for each subject for each scene type (landscape, people in nature, and portraits) at each question was plotted.  The viewing time refers to the amount of time began with the first fixation and ended when the subject gave an answer.  The time was also easy to determine because the subject was asked to look at the center of the scene until the question was asked and to look back at the center when they gave an answer.  This made the tapes much easier to analyze the time easier to determine.  The plot below outline the average time each of the subjects looked at the scene types for a given question.  Each subject has six bars that are associated with them.  The first two are as they viewed the landscape scene, the second two are as they viewed the people in nature scenes, and the to are as they viewed the portraits.  The first of the two representing the time it took the subject to answer the question "Which image do you like better, A or B?".  The second representing the time it took the subject to answer the question "Which image looks more like the original, A or B?".  This is the same format for all the proceeding plots.  The plot representing the average time is located below:
 

Plot 1:  Average Viewing Time by Subject for each Scene Type and Each Question
 
 
From the plot, it can be seen that all of the subject took more time to answer the question pertaining to the original.  Most subjects were able to answer either question in less than 10 seconds.  Subjects 3 and 4 had much longer times when trying to match the original (green and white bars).  Subject 3 was the only female in the test.  It is not sure that the lengthened time is a result of the gender.  From this plot the an average was taken over all the subjects.  The plot located below shows similar results.  The time for viewing images and determining on that is preferred or liked better takes much less time than trying to match the original.  The average times for answering the 'like better' question are similar for all three scene types.  This is not the case for the scenes when the subjects to  pick an image to match the original.  It appears that it takes more time to pick the closest match for landscapes, then portraits, and the least amount of time for people in nature.  The averages may have been altered by the large results of subjects 3 and 4 when trying to choose a match to the original.  Below is the plot outlining this data:
 
Plot 2:  Overall Average Viewing Time
        The number of fixations made for each scene was averaged and plotted for each subject.  The plot is similar to Plot 1 in its format.  The plot shows that all subjects had fewer fixations when determining which of the two reproductions they liked better.  This too can be correlated with the viewing time.  The longer the viewing time, the more fixations were made.  Subjects 3 and 4 again have very high number of fixations, but they have the highest viewing time.  Amongst all the subjects, there is little difference in the average number of fixations when determining which image they like better.  Most of the subjects made his/her determination in less than 20 fixations, except for subject 3 and 4 when picking a match to the original.   Plot 3 portrays these findings.
 
 
Plot 3:  Average Number of Fixations by Subject for Each Scene and Question
The overall fixation average shows the same results as the overall time average.  The average number of fixations was much greater for questions pertaining to the original.  There was no real difference in scene type when the subjects were asked to choose which reproduction they like better.  The results for the original match did vary based on scene.  The landscapes had a significantly higher average number of fixations, followed by portraits, then people in nature.  These results coincide with the results for the age time.  That same order occurred for the time to view the images when asked which image looked more like the original.  Because the time was longer, it makes sense that the number of fixations increased.  Below is the plot showing the overall average for number of fixations.
 
Plot 4:  Overall Average Number Fixations
 

From the time and number of fixations, the average number of fixations per second was also calculated and plotted.  Most of the subjects had rates less than 3.5 fixations per second.  Subject 1 seemed to have very high fixations per second.  These numbers appear higher than what would be realistic.  The time for this subject, because they were the first subject, was harder to record.  Some of the bugs in data collection still had to be worked out at that time.  There seemed to be an even split between the scene types when looking at if one or the other questions had greater fixation rates.  None of the scenes had a distinct trend to have greater fixation rates for certain questions.   This result is better depicted in the overall average plot of the fixation rates.  First the plot for every subject is shown.
 

 Plot 5:  Average Fixations per Second by Subject for Each Scene and Question
Again, the overall average of all the subjects was plotted. The plot, Plot 6, does not show any real differences in the fixation rate for the different scenes or for the two questions asked.   These average rates all fell between 2.7 to 3.2 fixations per second.  There may have been a slight downward trend toward portraits from landscapes. The 'like better' question seemed to have just a very small difference, being slightly higher than the rate for the question pertaining to the original.  This difference may have no statistical difference.  This is not know at this time.  The plot is as follows:
 
 
Plot 6:  Overall Average Number of Fixations per Second
 

        The next parameter that was studied was what will be referred to as the average number of correlations.  In this study, a correlation refers to a pair of fixations in a row where the first fixation is on a particular location in one of the images in a scene and the second fixation is to another image in the scene but to the same feature as in the first fixation. For example, if a subject were to look at the right eye of the baby in Reproduction A of Scene 29 first, then shift their focus to the same eye in Reproduction B, this would be considered a correlation.  The number of correlation's did vary from subject to subject, but they were consistently greater for all subjects and every scene when asked to choose which reproduction 'matched the original'.  Subjects 3 and 4 again had very high numbers for the questions that were to find the closest match to the original.  These were the same questions that had the high fixations and fixation times.  Given this, it is understandable why the numbers are greater.  More time and fixations allows for greater probability of getting correlations.  The plot below outlines this data.
 

Plot 7:  Average Number of Correlations by Subject for Each Scene and Question
 
 

The overall average plot of the number of correlations per scene and question portrays the same results of having a greater number of correlations when asked to match the original.  The averages for this question are proportional to those of the number of fixations and viewing time.  It is thought that subjects 3 and 4's average have a great impact on the overall outcome.  This plot is labeled Plot 8 and is located below.
 

Plot 8: Overall Average Number of Correlations
 
        Along with the correlations, the style in which people are viewing the images can be determined by the number of sequential fixations in the same image of a particular scene.  The more sequential fixations in a single image, he could mean less bouncing back and forth with fixations between images.  This can be thought of as the run length of the fixations in each image.  The average number of sequential fixations in each was determined for each image in each scene for all subjects and both questions.  The comparison of the average run length in the original scene was not a very good comparison for the two questions.  People are not likely to look at the original when they are asked which they 'like better', A or B.  The plot does show the extreme difference in the number of fixations in the original as a result of question asked.  There is no real difference in the scene type.  All of the scenes for each question have the same relative number of fixations in the original scene  for a given question.  The plot below shows the results discussed.
 
 Plot 9:  Average Sequential Fixations in Original  by Subject for Each Scene and Question
 
The overall average plot of the sequential fixations in the original scene shows the difference in the two questions asked.  The 'like better' question has fewer fixations in the original scene resulting in very low number of sequential fixations.  There is not a difference in the scene type in either question.  It shows that there are approximately 2 fixations made sequentially in the originals for the question the question of finding the reproduction that 'matches the original'.  The plot is below.
 
 
             Plot 10:  Average Number of Sequential Fixations in the Original Image
        The average number of sequential fixations was also found for the reproduction images, A and B.  For A, the average run length was plotted, as before, for each subject for each scene type at each question.  The results for A were far less dramatic as they were for the original.  In most cases, the scenes with the question which reproduction was liked better typically had more sequential fixations.  Subjects 1, 3 and 4 all had a trend of the landscapes having higher run length data, the people in nature being slightly less, and the portraits having the lowest number.  The plot below depicts these results.
 
 Plot 11:    Average Sequential Fixations in A by Subject for Each Scene and Question
The overall average for all the subjects shows a trend of all the images having a higher average run length when asked which image they liked better.  The values were all near 2 fixations.  This meaning that most subjects took two sequential fixations in reproduction A while viewing the images.  Below is the plot.
 
Plot 12:  Average Number of Sequential Fixations in Reproduction A
    The final plots for this analysis was the average number of sequential fixations in image B.  The results are very similar to those from image A.  Most subjects had more sequential fixations when viewing the images and being asked which they 'like better'.  There was a pattern for the scene type as well.   Most subjects had highest average number of sequential fixation in B for the portraits, then the landscapes, and finally the people in nature scenes.  The plot with each subjects average values is below:
 
Plot 13:    Average Sequential Fixations in B by Subject for Each Scene and Question
The overall average for this data has the same results as outlined for the individual subjects' averages.  The questions that pertained to 'like better' questions had much higher average number of sequential fixations in image B than those with the questions pertaining to 'match the original'.    The results for the scene types was not the same as in reproduction A, the landscapes seemed higher (overall), then the portraits, then finally, the people in nature scenes.  Plot 14 depicts this information.
 
 
Plot 14:  Average Number of Sequential Fixations in Reproduction B

 

  Fixation Data

        The data from the spreadsheet  was not the only data collected.  There was also data collected on what was actually looked at in the scene.  As the tapes were analyzed, the fixations were numbered on a piece of paper that represented the scene that was being viewed.  See  Figure 8.     The information collected was connected sequentially to see if there were any viewing patterns present. Figure 9 is an example of these connections.  The analysis provided information on what people were looking at when they viewed the images and what type of pattern they used to looked at images.

    A table was generated with the scene number and what was viewed in the scene.  The objects in the scenes that was fixated on were the same regardless of the question.  The table below gives the image and what was looked at.

    Table 2:  What was looked at for Each scene
The images can be seen in Appendix C what is actually being looked at.

        The information shows that in landscapes, the main emphasis of the fixations is on whatever the main subject of the photo is, mountains, rocks, waterfalls.  The secondary fixations are primarily on sky and foliage.  The people in nature scenes shows results that people are primarily concerned with skin and face.  Either they concentrate completely on the face and flesh and look at nothing else or they look at the foliage and sky as a secondary reference.  In portrait scenes people are mostly concerned with facial features, eyes, nose, forehead.  They then look at the background information or clothes.

        The fixation patterns were also looked at.  The order or way in which people view the images was also an important part of this study.  By connecting the fixation points, the viewing patterns were found. Below is an example of each viewing pattern.  One represents a landscape scene that is being viewed to answer the question which reproduction do you like better and the other for which reproduction looks more like the original.

Figure 10:  Fixation Patterns


 

As you can see from the arrows, there is a distinct viewing pattern depending on the question asked.  When asked to pick which reproduction is 'liked better', most subjects look at the two reproductions making a couple fixations in each scene then jumping to the other reproduction.  They look and bounce back and forth between the two reproductions until they make a decision.  The question about which 'matches the original' produces a different viewing pattern.  Here subjects make one or two fixations and move to the original or other reproduction looking at the similar point in the scenes.  There are more correlation points in this viewing condition.  The pattern in more of a cyclical pattern.  The subjects may start in reproduction A, look to reproduction B, then to the original.  In each cycle, subjects are typically looking at the same point in each of the scenes.  These results are the same regardless of the scene type.  The patterns are dominated by the question asked.

 

Discussion and Conclusion

        In this experiment it  has been found that an eye tracker can be used to determine what people are looking at in a situation where they are presented with an original and two reproductions of that same original given certain questions.  This examination has shown that people look at scene types in a certain way based on these questions.  The fixation patterns were dictated by the question asked.  When asked which image the subjects 'liked better', the fixation patterns were like a scanning mechanism.  Subjects would scan around one of the reproductions for 2-3 fixations then jump to the other reproduction and look at it in a similar fashion.  This was distinctly different for the question of which one looked more like the original.  These fixation patterns were more cyclical and comparing the same point in each image.  These fixation patterns had more correlations also.  This makes sense because if you are asked which you 'like better' there really is not a matching mechanism going on in the brain it is more of a preference.  When you ask a subject which one 'matches the original' there are definite comparisons going on in the viewing cycle on a point by point basis.

        This experiment also provided data on what is important in a scene when people make comparisons.  The skin and flesh are definitely the main concern when they are present in the image.  People will focus their attention on faces or exposed flesh before they shift attention to any other objects in a scene.  When the face fills the image, eyes, nose and hair are the primary focussing points.   Landscapes are a little more subjective.  The main interest in the images depends on what is in the image.  Typically, people will focus their attention on what ever the main subject in the image is, such as mountains, water falls, trees.  This is consistent for both of the questions asked.  People are concerned with the same information in the scenes regardless of what is asked.
 
        This experiment is not complete.  There were 7 subjects tested but only 5 analyzed.  There are still two more subjects with data that can be included in this study.  To make the conclusions more reliable, more subjects should be tested.  An ANOVA analysis should also be performed on the data to provide information on what is statistically significant.  This may be performed in the next couple of months.

Table of Contents