![]() ![]() Jeff B. Pelz, Ph.D. Professor Ph.D., 1995
3112 Carlson |
In order to perceive the world around us, we must move our eyes almost constantly.
We typically make ~2 to 4 eye movements every second; over 100,000 every day. These eye
movements are necessary because of the design of the human eye. Unlike manmade
image sensors such as CCDs or photographic film, the image sensor at the back of
the eye (the retina) is highly anisotropic; the resolution varies by orders of
By examining the eye movements of subjects as they perform complex tasks, we
are able to take advantage of this window into cognition, helping us understand
how we gather information from the environment, how we store and recover the
information, and how we use that information in planning and guiding actions.
Recent work in the Visual Perception Laboratory has focused on using the RIT
Wearable Eyetracker to monitor complex, real-life tasks in natural
environments. Recent papers describe these experiments (follow the
recent publications:
link below)
The design of the human eye was necessary to meet the competing evolutionary
demands for high visual acuity and a large field of view. There is simply not
enough neural real estate available in the brain to support a visual system that
has high resolution over the required field of view. Even if we left no room in
the cortex for any other senses (not to mention housekeeping functions like
breathing or keeping the heart beating), the human cortex could not support the
The first job of an eye movement system is to move the eye quickly from the
current point of gaze to a new location. Vision is blurred during an eye
movement, so the length of time that the eye is moving must be minimized. In
order to minimize the time during which no clear image is captured on the fovea,
eye movements that move the fovea from one object/point to another are very rapid.
These saccadic eye movements are among the fastest movements the body can
make; the eyes can rotate at over 500 deg/sec, and subjects make well over one
hundred thousand of these saccades daily. These rapid eye movements are
accomplished by a set of six muscles attached to the outside of each eye. They
are arranged in three pairs of agonist-antagonist pairs; one pair rotates the eye
horizontally (left - right), the second rotates the eye vertically (up - down),
the third allows 'cyclotorsion,' or rotation about the line of sight. The second class of eye movements maintains clear vision by
stabilizing the retinal image. This stabilization assures that the image of an
object or region in the center of the field-of-view is kept over the fovea.
Sophisticated mechanisms exist to accomplish this goal in the face of eye, head,
body, and object motion. These eye movements are often grouped into four
categories:
Eye movements in natural tasks:
A paradigm for understanding the process of visual
perception
My primary research focus is visual perception in everyday life. Most of what we
know about visual perception is based on carefully designed experiments performed
in the laboratory where conditions can be carefully controlled. While this has
given us a very thorough understanding of the metrics and mechanics of
vision, it tells us little about how we use vision every day.
magnitude across the field. High acuity is only available in a small area at the
center of the retina, so the eyes are moved to 'point to' objects or regions in
the scene that require high acuity.
Eye movements are also made toward task-relevant targets even when high spatial
resolution is not required. These eye movements, made without conscious
intervention, can reveal attentional mechanisms and provide a window into
cognition; they are the focus or our research.
We recently presented related work at the
Symposium on Eye
Movements and Vision in the Natural World in Amsterdam.
Another project in the Visual Perception Lab at RIT is being done in
collaboration with
Alex Jaimes
of the
Department of Electrical
Engineering
at Columbia University.
"Using Human Observers' Eye
Movements in Automatic Image Classifiers"
[Jaimes
, Pelz,
Grabowski
,
Babcock
, and
Chang].
Roxanne Canosa
has completed her MS thesis "Eye Movements and Natural Tasks in an
Extended Environment," and is now beginning her doctoral research. Her focus is
to better understand how eye movements aid the process of visual perception, and
seeking ways to use that understanding in the design of artificial vision systems.
Work in integrating an eyetracker into a virtual reality HMD is
described in "Development of a Virtual
Laboratory for the Study of Complex Human Behavior"
optimal size/resolution sensor. Some animals stay within the design limits by
restricting their field of view (e.g., a hawk); others give up high resolution in
favor of a larger field of view (e.g., a rabbit). Rather than picking one or the
other solution, humans evolved the anisotropic retina with very high spatial
resolution in the center of the visual field (the fovea), surrounded by a much
lower resolution region (the peripheral retina). In the human retina, the
high-resolution fovea encompasses less than 0.1% of the visual field visible at
any instant, and the effective resolution falls by an order of magnitude within a
few degrees from the fovea. This variable-resolution retina reduces bandwidth
sufficiently, but is not an acceptable solution alone. Unless the point of
interest at any moment happened to fall in the exact center of the visual field,
the stimulus would be relegated to the low-resolution periphery. The 'foveal
compromise' was made feasible by the evolution of a complementary mechanism to
move the eyes. In order to ensure useful vision, the eyes must be moved rapidly
about the scene.
Much of the research on eye movements to date has been focused on understanding the mechanics and dynamics of the oculomotor system. The question of how successive fixations are aligned spatially has also received much attention. Most of this research has been aimed at discovering how the visual system 'knows' where the eyes are situated for each fixation so that the individual images captured with each fixation can be correctly aligned to build the rich internal representation we experience. Evidence is emerging, however, that we may have been asking the wrong question. We are able to use regularities in the environment to maintain a stable representation without resorting to complex alignment mechanisms and large changes in the environment may go undetected. Understanding visual perception requires us to ask a similar, but orthogonal question about the temporal stitching of successive views. This issue has not arisen with experimental tasks in the past because task complexity was purposely restricted.
We are studying eye movements in complex tasks and natural environments so that we can better understand the process, rather than the mechanics, of visual perception
Teaching:
I am an Associate Professor in the Chester
F. Carlson Center
for Imaging Science at R.I.T.
I teach Introduction to Imaging Science I & II, Survey of Imaging Science, and
Vision and Psychophysics
and co-teach The Visual System, and
Spatial Vision and Pattern Perception with Eriko Miyahara.
I have
also taught courses in optics and computer programming in the Microelectronic
Engineering and Imaging and Photographic Technology programs.
I did my dissertation on
Visual Representations in a Natural Visuo-motor Task
in the department of
Brain
and Cognitive Sciences (BCS) at the University of Rochester's
Center for Visual Science (CVS)
I continue to collaborate with Mary
Hayhoe (CVS and BCS),
Dana Ballard (
CVS and Computer Science),
New instrumentation allows us to monitor the movement of a subject's eyes, head, and hand while they are performing complex visuo-motor tasks. The photograph below shows the experimental equipment and one of the tasks that we are using in these studies. A headband mounted eyetracker (made by Applied Science Laboratories) monitors the subject's eye position. (The crosshairs in the monitor in the background indicate the gaze position on the board.) An EM field transmitter/receiver pair (from Ascension Technology) reports the position and orientation of the head and hand. Eye and head data is combined to provide an 'eye-in-space' signal.