Research Interests

 

image of a PPT software Window

Research overview.

Spatial behavior is one of the most fundamental aspects of our existence, yet most people have never considered what information they use to navigate their environment.  Think about the last time you walked around the mall, toured a new city, or needed to find your gate at the airport.  Assuming clear signage, these spatial activities were likely performed with little thought or effort, but how did you do it? The answer is surprisingly difficult to articulate. These tasks are not a conscious process for most people; they are simply the product of an effective visually-guided perceptual-motor coupling.

However, when reliance on visual information is not available, the automaticity of your behavior likely changes dramatically. For instance, now imagine that you must perform the same tasks wearing a blindfold. Most people would find the prospect of navigating complex environments like malls, cities or airports without vision incredibly daunting, if not completely incomprehensible. How would you avoid obstacles, recognize landmarks, read signs, stay oriented, and build up a mental representation of space as you walked around? Research in our lab studies such questions by comparing similarities and differences of vision and non-visual modalities for learning, representing, and navigating real and virtual environments.

Multimodal spatial cognition.

The core of our basic research program is based on an agenda we call "multimodal spatial cognition," which deals with topics such as spatial learning and navigation from different inputs, the effects of multimodal and cross-modal interactions on the mental representation of space, and a comparison of spatial problem solving and spatial behavior between modalities. Our work makes comparisons between different combinations of vision, haptics, and 3-D audio, as well as non-perceptual inputs such as spatial language.

Although your phenomenology may suggest otherwise, vision is not necessary for successful spatial behavior. Good evidence comes from superior navigational abilities of blind animals and humans alike but is also evident from accurate performance by sighted folks on many common tasks done without visual support, e.g. your ability to walk from your bedroom to the bathroom in the middle of the night. When one considers that much of what is perceived through vision is spatial, and that audition and touch convey many of the same spatial properties as vision (e.g. position, direction, configuration, relation, and the like), the ability to accurately perform spatial behaviors without vision is not surprising.

What remains unknown is how far the envelope can be pushed for non-visual inputs to convey information and support tasks generally performed using vision. For example, the traditional use of spatial language in the context of navigation is for direction-giving or route-guidance, which are not complex spatial behaviors.

Verbal route directions are static, based on executing a fixed sequence of procedures, e.g. “turn left at the second light, continue three blocks on Hop Meadow Street until you see Pizza Luce on your right, take a left on Eaton street and the entrance to Highway 101 is immediately on your left". Such descriptions are limited as they are generally given prior to travel (requiring maintenance in memory) and the information is not updated with respect to your position and heading as you traverse the route.

By contrast, verbal messages received during route navigation may include updated information, e.g. the verbal output of an in-car navigation system informing you to “turn left on Broad Street” when you are driving Northbound and “turn right” on Broad Street when driving Southbound. Even though such systems are providing real-time verbal information, they only support directed action states and convey little information about layout configuration beyond what is experienced along the route.

By comparison, vision is used to support a host of more complicated spatial operations, such as open searching (free exploration), wayfinding, cognitive mapping, and spatial inference. We ask: can verbal information also be used to perform these more complex spatial behaviors? To address this question, Giudice and colleagues have conducted a series of studies using dynamically-updated verbal displays during navigation of real and virtual environments. These displays are based on context-sensitive spatial descriptions which are updated in register with the user’s position and orientation in the environment as they move. Results from this research demonstrate that use of our updated verbal descriptions not only support these more complicated spatial behaviors but that access to verbal and visual information yields near identical learning and wayfinding performance. A number of papers on this topic are available from our publications page.

Functional equivalence.

Given our visuocentric focus, and the obvious surface differences between the senses, several interesting questions arise when considering multimodal spatial cognition. Can non-visual modalities and interface technologies provide the same underlying spatial content normally subserved by vision? Is spatial knowledge acquisition and spatial behavior from non-visual modalities fundamentally different or functionally equivalent to the same tasks from visual input? If performance is different, what are the reasons and can compensatory information be used to equate for deficits? If different inputs lead to highly similar performance, what is the format of the mental representation (and associated neural substrates) mediating this behavior?

Some simple examples of work we would do to address such questions include: comparing learning of object layouts from vision or touch, maps of the UMaine campus encoded from haptic exploration or verbal descriptions, or objects positioned around an office apprehended by seeing or hearing their locations. Our theoretical interest is independent of specific combinations of inputs or scenarios, focusing instead on questions of whether representations built up from any two or more modalities support the same level of spatial performance, and probing the structure of these spatial representations (e.g., whether they are amodal, multimodal, separate but equal, or recoded into a common sensory format).

A growing body of evidence from researchers studying similar scenarios, including work from our lab, demonstrates that information learned from different encoding modalities can lead to highly similar performance on a range of spatial tasks (e.g., spatial orientation, spatial updating, and wayfinding behavior), an outcome referred to as functional equivalence. The explanation is based on the hypothesis that the sensory-specific information at learning builds up into a common (likely amodal) spatial representation in memory (called the spatial image in working memory or the cognitive map in long-term memory) which functions equivalently in supporting spatial behaviors. Our research on functional equivalence and the development of common spatial representations is supported by a National Institutes of Health (NIH) grant entitled “Spatial Images from Vision, Touch and Hearing in Sighted and Blind”. This work is being done in collaboration with the leading researchers in this area: J.M. Loomis (UCSB) and R.L. Klatzky (CMU). Read here for a brief abstract of our spatial images project or check out our publications page to read more on our work in this area.

 

Information requirements for spatial learning and navigation.

Another line of research in the lab investigates the best design for real-time multimodal displays for supporting environmental learning, cognitive mapping, and wayfinding behavior of indoor and outdoor environments. The dynamically-updated verbal displays described earlier represent one such example, but our interest is more general.  We compare and optimize performance from all matter of navigation interfaces: visual, auditory, haptic, language-based, and multimodal. Although we are interested in the hardware and software used in such displays, our primary focus is on the content and presentation of spatial information--determining the minimal information requirements and best delivery methods supporting the highest level of environmental learning and navigation performance. These results are critical for understanding how multimodal spatial cognition is affected by the availability of different sources of environmental information and will be used to establish specifications for the design of visual and non-visual interfaces which support a similar level of performance across a range of common spatial behaviors. This goal is far from completion, but results from our work with verbal displays and our basic research on functional equivalence speak to its eventual success. That is, a practical outcome of the building up and accessing of functionally equivalent representations is that, assuming provision of the appropriate information, different sensor technologies, multimodal interfaces, and spatial displays could support the same level of spatial behaviors as vision. Our goal is to identify a core set of sensory-independent spatial primitives for indoor and outdoor environments that support complex spatial behaviors, irrespective of the input channel. These spatial primitives are at the heart of our design of visual and non-visual navigation interfaces.

 

Multimodal interfaces for real-time navigation systems.

 

Although we study all matter of spatial layouts, our work concentrates on indoor navigation using both real and virtual environments. Compared to outdoor travel, indoor navigation is aided by far less information from the environment, orienting cues, and external aids (such as maps or GPS). As a result, spatial learning and wayfinding of indoor spaces can pose some particularly difficult challenges. Check out the Indoor Wayfinding page for more on this surprisingly vexing issue or the following project descriptions for more about our proposed solutions.

For most outdoor locations, a couple of button presses of the average mobile appliance will call up information about one’s current position, maps of the surrounding area, and detailed descriptions of nearby businesses. There is a glaring hole, and practical need, for similar functionality to support indoor travel. We have three funded projects in the lab looking at various aspects of navigation and the development of user interfaces and spatial technologies to aid the challenges posed by indoor wayfinding. Two related projects are working on investigating research and development of indoor navigation systems to support travel in complex buildings for people with low-vision. This work is timely, as the World Health Organization estimates over 12 million U.S. citizens have some form of uncorrected vision loss, with these projections doubling by the year 2030 (WTO, 2004). Both projects are based on user-centered experiments in real and virtual environments. They employ multimodal interfaces and building databases with infrastructure independent sensors to provide information about position, orientation, local geometry, and object identification. A third project addresses similar issues but employs visual displays in a real-time navigation system supporting integrated travel of indoor and outdoor spaces.  

 

Systems supporting spatial learning and navigation using non-visual displays. 

One of our NSF projects, entitled “Cyber Enhancement of Spatial Cognition for the Visually Impaired,” leverages expertise from human spatial cognition, machine vision, robotics, and sensor fusion algorithms (with collaborators K. Daniilidis, UPenn; S. Roumeliotis, UMN; and R. Manduchi, UCSC). Project-related work in the VEMI lab is currently addressing the information requirements for designing speech-based displays and 3D audio interfaces to be used in a “cyber assistant”. The cyber assistant is a real-time navigation system which will provide blind and low-vision people with dynamically-updated information about their position and orientation in the building, as well as information about local geometry, identification of rooms, and indication of functional landmarks.  Read here for a brief abstract of our Cyber Assistant project.

We are also working on another navigation system, sponsored by a Phase II SBIR from NIH, with Minneapolis-based Koronis Biomedical Technologies (KBT). This project proposes an indoor solution combining multimodal interface design with new highly sensitive GPS receivers augmented with advanced dead reckoning technology. KBT is leading the engineering and technological development activities on the project. Our contribution relates to experimental design and studies to determine the best ways for presenting non-visual environmental information in real-time displays which could be implemented in a commercially viable package. Read here for a brief abstract of our Indoor GPS navigation project.  

In addition to its application to persons with low vision, development of a non-visual indoor navigation system is relevant to situations where normal vision may be impaired (e.g., firefighters or emergency response personnel), or for use in indoor navigation systems to guide tourists (similar to the verbal instructions provided by GPS-based systems for vehicle navigation). Although the explicit goals of our current projects are scoped more narrowly, our results are also germane to these broader application domains.

 

Systems supporting spatial learning and navigation using visual displays.

Thus far, we have only discussed non-visual interfaces but there are obvious benefits of indoor navigation systems based on visual displays to support many of the same spatial activities and behaviors (e.g., indoor route-guidance, tourism, resource management, emergency operations, and provision of location-based services. To this end, we are working on an NSF project entitled “Information Integration and Human Interaction for Indoor and Outdoor Spaces” (with our SIE collaborator Mike Worboys).  This project involves determining computational models and data structures for representing outdoor (O) and indoor (I) spaces and the creation of a unified O/I space model. The goal is to implement this unified model on a portable, context-aware device (e.g., a cell phone instrumented with appropriate sensors) which will support seamless navigation assistance in O/I spaces. Project-related work in the VEMI Lab addresses interface development and usability testing of an interactive platform for this device. One of our goals is to determine the minimum information requirements for interfaces based on visual and multimodal displays to support optimal learning and navigation performance. This is done by comparing learning from virtual displays rendered using different sensory cues with different levels of information content. For instance, how does seeing an entire floor plan of a building compare to only seeing a small “bubble” around your immediate position? Does the addition of auditory information to a visual display improve spatial knowledge acquisition and memory? Does the amount or type of environmental information needed for accurate learning and efficient navigation performance differ between indoor and outdoor spaces?  To learn more about this work, check out our O/I project website. More about the issues of navigating in buildings can also be found on the Indoor Navigation page or from some of our publications.

 

Other research interests in the lab.

We are interested in all matter of topics that lie at the intersection of spatial cognition and multimodal input. Some of our other interests include: creative uses of multimodal virtual environment technology (MVET) for spatial knowledge acquisition and application, cross-modal brain plasticity in the blind, development and usability testing of assistive technology providing non-visual access to environmental information, universal design, lifespan spatial abilities, and the development of gerontechnology for age-related vision loss.

More about our research can be found on our Current Projects and Philosophy pages. Selected articles and accompanying comments can be downloaded from our publications page. To get additional information about the VEMI Lab, our team, or our colleagues, check out our Lab ResourcesPersonnel, and Collaborators pages.