Seminar Details
| Date |
6-5-2011 |
| Time |
14:30 |
| Room/Location |
DISI-Sala Conferenze III piano |
| Title |
PLACE INFORMATION SYSTEMS: TEXTUAL LOCATION IDENTIFICATION AND VISUALIZATION |
| Speaker |
Prof. Hanan Samet |
| Affiliation |
Department of Computer Science University of Maryland College Park, MD 20742 e-mail: hjs@cs.umd.edu |
| Link |
http://www.cs.umd.edu/
|
| Abstract |
PLACE INFORMATION SYSTEMS: TEXTUAL LOCATION IDENTIFICATION AND VISUALIZATION:
An ACM Distinguished Speaker Program Lecture
The popularity of web-based mapping services such as Google Earth/Maps
and Microsoft Virtual Earth (Bing), has led to an increasing awareness
of the importance of location data and its incorporation into both
web-based search applications and the databases that support them,
In the past, attention to location data had been primarily limited to
geographic information systems (GIS), where locations correspond to
spatial objects and are usually specified geometrically.
However, in the web-based applications, the location data often
corresponds to place names and is usually specified textually.
An advantage of such a specification is that the same specification
can be used regardless of whether the place name is to be interpreted as
a point or a region. Thus the place name acts as a polymorphic data
type in the parlance of programming languages. However, its drawback is
that it is ambiguous. In particular, a given specification may have
several interpretations, not all of which are names of places. For
example, ``Jordan'' may refer to both a person as well as a place.
Moreover, there is additional ambiguity when the specification has a
place name interpretation. For example, ``Jordan'' can refer to a river
or a country while there are a number of cities named ``London''.
In this talk we examine the extension of GIS concepts to textually
specified location data and review search engines that we have
developed to retrieve documents where the similarity criterion is not
based solely on exact match of elements of the query string but
instead also based on spatial proximity. Thus we want to take
advantage of spatial synonyms so that, for example, a query seeking a
rock concert in Nervi would be satisfied by a result finding a rock
concert in Albaro or Sampierdarena. This idea has been applied by us to
develop the STEWARD (Spatio-Textual Extraction on the Web Aiding
Retrieval of Documents) system for finding documents on website of the
Department of Housing and Urban Development. This system relies on
the presence of a document tagger that automatically identifies
spatial references in text, pdf, word, and other unstructured
documents. The thesaurus for the document tagger is a collection of
publicly available data sets forming a gazetteer containing the names
of places in the world. Search results are ranked according to the
extent to which they satisfy the query, which is determined in part by
the prevalent spatial entities that are present in the document. The
same ideas have also been adapted to collections of news articles as
well as Twitter tweets resulting in the NewsStand and TwitterStand
systems, respectively, which will be demonstrated along with the
STEWARD system in conjunction with a discussion of some of the
underlying issues that arose and the techniques used in their
implementation. Future work involves applying these ideas to
spreadsheet data.
Biography
Hanan Samet (http://www.cs.umd.edu/~hjs/) is a Professor of Computer
Science at the University of Maryland, College Park and is a member of
the Institute for Computer Studies. He is also a member of the
Computer Vision Laboratory at the Center for Automation Research where
he leads a number of research projects on the use of hierarchical data
structures for database applications involving spatial data. He has a
Ph.D from Stanford University. He is the author of the recent book
"Foundations of Multidimensional and Metric Data Structures" published
by Morgan-Kaufmann, San Francisco, CA, in 2006
(http://www.mkp.com/multidimensional), an award winner in the 2006
best book in Computer and Information Science competition of the
Professional and Scholarly Publishers (PSP) Group of the American
Publishers Association (AAP), and of the first two books on spatial
data structures titled "Design and Analysis of Spatial Data
Structures" and "Applications of Spatial Data Structures: Computer
Graphics, Image Processing and GIS" published by Addison-Wesley,
Reading, MA, 1990. He is the founding chair of ACM SIGSPATIAL, a
recipient of the 2009 UCGIS Research Award and the 2010 CMPS Board of
Visitors Award at the University of Maryland, a Fellow of the ACM, IEEE,
AAAS, and IAPR (International Association for Pattern Recognition), and
an ACM Distinguished Speaker. |
|
|
 |