Information discovery in loosely integrated data

Heasoo Hwang, Andrey Balmin, Hamid Pirahesh, Berthold Reinwald

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

3 Scopus citations

Abstract

We model heterogeneous data sources with cross references, such as those crawled on the (enterprise) web, as a labeled graph with data objects as typed nodes and references or links as edges. Given the labeled data graph, we introduce flexible and efficient querying capabilities that go beyond existing capabilities by additionally discovering meaningful relationships between objects that satisfy keyword and/or structured query filters. We introduce the relationship search operator that exploits the link structure between data objects to rank objects related to the result of a filter. We implement the relationship search operator using the ObjectRank [1] algorithm that uses the random surfer model. We study several alternatives for constructing summary graphs for query results that consist of individual and aggregate nodes that are somehow linked to qualifying result nodes. Some of the summary graphs are useful for presenting query results to the user, while others could be used to evaluate subsequent queries efficiently without considering all the nodes and links in the original data graph.

Original languageEnglish
Title of host publicationSIGMOD 2007
Subtitle of host publicationProceedings of the ACM SIGMOD International Conference on Management of Data
Pages1147-1149
Number of pages3
DOIs
StatePublished - 2007
EventSIGMOD 2007: ACM SIGMOD International Conference on Management of Data - Beijing, China
Duration: 12 Jun 200714 Jun 2007

Publication series

NameProceedings of the ACM SIGMOD International Conference on Management of Data
ISSN (Print)0730-8078

Conference

ConferenceSIGMOD 2007: ACM SIGMOD International Conference on Management of Data
Country/TerritoryChina
CityBeijing
Period12/06/0714/06/07

Keywords

  • Information discovery
  • Search
  • XML

Fingerprint

Dive into the research topics of 'Information discovery in loosely integrated data'. Together they form a unique fingerprint.

Cite this