Seminars & Colloquia
University of Georgia
"Supporting Subgraph Extraction Queries in Semantic Web Databases"
Wednesday April 11, 2007 10:00 AM
Location: 3211, EB2 NCSU Centennial Campus
(Visitor parking instructions)
Abstract: In the context of graph structured data, Subgraph Extraction Queries are queries that given a few points (nodes or edges) in a graph, extract a subgraph connecting the given points. The purpose of such queries is to reveal the nature of relationships between entities i.e., "connect the dots". This analysis paradigm is fundamental to analyzing biological networks in order to help understand the mechanism of action of diseases and determine effective drug targets. It also has many applications in the business, government and other scientific research domains. For example, the assessment of a passenger's risk to air travel hinges on what links, if any, can be found connecting them to terrorist activities and organizations. However, such queries have little to no support in mainstream databases. Further, current support for such queries is limited to graph traversals on memory-based graphs, mostly focusing on finding shortest paths. This is a limitation both for large disk-resident graph databases and for applications that need support for constraints more complex than shortest paths e.g. finding paths that contain a set of mandatory nodes or edges.
This talk presents my work on supporting subgraph extraction queries in disk-based Semantic Web databases. The advantages of investigating this problem in the context of Semantic Web databases are that the machine processible semantics enabled by its data models allows for the support of richer query constraints as well as the possibility of reasoning about the importance of relationships found as a result of query. Additionally, Semantic Web data models admit a general directed labeled graph model so that techniques developed in this context are easily transferable to more restricted graph models such as tree data models. In my talk, I will highlight some of the contributions made by my approach, particularly:
* The formalization of a fundamental subclass of subgraph extraction queries called Constrained Path Extraction CPE queries and a query language, SPARQ2L, which extends the current W3C Semantic Web query language proposal SPARQL with the ability to express CPE queries.
* A novel graph encoding scheme for Semantic Web databases called the 2-Color Code. The notion of a 2-Color Code forms the basis of a novel data storage model which supports efficient evaluation of CPE queries.
* (Time permitting) An approach for ranking the results of CPE queries based on a novel relevance model called SemRank that allows users adjust the ranking of their query result to suit their needs.
Short Bio: Kemafor is a Ph.D. candidate from the LSDIS lab at the University of Georgia. She has a strong interest in multi-disciplinary research, in particular, involving other science disciplines such as Biology. Her research interests include Databases, Data Services, Semantic Web, Graph Mining and Workflow Management. Recently, her work has focused on developing advanced querying and mining techniques for supporting knowledge discovery in "semantics-enabled databases". She has a Bachelor's of Science degree in Biochemistry from the University of Nigeria and prior to pursuing her graduate studies in Computer Science, she worked for a biotechnology firm for over two years developing enzyme immunoassays.
Host: Munindar P. Singh, Computer Science, NCSU
Back to Seminar Listings
Back to Colloquia Home Page