Institute for Discrete Sciences Workshop on Associating Semantics with Graphs

April 16 - 17, 2007
The DyDAn Center at the Center for Discrete Mathematics and Theoretical Computer Science (DIMACS), Rutgers University

Organizers:
Alex Borgida (co-chair), Rutgers University
Claire Cardie, Cornell University
Hans Chalupsky (co-chair), Information Sciences Institute, University of Southern California
Jiawei Han, University of Illinois Urbana-Champaign
Scott Kohn, Lawrence Livermore National Laboratories
Peter Patel-Schneider (co-chair), Bell Labs
Presented under the auspices of the Institute for Discrete Sciences, led by Lawrence Livermore National Laboratory, and its affiliated DHS centers of excellence based at Rutgers University, the University of Pittsburgh, the University of Illinois at Urbana-Champaign and the Information Sciences Institute of the University of Southern California.

Abstracts:

Olivier Bodenreider, U.S. National Library of Medicine

Title: UMLS: The graph behind the forest

Many biomedical terminologies are organized as trees (e.g., International Classification of Diseases) or polyhierarchical structures (e.g., Medical Subject Headings, Gene Ontology). Browsers developed for these terminologies generally provide tree-like representations. In contrast, terminology integration systems identify concepts shared by several terminologies and create a large graph our of the originally disjoint trees.

We present the Unified Medical Language System (UMLS), which integrates over one million biomedical concepts from more than one hundred terminologies. We show that the resulting graph is broader, deeper and more richly connected than any of its constituent trees. We discuss some of the limitations of the UMLS graph, including presence of cycles, underspecification of the semantics of some links, semantic inconsistency and missing links. Finally, we show how biomedical graphs can be exploited for knowledge discovery.


Hans Chalupsky, ISI

Title: Representing, Reasoning with and Querying Semantic Graphs in PowerLoom

We have developed a suite of knowledge and link discovery tools called KOJAK that perform tasks such as pattern matching, group detection, anomaly detection and graph and relationship simplification on large and complex semantic graphs. KOJAK is a hybrid system combining symbolic knowledge representation and reasoning (KR&R) technology with statistical graph algorithms from the area of data mining. We describe how PowerLoom, a general-purpose, logic-based KR&R system is used by KOJAK to represent semantic graphs, link them to an ontology and provide reasoning services such as graph normalization, group extension and general graph querying and matching.


Christos Faloutsos, CMU

Title: Finding patterns in large, real networks

How does the internet look like? Which sub-graphs of such networks deviate from the 'normal' (and are thus suspicious groups of people/nodes/routers)? How do viruses/rumors propagate in a network? How do we scale up all these algorithms, to handle graphs of millions and billions of nodes and edges?

These are the questions we address. Earlier work has shown that the topology of real networks presents some surprising patterns: the Internet topology obeys power-laws, and so does the web topology. Both, along with social networks, they all exhibit 'small world' phenomena. We present a list of such patterns that real graphs follow, as well as fast, scalable algorithms to efficiently process huge datasets, in search of patterns, anomalies and outliers.


Seth Greenblatt, Overwatch

Title: Ontologies for Graph Matching: Practice and Potential

In practice, many implementations of graph matching make use of ontologies only in a cursory manner. This talk seeks to:

  • Briefly introduce the basic concepts of graph matching,
  • Discuss how ontologies are generally used in graph matching,
  • Provide a few ideas regarding ways to make more effective use of ontologies in graph matching, and:
  • Discuss issues that need to be overcome in order to make more effective use of ontologies in graph matching
    Ralph Grishman, NYU

    Title: Information Extraction: Building Graphs from Natural Language Text

    The talk will provide an introduction to information extraction and briefly discuss the ACE hierarchy (entities, relations, events) and the problems of training new IE systems.


    David Israel, SRI

    Title: Some Thoughts on Associating..... Something Really Useful with Graphs


    Cliff Joslyn, LANL

    Title: Semantic Hierarchy Analysis Using DAGs and Posets

    Order theory is the mathematical theory of hierarchical structures, specifically lattices, partially ordered sets (posets), and other Directed Acyclic Graph (DAG) structures. While hierarchical systems are ubiquitous both in nature and in our computer systems, they play a special role in representing semantic information, specifically in ontological stores of lexical information, relational knowledge representations, non-traditional uncertainty quantifications, and multidimensional databases. In this talk we will survey some issues in our analytical perspective on semantic hierarchies, including the grounding of hierarchies in digraphs and DAGs, the use of poset metrics in ontology analysis, and some speculative thoughts about the use of order theory for OLAP and other multidimensional databases.


    Ken Murray, SRI

    Title: Representing Patterns in PHERL

    Many Link Analysis (LA) tools have been developed; each has particular strengths in helping analysts acquire patterns or use patterns to detect evidence in data. In order to provide analysts with the benefits from a range of tools, workbench systems are being created that include many pattern-based LA tools. In such systems, different tools must be able to share patterns. However, different tools represent patterns in different ways. PHERL is an interlingua that supports pattern-sharing across a community of pattern-based LA tools. It exploits a graph-based representation for patterns; the semantics of these graphs are defined by equivalent FOL rules. Experimental translation programs have translated example patterns between PHERL and the native representations of several link analysis tools.


    David Silberberg, Wayne Bethea Johns Hopkins University Applied Physics Laboratory

    Title: Ontology-Assisted Query of Graph Databases

    We present our approach to exploiting the semantics of ontologies to assist in querying data graphs. The approach is based on current efforts that define a declarative graph query language that unifies common approaches to querying data graphs. A separate ontology that represents the semantics of the problem domain can augment a graph database, enabling graph queries to be formulated using terms of the ontology as well as terms of the graph database. Preserving the separation between ontology and graph databases provides users with some of the semantic power of ontologies while maintaining the performance advantage and the query paradigm of graph database systems. The presentation will focus on the paradigm difference among ontology query, ontology-assisted query and graph query as well as our reasons to maintain the differences. Finally, the presentation will describe the challenges and future directions of our efforts to create a comprehensive declarative query language for analysis of graph data.


    Michael Witbrock, Cycorp Inc.

    Title: Adding Strength to Links

    Most large scale data graphs have relatively small numbers of link types. These restricted relational vocabularies make it easier to form graphs, since the relationships tend to be broad and therefore hold between many things, but they also tend to be relatively unproductive, inferentially, as a result. Cyc has a very large relational vocabulary, with many thousands of relationships, some of them very specific. Many of these relationships are connected to background knowledge that makes them inferentially productive, so it is useful if the more specific relationships can be identified, converting part of the graph into a grounded knowledge base. In this talk, I will discuss a preliminary results from a number of efforts at Cycorp to identify specific relationships both in graphs directly, and in the text from which graphs are often derived.


    Mike Wolverton, SRI

    Title: GEM: A Graph-Based Hierarchical Pattern Language for Imprecise Information Needs

    Many domains require data analysis techniques that provide inexact views into the data. Noisy and incomplete data sets, and analysts' imprecise or evolving understanding of their information needs are some of the problem characteristics that require a pattern language that can access data more flexibly than SQL and similar query languages. This talk will discuss GEM (for Graph Edit Model), a language with the requisite flexibility for many such problems. GEM patterns include a semantic graph that represents a prototype of the situation of interest, and a data comparison metric based on graph edit distance to represent allowable deviations from that prototype. We give a description of the constructs of the language -- including higher-order constructs such as hierarchical graphs, disjunction, and cardinality -- and we demonstrate its use through examples taken from the Link Analysis Workbench (LAW) system.



    Previous: Program
    Workshop Index
    DyDAn Homepage
    Contacting DyDAn
    Document last modified on April 13, 2007.