Abstract
The semantic discontinuity between World-Wide Web languages, e.g., XML,
XML Schema, and XPath, and Semantic Web languages, e.g., RDF, RDFS, and
DAML+OIL, forms a serious barrier for the stated goals of the
Semantic Web. This discontinuity results from a difference in
modeling foundations between XML and logics. We propose to
eliminate that discontinuity by creating a common semantic
foundation for both the World-Wide Web and the Semantic Web, taking
ideas from both. The common foundation results in essentially no
change to XML, and only minor changes to RDF. But it allows the
Semantic Web to get closer to its goal of describing the semantics
of the World Wide Web. Other Semantic Web languages (including RDFS
and DAML+OIL) are considerably changed because of this common
foundation.
Semantic Web Vision
- Bring structure to web pages
- Permit software agents to carry out sophisticated tasks for users
- Extension of the current web
(Tim Berners-Lee, James Hendler, and Ora Lassila.
``The Semantic Web''. Scientific American, May 2001.)
Requirements for Semantic Web Languages
Form:
The languages used in the semantic web need well-defined syntax.
- Otherwise software agents cannot determine what constructs
they are using.
Meaning:
The languages used in the semantic web need well-defined semantics.
- Otherwise software agents cannot determine what the
constructs that are using mean.
Semantic Web Tower
Semantic Web Tower (from Tim Berners-Lee)
Elements of the Semantic Web Tower
other elements do not yet exist
- new ontology language
(OWL)
under development
The Current Vision of the Semantic Web Tower
- XML is the base syntactic language for the Semantic Web.
- All Semantic Web languages can be written in XML dialects.
- The meaning of Semantic Web language is not necessarily
related to their XML meaning.
- RDF is the base language for the Semantic Web.
- All Semantic Web languages use RDF for their syntax.
- All Semantic Web languages build on RDF for their semantics.
Rationale for the Current Vision
- XML is supposed to be the mechanism for defining all Web languages.
- XML Schema, XML ..., RDF, RDF Schema, OIL, ....
- Different languages are in different documents.
- Base data is contained in XML documents.
- XML systems can parse all Web documents, but will not
understand them.
- RDF is supposed to be the mechanism for defining all Semantic Web
languages.
- RDF Schema, ....
- Different languages are in the same document.
- All information is contained in RDF documents.
- RDF systems should be able to parse and (partially) understand
all Semantic Web documents.
Problems with the Semantic Web Vision
- Disconnects at the Foundation
- The XML meaning is not used, so data written in XML cannot be
used in the Semantic Web.
- XML Schema is not used in the Semantic Web languages.
- An Inadequate Basis
- RDF is inadequate for providing either syntax or semantics for
the entire Semantic Web.
A Disconnect at the Foundation
- The Semantic Web needs a source of data.
- Where will this come from?
- HTML?
- XML and XML Schema?
- RDF?
A Disconnect at the Foundation (po.xml extracts)
<?xml version="1.0"?>
<purchaseOrder orderDate="1999-10-20">
<shipTo country="US">
<name>Alice Smith</name>
<street>123 Maple Street</street>
<city>Mill Valley</city>
<state>CA</state>
</shipTo>
<items>
<item partNum="872-AA">
<productName>Lawnmower</productName>
<quantity>1</quantity>
<USPrice>148.95</USPrice>
</item>
...
</items>
</purchaseOrder>
A Disconnect at the Foundation (po.xsd extracts)
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<xsd:element name="purchaseOrder" type="PurchaseOrderType"/>
<xsd:complexType name="PurchaseOrderType">
<xsd:sequence>
<xsd:element name="shipTo" type="USAddress"/>
<xsd:element name="billTo" type="USAddress"/>
<xsd:element ref="comment" minOccurs="0"/>
<xsd:element name="items" type="Items"/>
</xsd:sequence>
<xsd:attribute name="orderDate" type="xsd:date"/>
</xsd:complexType>
...
</xsd:schema>
RDF is not built on XML
- RDF cannot use most XML (and XML Schema) data
- XML data
- is totally ordered
- is tree-like
- has no distinction between objects and relationships
- cannot be given identity
- can be regulated and typed using XML Schemas
- RDF information
- is unordered
- is in the form of directed graphs
- has a distinction between objects and relationships
- can be given identity
Providing Meaning for Semantic Web Languages
Model-theoretic semantics is an excellent way of providing meaning.
- can be tailored for lots of languages
- generalize data models
- based on
- interpretations - what the world could be like
- models of information - what interpretations are compatible
with the information
- inference is entailment - are all models of the premise also models of
the consequence
An RDF interpretation is a node- and edge-labelled graph
- labels are identifiers (not types)
- node labels are either URIs or strings
- nodes with strings as labels have no outgoing edges
- edge labels are URIs
- there is no order in the graph
An RDF interpretation is a model of an RDF document if there is
- a mapping from the description elements in the document to
URI-labelled or unlabelled nodes in the interpretation
- if the document element has an ID, then the node has that ID
as a label
- if the document element has a name, then the node is linked
via an rdf:type labelled edge to a node with that name as label
- if the document element is a string, then the
node has that string as a label
- a mapping from the property elements in the document to edges in the
graph that relate the two description elements and that have the ID
of the element as label
XML Model-Theoretic Semantics (abstracted)
An XML interpretation is a node-labelled tree
- node labels indicate typing information (not identification)
- node labels are either QNames or strings
- nodes with QName labels are either element nodes or attribute
nodes
- nodes with strings as labels have no outgoing edges
- there is a total order on the outgoing edges of each node
XML Model-Theoretic Semantics (abstracted)
An XML interpretation is a model of an XML document if there is
- a mapping from the elements of the document to element nodes
- the root element maps to the root of the tree
- the name of the element is the name of the node
- a mapping from the attributes of the document to attribute nodes
- the name of the attribute is the name of the node
- a mapping from attribute values to nodes labelled with strings
- the node is a child of the mapping of the attribute
- the value of the attribute is the string
- a mapping from text to nodes labelled with strings
- the value of the attribute is the string
- child elements and attributes are mapped to child nodes
- child nodes later in document order are greater in the tree
order
Example RDF Interpretation
- Ovals are resources.
- Rectanges are strings.
- Oval labels are identifiers.
- Edge labels are identifiers.
- No order.
Another Example RDF Interpretation
- Ovals are resources.
- Rectanges are strings.
- Oval labels are identifiers.
- Edge labels are identifiers.
- No order.
Example XML Interpretation
- Ovals are URI-labelled nodes.
- Rectangles are string-labelled nodes.
- Order is present.
A New Foundation for the Semantic Web
- Harmonize XML data and RDF-ish information model and build a model theory
- have partial order in interpretations
- allow non-tree graphs in interpretations
- labels only on nodes
- do not distinguish between objects and relationships
- allow RDF-style identifiers
- Incorporate XML Schema into the Semantic Web
- as a way of providing structure and typing for XML documents.
- Restrict the format of information in XML documents
- as a way of defining classes or types.
- Independent of any XML document
Integrated Model-Theoretic Semantics (abstracted)
An interpretation is a six-tuple,
- R, a set of resources
- E, a set of relationships
- EXT, a mapping from relationships to pairs of resources or
pairs of resources and strings
- CEXT, a mapping from resources to sets of resources
- O, provides a strict partial order on relationships
- S, a mapping from URIs to resources
XML (and RDF) documents are processed into document graphs that are like
XML document graphs with the addition of RDF identifiers.
Integrated Model-Theoretic Semantics (abstracted)
An RDF interpretation is a model of an document graph
if there is a mapping N from the nodes of the graph to resources with
- for each element or attribute node, n, with identifier, r,
N(n) = S(u)
- for each element or attribute node, n, with name label, u,
N(n) in CEXT(S(u))
- for each text node, n, with label, s,
N(n) = s
- for each edge in the graph <n,m>, then there is
e in E with
EXT(e) = <N(n),N(m)>
- if c and d are two children of
n then the relationships from above are in the same
O order as c and d are in
document order
Example Interpretation
- Ovals are Resources
- Outside labels are identifiers.
- Inside labels show class membership.
- Rectangles are strings
- Order is present.
A New Foundation for the Semantic Web
Status
- Just a start for this idea so far
- Lots of work still needed
- Integrate XML Schema
- Extend to ontology language
- This is a change to the model theory for logic!
A New Semantic Web Vision
- Not from me!
- Building more of the Semantic Web may expose more problems.
- However, the more flexible yet integrated the better!
- want information to flow between the levels
- a single syntax and a single semantics is just too restrictive