URIQA

The URI Query Agent Protocol
A Semantic Web Enabler

Nokia

Abstract

This document defines a simple and efficient protocol for knowledge discovery, both from authoritative sources as well as from arbitrary third party sources. It introduces an extension to the present web architecture used to indicate to a web server that it should resolve the specified URI in terms of knowledge about the resource denoted by that URI rather than in terms of a representation of the resource in question.


Table of contents

  1. Introduction
  2. Concise Bounded Resource Descriptions
  3. Extending the Web Architecture
    1. Semantic Web Methods for HTTP
  4. URIQA Semantic Web Service
    1. The URIQA Semantic Web Service Interface
    2. URI References with Fragment Identifiers
    3. Descriptions as Fall-Back Representations
    4. URIQA Service Extended Parameters
    5. Named Graphs
  5. Frequently Asked Questions
  6. Notes and References

Introduction

As the Semantic Web [1] emerges and the behavior of automated software becomes increasingly directed by explicit knowledge about resources, gathered from disparate sources, the need for a standardized means of sharing authoritative knowledge about a given resource, based solely on the URI denoting that resource, becomes critical to achieving a fully open, global, scalable, and ubiquitous Semantic Web.

URIQA (URI Query Agent) is a protocol for knowledge discovery, both from authoritative sources as well as from arbitrary third party sources. It introduces an extension to the present web architecture [2] used to indicate to a web server that it should resolve the specified URI in terms of knowledge about the resource denoted by that URI rather than in terms of a representation of the resource in question.


Concise Bounded Resource Descriptions

URIQA employs a particular form of resource description called a concise bounded description [3]. A concise bounded description of a resource is a general and broadly optimal unit of specific knowledge about that resource to be utilized by, and/or interchanged between, semantic web agents.

Concise bounded descriptions are defined in a separate specification.


Extending the Web Architecture

Semantic Web Methods for HTTP

URIQA extends the present web architecture by introducing the following new HTTP [5] methods for interacting with an authoritative semantic web enabled web server:

MGET

Return a concise bounded description of the resource denoted by the request URI. E.g.

MGET /foo HTTP/1.1
Host: example.com

I.e. Get a description of the resource denoted by <http://example.com/foo>.

MPUT

Add the statements contained in a concise bounded description of the resource, provided as input, to the (possibly empty) body of knowledge maintained about the resource denoted by the request URI. E.g.

MPUT /foo HTTP/1.1
Host: example.com

I.e. Add statements to the description of the resource denoted by <http://example.com/foo>.

MDELETE

Remove the statements contained in a concise bounded description of the resource, provided as input, from the existing knowledge maintained about the resource denoted by the request URI.

If no description is provided as input, remove all statements asserted about the specified resource. E.g.

MDELETE /foo HTTP/1.1
Host: example.com

I.e. Delete statements from the description of the resource denoted by <http://example.com/foo>.



The URIQA Semantic Web Service Interface

In addition to the methods described above, the URIQA protocol also defines a simple semantic web service interface providing for access to descriptions of resources by third parties other than the web authority of the URI denoting the resource and/or for resources denoted by URIs which are not meaningful to the HTTP protocol.

All URIQA service implementations must provide support for the following parameters:

Method Parameter Value Description
GET, POST uri <URI> The URL encoded URI denoting the resource described.
POST method MGET, MPUT, or MDELETE The URIQA method to be applied.

Input resource descriptions should be provided to POST requests as the request body, as appropriate.

E.g.:

GET /uriqa?uri=http%3a%2f%2fexample%2ecom%2fblargh HTTP/1.1
Host: example.com

Retrieve from example.com a description of the resource denoted by <http://example.com/blargh>.

POST /uriqa?uri=http%3a%2f%2fwidgets%2eorg%2ffoo%23bar&method=MGET HTTP/1.1
Host: example.com

Retrieve from example.com a description of the resource denoted by <http://widgets.org/foo#bar>.

Note: it is recommended that GET normally be used to retrieve resource descriptions so that bookmarking and similar operations can be employed.

POST /uriqa?uri=urn%3aissn%3a1560%2d1560&method=MPUT HTTP/1.1
Host: example.com

Add statements to the description maintained by example.com of the resource denoted by <urn:issn:1560-1560> (manditory description provided as request body).

POST /uriqa?uri=uuid%3a438c44e9%2d6b2f%2d11d7%2d944a%2d006097b1ebc&method=MDELETE HTTP/1.1
Host: example.com

Remove statements from the description maintained by example.com of the resource denoted by <uuid:438c44e9-6b2f-11d7-944a-006097b1ebc> (optional description provided as request body).

Note that the particular name of the web service is not mandated, but for the sake of consistency, '/uriqa' is recommended.

URI References with Fragment Identifiers

URI references which contain fragment identifiers are problematic in a semantic web environment where the precise form of the URI must be preserved during interchange to ensure correct interpretation. Many software applications, gateways, proxies, and other intermediaries may discard the fragment identifier portion of an HTTP request URI, resulting in miscommunication between two semantic web agents which are communicating with one another via HTTP.

E.g., a URIQA request such as

MGET /foo#bar HTTP/1.1
Host: example.com

can arrive at the authoritative server as

MGET /foo HTTP/1.1
Host: example.com

hence being misunderstood as a request for a description of an entirely different resource than was intended by the requesting agent.

As a workaround for this special case, it is recommended that when submitting a URIQA request where the request URI contains a fragment identifier, the full URI should be redundantly specified using the special HTTP message header URIQA-uri:

E.g.

MGET /foo#bar HTTP/1.1
Host: example.com

URIQA-uri: http://example.com/foo#bar

A URIQA enlightened server must check for the presence of the URIQA-uri: header. If a URIQA-uri: header value is specified, and if the request URI constitutes the base URI of the header specified URI, then the server should presume that the header specified URI denotes the intended target of the request and that the fragment identifier was lost from the request URI during transit, and should use the header specified URI when processing the request.

Thus, the following requests, as received by a URIQA enlightened server, are all equivalent:

MGET /foo#bar HTTP/1.1
Host: example.com

MGET /foo HTTP/1.1
Host: example.com

URIQA-uri: http://example.com/foo#bar

MGET /foo#bar HTTP/1.1
Host: example.com

URIQA-uri: http://example.com/foo#bar

All three requests should result in the server attempting to return a description of the resource denoted by the URI
<http://example.com/foo#bar>.

Descriptions as Fall-Back Representations

A URIQA enlightened server should attempt to provide a description of a resource when a general GET request fails. This is particularly important for resources which are denoted by http: URIs but which may have no (other) web-accessible representations, such as vocabulary terms, abstract concepts, physical entities, etc.

URIQA Service Extended Parameters

In addition to the required parameters for all URIQA semantic web services, a URIQA service will ideally also support the following additional optional parameters and their associated functionality:

Parameter Value * Description
format <MIME TYPE>

Return RDF statements serialized using the specified encoding. E.g:

application/rdf+xml   [default]Serialize using RDF/XML
text/turtleSerialize using Turtle
text/n3Serialize using N3
form <URI> The URL encoded URI denoting an alternative form of description to return, other than a concise bounded description.
graph <URI> The URL encoded URI denoting the graph from which description statements are to be extracted (see further explaination below).
inference exclude

Only explicitly asserted statements are returned. No inference is performed when processing the request.

[default]

include

All explicitly asserted statements, as well as all inferable statements which are otherwise appropriate for the response are returned.

* All service parameter values must be URL encoded as appropriate.



Named Graphs

A named graph [6] can be specified for explicit MGET, MPUT, and MDELETE requests as well as for GET, POST and DELETE requests to a URIQA service.

When specified for an explicit MGET, MPUT, or MDELETE request, the graph must be defined in the request header using URIQA-graph: [URI]. E.g.

MGET /foo HTTP/1.1
Host: example.com
URIQA-graph: http://example.com/someSpecificRDFGraph

When specified for a GET, POST, or DELETE request to a URIQA web service, the graph must be specified in the query URI using the parameter graph=[URI], as defined above.

Only one graph may be specified for a given URIQA request.

Description Requests

For MGET requests (either explicit or using method=), if a graph is specified, the description is extracted only from the specified graph, not from the entire knowledge base.

For MPUT requests (either explicit or using method=), if a graph is specified, statements are added to the knowledge base within the specified graph. If the graph does not already exist, it will be created by the system.

For MDELETE requests (either explicit or using method=), if a graph is specified, statements are removed from the knowledge base only from the specified graph.

Named Graph Requests

For GET requests to a URIQA service where neither the method= nor the uri= parameters are specified, and where the graph= parameter is specified, all statements contained in the graph will be returned, rather than a description of a particular resource.

For POST requests to a URIQA service where neither the method= nor the uri= parameters are specified, and where the graph= parameter is specified, all statements contained in the input will be stored within the specified graph, rather than only a description of a particular resource.

For DELETE requests to a URIQA service where neither the method= nor the uri= parameters are specified, and where the graph= parameter is specified, the entire graph, including all statements within that graph, is removed from the knowledge base.


Frequently Asked Questions

Why not use a link in the representation to provide the URI via which the description can be accessed?

First, this requires that you download the entire representation to get to that link, which is simply inefficient. Secondly, the insertion of the link would be encoding specific and not all encodings facilitate such links. Thirdly, the solution would not be consistent across all encodings, and for each new encoding, yet another way to embed/encode that link would have to be devised. Finally, it precludes processing of descriptions by agents who do not support the particular encodings yet can still operate on RDF/XML descriptions.

Why not have a standardized suffix, prefix, or other URI transformation that allows one to derive from a resource URI another URI via which the resource description can be accessed?

Such an approach both violates the sanctity of the web authority's URI space, since an agent has no way of knowing whether the web authority subscribes to such a transformation. It also violates the principle of URI opacity. Finally, such a transformation would have to work with any URI scheme whatsoever, which may very well not be feasible.

Why not just use a special header to indicate that you want a description rather than some other form of representation?

If the header is not understood by the server, then it may provide a response which appears to be successful but is not a description. In the case of a resource which has a default representation encoded as RDF/XML, it may not be clear even upon examination of the response that a misunderstanding has occurred. And that response may also be exceptionally large, such as in the case where the agent asks for a description of an MPEG4 encoding of a full-length motion picture, but because the server does not understand the special header, returns the entire 1.2GB MPEG4 video stream.

Furthermore, in the case of PUT and DELETE, use of a special header to capture the same semantics as defined for MPUT and MDELETE would change the semantics of PUT and DELETE in a way that is unlikely to be considered acceptable, as MPUT and MDELETE can be used to (partially) modify a resource description and have no affect on the state (or representation of the state) of the resource itself.

Why not use a MIME type and content negotiation to request a description?

Content negotiation is designed to allow agents to select from among a set of alternate encodings. The distinction between a resource description and (other kind of) resource representations is not based on any distinction in encoding. In fact, a given description (which is itself a resource) may have several available encodings (RDF/XML, XTM, N3, etc.). Thus, if you use content negotiation to indicate that you want a description, you can't use it to indicate the preferred encoding of the description (if/when other encodings than RDF/XML are available).

Why not first use a HEAD request to get another URI via which the description can be accessed?

Firstly, this requires an agent to make two requests to the server, rather than just one, which is inherently inefficient. Secondly, while each description is a resource in its own right and can be denoted by a distinct URI, it is seldom necessary to give descriptions distinct identity and therefore unnecessarily burdensome to require that every description of every resource be given an explicit URI simply in order to be able to access a resource's description.

Why not use DDDS (DNS) to get another URI via which the description can be accessed?

While DDDS would be more efficient than the HEAD request approach above, it adds yet another architectural layer into the mix, and it is unclear at this point whether or not that introduces additional challenges, particularly in areas of configuration, management, and security. While this approach should be explored further, at present it does not appear to offer a better solution to the present URIQA methods. It also shares a shortcoming with the HEAD request approach in that it requires all descriptions to be given an explicit identity, which in most cases is unnecessary management overhead.

Why not use OPTIONS to get the URI of a service/portal via which the description can be accessed?

This is a slightly better variant of the HEAD request approach, in that (a) the OPTIONS request is relevant to an entire web authority, and with caching, means that even if the request is being made frequently for many resources denoted by URIs grounded in that web authority, there is not as significant an efficiency hit as for the same number of HEAD requests (which even if cached may only be made once each), and (b) an agent may choose to keep track of the service URI provided by the OPTIONS request response and thus forego additional requests to the same web authority. Of all of the alternative approaches to URIQA as presently defined, this approach has the greatest merit. Nevertheless, it presumes that resource descriptions will always be centrally managed by some web service, and limits the freedom of web authorities from deciding how best to provide resource descriptions. The present URIQA design imposes no implementation constraints whatsoever.

Why not use OPTIONS to get the definition of a regular transformation to apply to the resource URI in order to generate another URI via which the description can be accessed?

This is a variant of the OPTIONS approach above which does not violate the sanctity of the web authority's namespace as would a more general syntactic transformation, as each web authority is free to specify their own transformation as they see fit. It shares, though, the shortcomings of the other OPTIONS approach in that it incurs multiple requests and requires explicit identity of all descriptions. It also has the additional drawback of placing both a greater implementational burden as well as a heavier runtime burden on clients in that they will have to be able to apply the transformations locally. Such concerns are particularly relevant for mobile clients.


Notes and References

[1] W3C Semantic Web Activity, http://www.w3.org/2001/sw/

[2] Architecture of the WWW, http://www.w3.org/TR/webarch/

[3] CBD - Concise Bounded Description, http://sw.nokia.com/uriqa/CBD.html

[4] Resource Description Framework (RDF), http://www.w3.org/RDF/

[5] RFC2616 - HTTP 1.1, http://www.ietf.org/rfc/rfc2616.txt

[6] Named Graphs, Provenance and Trust, http://www.hpl.hp.com/techreports/2004/HPL-2004-57.html


Contact: Patrick Stickler, Nokia Copyright © 2002-2010 Nokia. All rights reserved.