December 18th, 2008
Many people say that Web 3.0, the “Semantic Web”, is coming. But what is the Semantic Web all about?
In the beginning, Internet was created as a way to share information between people in different organizations, like universities and research centers. In the early 90′s the Internet became of public usage and grew exponentially, and so grew the information shared by people. An automated way to process the always increasing amount of information was needed. Nowadays, it is not enough to only use a keyword-based Web Search Engine to find the information on the web. This is where the Semantic Web comes in.
The Semantic Web represents the collection of standards that were created to share information not only between humans but also between the machines. It starts by defining a common language (XML) and representation for the shared data (RDF). But the Semantic Web requires more than a common language and representation.
Do computers “understand” RDF?
In order for machines to be able to process the information, machines must not only be able to read the information but also to “understand” them. I’m referring not only to understanding that “<sw:Developer>” is an XML tag in a namespace, but also that Developer represents an entity within a context (Ontology).
In order to retrieve the information that is put in RDF (XML) under a specific Ontology (OWL), a query language is needed. That is why SPARQL was created, and also to provide basic inferences; for example, every “Developer” is an “Southy” (within the company context ), so you will probably want to query for “all Southies” instead of Developers. A programmer can then create SPARQL queries that the systems use to retrieve this interchangeable information.
Queries in the Semantic Web
So how does a SPARQL query look like?
In the example below, I want to answer the following question “How can we choose 2 people that had worked together, both have a high English level and at least 1 has management skills?”
PREFIX sw: <http://southworks.net/core/testOntology>
SELECT ?x ?y
?x sw:workedWith ?y ;
?y sw:englishLevel “high”.
This is a complex query where I’m querying not only for facts but also information that is inferred from these facts.
Note: For an interesting example of Advanced SPARQL queries in IMM, please see http://blogs.msdn.com/imm/archive/2008/10/28/advanced-sparql-in-imm.aspx
Evolving the Semantic Web
There is no need to be too sceptic to see that humans still need to write the queries, and there is still no released commercial system that answer natural language questions which are translated to SPARQL queries (products like Cypher are still in Beta). What will the future bring on this side may depend on the evolution of these standards and the evolution of massive parallel computing.
I’ve left out an important detail though, that the information made available is naively trusted. When these systems publish and consume the information on the Internet, there must be a way to prove that the information is true, and to establish a trust relationship with the publisher (as we humans do). So far there are no implementations of an actual trust network, but plans for a public network based on a Public Key Infrastructure may turn into pillars for the future Trusted Semantic Web.
I’m planning to write specific articles on:
- Real world applications, like IMM
- Semantic engine internals