Archive for June, 2008

You know you’re about to make it big when…

Thursday, June 19th, 2008

the defense industry uses your technology. The funny thing about the semantic web conference was the vendor showcases. I went to a few, where a given semantic technology toolkit vendor would showcase how one can make use of their tool or tools. The thing that struck me was that each and every one that I attended focused on how they were helping the US department of defense catch terrorists or extract evidence of links to terrorism.

There was one vendor showcase in particular that I got a great laugh out of. I was sitting in the very front row, right in front of the presenter who was about 4 feet in front of me. Before I go on I should give you some information about myself, I am a Muslim who looks very Muslim…ie the full beard, middle eastern looking, etc… I, like all the other attendees has a name tag on with my full name written in large print which is visible from several feel away. So he starts his talk about the tool that they are selling. Mid way through the presentation he starts a demonstration on how the tool was used to prove that two people who claimed to not know each other did in fact know each other and that they were smuggling funds between them.

As he went through this presentation, he paused, saw me and my name and became really uncomfortable. You see, the names of the people he was talking about had the exact same last name as me and I don’t think he really saw me sitting there until that point. At that point he tried really really hard to come up with another example to demonstrate the tool but for the life of him he could not. I had to fight the urge not to stir the pot to see how much more uncomfortable I could make him feel.

Regardless, you know this technology is going to take off when the US department of defense is interested in it. It seems that they are dumping large sums of money into it’s development and use.

Semantic Web Conference 2008

Thursday, June 19th, 2008

I had the chance to attend the semantic web conference that what held in San Jose May 18th to 22nd. Before I go any further, there may be many of you who are scratching your head thinking “What is the semantic web?” Dictionary.com defines semantics as the following:

  1. Linguistics.

    • the study of meaning.
    • the study of linguistic development by classifying and examining changes in meaning and form.
  2. Also called significs. the branch of semiotics dealing with the relations between signs and what they denote.
  3. the meaning, or an interpretation of the meaning, of a word, sign, sentence, etc.: Let’s not argue about semantics.

What we are talking about here is devising a means for a computer to be able to understand the meaning of language. Once it can understand language. When you enter a question into Google such as “What is the best way to exercise your lower abs?” google goes through and analyzes the phrase/sentence you entered into the search box and breaks it up into keywords. It then uses these keywords to search their repository for the information within a web page that best matches the keywords you entered. Therefore, in the example I provided above, it will go through and extract words like ‘best’, ‘exercises’, ‘lower’ and ‘abs’ and use these keywords to search their repository and return matches that have a high likelihood of containing the information you’re looking for. Google has little to no idea what the sentance “What is the best way to exercise your lower abs?” means.

What semantics hopes to archive is to extract information from the internet such that a computer or set of computers could perform a search understanding exactly what “What is the best way to exercise your lower abs?” means and would therefore return search results matching exactly what you are looking for. In the above example it’s fairly clear what I’m looking for but what if I entered something like “paris hilton”? What exactly am I looking for? Am I interested in the Hilton hotel located in Paris, France or am I interested in information about the trash TV celebrity? By now I’m sure you get the idea.

Needless to say, getting a computer to understand meaning isn’t an easy task and therefore the community that is hoping to make semantics a reality holds two conferences a year on the topic to gather like minds and to showcase how the technology has progressed along with any real world uses of the technology. The conference has been held for the past four years and has grown steadily each and every year. This year there were over 1100 attendees which is remarkable given the downturn in the US economy. The remainder of this post will outline the technologies of interest and some real implementations of the technology.

Technologies and Specifications

Gleaning Resource Descriptions from Dialects of Languages (GRDDL)
GRDDL is a W3C specification to allow people to make their content semantically accessible. For example, one can have a website containing all sorts of data within it that may or may not be structured. The GRDDL specification allows one to embed a link to an XSLT transform that will take the XHTML of the webpage as an input and output a set of RDF triples representing the content of your page.

Sesame
Sesame is an open source RDF framework with support for RDF Schema inferencing and querying. It can be deployed on top of a variety of storage systems (relational databases, in-memory, file systems, keyword indexers, etc.), and offers tools to developers to leverage the power of RDF and RDF Schema.

ELMO
Elmo is a toolkit for developing Semantic Web applications using Sesame. Elmo wraps Sesame, providing a dedicated API for a number of well known web ontologies including Dublin Core, RSS and FOAF. The dedicated API makes it easier to work with RDF data for the supported ontologies. Elmo also offers a set of tools related to the supported ontologies, including an RDF crawler, a FOAF smusher and a FOAF validator.

JENA
JENA is a Semantic web framework that can be used to access RDF, RDFS, OWL and SPARQL and a rule based inference engine. Included within the framework is an RDF API, an OWL API and a SPARQL query engine.

Topaz
Topaz is a RDF persistence and querying service framework.

NetKernel
1060 NetKernel is a resource oriented microkernel and RESTful application server created from the convergence and unification of the powerful fundamental concepts found in the World Wide Web and Unix. NetKernel is being used by a large number of major semantic applications.

Persistant URLs
A PURL is a Persistent Uniform Resource Locator. Functionally, a PURL is a URL. However, instead of pointing directly to the location of an Internet resource, a PURL points to an intermediate resolution service. The PURL resolution service associates the PURL with the actual URL and returns that URL to the client. The client can then complete the URL transaction in the normal fashion. In Web parlance, this is a standard HTTP redirect.

Implementations

The following are a list of applications that have been created using semantic technologies in part or as a whole. Examples of implementations such as these provide direction as to where this technology can be used to solve real world problems.

Powerset
Powerset is a website that has sucked in the content from Wikipedia and Freebase. They have taken this content and have translated it into a very very very large RDF store using NetKernel (or so I believe). Therefore, using their interface, one can search the data using semantic technology. If one is doing research on a given topic and has confidence in the data contained within Wikipedia, then using powerset to do your research would be of great use. Search result and content related to a particular item are easily retrivable using powerset. For example, if I were researching the pouring of concrete and were to enter “How do you pour concrete” you would see a result of all content that matches the information you are looking for understanding as best they can the meaning of what you are asking. Upon selecting the first result you would not only be given information about concrete and a slew of related content.

Twine
Twine is a social network focused on sharing your interests. In other words, creating a grouping of people interested in the same stuff. It is currently in private beta but hopes to go into public beta sometime in the late summer/early fall. The concept revolves around the idea that if you are interested in a certain topic, twine can offer information/articles/etc that fit within that area of interest. This is where the use of semantic technology fits in. They use semantic technology to search for and filter content related to what you are discussing and so on within your group. I was extremely disappointed with the fact that they made no attempt to showcase their technology by providing a walk-through of their application, but rather spent the entire time talking high-level about the application and showing a stupid video created about the website by some pre-teen. A real waste of an opportunity.

Garlik
Garlik is the darling of the semantic world. They are an example of where semantic technology can be used to solve real world problems. In this case, the use semantic technology to help protect your identity online. Currently they offer their services within the UK. Information is sketchy as to how exactly they do this using semantic technologies. I was really disappointed that they sent a non-techy to speak about their product but that seemed to be the running theme of the these types of companies within the conference, keep your secrets guarded regardless of whether it would further the proliferation of the technology.

Zepheira Remix
Now this tool is something that follows along the lines of Garlik in that it provides a solution to a real world problem. In this case they are focusing on sharing data. They have created a tool that significantly reduces the complexity of using semantics to share data. Remix allows you to import two files and merge, query, generate content between them with no need to understand anything about semantics or even the underlying technology. Remix features an AJAX front end that allows users to massage the data as they see fit and extract a meaningful summary of the data in an HTML format. Zepheira’s goal is to advocate for the use of semantic technologies while encouraging others to attempt to reduce the barriers of entry to harnessing the power of the technology. ie lets make it easier for people to use semantics through the use of frameworks and applications.

All in all, the conference was great. I have to admit, before attending the conference I was a skeptic about the technology but after the conference I think there are some great possibilities for this technology in the very near future. I look forward to attending next year. The only thing that I believe would improve the conference is an increase in details about the technology and/or implementation/usage of the technology rather than just skimming over things at a very very high level.

No Fluff Just Stuff, April 2008

Tuesday, June 17th, 2008

With forecasts of April snow, our 2Paths contingent of two headed off at the crack of dawn one Friday for Bellevue, WA to attend the No Fluff Just Stuff - Pacific Northwest Software Symposium (NFJS). Having only spent 10 minutes at the border (whew!), we arrived early enough in the Seattle area to fit in some Fluff (aka shopping and dining) before getting into the Stuff.

After a quick introduction by Evangelist Scott Davis, and with hopes of winning an IPhone by the end of the weekend, I headed right into his double-bill of sessions about Groovy. The first one entitled “Groovy, the Blue Pill: Writing Next Generation Java Code in Groovy” was a good gentle introduction to neophytes like myself on Groovy basics. Now that my interest was piqued, I was eager to continue on with Scott’s next session “Groovy, The Red Pill: Metaprogramming, the Groovy Way to Blow a Buttoned-Down Java Developer’s Mind”. Never mind the Blue Pill - the Red Pill is where it’s at!

Scott’s enthusiasm for Groovy was definitely contagious, and my mind started churning, trying to think of ways to start introducing Groovy into current projects or future projects at 2Paths. An allure of Groovy is that it’s a dynamic language and can be run on the JVM. Groovy is implemented in Java, so the two languages offer seamless interoperability. Groovy compiles down to Java bytecode, so Groovy code can call Java code, and vice-versa. This would make integration into our existing infrastructure much simpler.

The Blue Pill was starting to take effect, as Scott showed us the terseness of the language. No need to labouriously write getters and setters, amongst other many nifty language shortcuts. Who needs semicolons or parentheses anymore? Because Groovy really is Java under the hood, programmers unwilling to let go of that extra baggage can continue to use it and nothing will break. We were then shown demos on how much more quickly we could bang out Unit Tests in Groovy.

In the next session, the Red Pill stuff was pretty mind-blowing. Introducing method pointers, closures, the ExpandoMetaClass. Method pointers make our code more readable and understandable by allowing us to alias methods with semantics tailored to our own business logic. Closures are an exciting and powerful inclusion in Groovy that aren’t available in Java, allowing us to pass around executable snippets of code. And last but certainly not least, there’s the ExpandoMetaClass that lets us either do a lot of really cool stuff, or get ourselves into a heap of trouble. With the ExpandoMetaClass we can intercept methods and inject methods into any Class. Scott showed us some fun examples that got us all fired up wanting to try more!

With my introduction to Groovy setting the stage for the weekend, I went on to choose more sessions on Groovy (Design Patterns, Testing, Groovy with Spring), but also attended other sessions such as Spring Configuration, Regular Expressions, and GIS for Web Developers. I found most of the other sessions mostly affirming knowledge I already had, but augmenting it with good snippets here and there that were new to me. Venkat Subramaniam was the other main Groovy Prophet, and I came home with his book “Programming Groovy” in hand, eager to start integrating Groovy at least into our testing infrastructure.

The atmosphere at NFJS was inspiring, as the speakers all had great enthusiasm for the technologies they were touting, and the audience was generally equally as keen. It felt great to be in a large conference hall amongst so many other like-minded programmers, all sharing, and getting fired up about what we do, and our seemingly unlimited potential. I fully recommend NFJS (regardless of me not winning an IPhone), and would love the opportunity to go again!