LAK11: From taxonomies to folksonomies and back: The Semantic Web

Before social media happened to us, there were vocabularies such as the ones encountered in the Semantic Web. You could see these everywhere – from programming languages to news to business. Specifically, vocabularies such as those embedded in NewsML-G2, EDIFACT and ANSI X12 echoed a need to standardize data exchange across a wide range of industries, giving an integrated technology platform for transacting business documents across the supply chain.

SOAP and REST based Web services came along to offer an easier XML based way to communicate the same standardized information (and then the Windows Communication Foundation). Various systems emerged to manage XML including Object Oriented DBMSs (like Cloudscape) and Tamino, one of the first relational databases based on XML. All tried to marry the relational database architecture to the XML platform. And meanwhile, DTDs evolved to XSDs as the schema definition component. On the BI side, Essbase Hyperion and Cognos arrived with compelling BI facilities, accompanied by SQL Server OLAP and the Oracle Datawarehouse.

In education, AICC and SCORM (and IMS QTI – Question Test Interoperability) standards set the vocabulary to be adopted for communication between a course (and assessment) and the Learning Management System. The DITA (Darwin Information Typing Architecture) emerged as an XML based standard for authoring, managing and publishing content. The S1000D standard emerged to describe equipment. Interfaces have emerged between SCORM on the one end and DITA and  S1000D on the other.

But Social Media happened to us and suddenly everything went open and “un-designed”. The term Folksonomy came into vogue to describe common, non-centralized vocabularies, which is what the most effort by search engines like Google was based. Even now, tags and their analysis form a central part of social network analysis.

As a result, business and education technology has had to play catch up with SoMe. Perhaps, we are waiting for maturity in the Education space, but business has shown good signs of adopting and using social media already – once they got the transactional aspect and the commercial angle worked out. However, SoMe fitted well into information spaces, customer relationship, sales and marketing, not, in any widespread way, the core aspects of transaction and operations.

The answer was the Semantic Web which attempts to bring together the SoMe quotient and the EDI quotient into some kind of alliance, and extending far beyond that as well to encompass search, business intelligence and BIG data.

Lots of views exist on why the Semantic Web would fail. Stephen talked from the SoMe side favouring personal computing and felt intuitively that the Semantic Web would fail because “it depends on businesses working together, on them cooperating”. There was also apprehension elsewhere that this will not happen and also the opinion that the vast majority of data on the web is unstructured (and will remain so).

In fact, even before Stephen voiced his intuition, Clay Shirky in 2003 stated it will fail:

“However, like many visions that project future benefits but ignore present costs, it requires too much coordination and too much energy to effect in the real world, where deductive logic is less effective and shared worldview is harder to create than we often want to admit.”

Cory Doctorow points to other problems such as schemas not being neutral, the general user’s competence & will, inherent bias and also the fact that metrics will influence choice of metadata.

So Tim Berners Lee expounded a new less rigid term called Linked Data. Tyler Bell, writing on Where the semantic web stumbled, linked data will succeed, states:

Successful adoption will often entail sacrificing standardization and semantic purity for pragmatic ease-of-use; this is where the semantic web appears to have stumbled, and where linked data will most likely succeed.

So is there a difference between Linked Data and the Semantic Web? That seems to be an ongoing debate (Nick Gall, LornaDesign Issues and ReadWriteWeb) .

Visiting the tutorial from the LinkedData website and Linked Data – the story so far, the evidence states it is not different in thought or implementation as compared to the Semantic Web and shares the implementation (through RDF, OWL and SPARQL).

So what will happen to non-common user generated folksonomies? With Linked Data, there cannot ideologically be folksonomies. SKOS seems to be an initiative that provides a migration path.

SKOS—Simple Knowledge Organization System—provides a model for expressing the basic structure and content of concept schemes such as thesauri, classification schemes, subject heading lists, taxonomies, folksonomies, and other similar types of controlled vocabulary. As an application of the Resource Description Framework (RDF), SKOS allows concepts to be composed and published on the World Wide Web, linked with data on the Web and integrated into other concept schemes.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

Blog at

Up ↑

%d bloggers like this: