The semantic web is a term used to describe a web which is made up not just of data but of data with attributed ‘meaning’. The result of contexualising data and meaning is ultimately ‘machine-readable meaning’ i.e. the ability for a computer to understand that the word ‘Acne’ that appears on a website doesn’t refer to a skin condition but rather a brand of jeans.
The term ‘semantic web’ was coined by Tim Berners Lee based on the following vision
I have a dream for the Web [in which computers] become capable of analyzing all the data on the Web – the content, links, and transactions between people and computers. A “Semantic Web”, which makes this possible, has yet to emerge, but when it does, the day-to-day mechanisms of trade, bureaucracy and our daily lives will be handled by machines talking to machines. The “intelligent agents” people have touted for ages will finally materialize.
Semantic languages & Schema.org
Since 2001 when Tim Berners Lee set-forth his vision, various semantic vocabularies have been developed to enable people to mark-up web pages to give them meaning. One of the leading markup vocabularies is Schema.org, created as a collaboration of leading search engine organisations including Google, Bing, Yandex and others.
Schema.org was created to create widespread usage of semantic markup, and is broad in its scope incorporating 500+ ‘types’ and 800+ ‘properties’ meaning that Schema.org can be applied to pretty much anything to date.
An example of Schema.org mark-up structure, with its ‘types’ and ‘properties’ is as follows – using the website information of my local pub ‘The Duke of Edinburgh’ as an example:
Type:
BarOrPub / FoodEstablishment
Properties:
Name
Address
Opening hours
By adding this semantic markup language to the pubs website information enables Google (or other search engines / service that relies on open web data) to more easily understand the meaning of the information provided.
For example, a very important thing that it does is help Google understand that this information is related to an organisation (pub) called the Duke of Edinburgh rather than a ‘person’ called the same thing.
The result when searched via Google is this, when searching for ‘pub near me’
When looking at Schema.org’s application, one important area missing to date is democracy & legislation.
DML (democracy mark-up language)
In the context of Delib’s work, the idea of applying the semantic web to democratic processes (like policy creation and legislation) highlights a whole raft of exciting advantages to enrich democracy. We might call this specific mark-up language “Democracy Mark-up Language (DML).
Government policy & legislative documents are famously wordy and inaccessible, but at the same time are generally well-structured and part of a wider well-structured government process.
The natural structured nature of policy / legislation means it has the potential to be made more accessible by technology; the starting point for making this government policy data more accessible is providing an easy way to mark it up and give ‘machine-readable meaning’ to policy documents.
What this might look like from a practical perspective using Schema.org, is something like this (using this proposed policy from Dept of Environment, Food and Rural Affairs as an example https://consult.defra.gov.uk/animal-health-and-welfare/ban-on-electronic-training-collars-cats-and-dogs/consult_view/ )
Type:
PolicyDocument / Government Work
itemtype=“http://schema.org/GovernmentWork”
Properties:
Name
Description
Organisation
Audience
Start Date
End Date
Geography
Contact Point / Email
Feedback Point
Benefits / practical uses of DML applied to policy & legislative documents
Having applied DML to a series of government policy documents would then, like the pub example, enable search engines to more easily surface policy documents relevant to individuals. For example, instead of searching for ‘pubs near me’ a person might search for ‘What government policies affect my local area?’ and the results may look something like . . .
Or alternatively more specifically a person may search for ‘What’s the latest with the government’s HS2 policy’? And the latest policy document would appear, along with the ability for the citizen to feedback on it.
Schema.org + Citizen Space
The hugely exciting bit in all of these is that we’re 90%+ there in making DML a reality. Breaking down what’s needed to make DML work in practice at scale there’s 2 key parts, reliant on Schema.org and Citizen Space.
- Agreeing the DML language (via Schema.org): policy documents are very similar to other standard documents that are covered by Schema.org’s type ‘CreativeWork’ http://schema.org/CreativeWork so we’re 99% there with the Schema.org language (types and properties). I think there does need to be a sub-type of ‘CreativeWork’ which is ‘GovernmentWork’, which includes additional properties specific to policy and legislative documents like ‘feedback’ (relating to the ability for citizens to feedback / input into policy).
- Easy application of DML to policy documents: sure, all of this DML idea sounds interesting in theory, but given that the practical application would involve civil servants needing to specifically add code to online documents to mark them up with DML, the idea would die very quickly – as no civil servant would have the time (or realistically the technical expertise) to add DML to their policies.
That’s luckily where Citizen Space comes in, as Citizen Space is already used by a high percentage of government departments (UK and Australia) to publish policy documents through. To make DML a reality, Delib would need to map Schema.org language (i.e. DML) to the existing structured data that Citizen Space is structured around. N.B. to get a sense of how policies in Citizen Space are structured, check out the Citizen Space Aggregator.
This essentially would mean business as usual for the government departments who publish their policies via Citizen Space, but a huge potential step change in the value that government and citizens get out of the publishing of policies.
Appendix 1: Mapping Schema.org to Citizen Space structured data
The following is a breakdown of existing Schema.org language applied to policy documents listed in Citizen Space (according to the policy information structuring allowed for in Citizen Space). I’ve added some additional notes and questionned some
Citizen Space policy document data | Schema.org ‘type’ or ‘property’ | Notes (thoughts on appropriateness) |
Document type | GovernmentWork [type] NEW | This doesn’t exist at present. Only ‘CreativeWork’ exists as a ‘type’ |
Name | Name | |
Overview | description | |
Area | EligibleRegion | |
Audience | audiencetype | |
Interest (interest category area) | category | |
Organisation | organisation | |
Department (of organisation) | department | |
Consultation start date | startDate | |
Consultation end date | endDate | |
Contact | Contactpoint (?) | Or should this be ‘accountable person’ – refers to ‘legal owner’ |
Contact information (of owner) | Telephonenumber, email | |
Feedback format (online survey, .pdf, email, event) | FeedbackPoint [NEW] | This is a new property and does not exist at present |
Related documents | Citation (?) | May not work, may need other option. |
Language | availablelanguage | |
Published response | comment |
Next steps?
DML is very much in concept phase at the moment, and this thought paper is a first articulation of what DML could be and the benefits. If you’re interested in discussing the concept further, and are interested in applying it (especially if you work in government policy), drop us a note: chrisq@delib.net