Models

The LongSpine spine relies on the Linked Data technology stack (see Principles > Linked Data) for technical interoperability with lots of other Internet-based data and an overarching model (next section) for conceptual alignment. Additionally, all LongSpine Datasets use elements from a set of background models (below) which afford them conceptual alignment with many other datasets outside the spine itself, such as those published by the Australian Government Linked Data Working Group.

Models sections

  • Modelling Paradigm
  • Overarching Model
  • Background Models
  • Component Data Models
  • Vocabularies

§ Modelling Paradigm

This project implemented a series of Semantic Web data models, all formulated according to the Web Ontology Language (OWL). This means data according to the models is presented in Resource Description (RDF) formats, can be stored in an RDF triplestore (a form of graph database), be queried using the SPARQL query language and can be joined to any other data that is also presented according to any of the very many OWL models in RDF.

OWL was chosen as the modelling system as there are very many public OWL models in wide use of relevance to this project's domain (e.g. the Organization Ontology) and OWL models are specialisable, extensible, technically interoperable (with any other OWL models) and data made according to them can be presented online as Linked Data allowing human- and machine-access across institutions.

§ Overarching Model

The structure of LongSpine's Dataset and Dataset relations (Linksets) (see Principles > Datasets and Linksets) adheres to the LocI ontology however the content of the Datasets does not, since LocI is about spatial objects and LongSpine is not. LongSpine content adheres to the very general LongSpine Ontology, an informal outline of which is given in Figure M1.


Figure M1: The LongSpine overarching model

The LongSpine ontology can be treated as the simples or fall-back data integrator: all data from all Datasets and Linksets, regardless of their specifics, can be viewed as LongSpine ontology data and used accordingly. The LongSpine Ontology documentation gives an example of specialised CRS Ontology data viewed as generic LongSpine ontology data.

§ Background Models

There are a series of well-known OWL models that are used for concepts relevant to LongSpine. These include both the technical, structural, models of how to represent data elements generally and also the conceptual models of particular domains' concepts. The following models, with notes on their role, are used by LongSpine:

Table M1: Background models used in LongSpine

Ontology Description Role in LongSpine
Resource Description Framework (RDF) The fundamental data model used for Semantic Web and Linked Data applications. It models objects and relations. Required for any RDF-based system
RDF Schema (RDFS) A schema on top of RDF for modelling types of things and specailisations Required for most RDF-based systems
Web Ontology Language (OWL) An extension to RDFS that uses set theory to describe detailed relationships between things Allows for nuanced classes of object, like different Organisations
Dublin Core Terms A vocabulary of basic annotation properties for things like title, description, source, created date etc. Allows for basic annotations on many LongSpine objects
schema.org A large, general-purpose, OWL model of common classes of objects and relations Used for basic object types like Person and properties like birthDate
Time Ontology in OWL (TIME) An OWL ontology of temporal concepts, for describing the temporal properties of resources Used for all LongSpine real-world temporality
The Dataset Catalogue Vocabulary (DCAT) An OWL ontology designed to facilitate interoperability between data catalogs published on the Web Used to describe LongSpine Datasets at the whole-of-dataset level
Vocabulary of Interlinked Datasets (VoID) An OWL ontology for expressing metadata about RDF datasets, particularly relations between them Used primarily for its definition of a Linkset
LocI Ontology A profile of several ontologies implemented to govern Linked Data resources published within the LOC-I project. Used for its overall structure of a Semantic Web-based spine. The LocI Ontology extends DCAT Dataset and VoID Linkset definitions
Simple Knowledge Organization System (SKOS) An OWL model for expressing the basic structure and content of concept schemes such as thesauri, classification schemes, subject heading lists, taxonomies, folksonomies Used to structure the vocabularies and thesauri of government functions
The Organization Ontology (ORG) An OWL core ontology for organizational structures Used as the basis for LongSpine organisations modelling

The models above build on each other with last model listed, LocI, being an integrator of many of the others. Due to the reuse of the LocI ontology, Datasets and Linksets in LongSpine are structured identically to those in the Location Index, even though their content is very different. This does mean that structurally, LongSpine is interoperable with LocI.

All of these models are loaded into the LongSpine Cache as a series of Named Graphs which means, like individual Datasets and Linksets, they can be selected for use, or excluded, within individual queries against the Cache.

§ Component Dataset Models

Each LongSpine component Dataset has its own data model - ontology - to formally describe its content in a manner that aligns with the LongSpine Overarching Model several existing Australian government datasets. Each Dataset's model has a namespace for all items within it that define the structure of the dataset. The namespace, a web address, acts as both a persistent identifier for the dataset model and also as means to access it and metadata about it. The namespace web address is called a "PID" - Persistent Identifier - and the PIDs listed below are issued by the Australian Government Linked Data Working Group who have a system in place for this very purpose. The 5 component dataset models are given in Table M2.

Table M2: LongSpine Component Dataset Models

Ontology Description Namespace / PID Release Status
Administrative Arrangement Orders (AAO) Ontology For the AAOs presented online at legislation.gov.au or from the National Archives of Australia http://linked.data.gov.au/def/aao Draft: in code repository by CSIRO online at http://test.linked.data.gov.au/def/aao
AGOR Ontology For the AGOR dataset (Dept. Finance), some of which is delivered publicly online via directory.gov.au http://linked.data.gov.au/def/agor Draft: in code repository by CSIRO online at http://test.linked.data.gov.au/def/agor
Commonwealth Record Series (CRS) Ontology For the CRS database at the National Archives of Australia that supports the archiving of government records and is searchable online via their website. http://linked.data.gov.au/def/crs Beta: online at its Namespace URI
Portfolio Budget Statements (PBS) Ontology Documents the structure of Portfolio Budget Statements that are created by multiple government agencies to be released with Federal budgets. See the 2019/2020 PBSes here: https://www.budget.gov.au/2019-20/content/pbs/index.htm. http://linked.data.gov.au/def/pbs Placeholder only: in code repository by CSIRO online at http://test.linked.data.gov.au/def/pbs. This ontology has not been fully developed yet.
Records Disposal Authority (RDA) Ontology Describes the structure of Records Disposal Authority documents authored with agencies by the National Archives of Australia to categorise agency functions so government records so they may be classified according to them and archived efficiently. The NAA website describes RDAs http://linked.data.gov.au/def/rda Draft: in code repository by CSIRO online at http://test.linked.data.gov.au/def/rda

See the Data Page for links to the data of the Datasets built according to these models.

§ Vocabularies

In addition to the data models listed above, models of government functions used by LongSpine are presented as regular SKOS vocabularies, meaning SKOS vocabularies without any properties beyond the SKOS model's properties and basic annotation properties mentioned in SKOS, such as Dublin Core Term's source property to indicate sources for terms. The vocabularies are listed below in Table M3.

Table M3: LongSpine Vocabularies

Ontology Description Namespace / PID Release Status
Australian Government Interactive Functions Thesaurus (AGIFT) A vocabulary of government functions created by the NAA and released online in SKOS before the LongSpine project https://data.naa.gov.au/def/agift In stable release by NAA as Linked Data at its namespace location.
Classifications of Functions of Government (COFOG) An international government functions classifications vocabulary issued by the UN's statistical office. http://linked.data.gov.au/def/cofog

Draft: in code repository by CSIRO online at http://test.linked.data.gov.au/def/cofog

Note: this vocabulary has been published as Linked Data by the UN and also CSIRO previously but not a according to SKOS in a way able to be easily be used by LongSpine so the version linked to here is modified for effective use by LongSpine.

Classifications of Functions of Government - Australia (COFOG-A) An Australian version of COFOG (above) issued by the Australian Bureau of Statistics http://linked.data.gov.au/def/cofog-a Draft: in code repository by CSIRO online at http://test.linked.data.gov.au/def/cofog-a
Commonwealth Record Series Thesaurus (CRS-Th) The CRS Thesaurus is a government functions thesaurus used within the NAA's CRS Database (see Component Datasets above). http://linked.data.gov.au/def/crs-th Alpha: Released at it's namespace location online, but currently redirecting to a code repository awaiting online delivery as Linked Data
Government Purpose Classification (GPC) The Australian Bureau of Statistics' legacy government functions thesaurus, now mostly superseded by COFOG-A http://linked.data.gov.au/def/gpc Draft: in code repository by CSIRO online at http://test.linked.data.gov.au/def/gpc
Local Government Purpose Classification (LGPC) The Australian Bureau of Statistics' legacy local government functions thesaurus. An extension to GPC (above). http://linked.data.gov.au/def/lgpc Draft: in code repository by CSIRO online at http://test.linked.data.gov.au/def/lgpc
Records Disposal Authority Classes Vocabulary A vocabulary of just the 'Classes' within Records Disposal Authorities (see the component dataset listed above) http://linked.data.gov.au/def/rda-voc Draft: in code repository by CSIRO online at http://test.linked.data.gov.au/def/rda-voc