All of the models and data for LongSpine listed on the Models & Data pages are best loaded into a cache for easy data access. With a 'triplestore' cache, you can run either the example queries listed on the Examples page or, in fact, any SPARQL query language queries you wish!
The a cache of RDF data is usually called a triplestore, that is, a node-and-edge graph database that stores RDF data. Caches' purposes are to allow queries across all the elements loaded into them so you can use one to query from vocab to vocab (via Linksets), vocab to Dataset and across Datasets is all the component items are loaded.
Unlike other databases, e.g. relational or noSQL, RDF triplestores queqire essentially no configuration to well index and link disparate data items loaded into them. This is because RDF data is already fully defined - the data schema is present within the data itself - and triplestores index RDF data in predictable ways, regardless of the particular data.
With most modern RDF triplestores, the SPARQL query language is supported. Like SQL for relational databases, you can expect to use the same queries regardless of the specific product used. Unlike relational SQL though, SPARQL queries also have stnadardised input methods and output formats so their use is somewhat more sntadardised accross implementations than SQL.
A cache itself is NOT the LongSpine's spine! As noted in the Principles Page's Spine section, the spine is "...a collection of datasets and methods for data presentation that act as an anchor point for other data..." thus the spine is the LongSpine outputs and the fact that they have common purpose. A cache is only a convenient way to access all of the spine's content.
Having said that, a related project, the Location Spine (LocI) has a main, supported, cache and while that system's spine, like this one, is not the cache, that project's cache may easily be conflated with the spine. In time, a well-known, supported, cache of LongSpine data may emerge but multiple caches of all or part of this spine's content may always be created.
Just repreating: this spine/cache distinction must be made since it's possible to create multiple or partial caches of the LongSpine's spine for multiple purposes and the conveninence of the cache shouldn't distract from the spine's points-of-truth which are the hole (namespace) locations of the models and datasets which the Models & Data pages list.
Each ontology, vocabulary, Dataset & Linkset loaded into this cache is partitioned off from other loaded element by placing it into it's own Named Graph. These are somewhat like relational database's schemas - namespace-separated collections of tables within a single database - in that they allow all the data in a Named Graph to managed as a single unit (removed, updated, reloaded) without affecting other Named Graphs' content, and they also allow both queries within and across Named Graphs.
Like with SQL queries across or within particular schema, SPARQL queries can be written that query all the data in the cache as one or just the data in one or more identified Named Graphs. This query gets all skos:Concept
instances from the cache, regardless of the Named Graph they are in:
PREFIX skos: <http://www.w3.org/2004/02/skos/core#> SELECT * WHERE { ?c a skos:Concept . }
This query only gets skos:Concept
instances from the COFOG-A vocabulary:
PREFIX skos: <http://www.w3.org/2004/02/skos/core#> SELECT * FROM <http://linked.data.gov.au/def/cofog-a> WHERE { ?c a skos:Concept . }
Note that since the models' data and the datasets' data are separated out into their own Named Graphs, you will need to combine appropriate models with datasets if you write queries that rely on model rules. By far the easiest thing is to just query the whole cache.
See the SPARQL query language section on Named Graphs for more.
All items in the cache are given Named Graph IDs which are their namespace URIs. Table C1 lists all the items in the cache and givens their Named Graph IDs which can be used in FROM
clauses in SPARQL queries.
Item | Named Graph URI | Access URI |
---|---|---|
Background Ontologies | ||
Resource Description Framework (RDF) | http://www.w3.org/1999/02/22-rdf-syntax-ns | same |
RDF Schema (RDFS) | http://www.w3.org/2000/01/rdf-schema | same |
Web Ontology Language (OWL) | http://www.w3.org/2002/07/owl | same |
Time Ontology in OWL (TIME) | http://www.w3.org/2006/time | same |
The Dataset Catalogue Vocabulary (DCAT) | http://www.w3.org/ns/dcat | same |
Vocabulary of Interlinked Datasets (VoID) | http://rdfs.org/ns/void | same |
LocI Ontology | http://linked.data.gov.au/def/loci | same |
Simple Knowledge Organization System (SKOS) | http://www.w3.org/2004/02/skos/core | same |
The Organization Ontology (ORG) | http://www.w3.org/ns/org | same |
LongSpine Ontologies | ||
LongSpine Ontology | http://linked.data.gov.au/def/longspine |
http://test.linked.data.gov.au/def/longspine |
Administrative Arrangement Orders (AAO) Ontology | http://linked.data.gov.au/def/longspine |
http://test.linked.data.gov.au/def/aao |
AGOR Ontology | http://linked.data.gov.au/def/longspine |
http://test.linked.data.gov.au/def/agor |
Commonwealth Record Series (CRS) Ontology | http://linked.data.gov.au/def/longspine |
http://linked.data.gov.au/def/crs |
Portfolio Budget Statements (PBS) Ontology | http://linked.data.gov.au/def/longspine |
http://test.linked.data.gov.au/def/pbs |
Records Disposal Authority (RDA) Ontology | http://linked.data.gov.au/def/longspine |
http://test.linked.data.gov.au/def/rda |
LongSpine Vocabularies | ||
Australian Government Interactive Functions Thesaurus (AGIFT) | http://data.naa.gov.au/def/agift | same |
Classifications of Functions of Government (COFOG) | http://linked.data.gov.au/def/cofog |
http://test.linked.data.gov.au/def/cofog |
Classifications of Functions of Government - Australia (COFOG-A) | http://linked.data.gov.au/def/cofog-a |
http://test.linked.data.gov.au/def/cofog-a |
Commonwealth Record Series Thesaurus (CRS-Th) | http://linked.data.gov.au/def/crs-th |
http://test.linked.data.gov.au/def/crs-th |
Government Purpose Classification (GPC) | http://linked.data.gov.au/def/gpc |
http://test.linked.data.gov.au/def/gpc |
Local Government Purpose Classification (LGPC) | http://linked.data.gov.au/def/lgpc |
http://test.linked.data.gov.au/def/lgpc |
Records Disposal Authority (RDA) Vocabulary | http://linked.data.gov.au/def/rda-voc |
http://test.linked.data.gov.au/def/rda-voc |
LongSpine Datasets | ||
Administrative Arrangements Orders (AAO) Dataset | not loaded yet | http://test.linked.data.gov.au/dataset/aao |
Australian Government Organisations Register (AGOR) | not loaded yet | http://test.linked.data.gov.au/dataset/agor |
Commonwealth Record Series (CRS) Dataset | http://linked.data.gov.au/dataset/crs |
http://test.linked.data.gov.au/dataset/crs |
Portfolio Budget Statements (PBS) Dataset | not loaded yet | http://test.linked.data.gov.au/dataset/pbs |
LongSpine Linksets | ||
AGIFT → CRS Thesaurus Linkset | http://linked.data.gov.au/dataset/agiftcrs |
http://test.linked.data.gov.au/dataset/agiftcrs |
AGIFT → COFOG-A Linkset | http://linked.data.gov.au/dataset/agiftcofoga |
http://test.linked.data.gov.au/dataset/agiftcofoga |
COFOG → COFOG-A Linkset | http://linked.data.gov.au/dataset/cofogcofoga |
http://test.linked.data.gov.au/dataset/cofogcofoga |
LGPC → COFOG Linkset | http://linked.data.gov.au/dataset/lgpccofog |
http://test.linked.data.gov.au/dataset/lgpccofog |
LGPC → GPC Linkset | http://linked.data.gov.au/dataset/lgpcgpc |
http://test.linked.data.gov.au/dataset/lgpcgpc |
The Dublin Core Terms http://purl.org/dc/terms and schema.org background ontologies, listed on the Models page aren't loaded into the cache as neither provide any ontological rules useful to build new data (see the Inferencing section next). Both essentially provide vocabularies of properties used for annotations only.
Triplestores, such as the one used for this LongSpine cache, are able to take reasoning rules added to them, usually in the form of ontology statements, and apply them to data. This has the effect of "building out" all the data that the rules indicate. This for of reasoning with RDF data is called inference.
For example, the ontological rule in the LongSpine ontology agor:Entity rdfs:subClassOf long:GovernmentStructuralUnit .
will cause a triple stating that every instance of a agor:Entity
will be also typed as being a long:GovernmentStructuralUnit
since, as per the rule, everything that is a agor:Entity
is also a long:GovernmentStructuralUnit
.
When reasoning is performed at load time, i.e. ontological rules in the triplestore are applied to data as it's loaded or applied to stored data when an ontology is loaded, then it is called forward-chain inference.
The triplestore used for this cache applies all the ontological rules it finds in all loaded ontologies to all data within the family of rules known as OWL-RL which are, by design, amenable to rule-based technologies meaning. This means while calculating all the ontology rule possibilities for forward-chaining all loaded data takes some time, it takes less time than some other, more expressive, OWL profiles which are not needed for any LongSpine ontological rules.
All of the items listed in Table C1 are loaded into this cache in a 2-step manner using two scripts:
The Python scripts can be used to assemble all of the cache's data in one place and many triplestores have commands/script to load them and apply inference using RDF files.
To extend this particular cache of the LongSpine information, that information - perhaps new vocabularies of government functions or longitudinal data of government structural units - should be accepted as being part of LongSpine and listed on the Data Page. Then it's just a matter of, either manually or via a script, loading the new data into the database that this cache uses while ensuring that the new information is placed in an appropriately named Named Graph.
Currently, the governance of what is in and what is not in LongSpine hasn't been codified.
If non-LongSpine data is to be loaded into this cache, perhaps to demonstrate how others may use the LongSpine by joining their own (external) data to it, then that data can also be loaded into additional Named Graphs within this database. Individual Named Graph can be cleared from this database by a database admin so graphs can be added and later removed without compromising the rest of the cache.
The scripts given above can be used to create another cache of LongSpine + any other information so should someone want to greatly extend the cache, or to perhaps just load a portion of it, they should use/extend the scripts to do that.
The specific systems powering this cache of LongSpine are:
linked.data.gov.au
domain or subdomains of it are managed by the Liked Data WG