User-manual

From Congruence Engine Data Register
Jump to navigation Jump to search

Introduction

The Congruence Engine Data Register is one of the key aspects of the minimum infrastructure for a national collection that the Congruence Engine is exploring. The Congruence Engine Data Register will act as a place to record the existence of a (collection, archive) dataset that is available to be used as part of the national collection.

The Congruence Engine Data Register is powered through an experimental Wikibase cloud instance that has been set up with an industrial history focus and can be accessed via https://congruence-engine.wikibase.cloud/wiki/Main_Page

This User Manual will introduce you to the concepts and frameworks of Wikibase and Wikidata, will then describe the CE Data Register development process and finally will provide instructions about accessing, adding data at and navigating the Congruence Engine Data Register through Wikibase. Feel free also to access the User Manual in a Word Format.

A gentle introduction to Wikibase and Wikidata

Wikibase is a free software that stores and organises information that can be collaboratively edited and read by humans and by computers, translated into multiple languages and shared with the rest of the world as part of the Linked Open Data (LOD) web. Wikibase is the software behind Wikidata. It consists of a set of extensions to the MediaWiki softwarefor storing and managing data (Wikibase Repository) and for embedding data on other wikis (Wikibase Client). Wikibase, among other things, lets you store information that's meant to be consumed and thoroughly digested by computers, connected to other information systems, collaboratively edited, and shared, shared, shared. Wikibase powers Wikidata and, increasingly, a wide range of other linked open data projects.

Wikidata is a free and open knowledge base that acts as a central storage for structured data to be used by Wikimedia projects and others. It is a collaborative project that allows users to contribute and edit data about various topics, such as people, places, events, and concepts, in a structured format. Wikidata's data is machine-readable and can be used by developers, researchers, and organizations to power applications, conduct analyses, and enhance the understanding of information across different domains.


How can data be modelled?

The main approach to representing data in Wikibase/Wikidata is in a triple or statement (fig.1) .


A triple is a sequence of three entities that codifies a statement about semantic data in the form of a Item > Property > Value structure ( or subject–predicate–object )

  • Item is an entity from the real world (e.g. National Science and Media Museum);
  • Property is a relationship between items or between an item and a value (e.g. National Science and Media Museum is part of Science Museum Group).
  • Value (or property value) is the value of a Property (Science Museum Group)

The combination of an Item, a Property, and a Property value forms an RDF Statement (known as the subject, predicate and object of a Statement) using the RDF semantic conceptual data model. Wikidata entries are made of a series of RDF statements (fig.2).


Fig.1 an example of a Wikidata triple



Fig.2 an example of a Wikidata entry


How can data be queried ?

SPARQL is the standard query language and protocol for Linked Open Data on the web or for RDF triplestores. SPARQL is thus able to retrieve and manipulate data stored in Resource Description Framework (RDF) format. SPARQL lets users run queries on the data contained in Wikidata. As of November 2018, there are at least 26 different tools that allow querying the data in different ways. In 2021, Wikimedia Deutschland released the Query Builder,"a form-based query builder to allow people who don't know how to use SPARQL to" write a query.

The Congruence Engine Data Register and Wikibase

The Congruence Engine Data Register is a register of collections and cultural heritage data related to industrial history. It is designed to help people interested in industrial history to find, use, edit, and contribute to this type of cultural heritage. The Congruence Engine Data Register, powered through Wikibase, is a proof-of-concept of a collaboratively developed and curated minimal digital infrastructure for digital cultural heritage, and more specifically for industrial history. It has been developed as part of the TaNC-funded Congruence Engine project, which focused, among other things, on exploring what a UK collection of heritage materials might look like. The CE Wikibase instance does not contain or provide immediate access to the collections or datasets per se. It contains only information about the collections or datasets which can be then found and accessed, in physical and/or digital format, in various cultural heritage institutions across the UK. The pioneering thing about CE Wikibase is, thus, that it enables connections and linkage among these collections and datasets (and other information systems collaboratively edited, and shared), using the underlying semantic web infrastructure, setting thus the basis for a more sophisticated interconnected cultural heritage collections infrastructure.

Data model information

The Data Register schema and consequently the CE Wikibase data model have been developed using the information initially defined and gathered for the Bradford Data Register by Beatice Canelli (Register_Bradford datasets.xlsx). After having defined the initial information fields required and their relationships, we translated/mapped the information required to Wikibase entities (items & properties) i.e. identify core entity types and relations.


Example Wikibase data model for the data register

Classes of items and properties The CE Wikibase currently hosts one class of items, that of dataset (fig.3): Dataset (ID=Q1)

fig.3.an example of a dataset entry.


Below you can find a list of the CE Data Register properties alongside their characteristics(fig.4).

Properties:

  • Label
  • Description
  • Alias
  • Instance of
  • Keywords
  • Data created
  • Held by
  • Available at URL
  • Copyright status
  • Copyright licence
  • Digital and Cultural Heritage Datasheet URL
  • Used by


Fig.4. a set of guidelines on available CE Data registry properties, their characteristics, and instructions on how to record them also available at https://congruence-engine.wikibase.cloud/wiki/Guidelines .

Statements are used for recording data about an item, consisting of property and value pairs. In the context of the Congruence Engine Data Register, a statement would be information about subjects and topics of a dataset, where the dataset can be accessed, the organisation/person holding the dataset, copyright status, etc.

Using the Congruence Engine Data Register

Creating an account

In order to add and edit dataset records to the CE Data Register, you will need to be a registered user. Please create a user account if you don't have one already.

Registering a dataset ​​

1. Using the Search, check and make sure your dataset is not already registered in the Data Register.

2. Log in to Congruence Engine Data Register.

3. Select New Item from the Wikibase menu in the left side bar, and you will be directed to the 'Create a New Item' page.

4. In the ‘Create a New Item’ page, enter the basic information for the item you wish to create as follows:

Create a New Item in Wikibase - Example
Create a New Item in Wikibase - Example
Language
By default, a language will be automatically selected based on your computer’s language settings. You can leave it as is or manually select another language that is appropriate to your entry.
Label
This is the name of your dataset. Please make sure that abbreviations included in the name are expanded (e.g. West Yorkshire Queer Stories Bradford oral histories ‘ instead of WYQS Bradford oral histories) and that the name does not include versions or creation dates or any project-specific details that might be difficult to understand/decipher for general users.
Description
This is the description about your dataset i.e. what the dataset is of. Please be as descriptive as you can, including (where available) which region or historical period the dataset covers - basically, the details that you think would help others discover your dataset via search.
Aliases (optional)
These are other names your dataset may be known by. If there are multiple names, enter as pipe-separated string e.g. WYQS|Queer Stories

5. Once you have entered the basic information, click the ‘Create’ button to register your dataset. You will then be redirected to the page where you can add further details about your dataset, known as statements.

Wikibase Statements - Example
Wikibase Statements - Example

Adding statements

Statements are used for recording data about an item, consisting of property and value pairs. In the context of the Congruence Engine Data Register, a statement would be information about subjects and topics of a dataset, where the dataset can be accessed, the organisation/person holding the dataset, copyright status, etc.

1. Click the ‘+ add statement’ link.

Add a statement

2. Start typing into the property input field and select a property you wish to add from the suggested list of properties.

Add a property to a statement

3. Enter a value for the property and click ‘save’, or continue below to add qualifiers.

Add a property value

4. If you need to add more values to a property, such as with keywords, click the ‘+ add value’ button.

Add additional property values

Adding qualifiers

Some statements in the Congruence Engine Data Register will require you to add ​​qualifiers. Qualifiers, added as property and value pairs, are used to further describe or refine the value of a property in a statement. For instance, what is meant by 'West Yorkshire Archives' can be elaborated on by using the property 'described at URL' as a qualifier to link to an entry in Wikidata.org or other web page(s).

Wherever possible, we would like further descriptions be given using the described at URL property with a link to a relevant entry in Wikidata.org. This typically applies to the held by (dataset holder), used by, copyright status and copyright license properties.

1. If you are not editing already, click the ‘edit’ button for the statement you wish to add a qualifier to.

Edit a statement

2. Click the ‘+ add qualifier’ button.

Add a qualifier

3. Add property and qualifier value, and click ‘save’.

Guidance: Please make sure to check for an entry that describes the property you are qualifying at Wikidata.org, and add the link to the entry as a value. Where there is no pre-existing Wikidata entry, please create a new Wikidata entry, or alternatively, find an equivalent webpage elsewhere e.g. about page on the dataset holder's website.
Add property and value to a qualifier