Introduction to Graph Data and Ontologies Flashcards
What is Data integration?
The analysis of genomic, imaging or other
types of data allows us to investigate different
facets of human health.
• But in order to gain a comprehensive
understanding of human health, we need to
integrate such data
What are the challenges to data integration?
- Biomedical and healthcare datasets sit in
silos. - Linking entities between different datasets is
not a trivial task.
In Scotland, we use CHI Numbers to uniquely
identify patients.
But how about sharing data between different
countries? - Ambiguity around the meaning of
different terms.
What are graph databases?
Graph Databases use graph structures with
nodes, edges and properties to represent and
store data
What are nodes?
T
What are edges?
T
What are properties?
T
What is the RDF graph data model?
Data is represented in the form of triples, i.e.
statements consisting of a subject, a predicate
and an object.
-Subject
-Predicate
-Object
Describe RDF triple visualisation
T
What are URIs?
In RDF, we use URIs (Uniform Resource Identifiers) to uniquely identify concepts and entities. • Examples: http://dbpedia.org/resource/Edinburgh http://xmlns.com/foaf/0.1/age • URIs are used for both resources and properties.
How to use existing URIs
DBPedia (http://dbpedia.org) is a very good source of URIs.
• Every resource that is the subject of a page in Wikipedia has a
corresponding URI in DBpedia.
• URI forEdinburgh:
http://dbpedia.org/resource/Edinburgh
How to create your own URIs
If you don’t own a domain name, you can use
http://example.com/
http://example.com/id/EwanMcGregor
Keep it simple
How to merge RDF data
By uniquely identifying resources with the use
of URIs, we can easily link data about the
same resource.
• Merging different RDF datasets is simply a
matter of bringing the two sets of RDF
statements together
Dataset3 = Dataset1 + Dataset2
How to write RDF statements in Turtle
Turtle (Terse RDF Triple Language): One of the
most popular forms of syntax for expressing
RDF.
• General form:
subject predicate object
What is Turtle?
Turtle (Terse RDF Triple Language): One of the
most popular forms of syntax for expressing
RDF
Whitespace and full stop
When using URIs, these should be enclosed in
angle brackets, e.g.
What is ontology?
A formal, explicit
specification of a shared conceptualisation.
• Essentially, a way of encoding domain
knowledge.
• Something like an enhanced dictionary, where
you can look up the meaning of different
concepts and find relations between them.