RDF to OWL Flashcards
relational database
a collection of data items with pre-defined relationships between them
These items are organized as a set of tables with columns and rows
study about what kinds of things exist
encompasses a representation, formal naming and definition of the categories, properties and relations between the concepts, data and entities that substantiate one, many, or all domains of discourse
relating to meaning in language or logic.
describes the processes a computer follows when executing a program in that specific language
What is the gap between relational databases and ontologies called?
database-to-ontology mapping problem
How to attack the database-to-ontology mapping problem?
(1) propose a new life cycle for ontology learning from RDBs based
(2) describe a new method for building ontology from Relational database based on the predefined life cycle
(3) add three new semantics that can be extracted from RDB
(4) evaluation process based on two categories of metrics: (i) conceptual ontology (TBox) evaluation metrics; (ii) factual ontology (ABox) evaluation metric
What are ontologies used for?
make domain assumptions explicit (stated clearly)
Enable reuse of domain knowledge
Share Common Understanding of Info
How to develop an ontology?
Define Domain & Scope Define Class & Hierarchy Define class properties aka slots i.e. book has genre & author slots Define slot facets (value) create individual instances of classes in the hierarchy
What are instances in ontology?
the `things’ represented by a concept i.e. a human cytochrome C is an instance of the concept Protein
What problem occurs between two or more info systems
heterogeneity problem
quality or state of being diverse in character or content
Should use build ontology from scratch?
No, use ontology auto or semi-auto to gain knowledge acquisition
Why use relational databases?
70% of web data stored on them
best tech for store & manipulating data
But suffer from lack of semantic meaning hindering interoperability among info systems
What did this paper not consider with ontology?
The quality of the resulting ontology
What is SQL-DDL?
SQL, data definition or data description language (DDL) is a syntax for creating and modifying database objects
Build ontology RDB with it
What are the two main benefits of combing TBox and ABox?
(a) it facilitates the Semantic integration problem
(b) it allows to use a reasoning services for checking the consistency and satisfiability of the resulting ontology
From software engineering perspective, what does the ontology development process identifies?
which activities are to be performed, but not their order - the live cycle does that
What is the life cycle of learning ontologies from RBD?
1) Discover -> Do we have enough info to build Ont? -> 2) Preparation -> Do we have enough quality data to start building Ont? -> 3) Development -> Is the resulting Ont robust enough to be published? -> 4) Evaluation (repeat) consider the user & domain needs
Learning ontology from relational database: life cycle
activities or phases that have to be performed for learning ontologies from relational databases
competency questions (sketch) check if the ontology includes sufficient information to answer these questions and if the answers require a particular level of detail
second phase of the LOFRDB involves data preparation
if the data sources contain enough semantics
if the RDB contains the complete space of relations and the maximum possible combinations of the primary keys and foreign keys
Clean & Normalize data
pre-development starts by the Data acquisition extracting the instances from the relational database
[42], and represent them based on the RDF triple form
schema acquisition - the vocabulary of the app domain
generate the definition and the meaning of the extracting instances
foreign key
RDB exploration
verifying if the input relational databases contain the complete space of metadata and semantic characteristics for generating ontology
How to calculate the NS metric?
Give a “1” to each existing characteristic in the RBD and a “0” otherwise
Number of Semantic characteristics in the RDB
What is OWL?
Web Ontology Language
a Semantic Web language designed to represent rich and complex knowledge about things, groups of things
What is a domain?
represents concepts which belong to a realm of the world, such as biology or politics
What are check constraints?
conditions that validate the data in a table i.e. data range restriction
What does the DEFAULT constraint in RDB do?
provide a default value for a column
What does the owl: hasValue constraint do?
describes a class of all individuals for which the property concerned has at least one value semantically equal to the default value…at least one must be equal to the default value
What are Domains & Ranges in OWL?
‘axioms’ in reasoning.
i.e. hasProf range: Prof & domain: student
these classes can have instances in common
How do you generate an A-box?
Use R2RML language
language for expressing customized mappings from relational databases to RDF data sets
What is AR?
Attribute Richness
The avg. number of attributes (slots) per class. The more attributes generated from RBD the more knowledge conveyed to the ontology
How to calculate AR?
calculated as the number of attributes for all classes ( ATT ) divided by the number of classes (C).
AR = |ATT| / |C|
What is IR?
Inheritance Richness.
Metric represents distribution of info across diff T-BOX levels.
Indicates how well knowledge is grouped into different categories & subcategories
How to Calculate IR?
Inheritance Richness is the Avg. # of subclasses per class
H: sum of IR
IR = |H| / |C|
What is RR?
Relationship Richness
metric reflects the diversity of the types of relations in the TBox
A TBox with IR has less info that a more diverse set: Trans, Symmetrix, Reflexive
What is A-Box validation?
evaluation metrics can be used to check how the data is placed inside the ontology
Class richness
how instances are distributed across classes
Low CR: A-Box lacks data showing up in the T-Box
High CR: A-Box data covers most of the knowledge
knowledge base
Average population
measure is an indication of the number of instances compared to the number of classes
Useful for telling if enough instances were extracted compared to the # of classes
What are example competency questions?
Query 1: find movie for a given set of generic features such as name and duration, etc
Query 2: retrieve basic information about a specific movie for display purposes
Query 3: find movie having a label that contains specific words
Query 4: get information about a reviewer
Query 5: find movies having a label that contains specific words
Query6: find Text description of a given movie’s title
Query 7: find movies that are similar to a given movies
What question should be answered in the discovery phase?
Do we have enough relevant info background to start building an ontology?
What 3 things are required to create a perfect plan for learning ontology from RBD?
Requires a clear understanding of the domain area, the problem to be solved, and scoping of the data sources to be used. Knowing this helps to select the appropriate databases
What are competency questions?
Types of questions people, who are using ontology, want to be able to answer. A natural language sentence with certain patterns. Create CQ in Discovery to use in Development
While in the discovery phase, how should the data be analyzed?
Look at the list of data chosen to see if it has enough metadata
What are 7 semantic (column) values?
Inheritance, transitive, symmetric, value restriction, data range restriction, Functional and inverse Functional property.
What do flat ontology values close to zero mean?
Resulting Ontology has a General Knowledge of the domain
What do large vertical ontology values mean?
The resulting ontology is better than the reference ontology