Week 11 Flashcards

1
Q

What is the first star in Linked Open Data?

A

vailable on the web with an open license
Example: Government population dataset published on a website with Creative Commons license

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What does the second star add to open data?

A

Data must be machine-readable structured data (e.g., Excel instead of PDF)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What does the third star require?

A

Use of non-proprietary format (e.g., CSV instead of Excel)
Example: Population data in CSV:

city,population_2020,population_2021
New York,8336817,8467513
Los Angeles,3898747,3923341

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What standards are required for the fourth star?

A

Back: Use W3C open standards (RDF and SPARQL) to identify things
Example: Population data in RDF:
turtleCopy@prefix city: http://example.org/city/ .
@prefix pop: http://example.org/population/ .

city:NewYork pop:population2020 8336817 ;
pop:population2021 8467513

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

: What makes data achieve five stars?

A

Link your data to other people’s data to provide context
Example: Extended RDF with links:
turtleCopycity:NewYork
pop:population2020 8336817 ;
owl:sameAs http://dbpedia.org/resource/New_York_City ;
geo:location http://geonames.org/5128581/ .

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

explain linked data life cycle

A

Answer:
1. GENERATE

What: Creating initial data that will become linked data
Format: JSON, XML, HTML, etc.
Example: Creating a JSON file with book information

{
“title”: “1984”,
“author”: “George Orwell”,
“published”: “1949”
}

  1. VALIDATE
    Three levels of checking:

Individual Fields

Spell checking
Data type verification
Format validation (e.g., ISBN format)

Structural

Required fields present
Proper nesting/hierarchy
Data integrity

Semantic

Logical consistency (e.g., publication date within author’s lifetime)
Domain/range checks
Relationship validity

  1. PUBLISH

Convert data to RDF triples
Make available as SPARQL endpoint
Key Feature: Live, real-time queryable database
Example: Publishing book data so others can query it instantly

  1. QUERY

Ability to search and retrieve data
Uses SPARQL query language
Example Query:

SELECT ?book WHERE {
?book author “George Orwell”
}
5. ENHANCE
What it IS:

Adding new relationships
Creating external links
Enriching with additional metadata
Example: Linking books to related works

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

REMEMBER THE CHARACTERISTICS FOR DATA DUMP AND SPARQL ENDPOINTS ARE COMPLETE OPPOSITES

A

NOTE

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is a data dump in linked data publishing and what are its characteristics?

A

Data Type: “Old” (static) data
Access Method: Download entire dataset at once
Bandwidth Usage: High (must download everything)
Availability: High (simple file download)
Client Cost: High (needs resources to process full dataset)
Server Cost: Low (just hosting files)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

: What is a SPARQL endpoint and what are its characteristics?

A

A SPARQL endpoint is a live service that accepts queries and returns specific results.
Key Characteristics:

Data Type: Live (real-time) data
Access Method: Query-based (get only what you need)
Bandwidth Usage: Low (selective data retrieval)
Availability: Lower (service might be down)
Client Cost: Low (minimal processing needed)
Server Cost: High (maintaining query service)

Example:
sparqlCopySELECT ?book WHERE {
?book author “Terry Pratchett”
?book published “2000”
}
This gets only specific books instead of downloading the entire library catalog.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly