Lecture 5 – Data sources Flashcards
Why sharing data?
- working together on a project
- common needs, common resource
- data as a product
- data-based service
Opportunities from shared data
- new combinations of data
- new relationships of data
- new visualisations of data
- new understanding of data
What is open data?
data that is freely available to everyone
What is the problem with open data?
Its abundance and complexity
(many datasets, with different definitions and access points)
What is machine-readable data?
data which is in a format that can be understood by a computer
(JSON, XML, csv)
What is a markup language?
system for annotating a document in a away that is synthetically distinguishable from the text (e.g. HTML, XML)
What is a digital container
file format whose specification describes how different elements of data coexist in a computer file
What is Metadata?
structured data that describes other data
can be …
descriptive (title, author, file size),
structural (relationships, chapters, elements of JSON),
administrative (version number, archiving data, createDate)
What is Predictive Model Markup Language (PMML)?
provides a standard language for describing a predictive model that can be passed between analytic software
Unix: pwd
path of current directory
Unix: cd DIRPATH
change directory to DIRPATH
Unix: ls DIRPATH
output the filenames of DIRPATH
Unix: cp FILENAME NEWFILENAME
copy FILENAME to NEWFILENAME
Unix: mv FILENAME NEWFILENAME
rename FILENAME to NEWFILENAME
Unix: echo “TEXT”
output TEXT