Lecture20 Flashcards

1
Q

Big Data is

A

Large or complex datasets, which often need terabyte or petabytes of storage. They contain large amounts of info at a population, regional or local level or span different geographic areas. Combining data from multiple sources to explore population health outcomes.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Volume in big data:

A

computing capacity to store and analyse data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Velocity:

A

The speed at which data are created and analysed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Variety

A

The types of data sources available

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Veracity

A

The accuracy and credibility of data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Variability:

A

Internal consistency of your data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Data linkage:

A

The process of matching records from different sources based on key information

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Deterministic approach to data linkage:

A

Exact matches based on personal information appearing in all datasets

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Probalistic approach to data linkage

A

Statistical weights are used to calculate the probability that data from different sources

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

NHI:

A

tracks your interactions with the health system

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

The IDI:

A

Integrated Data Infrastructure is a large research database containing microdata about people and households. De-identified data. Researchers use the IDI to complex questions to improve outcomes for New Zealanders.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Benefits of IDI:

A

De-identified, linkable, Resource is only as good as the data it contains.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Variables included in NZDep2013 (x9)

A

Communication (no internet access), Income (people aged 18-64 recieving a means tested benefit), income (People living in equavalised households with income below an income threshold), Employment, Qualifications, Owned Home, Support (people <65 living in a single parent family), Living space, Transport

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What challenges do big data bring?

A

Data governance, data generation, data output

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

The five safes:

A

Safe people, projects, settings, output, data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Analyses you can do with big data:

A

Machine learning, Artificial intelligence, data mining, smart cities