Understanding Data Flashcards

1
Q

Why are statistical methods important?

A
  • Social sciences
  • Epidemiology
  • Business and marketing
  • used for evidence based research
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Define data analysis.

A

The process of inspecting, cleansing, transforming, and modelling data with the aims of gaining some useful
insight (or information) to help support decision making.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is the DIKW pyramid?

A

DIKW is a useful framework for describing the relationship, or structural ‘stages’ one must go through to gain knowledge and wisdom.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What does DIKW stand for?

A

Data - Information - Knowledge - Wisdom

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Define data in terms of DIKW.

A

Raw facts.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Define information in terms of DIKW.

A

Contents of a database assembled from raw facts.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Define evidence in terms of DIKW.

A

Results of analysis of many datasets or scenarios.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Define knowledge in terms of DIKW.

A

Personal knowledge about places and issues.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Define wisdom in terms of DIKW.

A

Policies developed and accepted by stakeholders.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

List the three main facets statistics is composed of.

A
  • design
  • description
  • inference
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Describe the design part of statistics.

A

How to collect the data (i.e., probabilistic sampling approaches).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Describe the description part of statistics.

A
  • Describing the way the data looks
  • Summarising the data that has been collected
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Describe the inference part of statistics.

A
  • Making predictions about the wider population or about the future
  • Specifically, statistical inference
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Define population.

A

The entire possible set of subjects we wish to study e.g. states, individuals, businesses..

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Define sample.

A

The subset of subjects chosen for study through data collection.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Define parameter.

A

A numerical summary about the OVERALL population.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Define statistic.

A

A numerical summary of the sample data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Why do we tend to use statistics instead of parameters?

A

Because we rarely know true population parameters.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Which two bits of information does statistics contain?

A
  • A measure of central tendency
  • A measure of variability
20
Q

Define variable.

A

Anything that we can measure about the subjects in our sample.

21
Q

What falls under continuous variables?

A
  • Interval
  • Ratio
22
Q

What falls under categorical variables?

A
  • Nominal
  • Ordinal
23
Q

Describe the levels of measurement from lowest to highest.

A

(Lowest) Nominal &laquo_space;Ordinal &laquo_space;Interval &laquo_space;Ratio (Highest)

24
Q

Define discrete variables.

A

Contains data with countable items e.g. number of crimes in London in the last month, number of students in a class..

25
Q

Define continuous variables.

A

Contains data with measurable items, e.g. Age (in years: 25, 57, etc.), height (in meters)

26
Q

Define categorical variables.

A

Has categories or groups, e.g., gender, ethnicity, employment status etc

27
Q

List the characteristics of nominal measures.

A
  • Categorical measure
  • Discrete set of categories with no natural order
  • Used to distinguish groups with labels
  • May be referred to as a qualitative or categorical variable
  • It is the lowest level of measurement
28
Q

Give examples of nominal measures.

A

e.g. Gender:
0 = Female
1 = Male
e.g. Race:
1 = Asian
2 = Black
3 = White

29
Q

List the characteristics of ordinal measures.

A
  • Categorical measure
  • Discrete set of categories that have some natural order
  • Their categories have rankings but difference between rankings is not known
  • Order matters!
  • It is the 2nd lowest level of measurement
30
Q

Give examples of ordinal measures.

A
  • Likert scale (strongly disagree, disagree, neutral,
    agree, strongly agree)
  • Socioeconomic status
    1 = Working class (Low)
    2 = Middle class
    3 = Upper class (High
31
Q

List the characteristics of interval measures.

A
  • Continuous measure
  • Unlike ordinal variable, difference between categories are known and equal (-must be known to calculate an interval)
  • Zero is arbitrary (meaning that whatever observation you measure it does not indicate that its nonexistent)
  • 2nd best level of measurement
32
Q

Give examples of interval measures.

A

e.g. Temperature in degree Celsius: difference between 78 degrees and 79 degrees is the SAME as the difference between 45 and 46 degrees
- Measure of zero degrees Celsius doesn’t indicate that there is no temperature – it only means that its temperature at zero is at freezing point

33
Q

List the characteristics of ratio measures.

A
  • Continuous measure
  • Most precise
  • Exact value
  • Unlike interval measure, a zero value means that there’s “nothing” there (not arbitrary)
34
Q

Give examples of ratio measures.

A
  • Weight
  • Height
  • Income
  • House price
35
Q

Define a dependent variable (outcome,event).

A

The variable to be explained, described or understood.

36
Q

How is the dependent variable mathematically denoted?

A

As the variable Y.

37
Q

List two characteristics of dependent variables.

A
  • Dependent variable should be dependent upon something else
  • Should NOT affect the independent variable
38
Q

Why should dependent variables vary?

A

If you have a constant DV, you will not be able to explain the effect of other variables on it.

39
Q

Define an independent variable.

A

Presumed as the determinant or cause, or something that impacts the dependent variable.

40
Q

What other terms can be used to describe the independent variable?

A

Explanatory or predictor variables and risk factors.

41
Q

How is the independent variable mathematically denoted?

A

X

42
Q

List the 3 types of descriptive statistics.

A
  • Univariable analysis
  • Bivariable analysis
  • Multivariable analysis
43
Q

Define univariable analysis.

A

Analysis of only one variable on some characteristic.

44
Q

Give examples of univariable analyses and describe them.

A
  • Frequency Distributions - a count or distribution of values on some single variable
  • Other descriptive statistics – some summary measure that describes the data in a way not obvious by looking at the frequency distribution
45
Q

Define bivariable analysis.

A

Analysis of two variables.

46
Q

Give an example of a bivariable analysis.

A
  • Simple scatter plots
  • Cross-tabulations
47
Q

Define multivariable analysis.

A

Analysis of three or more variables.