Lecture 7- Terminology Flashcards

1
Q

What does the type of variable used effect?

A

The tools we use to analyze it

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are the three main types of variables?

A
  • Continuous: expressed on a continuous scale in which every value is possible
  • Discrete: can be put in a one to one correspondence with counting numbers (whole)
  • Categorical: restricted to one of a set of distinct categories e.g. heads or tails
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is binary categorical data?

A

This is the simplest data type that can arise from a categorical variable
There is only two categories to choose from
e.g. you are either a smoker or non-smoker, athlete or not athlete

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Why is binary categorical data sometimes called 0-1 data?

A

Because the two categories are given arbitrary numbers for representation e.g. 1 represents the patients with the outcome and 0 represents those without the outcome

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Are there cases where we need more than two categories of categorical data? What are the further divisions that occur at this level?

A

Yes, often e.g. blood types
The term nominal is used if there is not relevant ordering of the data
Ordinal is used if there is an ordering/ rank

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

In what case where we are using a number scale is data not numerical?

A

For categorical ordinal categories we often use number as a scale/ ranking. However, this doesn’t mean the data is numerical e.g. pain levels. If it was numerical we could say someone who rated double someone else’s score experienced twice the pain which we can’t do.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How is discrete numerical data different from categorical data which uses a number scale?

A

Because numbers only take on discrete (whole) data values spaces between the numbers is always consistent so we can make claims like 2x as many (comparisons)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Continuous data arises from…..

A

taking a measurement (e.g. height), there is a full scale of numbers.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

True or false continuous data is more often positive than negative…

A

T

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is an example of making continuous data discrete? (coarsening)

A

When we say our age in full years

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Does continuous numerical data give rise to only 1 pattern of distribution?

A

No, because any measurement is possible often there is a lot of variation in the pattern that can be seen from a graph

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is an example of when multimodal data would arise?

A

Measuring height- would get a mode for each gender

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is a ratio?

A

Fraction given by one quantity over another. Both quantities have the same units

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is a fraction?

A

Fraction of one quantity when compared to the whole

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What are rates?

A

Ratios for quantities of different units

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What happens when it is too hard to obtain measures of a continuous phenomena?

A

A score is used

17
Q

What is censored data? What are the types?

A

Underlying data follows a continuous distribution but some values are not known exactly

Types:
-Right censored - the true value is known to be larger than a recorded
value

  • Left censored - the true value is known to be smaller than a recorded
    value
  • Interval-censored - the true value is known to lie between two values

Censored data are categorized by two variables, e.g. for right censored
data one variable gives the last known value and another indicates whether
or not the measurement is censored.

18
Q

What are parameters?

A

The numerical measure of the quantity of interest in a population. They are generally unknown but can be hypothetical

19
Q

What is a random variable?

A

An unknown quantity that varies in a unpredictable way. Like flipping a coin.

Once observed we refer to an observed or realized value

20
Q

Upper case roman letters represent…

A

Random variables

21
Q

Lower case roman letters represent…

A

Observed or realised values

22
Q

Random variables are described in…

A

probability distributions

23
Q

what is a statistic?

A

A numerical summary of data

24
Q

What is an estimate?

A

A special kind of statistic used as an intelligent guess for a parameter

25
Q

What does adding a circumflex do?

A

Estimates are denoted by adding a circumflex. This implies that the value is an estimate for the parameter value.