RS- 6 Flashcards

1
Q

population

A

Is the collection of items under discussion / observation
* e.g. objects, events, hospital visits, procedures, observations, measurements or it can be an actual population of people, animals etc.

  • Finite - if it is possible to count its individuals (to get a N number)
  • Infinite – if there is no end to the population, or it is uncountable
  • Real
  • Hypothetical
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Variates and variables

A

Variate: A quantity or attribute whose value can change- e.g. for the variable ‘sex’ the variates could be male or female. Male is a
variate.
* Variable: Any characteristics, number, or quantity that can be measured or counted. It can take on different values e.g. ‘sex’ is a variable as it can be male or female

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Cases

A

an experiment unit from which data is collected

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

observations

A

is a set of one or more measurements on a single unit of observation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

what are the types of variables?

A

quantitative- values are numerical, arithmetic operations can be performed on them, result from counting or measuring something
Qualitivaive- non numerical
constant- A quantity which can assume only one value is called a constant- They can be mathematical constants which do not vary, or they can, be categorical constant

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

why are these differences important?

A

The methods we employ to analyse the data depend on the level and type of data (variables) we have.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

what are the types of quantitative variables?

A
    1. continuous- Can take any value within a range, they are continuous on a scale, the values
      between the figures have meaning and the data can be fragmented into parts- e.g. birth weight of a baby in kg and g
    1. discrete- Discrete variables are specific points on a scale, they might change by steps or jumps, the values between have no meaning, often are whole numbers e.g. number of children a person has – it cannot be 2.4 children it must be 1, 2,or 3 etc
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

what are the types of qualitative variables?

A
    1. Nominal
      there is no natural order to the categories these variables are assigned to- e.g. degree course or hair colour
    1. Ordinal- there is a natural order to the categories e.g. months of the year follow an order, satisfaction scale from 1-10,
      .3. Dichotomous
      there are only 2 options
      e.g. yes / no vote, leave / remain vote.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

why do we summarise shite-

A

readable and understandable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

what are the tables and give the descriptions

A

-Frequency table- grouping of data and intervals
-Histrogram- skewed, graph rep
-dot plot- linear axis, range is finite, retains all data in original form,

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Talk about central form

A
  • 1 number to represent data- the most central- central ‘tendency’.
    Measured by median, mean (sum of all data points divided by number of observations) balance point, equal weighting on both sides however outliers strongly affect the mean, mode
    median better because middle value, rank from lowest to highest, divides distribution data in 2 halves. sum of two middle values divided by 2.
    Mode- data point that occurs most frequently, doesn’t rank. If two- then bimodal
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

when to use which central tendency

A
  • For categorical data use the mode
  • For quantitative data use the median or mean
  • The mean is strongly affected by outliers
  • The median is insensitive to outliers and to skewed distributions
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

what is the limiting factor of central tendency

A

Even though the central tendency information would tell us that. these data were very similar – we can see that there is a greater spread of the data for year 2.
* We therefore need a better way to describe the data, and the spread of the data we can see

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

what are the other ways t represent data that is not based on central tendency?

A
  • range, is the measure of the difference between the lowest and the highest data values. shows range and spread but doesn’t show how variable the data is, extreme values extreme effects.
    -IR- better idea of how the data is distributed. measure of where the “middle fifty %” is in a data set. It is a measure of where the bulk of the values lie. This help rule out extreme values or outliers in the data. divide middle 50 from median. The first quartile, denoted Q1, is the value in the data set that holds 25% of the values below it. The third quartile, denoted Q3, is the
    value in the data set that holds 25% of the values above it. Q0-Q4, q1-q3. Interquartile range covers the middle two groups. Used by population scientists with large datasets. Not useful with small numbers of observations. FIND MEDIAN, MEDIAN FOR LOWER HALF AND MEDIAN FOR UPPER HALF.he interquartile range (IQR) shows the range in values of the central
    50% of the data. To find the interquartile range, subtract the value of the lower quartile (Q1) from the value of the upper quartile (Q3). Shown by box and whisker plot
    -Box whisker plot: A Box and Whisker Plot (or Box Plot) is a convenient way of visually displaying the data
    distribution through their quartiles.
    The lines extending parallel from the boxes are known as the “whiskers”, which are used
    to indicate variability outside the upper and lower quartiles. Outliers are sometimes
    plotted as individual dots that are in-line with whiskers. Box Plots can be drawn either
    vertically or horizontally. If your extreme values (high or low) are more than 1.5x IQR below Q1 or above Q3
    then they can be classed as outliers. They can be plotted separately from the box
    and whisker plot as an asterisk * or other symbol.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

summary- descriptive status

A
  1. Measures of average
    – Mean: works best for mathematicians
    – Median: sometimes gives a more sensible answer when there are
    outliers, or a skewed distribution
  2. Measures of spread
    – Range(only tells you about smallest and largest observation)
    – Interquartile range (only useful if large number of observations)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

describe variance

A

Think of variation as a kind of average of how much each number in a group of
numbers differs from the group mean.
* Several statistics are available for measuring variation. They all work the same way:
* The larger the value of the statistic, the more the numbers differ from their mean.
* The smaller the value, the less they differ.