Statistics Flashcards

Question

What is continuous data

Answer 1

Capable of being expressed as numbers e.g height, weight, serum bilirubin

Answer 2

Two populations of numbers in which the same variable has been measured on the same population usually at two different times, or under two different conditions. e.g before and after a treatment. In clincal trials unpaired T tests are used because patients are randomised intor groups (e.g type of anasthetic used for an operation) Paired t tests remove subject to subject variation

Answer 3

Have mutually exclusive classes but there is an order between them. I.e Can be ranked or ordered, falls between 2 extremes Can be given as frequncies Mean calculates with caution.

Answer 4

Also known as categorical or qualitative. Consists of classifying the observations into mutually exclusive classes. I.e can be put into various categories but no specific hierarchy exists e.g sex, colour Can be given as frequencies Mean cannot be calculate.d

Answer 5

Central tendency - The average - Commonly used ones are mean and the median, mode is used less frequently. - Median: Middle number. - Mean: Arithmetic mean Best used when observations are symmetrical (i.e evenly distributed) not as good when there are outliers, can skew results. - Mode: Most frequent number

Answer 6

Series of relative frequencies

Answer 7

A left-tailed test is used when the alternative hypothesis states that the true value of the parameter specified in the null hypothesis is less than the null hypothesis claims. Critical value will be negative.

Answer 8

A right-tailed test is used when the alternative hypothesis states that the true value of the parameter specified in the null hypothesis is greater than the null hypothesis claims Critical value will always be positive (a threshold that is used to determine whether or not to reject the null hypothesis). Direction of test is indicated in the alternative hypo thesis and not in the null hypothesis.

Answer 9

Non directional Used when population parameter is DIFFERENT from hypothesised value. Usually has 2 critical value

Answer 10

The closer the mean and median are together, the more symmetrical the distribution. We can get a crude measure of skewness by subtracting the median from the mean.

Answer 11

Explain how the observations are spread around the central measure. For parametric data SD describes the dispersion of values around the mean. Non parametric - Percentiles are used to describe the values around the median value.

Answer 12

Interquartile range. It is defined as the difference between the 75th and 25th percentiles of the data.

Answer 13

The degree of association between 2 variable expressed at -1 to +1 -1 = negative correlation 0 = no correaltion +1 = positive correlation The correlation coefficient is a mathematical interpretation that is devoid of any cause or effect implications. It is best to regard the correlation technique as a type of investigative analysis because it suggests areas for further research, rather than as testing hypotheses.

Answer 14

A standard deviation that is greater than one-half of the value of the mean should raise questions about the adequacy of the standard deviation as a summary statistic.

Answer 15

Alpha = type 1 error Beta - Type 2 error

Answer 16

any instance that involves the simultaneous testing of more than one hypothesis

Answer 17

expressed by the statistic 1-beta Reflects the ability to reject a false hypothesis (usually set between 70-90%)

Answer 18

The difference in response rates between the groups that would be of biological or clinical interest.

Answer 19

Agreement between measurements refers to the degree of concordance between two (or more) sets of measurements. Statistical methods to test agreement are used to assess inter-rater variability or to decide whether one technique for measuring a variable can substitute another. It is evaluated by tests such as Kendall’s tau. Measurements made by two (sometimes more than two) different observers or by two different techniques produce similar results.

Answer 20

Multiple comparison - The more statistical tests you do the more likely it is you’ll get a false positive result. - Can use Bonferroni correction, sidak, Holms or Tukeys procedure to correct for multiple comparison

Answer 21

Increase in one variable leads to the increase in another.

Answer 22

Increase in one variable leads to a decrease in another.

Answer 23

Parametric - Pearson Non-parametric - Spearmen

Answer 24

The regression equation representing how much y changes with any given change of x can be used to construct a regression line on a scatter diagram

Answer 25

Multiple linear regression is used to estimate the relationship between two or more independent variables and one dependent variable. You can use multiple linear regression when you want to know: How strong the relationship is between two or more independent variables and one dependent variable (e.g. how rainfall, temperature, and amount of fertilizer added affect crop growth). The value of the dependent variable at a certain value of the independent variables (e.g. the expected yield of a crop at certain levels of rainfall, temperature, and fertilizer addition).

Statistics Flashcards

(49 cards)