Methods Of Summarising Data Flashcards
Before one can display data graphically what must happen
one has to organize the data in the form of tables, which summarize data into compact and readily comprehensible form as frequency distribution.
Importance of frequency distribution table and define it
Frequency distribution table is a table showing number of observations at different values of the variable.
•The purpose is to display meaningful pattern. It can be used for all types of data discrete or continuous.
In tabular presentations which of these statements are true
The categories must be mutually exclusive and mutually exhaustive. each disease must belong to a category and only one category of the table. True or false
- Avoid open ended intervals. True or false
- Limit the number of classes to between 10‑20. True or false
- Classes could be of equal or unequal widths. True or false
When is a category mutually exclusive and mutually exhaustive
Mutually exhaustive meaning at all cost one of the numbers in the samples must occur
Mutually exclusive: if you have 2 events only one can occur
Both can’t occur at the same time
Frequency polygons,
•the frequency is plotted at the midpoints of the intervals and for
•intervals of unequal widths, the heights on the y‑axis must be re‑adjusted.
True or false
For qualitative data, i.e. data not characterized by numeric quantity such as sex of patient, diseases seen at out‑patient department, summarization is by counting the numbers of observations in various categories and expressing them as proportion or percentage
True or false
•Sum of the individual items divided by the total number of items. It is amenable to mathematical manipulation, but is easily affected by extreme observations.
True or false
the case of data classified into intervals, the frequencies are multiplied by the mid‑class point. Because it is not known exactly where the frequencies are located within the classes. •Class mid point is obtained by adding the two class limits and dividing by two True or false
It is the middle‑most ranked observation. It is less influenced by extreme values, however, it is not easily amenable to mathematical manipulation.
•It is the best measure of central tendency in case of skewed distributed data.
True or false for median
•It is a useful summary statistic in antibody assay and microbacterial counts and for skewed data. It is defined as the Nth root of the product of N observations.
•It is not used if any of the observations is negative.
True or false
information about the variation within groups will provide useful additional statistics which could help to rate the strength of each group.
True or false
•It is the simplest measure of spread
•defined as the difference between the highest and the lowest observations.
•It tends to increase as the number of observations increases.
•It is not easily used for statistical inference.
•It only uses 2 of the observations and neglects all the information regarding variation
True or false
The semi inter‑quartile range is also a measure of variation and unlike the range, it does not vary with the number of observations. However, it is not a satisfactory measure of variation for small series of observations unless the number is divisible by 4.
True or false
Coefficient of variation It is a dimensionless statistic.
•CV = standard deviation / mean
True or false
The following data represent the number of correct responses made to the examination in statistics by 50 medical students in the Medical School selected systematically from the list of all students in the School.
72 72 93 70 59 78 74 65 73 80 57 67 72 57 83 76 74 56 68 67 74 76 79 72 61 72 73 76 67 49 71 53 67 65 100 83 69 61 72 68 65 51 75 68 75 66 77 61 64 74 a. Prepare the frequency distribution table and the frequency histogram for this data set. b. Compute the sample mean , sample median , sample range R, and sample variance . c. Does the data set represent a sample or a population? If it is a sample, describe the population from which it has been drawn.
Observed difference in parameters between the two groups such as treated and control may be due to?
i) sampling variation or chance
•(ii) Inherent differences between the two groups
•(iii) Differences in the handling and evaluation of
–(the two groups during the course of the investigation)
•(iv) The true effects of the new procedure/drug.
Probability is a proportion which in non negative and
•lies between 0 and 1.
•Probability distribution indicates how the total probability of ‘1’ is distributed among the different possible outcome
True or false
The width of the limits is twice the product of standard error of the attribute and the Za
True or false
What is statistical inference
•Statistical inference is a process by which one draws a conclusion regarding a population from the results observed in a sample.
What is test of significance
•Test of significance is the method to rule out sampling variation as an explanation of the observed difference.
What is probability value
P - value
•chance that random sampling from the population would produce a sample mean as deviant or more deviant than the mean observed.
If null hypothesis is true or not significant what inferences can be drawn from it
The observed result could well arise purely by chance
•ii No reason to doubt the validity of the null hypothesis. The data failed to provide sufficient evidence to doubt the validity of the null hypothesis.
•Not enough evidence to contradict the null hypothesis.
•iii We have to live with the null hypothesis until further evidence is obtained.
•iv The results do not provide sufficient evidence to doubt the null hypothesis