Statistics & Data Usage Flashcards
A statistic that represents a cohort’s probability of surviving at a particular point in time.
Survival probability
A summary display of the pattern of survival probabilities over time.
Survival curve
In statistics, patients who are observed until they reach the end point of interest are called _____ cases.
Uncensored
In statistics, patients who survive beyond the end of follow up or who are lost to follow up are called _____ cases.
Censored
The horizontal axis in a graph is also called the ____ axis.
X
The vertical axis of a graph is also called the ____ axis.
Y
The average of a series of numbers is also called the _____
Mean
The number that appears most often in a series of numbers is called the _____
Mode
The middle value in a series of numbers is called the ______
Median
Which method of survival analysis calculates the proportion of patients surviving to each point that a death occurs?
The Kaplan-Meier Method
Which method of survival analysis is more accurate in estimating a survival curve?
A. The Life Table method.
B. The Kaplan-Meier method.
B. The Kaplan-Meier method
Which survival analysis method involves dividing the total period of observation into fixed intervals?
The Life Table method
In most cancer applications, the most important variable by which survival results should be subdivided is the ______
Stage of disease
A survival estimate based on all deaths, regardless of cause is called ______
Observed survival
A net survival measure representing cancer survival in the absence of other causes of death.
Relative survival
The purpose of a cancer registry report is to provide data for
_____ and ______
Education and research
What is one of the primary ways hospital and central cancer registries become known in their communities?
Dissemination of cancer data
The reputation and usefulness of a cancer registry is often judged by the accuracy, timeliness, and clarity of its reports.
True or False?
True
Registries are not allowed to obtain copies from other registries to use as models for their own publications.
True or False?
False.
The CoC requires an annual report of a cancer program’s activities.
True or False?
False
Distributing an annual report of the cancer registry’s activities to medical and administrative staff is an excellent way to showcase the information in the registry.
True or False?
True
A log of all requests made to registry for information is called a _____
Request log
A registry should keep a file of how they’ve responded to each request for information.
True or False?
True
What is the primary purpose of a narrative or technical writing that accompanies the presentation of data?
To describe how the data were collected and analyzed
Another name for the vertical or Y axis of a graph is called the ___
Ordinate
Another name for the horizontal or X axis of a graph is called the ___
Abscissa
Categories on a graph must be mutually exclusive.
True or False?
True
Data that only fits into one group is called ____
Mutually exclusive
What are the two types of qualitative data?
Nominal and ordinal
What are the two types of quantitative data?
Interval and ratio
Data that is unordered and discontinuous is____
Nominal data
Data that has some continuity or is ranked in some type of order is called ____
Ordinal data
Stage of disease (I, II, III, IV) is what kind of data?
Ordinal
What kind of data includes numbers that begin from an arbitrary starting point, such as body temperature?
Interval data
What kind of data is based on units of measure and has a well-defined absolute 0 (where 0 means there is none, ex. weight or tumor size)?
Ratio data
What type of graph is a continuous bar graph (the bars are touching) in which the height of each bar is proportional to the number of observations?
Histogram
What type of graph uses separate, non-touching bars to depict nominal data with no presumed order?
Bar graph
What kind of graph represents a percentage of a total?
Pie chart
Three measures of central tendency are the
Mean, Mode, and median
Three measures of variability (spread/dispersion) are
Range, variance, standard deviation
The difference between the highest and lowest number in a set of numbers is called the ____
Range
What is the square root of the variance called?
Standard deviation
A count of the number of times a variable occurs, (ex. Patients with pancreatic cancer shows 14 males, 11 females).
Frequency distribution.
In what kind of frequency distribution does the distribution equal the total number of cases? (Ex. All 25 patients in a study have the same outcome).
Absolute frequency
The ____ is obtained by dividing one frequency by another.
Ratio
If your data might be skewed because of some extreme values, what is the best measure to express your data?
A. Mean
B. Mode
C. Median
C. Median
An essential item of statistical data whose value can change
Variable
The average square of the distance of observations from the mean in a distribution is called the _____
Variance
What are the three shapes of dispersion curves?
- Bell.
- Skewed.
- Bimodal.
Which dispersion curve shape is considered normal?
Bell curve
A negatively skewed distribution curve has the tail (longer,flatter part of the curve) on the ____ side.
Left side.
What kind of dispersion curve has a symmetric distribution around the mean?
A bell curve.
Which kind of dispersion curve has more observations on one side than the other?
A skewed curve
What kind of dispersion curve has two separate peaks?
Bimodal
A measure of whether a dispersion curve is heavy-tailed or light-tailed compared to a normal distribution curve.
Kurtosis
The expected frequency with which an event will occur is called ______
Probability
The probability that something would occur by chance alone is called the ____ value
P-value
A type of risk that’s synonymous with incidence and refers to one’s possibility of developing a particular disease over a period of time?
A. Absolute risk
B. Attributable risk
C. Relative risk
D. None of the above
A. Absolute risk
Represents the number of new cases in a population over a period of time.
Incidence
The number of deaths in a population over a period of time is the _____ rate.
Mortality rate
The total number of existing cases in a given population at a specific time.
Prevalence
The proportion of a specific population affected over a period of time is the ______ incidence.
Cumulative incidence
The cumulative proportion of patients alive over time is called _____
Survival
What are the five measures of disease frequency?
- Incidence.
- Prevalence.
- Survival.
- Mortality rate.
- Cumulative incidence.
The Life Table method of calculating survival is also called the _____ method.
Actuarial.
This method calculates the percentage of patients alive at the end of a specified interval, and only uses patients actually at risk of dying in that interval (ex. 5-year survival rate).
Direct method.
This method uses all individuals in the study group regardless of their length of follow up (alive, dead, lost) to determine survival rate.
Actuarial (or Life Table) method.
Three types of survival data include:
- Observed survival.
- Adjusted survival.
- Relative survival.
Survival data that uses all deaths regardless of cause.
Observed survival.
Survival data that considers only deaths from cancer
Adjusted survival or cause-specific survival.
The ratio of the observed survival rate to the expected rate for a demographically similar group in the general population.
Relative survival.
What are the three types of bias?
- Selection bias.
- Measurement bias.
- Confounding bias.
What are four controls to prevent bias in a study?
- Matching.
- Randomization.
- Stratification.
- Blinding.
A systematic error resulting in over or underestimation of the strength of association.
Bias
A study in which neither the subject nor the investigator know which group the subject is in.
A double-blind study.
The study of the distribution of a disease in the population and the factors that influence this distribution.
Epidemiology
The crude death rate is based on what portion of a population?
The total population
What death rate is based on variable s such as age, sex, cause, etc.?
Specific death rate
What type of risk measures the incidence associated with a specific factor and is calculated by taking the difference between the rate of a condition in an exposed population and an unexposed population?
A. Absolute risk
B. Attributable risk
C. Relative risk
D. None of the above
B. Attributable risk
A type of risk based on the ratio of incidence in a group with a specific factor compared to a group without that factor.
Relative risk
What term refers to the chance of rejecting the null hypothesis in a statistical test when it’s true?
A. Probability
B. P-value
C. Significance level
D. None of the above
C. Significance level
The ability of a test to give a positive result when the person has the disease.
Sensitivity
The ability of a test to have a negative result when the person does not have the disease.
Specificity
What is a cohort?
A group of people who have something in common.
Another name for relative risk is ____
Risk ratio
A cross-sectional study determines:
A. Incidence.
B. Prevalence.
B. Prevalence.
A study in which an entire population is classified according to the presence or absence of a certain factor, and then observed forward over time for the development of disease is called a _____
Prospective study
What type of study looks back at the exposure history after a disease or injury has occurred?
A retrospective study
A type of ratio that measures the odds an outcome will occur given a particular exposure compared to the odds of the same outcome in the absence of that exposure.
Odds ratio (OR)
A type of bias caused by the choice of individuals to be included in a study, (Ex. Choosing only male patients when testing heart medication).
Selection bias
The type of bias that occurs when information collected for a study is inaccurate.
Measurement bias.
A type of bias that occurs when a third variable that’s related to other variables leads to the perception that a relationship exists. (Ex. Smokers who drink coffee develop lung cancer. Most smokers drink coffee, so is the coffee really causative?)
Confounding bias
What bias control does a clinical trial typically use?
Randomization
The overall success of the registry hinges on the success of which two efforts?
- High quality data management
2. Effective reporting of cancer information
What are the 7 steps to effective report writing?
- Identify the audience
- Define the purpose
- Identify the cases required
- Extract data and calculate stats
- Choose appropriate format
- Edit and proofread
- Distribute or publish the report
What type of table shows only one variable?
A one-way table
A table showing more than ____ variables are difficult to interpret and should be avoided.
Four
What type of graph is best for showing trends and change over time?
Line graph
What type of graph is best for comparing the size or amount of variables?
Bar graph
What type of graph is best for showing frequency distributions in bar form, and for showing continuous variables like age?
Histogram.
What kind of graph is best for showing percentages or parts of a whole?
Pie chart
What type of graph is similar to a histogram and utilizes a line connecting the midpoint of the top of each bar?
A frequency polygon
What type of graphic uses symbols that are strongly associated with the message of the chart?
Pictorial chart
Comparison data adds credibility to a report.
True or False?
True
What does Section 508 of the Rehabilitation Act say in regard to electronic reports created with federal funding?
They must be accessible to persons with disabilities
Which organization specifies the format of national data transmission?
NAACCR
HIPAA rules do not apply when data is transmitted to the National Cancer Data Base.
True or False?
False
All reports should undergo a quality control review before being released.
True or False?
True
Statistics describe a population, while _____ tests evaluate whether the characteristics of two populations differ.
Statistical tests
How do you obtain the median when there is an even number of values, and therefore no middle value?
You average the two middle values.
What hypothesis asserts there is no difference between groups?
The null hypothesis
A claim or statement about a property of a population.
Hypothesis
A _____ test is used to determine whether the difference between two groups is more than would be expected by chance alone.
Statistical test
____ is used to analyze how one continuous variable varies with another continuous variable. (Ex. Do taller people tend to weigh more?)
Correlation
Tests that assume the data are distributed normally are called ____ tests.
Parametric
Tests that do not assume data are distributed normally are called ____ tests.
Nonparametric
This type of test, which compares the means from two samples to determine whether they are statistically different, is favored when the size of each sample is at least 30 observations.
T-test
In epidemiology, what are the three basic elements needed for disease to occur?
- Agent
- Host
- Environment
What kind of prevention begins before the onset of disease, and includes lifestyle and behavior modifications?
Primary prevention
What kind of prevention begins after possible onset of disease but before symptoms of the disease are present and includes screening tests such as mammograms?
Secondary prevention
What is tertiary prevention?
Measures taken after a disease is diagnosed in order to prevent the disease from becoming more severe.
Which measure gives a better understanding of the total burden of disease on a population?
A. Incidence
B. Prevalence
B. Prevalence
What term refers to the probability that subjects with a positive screening test truly have the disease?
Positive predictive value.
What is the leading cause of cancer death in men and women?
Lung cancer
The actual number of observations of an occurrence is called the ___ count.
Raw
The proportion of time that repeated observations will likely fall between the stated limits.
Confidence limit
An index of the extent to which two measured variables are associated.
Correlation coefficient
How well a test performs in measuring the property or characteristic it is intended to measure.
Analytic validity
The predictive value of a test for a given clinical outcome, determined mostly by the sensitivity and specificity.
Clinical validity
The likelihood that a test will prompt an intervention and result in an improved outcome.
Clinical utility
A percentage is obtained by multiplying a proportion by _____
100
A proportion is obtained by dividing a population into parts and dividing one of the parts by _____
The total population
In what type of measurement does the numerator accumulate over time while the denominator remains static?
Rate
What kind of study starts with individuals exposed to a suspected factor and follows them to measure frequency of disease?
A cohort study
In what kind of study are groups defined on the basis of disease and then assessed for their prior exposure?
Case control study
This data collection and analysis technique separates the data into distinct groups or layers so that patterns can be seen, and forces the study sample to be representative of the population. Considered one of the seven basic quality tools.
Stratification
A survival measure that is the hypothetical probability of surviving cancer in the absence of other causes of death.
Net survival
Which type of graph is used with nominal data that has no intrinsic ordering to the category?
Bar graph
TNM stage is an example of what type of scale of measurement?
A. Interval
B. Nominal
C. Ordinal
D. Ratio
C. Ordinal
What type of graph is used to display ordinal data with categories that can be ordered from high to low?
A. Histogram
B. Bar graph
C. Both a and b
D. None of the above
A. Histogram
What type of graph best demonstrates the sum of the class and all classes below it?
A. Bar graph
B. Cumulative frequency distribution
C. Frequency polygon
D. Pie chart
B. Cumulative frequency distribution
Gender is an example of what type of scale of measurement?
A. Interval
B. Nominal
C. Ordinal
D. Ratio
B. Nominal
What type of graph is made by joining the middle-top points of the columns of a frequency histogram?
A. Frequency polygon
B. Cumulative frequency distribution
C. Bar graph
D. None of the above
A. Frequency polygon.
What type of risk compares the possibility of developing disease over a period of time in two different groups of people?
A. Absolute risk
B. Attributable risk
C. Relative risk
D. None of the above
C. Relative risk
Pain on a scale of 1-10 is an example of what type of scale of measurement?
A. Interval
B. Nominal
C. Ordinal
D. Ratio
A. Interval
What term refers to the probability that subjects with a negative screening test truly don’t have the disease?
A. Test specificity
B. Test sensitivity
C. Negative predictive value
D. Positive predictive value
C. Negative predictive value
What level of evidence is considered the strongest?
A. Evidence from multiple time series with or without intervention
B. Evidence from well designed case control analytic studies
C. Evidence from well designed controlled trials without randomization
D. Evidence from at least one randomized control trial
D. Evidence from at least one randomized control trial
In a crude death rate, the numerator represents
A. The number of cancer deaths
B. The total population at risk of dying from cancer
C. The total population at risk of dying from cancer for a specific time period
D. None of the above
A. The number of cancer deaths
What term refers to the percentage of people alive for a certain period of time after being diagnosed with cancer?
Survival rate
Which is a measure of cancer frequency?
A. Incidence rate
B. Mortality rate
C. Prevalence
D. All of the above?
D. All of the above
______ is a type of bias control that forces a study sample to be representative of the population.
Stratification
A Type 1 Error occurs in hypothesis testing
A. When the null hypothesis is mistakenly rejected.
B. When the samples are different and the null hypothesis is rejected while the alternative hypothesis is believed.
C. When the samples are the same and the alternative hypothesis is rejected while the null hypothesis is believed.
D. None of the above
A. When the null hypothesis is mistakenly rejected.
What term refers to the measure of the risk of developing some new condition within a specified period of time?
A. Cumulative incidence
B. Incidence rate
C. Mortality rate
D. Prevalence
B. Incidence rate
Data from a population-based incidence registry can provide an opportunity to answer questions related to what type(s) of validity?
A. External
B. Internal
C. Both (a) and (b)
D. Neither (a) nor (b)
A. External
From where can a registrar obtain incidence rates?
A. Population-based central registries
B. National Cancer Data Base (NCDB)
C. Both (a) and (b)
D. Neither (a) nor (b)
A. Population-based central registries
What type(s) of spatial analyses involves reducing a small variation in an image (a map) to reveal the overall trend by data interpolation (i.e., estimating values rather than plotting exact values)?
A. Data Aggregation and Spatial Smoothing
B. Cluster Analysis
C. Both (a) and (b)
D. Neither (a) nor (b)
C. Both (a) and (b)
What is the source of expected survival data?
A. U.S. Population Life Tables
B. North American Association of Central Cancer Registries (NAACCR) statistics
C. Surveillance, Epidemiology, and End Results (SEER) calculations
D. None of the above
A. U.S. Population Life Tables
Why can it be difficult to determine if a specific race or ethnicity is more susceptible to certain cancers in a population?
A. Only hospital-based cancer registries collect race and ethnicity information.
B. The U.S census is performed every ten years, and race estimates made between censuses may be inaccurate.
C. Commission of Cancer (CoC) standardized collection of race and ethnicity is not currently performed by central cancer registries.
D. None of the above
B. The U.S census is performed every ten years, and race estimates made between censuses may be inaccurate.
To which organization do all 13 Canadian provincial and territorial registries report their data annually?
A. Statistics Canada
B. Public Health Agency of Canada (PHAC)
C. Canadian Cancer Registry (CCR)
D. Canadian Partnership Against Cancer (CPAC)
C. Canadian Cancer Registry (CCR)
Who are the members of the Commission on Cancer (CoC)?
A. Government agencies
B. Statewide professional and non-profit organizations involved in a field directly related to oncology (e.g., cancer registration, patient care, patient advocacy, control and prevention, education, research).
C. Corporations
D. All of the above.
A. Government agencies
What Commission on Cancer’s (CoC) program(s) does the American Cancer Society (ACS) award funding to annually?
A. Cancer Liaison Program (CLP)
B. Accreditation Program
C. Both (a) and (b)
D. Neither (a) nor (b)
A. Cancer Liaison Program (CLP)
Realizing the need for national cancer incidence rates, the U.S. Congress in 1992 established the
A. Surveillance, Epidemiology, and End Results (SEER) Program
B. National Program of Cancer Registries (NPCR)
C. North American Association of Central Cancer Registries (NAACCR)
D. None of the above
B. National Program of Cancer Registries (NPCR)
In a report you prepare from registry data, which should NOT be included
A. Site distribution
B. Sex distribution
C. Descriptive statistics
D. Statistical inferences
D. Statistical inferences
The way in which the values for a variable are distributed is called the
A. Standard deviation
B. Measure of central tendency
C. Frequency distribution
D. Absolute frequency
C. Frequency distribution
In order to visualize the ratio, a graph needs to include
A. Abscissa
B. Label
C. Zero point
D. Footnote
C. Zero point
Rates that are calculated for a total population are called ____ rates.
Crude
Rates that are calculated for subgroups of the population, such as an age group, are called _____ rates.
Specific
What type of graph is used to show frequency distributions in bar form, can show continuous variables like age, can show counts or percentages, and is useful when it’s more important to show distribution of a variable rather than absolute numbers?
Histogram
What type of graph allows several histograms to be displayed on the same chart, and uses a line to connect the midpoint of the top of each bar?
Frequency polygon
A pie chart should be limited to no more than _____ slices, and the smallest slice should be at least ____% of the whole.
6, 2%
What type of chart uses symbols that are strongly associated with the message of the chart?
Pictorial chart
The degree to which the conclusions in your study would hold for other persons in other places and at other times is called ______ validity
External
Action taken to decrease the chance of getting a disease; avoiding risk factors and increasing protective factors.
Prevention
What type of event focuses on a change in behavior that reduces the risk cancer will develop, or increasing knowledge and awareness of cancer risks?
Cancer Prevention Event
Which of the following are compliant cancer prevention events?
A. Programs held on social media, the internet, or through mail.
B. Education given in the regular course of business.
C. Education about screening or reduction of late-stage at diagnosis.
D. None of the above are compliant cancer prevention events.
D. None of the above are compliant cancer prevention events.
What are the two types of Cancer Prevention Events?
Behavioral risk reduction.
Education/risk awareness lecture.
What type of event focuses on detecting cancer at an early stage to improve likelihood of increased survival and decreased morbidity?
Cancer Screening Event
Which of the following events is compliant with the Cancer Screening Event Standard?
A. Free mobile mammography with formal procedure for followup as necessary.
B. Screening performed in the regular course of business.
C. Education about cancer screening that does not include actual screening.
A. Free mobile mammography with a formal procedure for followup as necessary.
How often should a Cancer Screening Event and a Cancer Prevention Event be held?
Once a year
The number of new cases of a disease divided by the number of people susceptible to that disease calculates
Incidence
The number of people with a disease (new or long-standing) divided by the total number of the population calculates
Prevalence
The number of new cases of a disease divided by the number of people susceptible to the disease, further divided by a specific time period calculates
Cumulative Incidence
Prevalence is the total number of cases of disease in a given population
A. Over time
B. At a specific time
B. At a specific time
A cross sectional study determines
A. Incidence
B. Prevalence
C. Correlation of risks
B. Prevalence
A method of calculating survival is
A. Direct method
B. Actuarial method
C. Life Table method
A. Direct method
Censored patients contribute to analysis only up to the time
A. The study ends
B. They are censored
C. The follow up period ends
B. They are censored
What kind of survival is an estimate of the probability of surviving all causes of death over a specific time period?
Observed survival
What kind of survival is the hypothetical probability of surviving cancer in the absence of other causes of death?
Net survival
What is the “event of interest” in a survival study?
The outcome being measured
What are two forms of net survival?
Relative survival and cause specific survival
What type of survival measure would you use to describe the observed mortality patterns in a cohort of patients?
Overall (observed) survival
What survival method would you use if your goal is to have reliable cause of death information, for example, for clinical trials?
Cause-specific survival
What survival method would you use if you were willing to have unreliable cause of death Information in order to get accurate expected other-cause mortality from the general population for the cohort?
Relative survival
The choice of survival method analysis depends on
A. The number of people being studied
B. The purpose of the study
C. The disease being studied
B. The purpose of the study
Which survival rate underestimates survival from cancer because it treats all deaths equally?
Observed survival
What is an indirect way to obtain survival rates that does not rely on cause of death information?
Relative survival
What is often a barrier to calculating cause-specific survival rates in the cancer registry?
The registry obtains cause of death info from death certificates and determining a single cause of death may be difficult.
What is the first thing you should do before submitting data to the state registry or NCDB?
Contact the registry software provider for updates.
What are two important things to do just prior to submitting your files to the NCDB?
Have your COC Datalinks User ID and password available.
Note the name and location of the file in your computer.
What survival rates are calculated using the actuarial method, compounding survival in 1-month intervals from the date of diagnosis with death from any cause as the end point?
Ovserved survival
What survival rate is the ratio of the observed survival rate to the expected survival rate of persons of the same age, sex, and ethnic background?
Relative survival