What are the 3 Categorical Data Categories? (_Qualitative_ Data Categories) Double Check!

1. Nominal 2. Ordinal 3. Dichotomus

CH17 Biostatistics Flashcards by Jesse Rennie

______ is the use of data analysis and interpretation in health care research.

Biostatistics

How well did you know this?

Not at all

Perfectly

______ involves the application of statistical tests to the data in order to organize, describe, summarize, and analyze it to answer a research question or test a hypothesis.

It also explains results and requires that ________ be used to explain the meaning and application of the findings, identifies possible factors that could have influenced the results, and draws inferences to the population.

Data Analysis; critical thinking

How well did you know this?

Not at all

Perfectly

Dental hygienists should know the research process in order to understand the epidemioloty of disease, practice therapies, implement programs and practice _________ dentistry.

evidence-baseed

How well did you know this?

Not at all

Perfectly

An insufficient number of subjects, too short of a duration, as well as the use of incorrect measurement instruments, incorrect procedure utilization & incorrect statistical tests are all causes of ________.

invalid research

How well did you know this?

Not at all

Perfectly

What are some example of Nominal scale data?

(Unordered categories)

Male/Female

Smoker/Non-smoker

(Qualitative categories)

How well did you know this?

Not at all

Perfectly

What are some examples of Ordinal scale data?

(Ordered categories)

Mutually exclusive categories:

1, 2, 3, 4, 5

IOTN

Minimal, Moderate, Severe, Unberable pain

(Each of the above have data that exclude all other data in the data set)

How well did you know this?

Not at all

Perfectly

_______ data are a scale of measurements that contain all of the characteristics of the preceding scales.

This data is quantitative and has an absolute zero point (0 means there is an absence).

Some examples are height, weight, duration, and number of teeth/sealants.

Ratio Scale Data

How well did you know this?

Not at all

Perfectly

Data that is reperesented by numbers would be considered _________. This data can be expressed as counts, percentages, and means of something.

An example of this in DH is pockets depths, # of DMFT, time spent scaling.

quantitative data

Asks the question HOW MANY

How well did you know this?

Not at all

Perfectly

Data focused on Information that reflects the quality or nature of variables that cannot be expressed numerically is called ________ data. It is expressed as outcomes, or states, and can be counted for reporting and its variable can be rank ordered.

An example of this in DH is tissue color, tenacity of calculus, and what patients liked most & least about visit.

Qualitative Data

Asks the question, HOW MUCH?

How well did you know this?

Not at all

Perfectly

What are some example of a Continuous Variable?

Height in cm

pocket depth in mm

Age

Time

(Example of age: 25 years, 10 months, 2 days, 5 hours, 4 seconds, 4 milliseconds, 8 nanoseconds, 99 picosends…and so on.)

How well did you know this?

Not at all

Perfectly

What are some examples of data that are Discrete Variables?

Number of visits to the dentist

DMF

How well did you know this?

Not at all

Perfectly

______ is a type of data that has no numeric representation therefore, it is qualitative in nature.

Ex: male/female, freshman/sophomore/jr/sr, eye color, race

Catergorical Variable Data

How well did you know this?

Not at all

Perfectly

_______ data are categorical variable data that places subjects into ONLY two groups/catergories. it takes on one of only two possible values when observed or measured and are qualitative in nature.

Ex: M/F, yes/no, T/F

Dichotomous Variable Data

How well did you know this?

Not at all

Perfectly

What are the 3 Categorical Data Categories?

(Qualitative Data Categories)*
Double Check!*

Nominal
Ordinal
Dichotomus

How well did you know this?

Not at all

Perfectly

Name the 4 Numerical Data Categories.

(Quantitative Data Categories)

Discrete
Continuous
Interval
Ratio

How well did you know this?

Not at all

Perfectly

__________allows raw data to be organized and summarized in a meaningful way that allows for a pattern to emerge.

This type of data alway precedes ________.

Descriptive statistics; inferential statistics

(If raw data was just presented it would be hard to visualize what was being seen. By using descriptive statistics we can see data in a meaningful way.)

How well did you know this?

Not at all

Perfectly

_______ are used to study something but do not have access to the entire population (or total). It is a ________.

Because of this limitation a sample of the population is taken and studied.

Inferential Statistics data; generalizations

How well did you know this?

Not at all

Perfectly

What measure of central tendency is an average usedwith continuous data?

It is appropriately used for ratios and interval data.

Mean

How well did you know this?

Not at all

Perfectly

What measure of central tendency is a midpoint of data when placed in ascending or descending order?

If there are an even amount of numbers, the ____ of the two middle numbers must be taken.

It’s appropriate use is for ordinal data.

Median; mean

How well did you know this?

Not at all

Perfectly

Calculate the Mean of the following:

2, 3, 3, 5, 7, 10 = 30

30 ÷ 6 = 5

Mean = 5

How well did you know this?

Not at all

Perfectly

Calculate the median of the following numbers:

3, 2, 5, 10, 3, 7

In order to calculate the median the numbers must be placed in ascending order.

2, 3, 3, 5, 7, 10

(the median point is when ½ the data is above and ½ the data is below)

NO MIDPOINT!?!?

3 + 5 = 8 ➗ 2

Median = 4

How well did you know this?

Not at all

Perfectly

Calculate the median of the following numbers:

7, 3, 2, 3, 5, 4, 10

In order to calculate the median the numbers must be placed in ascending order.

2, 3, 3, 4, 5, 7, 10

Median = 4

(the midpoint)

How well did you know this?

Not at all

Perfectly

What measure of central tendency is concerned with the value that occurs most often? It is used in all types of data.

It’s appropriate use is for nominal data.

mode

How well did you know this?

Not at all

Perfectly

Calculate the Mode of the following numbers:

2, 3, 3, 5, 7, 10

2, 3, 3, 5, 7, 10

Mode = 3

How well did you know this?

Not at all

Perfectly

What is the goal of using the **measure's of central tendency**?

To take a collection of data and identify the middle of the data collected. *A **measure of central tendency** is a _single value_ that attempts to describe a set of data by identifying the _central position_ within that set of data.*

Name the 3 **Measures of Central Tendency**.

Mean, Median and Mode

What two data categories are **n****umerical?**

Discrete and Continuous

Define: **Discrete Variable**

Discrete Variable are _counted_ a _finite_ number of times.

**Descriptive Statistics** are used to summarize data in a meaningful way. There are generally two MAIN types of statistics used to describe data. Name them.

1. Measures of Central Tendency ## Footnote * (Mean, Median, Mode)* 2. Measures of Dispersion * (Range, Variance, Standard Deviation)* * \*\*Though not a statistic type Graphs, histograms, and charts are also used to **describe** and **summarize**\*\**

\_\_\_\_\_\_\_ communicates how much variation is present in a group of data. In statistics, this is a way of _describing_ how spread out a set of data is. (Range, Variance, Standard Deviation)

**Measures of Dispersion** *(aka Measure or variability)*

**Measures of dispersion** communicate how much _variation_ is present in a group of data. Name the three data sets that are used to _describe_ the dispersion of a group of data.

1. Range 2. Variance 3. Standard deviation

What measure of dispersion is determined by subtracting the lowest score from the highest score? It is the simplest and least helpful measurement and is usually reported with the median.

Range

What represents the average distance of each score from the mean, is associated with standard deviation, and is the most common and useful measure of dispersion. It is usually reported with the mean to calculate data intervals. Its value or the SD in relation to the mean depicts the distribution of scores.

Variance ## Footnote *(it measures _how far_ each number in the set is from t_he mean_ and therefore **from every other number in the set**.)* Square root of the variance = standard deviation Define: **Variance**

Wheen is Standard Deviation used and how is it determined?

**Standard deviation** is used when determining _how spread out_ the numbers are _around the mean_. ## Footnote *(Used with Qualitative Continuous Data)* Square root of the variance = standard deviation

# Define: **Standard Error of the Mean** **I need your help with this one Jesse!**

**Standard Error of the Mean** is used to determine how accurate your estimation (or generalization) of _the sample_ is to the entire _population_.

A _________ is an asymmetrical curve distorted by a few extreme scores.

**Skewed Distribution**

A ________ shows how often something happened in a specific catergory. These tables may be _______ or \_\_\_\_\_\_\_. Example: how many times does the number 9 occur? 1, 2, 3, 4, 6, 9, 9, 8, 5, 1, 1, 9, 9, 0, 6, 9.

**Frequency Distribution Table**

Is this frequency distributuion table grouped or ungrouped?

Grouped

Characteristics of Effective Tables

1. Accuracy 2. Simplicity 3. Clarity 4. Appearance 5. Well-Designed Structure

\_\_\_\_\_\_\_\_ is a relationship or association between variables that can be measured _mathematically_.

**Correlation**

A _______ is a relationship between _two variables_ in which both variables _move in the same direction._

Positive Correlation

A \_\_\_\_\_\_\_\_is an *inverse correlation* is a relationship between _two variables_ that move _in opposite directions_

**Negative Correlation**

"\_" signifies the correlation coefficient. Its value communicates the ______ and strength of the association.

"r"; direction

Hypothesis testing ## Footnote \_\_\_\_\_\_ is a formal decision-making process of testing a hypothesis using statistical significance and inference, followed by interpreting the statistical results

Hypothesis testing

A ________ is an initial negative statement of belief about the value of a population parameter. It rejects the research or alternative hypothesis.

Null hypothesis

Probability

expressed as "p" value ## Footnote *(AKA alpha 𝛼 level)*

A ______ is also called an alpha a error. It occurs when the null hypothesis is rejected, but is actually true so it should have been accepted. The probability of computing this error is the same as at the alpha level. Researchers can control a type I error by setting the alpha level low. This type of error can be very costly.

Type I Error

A \_\_\_\_\_error is also called a beta b error. It occurs when the null hypothesis is accepted, but it is actually false, so it should have been rejected. The exact probability of computing this type of error is generally unknown. They are caused by using too small a sample, unreliable measuring devices, or imprecise research methods.

type II

Chi-square test

External validity

Less than ____ subjects would a research project invalid.

A _______ are made up of distinct and separate units or categories is is expressed by a large or infinite number of **measures along a continuum** and can be expressed in fractions or decimals. This type of data are considered quantitative and can be converted into nominal or ordinal scales.

continuous variable

\_\_\_\_\_\_\_\_\_ are data made up of distinct and separate units or categories, but is counted only in whole numbers. This data is quantitative in nature because it is represented numerically. It can be converted to nominal or ordinal scale.

Discrete Variable Data

What type catergorical variable data organizes its data into mutually exclusive categories that have no rank order, value or numeric relationship between the different classifications? Ex: L/R handed, M/F, hair color

Nominal Scale

What type of catergorical data organizes data into mutually exclusive catergories that are rank ordered based on criterion. In this type of data, the difference in rank is not equal ibn value. Ex: Poor/fair/good/excellent, shades of whiteness of teeth, calc class A-B-C-D

Ordinal Scale

What type of data has the characteristics of the ordinal scale and an equal distance between any two adjacent units of measurement. This type of data is quantitative in nature and does not have a meaningful zero point. Ex: temperature (0 degrees is colder than 90 degrees

Interval Scale Data

Data summary such as bar graphs, histograms, pie charts; measures of central Ttndency such as mean, median, mode; and measures of variability such as range, variance and standard deviation are all considered _______ Statistics.

Descriptive

A mode value can be either _____ (consisting of 2 modes) or _____ (consisting of more than 2 modes).

bimodal; multimodal

The \_\_\_\_\_\_\_\_\_, also referred to as ________ forms the theoretical foundation for comparisons and making statistical decisions. It is a symmetrical, unimodal, **bell-shaped curve** that explains why random variables tend to be normally distributed. ***The mean, median, and mode are equal in value.***

Normal Distribution; Gaussian Distribution

The ______ provides an estimation of the spread of data given the mean and the standard deviation of a data set that follows the standard normal distribution.

Empirical Rule

The Empirical rule says that \_\_\_% of data fall within one SD of the mean, \_\_\_% within two SD of the mean, and \_\_\_% within three SD of the mean.

–68% –95% –99.7%

\_\_\_\_\_\_\_\_is the foundation of the \_\_\_\_\_\_\_\_.

Normal distribution; central limit theorem

What factor is most effected with skewed distribution?

the mean

Skewed distributions can be _____ or \_\_\_\_\_\_.

positive or negative

Is this frequency distributuion table grouped or ungrouped?

Ungrouped

An example of an ________ frequency distribution table would include all the scores in the distribution, good for less than 30 observations.

Ungrouped

An example of a ____ freqency distribution table groups a set number of scores into mutually exclusive intervals, usually 5-10 intervals (easier to understand) (Those who got A’s, Those who got B’s...)

grouped

A ____ is used to represent categorical data. Its length corresponds with the frequency of the value.

Bar graph

A _______ is similar to a bar graph but the bars appear side by side and are touching. They are used to represent interval or ratio variables, grouped & ungrouped frequencies and ordinal datathat is treated as continuous data.

histogram

A \_\_\_\_\_\_\_\_\_is a line graph that represents frequency data that are continuous in nature. It is drawn by connecting midpoints of the bars of a histogram, then extending the line at both ends to imaginary midpoints at the right and left of the histogram This graph represent grouped or ungrouped frequencies and can also represent frequency, percent, cumulative frequency, or cumulative percent.

frequency polygon

A \_\_\_\_\_\_\_is a line graph used to plot a variable over time.

Polygon

A ______ shows the relationship between two variables and how the level of one variable varies as the level of the other variable changes.

Scattergram

As it relates to correlation, the "r" value indicates the ______ of relationship. If a value moves closer to +1 or -1, there is a stronger relationship. When it is closer to 0 there is a weaker relationship +1 or -1 indicate PERFECT relationship, while 0 indicates ZERO relationship

strength

A _________ can be used to quantify the relationship of two variables, and expresses the functional relationship between the variables. It is used to predict the score of one variable based on the score of another Example: National board scores based on students’ GPA

regression analysis

A _________ provides a mathematical model that gives the strength or ability of two or more variables to predict another variable. Examples: SAT scores, GPA strength

Multiple Regression Analysis

A _____ is called the alternative or positive hypothesis. It is the logical opposite of the null hypothesis and can indicate a direction of difference. Example: One brand of sealants does differ from another brand of sealants.

Research Hypothesis

The \_\_\_\_\_\_is a probability value, also called alpha value or significance value. It represents the probability that the findings from the study are due to chance. It is commonly accepted in oral health research as equal to or smaller than 0.05 (p≤.05) so we reject the null hypothesis because we are confident that statistical decision is correct If this value is ______ than 0.05, the results are said to be not statistically significant so we do not reject null hypothesis.

p-value; larger

\_\_\_\_\_\_\_\_\_ are used for hypothesis testing when the data meet certain assumptions. It must be classified as continuous (includes ratio, interval, and ordinal data)

Parametric Inferential Statistics

What are the types of parametric statistics?

–Student t-test –Analysis of variance (ANOVA)

The ______ determines is a statistically significant difference exists between two mean scores

T-test

\_\_\_\_\_ determines if statistically significant differences occur when comparing more than two mean scores and tells researchers that there is a difference among groups. It does not, however, specify which group is different.

ANOVA | (Analysis of variance)

CH17 Biostatistics Flashcards

(81 cards)