General Definitions Flashcards

1
Q

DCOVA Framework

A

Define, Collect, Organize. Visualize, Analyze data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are data?

A

In statistics, data are “the values associated with a trait or property that help distinguishing the occurrences of something”.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Variable

A

A characteristic of an item or individual.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Descriptive Statistics

A

Refer to methods that primarily help summarize and present data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Inferential Statistics

A

Refer to methods that use data collected from a small group to reach conclusions about a larger group.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Big Data

A

The collections of data that cannot be easily browsed or analyzed using traditional methods.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Categorical Variables

A

Take categories as their values (also known as qualitative variables)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Numerical Variables

A

Have values that represent a counted or measured quantity (also known as quantitative variables).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Discrete Variables

A

(Numerical variables) Are numerical values that arise from a counting process (e.g. total amount paid).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Continuous Variables

A

(Numerical variables) Are numerical values that arise from a measuring process and those values depend on the precision of the measuring instrument used (e.g. distance form home to store).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Primary Data Source

A

Data collected on your own.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Secondary Data Source

A

Data collected by someone else.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Population

A

Consists of all the items or individuals about which you want to reach conclusions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Parameter

A

When you analyze data from a population you compute a parameter.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Sample

A

A portion of a population selected for analysis.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Statistics

A

When you analyze data from a sample you compute statistics.

17
Q

Structured Data

A

Refers to all types of data that are structured or organized in any form.

18
Q

Unstructured Data

A

Data having very little or no repeating structure or organization.

19
Q

Frame

A

A complete or partial listing of the items that make up the population from which the sample will be selected.

20
Q

Nonprobability Sample

A

Selecting items or individuals without knowing their probabilities of selection.
ADVANTAGES: convenience, speed, and low cost.
CANNOT be used for statistical inference.

21
Q

Probability Sample

A

Selecting items or individuals based on known probabilities.

22
Q

Convenience and Judgement Samples

A

Subcategories of nonprobability sample.
In a convenience sample you select items that are easy, inexpensive, or convenient to sample.
In a judgement sample you collect the opinions of preselected experts in the subject matter.

23
Q

Simple Random Sample

A

Subcategory of probability sample. It is the most elementary sampling technique. every item from a frame has the same chance of selection as every other item, and every sample of a fixed size has the same chance of selection as every other sample of that size.
Sampling with replacement means that after you select an item, you return it to the frame, where it has the same probability of being selected again.
Sampling without replacement means that once you select an item, you cannot select it again.

24
Q

Systematic Sample

A

Subcategory of probability sample. In a systematic sample, you partition the N items in the frame into n groups of k items, where K=N/n

25
Q

Stratified Sample

A

Subcategory of probability sample. In a stratified sample, you first subdivide the N items in the frame into separate subpopulations,
or strata. A stratum is defined by some common characteristic, such as gender or year in school. You select a simple random sample within each of the strata and combine the results from the separate simple random samples. Stratified sampling is more efficient than either simple random sampling or systematic sampling because you are ensured of the representation of items across the entire population.

26
Q

Cluster Sample

A

Subcategory of probability sample. In a cluster sample, you divide the N items in the frame into clusters that contain several items. Clusters are often naturally occurring groups, such as counties, election districts, city blocks, households, or sales territories. You then take a random sample of one or more clusters and study all items in each selected cluster.

27
Q

Coverage Error

A

Coverage error occurs if certain groups of items are excluded from the frame so that they have no chance of being selected in the sample or if items are included from outside the frame. Coverage error results in a selection bias.

28
Q

Nonresponse Error

A

Nonresponse error arises from failure to collect data on all items in the sample and results in a nonresponse bias.

29
Q

Sampling Error

A

Sampling error reflects the variation, or “chance differences,” from sample to sample, based on the probability of particular individuals or items being selected in the particular samples. This margin of error is the sampling error.

30
Q

Measurement Error

A

Certain information is impossible or impractical to obtain directly. When surveys rely on self-reported information, the mode of data collection, the respondent to the survey, and or the survey itself can be possible sources of measurement error.