1.3 Understanding your data set Flashcards

1
Q

Observation

A

are instances of some group of interest and are generally represented as the rows of a worksheet. If you had data on individual students, for example, each student would count as a single observation. If you had data on different songs, each song would count as a single observation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

characteristics

A

are what each column within a data set represents. In Figure 2, for instance, each observation, or student, has only a single characteristic: “Classroom.” In the second image, another characteristic is added: “Semester.” While both of these examples are quite simple, in principle, you could have any number of characteristics.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Discrete

A

data is data that can only be represented through whole numbers (e.g., the number of students in a class or the number of animals in a zoo). You couldn’t have half of a student, or, say, .378 of a leopard (unless you’re looking at data from some kind of horror movie!).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Continuous

A

data, on the other hand, is measured along a scale and can take any point along that scale as its value. (for tempature 94.5 93.5 etc

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Nominal

A

the categories have no order. If you were categorizing cars, for example, you could have categories for each manufacturer (e.g., Honda, Ford, Toyota, etc.). As none of these categories are more or less than the other categories, there’s no implicit order to how you might organize them.
Honda, Ford, Toyota

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Ordinal

A

data in which there exists an implicit order to the way it’s organized, f
Small, Medium, Large

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Binary

A

categorizes data into two groups
Good, Bad yes, no

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Discrete

A

Number of Students

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Continuous

A

Tempature

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Continuous

A

Temperature

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

data imputation

A

involves substituting an estimated value for a missing value. There are various approaches to making the estimation: averaging the non-missing values, taking the most common of the non-missing values, or even taking a random value from the non-missing data. At the end of the day, the analyst needs to decide carefully whether to remove rows with missing data or to impute values for the missing data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What to do if the missing data is random

A

remove those rows guuurrll

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

If the missing data isn’t random?

A

If the missing data isn’t random, however, it’s usually better to impute. This will keep you from introducing bias into your data set.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Population

A

domain of interest ex, women between 25-35 years of age

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

sample

A

subset of that population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

There will always be a gap between inferences you make on the basis of your sample and what’s actually true of the population. This could be described as

A

Sampling error

17
Q

Characteristics are variables that describe….

A

observations

18
Q

qualitative data can also be divided into more specific types of data…they are

A

in this case, binary, nominal and ordinal.

19
Q

Quantitative can be measure as

A

discrete and continous

20
Q

Characteristic

A

Column