Task 2 the characteristic score Flashcards

1
Q

Cases

A

are the objects described by a set of data (customers, companies, subjects ín a study)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Label

A

is a special variable used in some data sets to distinguish the different cases

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Variable

A

is a characteristic of a case

→different cases can have different values of the variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Categorical Variable

A

A categorical doesn’t have a numerically meaning, it describes simply the quality or characteristics of a variable. The numbers in categorical variable designate quality rather than a measurement quality. You could use e.g. gender and use a 1 for males and a 2 for females. You can’t calculate with these numbers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Quantitative Variable

A

Quantitative variables are measured and expressed numerically, have numeric meaning and are used for calculation. Although e.g. zip codes are written in numbers these numbers are only labels and you can´t calculate with them

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Nominal Variable

A

Nominal variables are categorical. They are equal categories in other words categories which don’t differ in terms of order. You can’t bring them order you cant calculate with them e.g. Gender (Male, Female, Transgender) Eye colour (Blue, Green, Brown, Hazel)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Ordinal variable

A

Ordinal variables, belong to categorical variables, are those which have a clear ordering e.g. education status middle school, high school, college now you have 1 2 and 3 and they have a clear order. Economic status: low middle high again 1 2 and 3 in a clear order

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Interval Variable

A

Interval belongs to quantitative variables and it means that the interval between the values has the same interval/ are equally spaced (same space in between). In other words the distance between the variables must be the same E.g. Three peoples income 15,000 20,000 and 25,000 the interval is always 5,000

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Distribution of variables

A

It tells us what values a variable takes and how often it takes these values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Frequency table

A

You can see the peak or Two peaks (bimodal). Used for categorical variables (nominal/ordinal)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Pie chart

A

We can se the proportions and what is the major variable and what the minor
Nominal variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Bar chart

A

Lower form of frequency and mostly used with categorical variables preferred in case of ordinal ones but also applicable with nominal variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Stem-and-leaf plot

A

We can see the peak and the outliers it is basically just a frequency table turned around
For both quantitative variables so interval and ratio

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Distribution: Shapes

A

Trends, Peak, Outlier

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

skewed distribution

A

A frequency distribution in which most scores fall in categories above or below the middle

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Histogram

A

useful for quantitative variables

17
Q

Mode (MO)

A

Simply the score that occurs most often

Tells something about the frequency of score

18
Q

Median (M)

A

Lies in the middle if we order the sores from highest to lowest
Take the total amount of sores (N) +1 and divide it by 2 to obtain the middle score rank((N)+1)/2= middle score rank
You count to the Median and take the value of this score

19
Q

Arithmetic mean ( x ̅)

A

Add up all the separate scores and divide them by the total of scores (N)
x ̅=(∑x_i)/N

20
Q

Sum of squares (variation)

A

Calculate the difference of each score from the mean and square all the differences before adding them up
∑(x_i-x ̅ )^2

21
Q

Variance

A

divide the variation by the total amount of scores (N) minus 1
s_x^2=(∑(x_i-x ̅ )^2)/(N-1)=variance

22
Q

Standard deviation

A

Take the square root of the variance

s_x=√((∑(χ_i-x ̅ )^2)/(N-1))

23
Q

Split up in quartiles (Q1, Q2, Q3)

A
first quartile(Q1): 25% of all score lie below it and 75% above 
second quartile(Q2): The median
third quartile(Q3): 75% lies below 25% above
24
Q

IQR (interquartile range)

A

between Q1 and Q3 lies the half of the scores

IQR= Q3-Q1

25
Q

Five Number Summary

A

the lowest score (minimum), Q1, the median, Q3, and the highest score the maximum. You can leave out the outliers.
→can be summarised in a boxplot (all the horizontal bars indicate a number from the summary

26
Q

1,5*IQR criterion

A

Used to identify outliers: Q1-(1,5IQR)= every score below the outcome is an outlier
Q3+(1,5
IQR)= everything above is an outlier

27
Q

Linear Transformation

Centering

A

Shift all the scores such that the scale becomes 0
→subtract the mean from all scores →C=X - x ̅

Shape of distribution is not effected at all

28
Q

Z-scores

A

Z-scores indicate how many standard deviations a measurement has scored above or below the mean
z=(xi-x ̅)/s_x

29
Q

Centring

A

Shift all the scores such that the scale becomes 0
→subtract the mean from all scores →C=X - x ̅
Shape of distribution is not effected at all
Standard deviation does not change

30
Q

Standardising

A

Make sure that the scale obtains a mean of 0, and a standard deviation of 1
The results are so-called z-scores
→z=(xi-x ̅)/s_x
Z-scores indicate how many standard deviations a measurement has scored above or below the mean

31
Q

Multiplying

A

• We multiply all the scores by a certain number

→e.g. for every sold litre you get 20$ so you multiply your X with 20

32
Q

order of mean median mode in view of skewing

A

right skewed Mean > Median > Mode

left skewed Mean < Median < Mode