Task 2 the characteristic score Flashcards
Cases
are the objects described by a set of data (customers, companies, subjects ín a study)
Label
is a special variable used in some data sets to distinguish the different cases
Variable
is a characteristic of a case
→different cases can have different values of the variables
Categorical Variable
A categorical doesn’t have a numerically meaning, it describes simply the quality or characteristics of a variable. The numbers in categorical variable designate quality rather than a measurement quality. You could use e.g. gender and use a 1 for males and a 2 for females. You can’t calculate with these numbers
Quantitative Variable
Quantitative variables are measured and expressed numerically, have numeric meaning and are used for calculation. Although e.g. zip codes are written in numbers these numbers are only labels and you can´t calculate with them
Nominal Variable
Nominal variables are categorical. They are equal categories in other words categories which don’t differ in terms of order. You can’t bring them order you cant calculate with them e.g. Gender (Male, Female, Transgender) Eye colour (Blue, Green, Brown, Hazel)
Ordinal variable
Ordinal variables, belong to categorical variables, are those which have a clear ordering e.g. education status middle school, high school, college now you have 1 2 and 3 and they have a clear order. Economic status: low middle high again 1 2 and 3 in a clear order
Interval Variable
Interval belongs to quantitative variables and it means that the interval between the values has the same interval/ are equally spaced (same space in between). In other words the distance between the variables must be the same E.g. Three peoples income 15,000 20,000 and 25,000 the interval is always 5,000
Distribution of variables
It tells us what values a variable takes and how often it takes these values
Frequency table
You can see the peak or Two peaks (bimodal). Used for categorical variables (nominal/ordinal)
Pie chart
We can se the proportions and what is the major variable and what the minor
Nominal variables
Bar chart
Lower form of frequency and mostly used with categorical variables preferred in case of ordinal ones but also applicable with nominal variables
Stem-and-leaf plot
We can see the peak and the outliers it is basically just a frequency table turned around
For both quantitative variables so interval and ratio
Distribution: Shapes
Trends, Peak, Outlier
skewed distribution
A frequency distribution in which most scores fall in categories above or below the middle