1 to 200 Math Flashcards

1
Q

Histogram Buckets- Bins

A

The size of the bucket bins are very important, you are better with more than less. When you have less the data can be too general and not accurate enough

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Line Plot

A

Also makes it easy to stack features

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Inclusive internal

A

need to look this one up

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Distribution Plot

A

Allows us to visualize the dispersion of data across variables most common method Histogram

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Histogram is

A

The most common distribution plot

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

X axis

A

Horizontal Axis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Y axis

A

Vertical Axis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

When do we use a scatter Plot

A

used to show the relationship between 2 features

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

When is a line plot Appropriate

A

When we know for sure there is a continuous relationship (linear) between 2 data points

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

2 Types of distribution Plots

A

Box and Whisker, KDE Kernel Density Estimation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Categorical Plots

A

Metric per category, many variations, most common is the simple bar plot

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Always keep in mind the information I want to share of the story I am trying to tell

A

How does the story help analyzing that information to another

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Bin Sizes

A

You can make smaller or larger

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Histogram is

A

A distribution plot

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

in a histogram which axis is continuous

A

x Axis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Hat mx+b

A

Linear equation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

M=

A

How steep the line is (the slope)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

X=

A

How far it is from the line

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

B=

A

the value of y when x=0

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

M formula question

A

Rise over run

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Y=

A

How far up and down the line

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Ojive

A

Accumulating line Plot

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Line plots are greater when

A

the relationship between the data points that have no in-between points like the weather or days

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

yHat means

A

The equation of a straight line in the slope intercept form y hat represents the predicted value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

X Axis needs to be

A

Continuous Data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

Why do we use line plots?

A

We use them for changes over time

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

Data is

A

Data is collected and observable information about something

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

Discrete Data

A

can only take on certain values, there are no in-between numbers like Ford, Cheve,Cadillac

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

Continuous Data

A

data that can have in between values like we are 175 inches tall

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

Nominal Data

A

Nominal data is classified without a natural form or rank, cats, doge,fish

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

Ordinal Data

A

can be sorted it has an order like 1,2,3 hot, mild,cold. It has to make logical sense.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

Structured Data

A

highly specific and is tored in a pre defined format -excel spreadsheet/ if you send the data to someone else they will be able to work with the data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

Umstructured Data

A

not in any particular format example audio or text files irt does not follow a predefined format/ this involves deep learning= Dalle-e 2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

Population

A

the entire data set

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

Sample

A

Sample is a random sample of the data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

Mean

A

Mean is the most common measure of central tendency

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q

mean formula means

A

Sum of all data points/number of data points

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
38
Q

Average is the

A

Arithmetic mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
39
Q

meu is

A

Population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
40
Q

x bar

A

mean of sample size

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
41
Q

Weighted mean

A

Aweighted meanis a kind ofaverage. Instead of each data point contributing equally to the final mean, some data points contribute more “weight” than others. If all the weights are equal, then the weighted mean equals thearithmetic mean(the regular “average” you’re used to). Weighted means are very common in statistics, especially when studyingpopulations.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
42
Q

Weighted mean example

A

20 over 8.4- 7 over 6.1 would read 208.4 and 61 divide by 20 +7

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
43
Q

Truncated Mean

A

we use this to handle outliers we would ignore the outlier and take the other side off the data set ex 9 50, 52,78 we would take off 9 and 78 and divide by number of values must note that we took x% off the data set

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
44
Q

Mode

A

The value most often

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
45
Q

median odd

A

it is the number in the middle

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
46
Q

Median

A

add the two central numbers /2 that will be the median ( take the arithmetic mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
47
Q

use discrete

A

mean, median, mode

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
48
Q

Nominal Data

A

maybe mean, no median use on mode

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
49
Q

Ordinal Data

A

mean maybe, median, mode

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
50
Q

Numeric

A

mean, median,mode

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
51
Q

Non Numeric

A

no mean, median, mode

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
52
Q

Continious

A

median, mode,mode

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
53
Q

no numeric reason no memean

A

have to divide by 2, it is is a letter we cannot find the sum

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
54
Q

Continious

A

Height of people

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
55
Q

Discrete Data

A

Number of Children in a family

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
56
Q

Non Numeric

A

Cats, dogs, birds,fish can’t add it up

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
57
Q

Nominal Data

A

Has no specific order, cannot be sorted

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
58
Q

Ordinal Data

A

Data that can be sorted 1,2,3 hot, mild, cold

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
59
Q

To calculate a mean

A

We need numeric data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
60
Q

these caterories can overlap1

A

nominal, numeric, non numeric

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
61
Q

These categories can overlap 2

A

ordinal, numeric,non numeric

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
62
Q

Working with Household data

A

we would use median because of extreme values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
63
Q

step one to figure out central tendency

A

is it even possible use that central tendency

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
64
Q

step two to figure out central tendency

A

if we can measue what makes the most sense

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
65
Q

Measurement of Dispersion

A

are measurements of spread

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
66
Q

measurement of dispersion

A

it measures how the data is spread across the mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
67
Q

Mean is

A

the number that is as close as possible to all of the data sets ( balancing Point)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
68
Q

effects of measurements of spread

A

we get 2 things. The standard deviation and spread, they are similar to each other

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
69
Q

varience number meaning

A

the samller the value we find the less the spread

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
70
Q

reason for squaring

A

if we get a negative value squaring makes it positive, squaring it emphasises the larger deviations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
71
Q

Standard deviation is

A

the square root of varience

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
72
Q

Varience is

A

Not usually used we use standard deviation instead

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
73
Q

Varience formula uses

A

N-1 to correct the bias we generate from the mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
74
Q

Then for standard deviation

A

you take the square root

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
75
Q

Quartiles are

A

Related to thedata set

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
76
Q

when talking about Quartiles we are talking about

A

the first, second, third set of data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
77
Q

The 1st quartile

A

Will be the first half of the median

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
78
Q

1st quartile is

A

the bottom or lower 25%

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
79
Q

3rd quartile is

A

the upper 75% of data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
80
Q

Second Quartile

A

Is the median or 50th percentile

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
81
Q

First quartile data will be

A

bewlow the 25%

82
Q

third quartile

A

Below 75%

83
Q

define therange of a dtat set in a quartile

A

max-min values =range

84
Q

q1 and q3

A

will give us an idea of how close the data set is to the mean

85
Q

How do we calculate the first and third Quartile

A

first see if it has an even or odd number

86
Q

calculate quartile odd

A

use the take away method like the median

87
Q

Calculate even quartile

A

Use the take away method and divide the numbers /2

88
Q

Quartile Spread

A

the difference between the first and third Quartile is the measure of its spread

89
Q

Inter quartile range

A

q3-q1-71-68 =3 IQR =3

90
Q

Histogram Buckets- Bins

A

The size of the bucket bins are very important, you are better with more than less. When you have less the data can be too general and not accurate enough

91
Q

Line Plot

A

Also makes it easy to stack features

92
Q

Inclusive internal

A

need to look this one up

93
Q

Distribution Plot

A

Allows us to visualize the dispersion of data across variables most common method Histogram

94
Q

Histogram is

A

The most common distribution plot

95
Q

X axis

A

Horizontal Axis

96
Q

Y axis

A

Vertical Axis

97
Q

When do we use a scatter Plot

A

used to show the relationship between 2 features

98
Q

When is a line plot Appropriate

A

When we know for sure there is a continuous relationship (linear) between 2 data points

99
Q

2 Types of distribution Plots

A

Box and Whisker, KDE Kernel Density Estimation

100
Q

Categorical Plots

A

Metric per category, many variations, most common is the simple bar plot

101
Q

Always keep in mind the information I want to share of the story I am trying to tell

A

How does the story help analyzing that information to another

102
Q

Bin Sizes

A

You can make smaller or larger

103
Q

Histogram is

A

A distribution plot

104
Q

in a histogram which axis is continuous

A

x Axis

105
Q

Hat mx+b

A

Linear equation

106
Q

M=

A

How steep the line is (the slope)

107
Q

X=

A

How far it is from the line

108
Q

B=

A

the value of y when x=0

109
Q

M formula question

A

Rise over run

110
Q

Y=

A

How far up and down the line

111
Q

Ojive

A

Accumulating line Plot

112
Q

Line plots are greater when

A

the relationship between the data points that have no in-between points like the weather or days

113
Q

yHat means

A

The equation of a straight line in the slope intercept form y hat represents the predicted value

114
Q

X Axis needs to be

A

Continuous Data

115
Q

Why do we use line plots?

A

We use them for changes over time

116
Q

Data is

A

Data is collected and observable information about something

117
Q

Discrete Data

A

can only take on certain values, there are no in-between numbers like Ford, Cheve,Cadillac

118
Q

Continuous Data

A

data that can have in between values like we are 175 inches tall

119
Q

Nominal Data

A

Nominal data is classified without a natural form or rank, cats, doge,fish

120
Q

Ordinal Data

A

can be sorted it has an order like 1,2,3 hot, mild,cold. It has to make logical sense.

121
Q

Structured Data

A

highly specific and is tored in a pre defined format -excel spreadsheet/ if you send the data to someone else they will be able to work with the data

122
Q

Umstructured Data

A

not in any particular format example audio or text files irt does not follow a predefined format/ this involves deep learning= Dalle-e 2

123
Q

Population

A

the entire data set

124
Q

Sample

A

Sample is a random sample of the data

125
Q

Mean

A

Mean is the most common measure of central tendency

126
Q

mean formula means

A

Sum of all data points/number of data points

127
Q

Average is the

A

Arithmetic mean

128
Q

meu is

A

Population

129
Q

x bar

A

mean of sample size

130
Q

Weighted mean

A

Aweighted meanis a kind ofaverage. Instead of each data point contributing equally to the final mean, some data points contribute more “weight” than others. If all the weights are equal, then the weighted mean equals thearithmetic mean(the regular “average” you’re used to). Weighted means are very common in statistics, especially when studyingpopulations.

131
Q

Weighted mean example

A

20 over 8.4- 7 over 6.1 would read 208.4 and 61 divide by 20 +7

132
Q

Truncated Mean

A

we use this to handle outliers we would ignore the outlier and take the other side off the data set ex 9 50, 52,78 we would take off 9 and 78 and divide by number of values must note that we took x% off the data set

133
Q

Mode

A

The value most often

134
Q

median odd

A

it is the number in the middle

135
Q

Median

A

add the two central numbers /2 that will be the median ( take the arithmetic mean

136
Q

use discrete

A

mean, median, mode

137
Q

Nominal Data

A

maybe mean, no median use on mode

138
Q

Ordinal Data

A

mean maybe, median, mode

139
Q

Numeric

A

mean, median,mode

140
Q

Non Numeric

A

no mean, median, mode

141
Q

Continious

A

median, mode,mode

142
Q

no numeric reason no memean

A

have to divide by 2, it is is a letter we cannot find the sum

143
Q

Continious

A

Height of people

144
Q

Discrete Data

A

Number of Children in a family

145
Q

Non Numeric

A

Cats, dogs, birds,fish can’t add it up

146
Q

Nominal Data

A

Has no specific order, cannot be sorted

147
Q

Ordinal Data

A

Data that can be sorted 1,2,3 hot, mild, cold

148
Q

To calculate a mean

A

We need numeric data

149
Q

these caterories can overlap1

A

nominal, numeric, non numeric

150
Q

These categories can overlap 2

A

ordinal, numeric,non numeric

151
Q

Working with Household data

A

we would use median because of extreme values

152
Q

step one to figure out central tendency

A

is it even possible use that central tendency

153
Q

step two to figure out central tendency

A

if we can measue what makes the most sense

154
Q

Measurement of Dispersion

A

are measurements of spread

155
Q

measurement of dispersion

A

it measures how the data is spread across the mean

156
Q

Mean is

A

the number that is as close as possible to all of the data sets ( balancing Point)

157
Q

effects of measurements of spread

A

we get 2 things. The standard deviation and spread, they are similar to each other

158
Q

varience number meaning

A

the samller the value we find the less the spread

159
Q

reason for squaring

A

if we get a negative value squaring makes it positive, squaring it emphasises the larger deviations

160
Q

Standard deviation is

A

the square root of varience

161
Q

Varience is

A

Not usually used we use standard deviation instead

162
Q

Varience formula uses

A

N-1 to correct the bias we generate from the mean

163
Q

Then for standard deviation

A

you take the square root

164
Q

Quartiles are

A

Related to thedata set

165
Q

when talking about Quartiles we are talking about

A

the first, second, third set of data

166
Q

The 1st quartile

A

Will be the first half of the median

167
Q

1st quartile is

A

the bottom or lower 25%

168
Q

3rd quartile is

A

the upper 75% of data

169
Q

Second Quartile

A

Is the median or 50th percentile

170
Q

First quartile data will be

A

bewlow the 25%

171
Q

third quartile

A

Below 75%

172
Q

define therange of a dtat set in a quartile

A

max-min values =range

173
Q

q1 and q3

A

will give us an idea of how close the data set is to the mean

174
Q

How do we calculate the first and third Quartile

A

first see if it has an even or odd number

175
Q

calculate quartile odd

A

use the take away method like the median

176
Q

Calculate even quartile

A

Use the take away method and divide the numbers /2

177
Q

Quartile Spread

A

the difference between the first and third Quartile is the measure of its spread

178
Q

Inter quartile range even

A

q3-q1-71-68 =3 IQR =3

179
Q

inter quartile Odd

A

we would exclude the median and look at each half, these two numbers would then be even, take the two middle numbers/2 give you the mean

180
Q

Quartile function in excel

A

need to look up

181
Q

Quartiles are

A

Common to look for outliers in the data set

182
Q

It is common to use this formula for quartile outliers

A

Calculate the median for each side Q1- (Q-1.5) Q3 (Q3 +4.5)

183
Q

Line Plots

A

Great when a continiour relationship exists

184
Q

Line Plots Use

A

We use line Plots for Changes Over Time or a connection between data points

185
Q

if there is no in between points than it is

A

discrete points

186
Q

Ojive

A

add value after value

187
Q

Ojive

A

Does not make sense for Temperature Data

188
Q

Ojive you can

A

Put saveral on the same plot

189
Q

Distribution Plots are

A

Histograms

190
Q

Histograms

A

X axis need to be continoius

191
Q

Bar Charts are not

A

Continious data

192
Q

Bar Charts have

A

have spaces between them

193
Q

Bucket

A

Represent stoage of numbers for x to x

194
Q

Bin or class

A

Have no gaps so it looks like a continous set of data

195
Q

Each bin represents the

A

Number of occurrence that can fit in a bin, class, or Interval

196
Q

height of a br chart

A

is the number of occurrence

197
Q

Histogram

A

Helps organize and clean up the data

198
Q

the bin or class is always

A

the same the size of the bin is very important

199
Q

Natural Numbers

A

1,2,3,4,5

200
Q

Whole numbers

A

Add zero