Research 2 Exam 1 Study Guide Flashcards

1
Q

What key lessons did you learn from the statistics readiness quiz?

A

Stats isn’t super complex math, you only need to be able to do PEMDAS
Read Instructions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are the levels of measurement? Give your OWN examples of each.

A

Nominal - categorical, not numerical, scores represent a category. Do you like pizza? yes or no
Ordinal - ordering/ranking things Rank these cereals: Fruity Pebbles, Rice Krispies, Chex Mix
Interval - no absolute zero, numerical answers/responses mean the actual amount of variable, differences between units are the same across number line. Likert scale. How much do you like cats? 1,2,3,4,5,6,7
Ratio - there is an absolute zero, same difference between units, numerical answers reflect actual amount of variable. How many naps did you take today?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Distinguish between a categorical and continuous variables. Give your OWN example of each that measures the same topic.

A

categorical means nominal numbers represent categories. continuous variables (interval and ratio) the value is numerical can be any value a fraction, decimal.
Do you like to eat sushi? On a scale of 1(not at all)-7(a lot) How much do you like sushi?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Which type of variable (categorical or continuous) is typically better to use? why?

A

Continuous is better because you get more specific data and it gives you more info/context.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What’s your favorite football team? Is that a nominal or a categorical variable?

A

My favorite football team is the Giants. That is a nominal and a categorical variable because they mean the same thing.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Our class has a PAL. What’s her name?

A

Julia Macey

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Be able to generate survey question items based on key parameters (e.g., categorical variable with 4 levels, etc.)

A

What is your favorite brand? Nike Adidas Puma Nautica

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

How do you take a screenshot?

A

windows + shift + s

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

In your own words, what’s the difference between heuristics and algorithms? Give your own example of how you could use each to solve the same problem.

A

Heuristics are mental shortcuts representative and availability
Algorithms follows steps, systematic, slow, intentional
Putting together Ikea furniture. shortcut heuristic way would be to just look at the picture and try to figure it out get it together quickly. The Algorithm way would be to read and follow all the directions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are illusory correlations? How does research and stats help address them?

A

Correlations we make between two things that are not actually connected. Research and states help address this by finding out that the correlation we assumed was actually not true.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How would stats help you in your future career?

A

Stats will help me in my future career by making me more competitive for my skillset and they may pay me more. The skillset is important because of the medical files I will have to explain to the hospitalized child. There may be graphs and numerical results that I should be able to interpret. (for example, if given a graph, I can read it and make sure it isn’t a misleading graph.)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

A group of 50 people act odd on the day of the full moon. The moon obviously causes them to do this right? Why/why not?

A

No it does not obviously cause them to do this because the number 50 has no context. I would need to know a lot more information such as 50 out of how many and how many people act odd on days other than full moons. The context is important because the answer could be 50/50 or it could be 50/1,256, or even 50/2,374,805. Additionally, if on new moons the number of people acting odd is 75/75, 345/879, or 7,809/5,683,583, it would again change the perception and significance of the number. I would also want a percentage and a variability. It would be better if I knew that 1% with a SD = 1.45 was the number of people acting on a full moon day and 1% with a SD = 1.51 on new moon days.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

From, the video, she mentioned 3 questions to ask that help spot a bad statistic. Describe each in your own words.

A

Can you see uncertainty? Polls are not reliable they cannot accurately predict who will win. Charts are misleading. Averages are misleading. Can I see myself in the data? You need context. How much pee is a lot? change the scale., zoom out, did you know the male unemployment rate is higher than female? How was the data collected? methodologies differ such as how they operationalize a definition. that affects replicability. who is completing the survey is it the right representative group of people or did you let anyone answer even non-Muslims about jihad? Did you ask biased people in the company how much they like it here?
NON-complicated like a grandma.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Give your own example of how everything is counting/stats are all around.

A

When I wake up I check the time it is 7:20 a.m. Then I estimate how much longer I can stay in bed for, about 5 minutes. When the time reaches 7:25, I get up and go to the bathroom. Then I get dressed and I choose between two different outfits. Finally, I go to the kitchen and grab a 100% apple juice bottle and a bagel. The time is now 8:15 and I have to get to class so I get my headphones and the message tells me they the battery is about 70%. I select my playlist of 100 songs and walk across the grass to class. It takes me about 4 minutes which is equivalent to 2 songs, so I arrive at 8:19. Now I sit and wait for the rest of the class and the professor to arrive. By 8:25 the majority of the students are here. By 8:29, the professor walks in and sets up his PowerPoint. I finish up my last song, shut of my headphones and listen to the lecture.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Distinguish between descriptive and inferential statistics.

A

Descriptive statistics are a branch of statistics used to summarize the basic characteristics of a sample dataset. Inferential statistics are a branch of statistics used to make conclusions about a population based on the data from a sample. Descriptive statistics tell us what is happening but not why, we cannot make inferences about a population either. Inferential statistics have more context so they can explain why something is happening and is generalizable to the population.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Distinguish between Variables, vs. Values, Vs. Scores using your own example.

A

A variable is anything that can change How much do you like popcorn? (1-7) a value is any possible outcomes, in this case the possible values are 1, 2, 3, 4, 5, 6, 7. A score is a participant’s individual result on a variable. Participant #1 answered 3.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Be able to identify independent and dependent variables in a research question (Are people who like statistics more awesome than those who don’t?) AND know if they are categorical/nominal or continuous.

A

Are people who like statistics more awesome than those who don’t? More/less = continuous
DV = awesomeness = continuous(it’ll always be for our class) = Ratio? Could you have 0 awesomeness? IV = stats lovers = continuous (more/less is a continuous rating system/scale) = ratio sad but true a person could hate stats and rate it zero

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

When you see any kind of numbers reported, what do we need to know, or what questions do you have before making sense of these numbers?

A

From the TikTok example, I would want to first understand the question and what it is asking. What is the operationalized definition of followers? Are they physical real life people following another person? Because 10 would be a whole lot. Is it followers on youtube? Then it would be small. Additionally, I would like to know information such as out of how many, a measure of central tendency and a measure of standard deviation. If the average youtuber’s followers is 25, then 10 is pretty good. If the average is 100, then your number is small. Even smaller if 1,000. Then if we know that the average person has 10/100 followers or 10% with a standard deviation of 57, then that would also change how we think of the number, to being average.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What’s better, percentages or frequency counts? Why?

A

Percentages are better than frequency counts because percentages give us more context. A frequency count only gives us a number, we would want to know more like how much that is out of before placing value on what the number could mean. Like 50, means nothing. On the other hand, 50% would be pretty significant. We are given much more context. 50% isn’t perfect because it could be out of 2 people, out of 10 people, out of 800 people, etc. It definitely is better though.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

a) Distinguish between unimodal and bimodal distributions. (i.e., what do these look like?) b) Distinguish between positive and negative skew.

A

Unimodal distribution is a distribution that has one “peak”, representing the most frequently occurring value. This would look like a normal distribution (bell curve).
Bimodal distribution is a distribution that has two “peaks,” indicating the two most frequently occurring scores. In a histogram, you would see two of the highest peaks.
A positive skew has values on the left side of a distribution are more frequent than values on the right.
A negative skew has values on the right side of a distribution are more frequent than values on the left.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What does it mean to cherry pick data?

A

Pull what you want, choose what supports your data not what goes against it. 5% got 100% show that but hide that 60& got under 75%. Only focus on what you want.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What are the common issues with misleading graphs?

A

Failure to use equal intervals & Exaggeration of proportions - They do not start at zero, the size matters meaning numbers have to add up to proportions/percents, they use the wrong graph

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Find a bad graph, indicate what the problem(s) are. (On the exam I could also ask you to create a bad graph and label the problems or give you a bad graph and ask you to fix it)

A

Mar 2003, Jun 2004, August 2005
50,000, 60,000 70,000 (axis does not start at 0)
50%+13%+23%+24%
What’s your favorite season? fall 25% winter 25% spring 25% summer 25% (should be a bar graph not a pie chart)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

A bar chart depicts what type of data? A histogram depicts what type of data?

A

A bar chart depicts categorical data. “A chart for displaying frequencies of nominal data. A histogram depicts continuous data. “A visual representation of data for a single variable that uses bars to chart values on the x-axis and shows frequencies on the y-axis.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

Why do we create a codebook? b) On what page in the Guide to Writing can you read about how to do a codebook? c) on what page can you find an example? d) why am I asking these last two questions? Do we need to label levels for categorical or continuous? why? f) Why do we make some variables string in SPSS?

A

We create a codebook to organize and clearly label the questions. As a group it keeps us uniform in naming. It is used later for reference almost like a dictionary to know what each question or label means. When necessary, you can see the whole question not just selfesteem01, self esteem02. b) you can find creating a codebook on hyperlink 6 = page 7. c) you can find an example on the folowing page, 8. d) you are asking these questions so that we know where to find information on codebooks in case we need them. This proves we know how to use the guide. We need to label levels or categorical data because we need to know what each category means. Fo=r example when we need to make words into numbers for SPSS, yes could be 1 and no could be 2, that needs to be labeled. continuous data uses the number given therefore it does not need to be labeled. We need to make some variables string in SPSS to allow us to type in the words, which we would want to do because we do not yet know what we are looking for or what to expect from the results. It is helpful to keep all data together instead of leaving it out and we cannot simply label it 1 or 2.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

In your own words, what does the mean do? What makes it theoretical?

A

The mean finds the statistical average. What makes it theoretical is the possibility of outliers that pull the mean towards it.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

What is the formula for a mean? (In general, make sure you know what symbols stand for and what the formulas are doing) Which piece provides the context?

A

The formula for mean is M =( Σ(X) )/N In words, you would add up all scores and then you would divide by the number of scores.
The piece the provides the context is the N(number of scores) that Sigma(X) is divided by.(the denominator)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

How are outliers a problem when using the mean?

A

Outliers are a problem because they influence the mean and pull it towards the outliers.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

When are we more likely to use a mode, with categorical or continuous data?

A

We are more likely to use a mode with categorical data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

What’s better at handling outliers, medians or means? why?

A

Medians are better at handling outliers because the number does not matter as much as its order does. AS an example when calculating mean, you would have to add in the score which could be an additional 60 on its own, it significantly changes the mean. The median however, If the outlier is 1, well it goes to the left end of a numbered order least to most and is only moved to the left by one, so it is not moved much.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

If you read that the average student has been to 3 football games. How can that number lie?

A

If the average number is 3, we need standard deviation. That would find if there are outliers which is one way the number can lie. Maybe 276 students have never been to a football game, but 1 girl has been to 129 games. The mean was just pulled because of the one outlier, which is why we would want the standard deviation to be able to notice those types of things.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

A new movie gets 3 out of 5 stars. How could this be good? How could it be bad?

A

A new movie getting 3 starts (out of 5) could be good if 4578 people voted it 4 or 5 stars and 23 people gave it 1 star. It could be bad if 3782 voted it one or two stars and 14 gave it five stars. With the same logic, the mean is easily influenced by outliers, so although most people give it a high review, a couple outliers hated it and influenced the average. The same is true vice versa where most people said the movie was terrible but some outliers like low budget movies and thought it was hilarious and deserved five stars. It influences mean. What you would want to ask for would be median, so see what the central tendency would look like regardless of outliers, and a variability.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

Give your own example of a “flaw of averages” or when a mean is misleading.

A

One example of a “flaw of averages” or when a mean is misleading is rainfall in the U.S. The average rainfall may be 3 inches but some states like Maine actually get 12 inches of rain and Arizona gets 0 inches. Or snow but it all falls in Colorado not NJ

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

When reporting income levels, why is it common to use medians?

A

It is common to report income levels with medians because they are not as influenced by outliers like means are.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

What measure of central tendency is most common for categorical data?

A

The measure of central tendency most common for categorical data is mode.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

(Use what you know about means and the examples we did in class to answer this) You’re thinking of going to grad school, and the school says that the average student gets $20,000 dollars of financial aid a year. Based on what you know about averages, what questions should you have based on that figure?

A

What is the median, what is the variability, how many students were surveyed, were there any outliers.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q

The book section of Becoming a Better Consumer is a nice review of much of what we discussed in class.

A

Page 94!!mediandoes not equal mean, the flaw of averages, averages are theoretical (does NOT reflect reality) no one has 1.83 kids. What variability does and does not tell you, (indicate how much scores vary. we know the WHAT (NOT why) interpreting graphs and charts from research articles (do not only read the abstract, need to see tables and figures more info more context it is all important.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
38
Q

Fun with symbols X=

A

score

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
39
Q

Fun with symbols N=

A

number of scores

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
40
Q

Fun with Symbols M=

A

mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
41
Q

Fun with symbols Σ=

A

sum of

42
Q

a) In your own words, what is variability? b) Give a real life (non-math, stat jargon) example of high and low variability.

A

Variability is an individual’s score in relation to the mean. If I threw a bucketfull of ping pong balls down the stairs, I could get different results. A high variability would mean all the ping pong balls bounced and went everywhere, all over the floor, far apart from one another, with an outlier under the couch. Low variability would be that the balls went directly into the cups at the bottom of the stairs all close together instead of rolling away.

43
Q

Give three numbers (M = 6) with a) low variance, b) high variance, c) no variance.

A

Low variance, 5,6,7; high variance 1,10,7 No variance 6,6,6

44
Q

Draw (or find) a distribution with a) low variance, b) high variance

A

houses close and stuffed together= low variance
houses miles apart in mountains = high variance

45
Q

Book: What is range? What’s the problem with range?

A

Range is a measure of variability that indicates how far apart the top score is from the bottom score. in variability, numerical indication of how far the top number is to the bottom number. THE SPREAD. Problem is it doesn’t tell the whole story, more useful to know distribution between those endpoints. IGNORES the VALUES in the MIDDLE

46
Q

In your own words, what is a deviation score?

A

A deviation score is when you subtract a score by the mean. (X-M) It tells us how far one score is from the mean.

47
Q

What are the benefits of squaring deviations?

A

The benefits are to highlight outliers and to make everything positive

48
Q

In your own words, what is variance?

A

a measure of variability that generally lets us know how far, on average, numbers in the distribution are spread out from the mean.(standard deviation squared) ss/n or e(x-m)^2/n

49
Q

In your own words, explain step by step how to calculate variance

A

First take a score and subtract it from the mean (X-M) then square that equation (X-M)^2 next, get the sum of all scores in the dataset Σ(X-M)^2, then divide my the number of scores Σ(X-M)^2/N

50
Q

What does this tell you to do? Σ(X-M)^2

A

This tells you to sum up every score minus (subtracted by) the mean squared.

51
Q

Why is knowing standard deviation more useful than knowing variance?

A

Standard deviation is more useful because it is related back to the data set. Additionally, all you need to do is square SD to find variance.

52
Q

You got a 60/100 on your Intro Psych test. What combination of M & SD would make you feel really good about this? What combination would make you feel bad?

A

Good = M= 10 SD=1.2 (LOW)
Bad = M=98 SD=1.2 (LOW)

53
Q

Fun with Symbols Score=

A

X

54
Q

Fun with Symbols Mean =

A

M

55
Q

Fun with Symbols Number of scores =

A

N

56
Q

Fun with Symbols Sum=

A

Σ

57
Q

Fun with Symbols (X-M)

A

deviation score

58
Q

Fun with Symbols (X-M)^2=

A

squared deviation score

59
Q

Fun with Symbols SD^2=

A

variance

60
Q

Fun with Symbols SD=

A

standard deviation

61
Q

Fun with Symbols SS=

A

sum of squared deviations

62
Q

Fun with Symbols SQRT(SD)^2

A

standard deviation

63
Q

Fun with Symbols SQRT(SS/N)

A

standard deviation

64
Q

a) Why do we create syntax? b) Does it make sense to calculate a mean and standard deviation for a categorical variable? Why/why not? c) Know how to use Excel formulas to make calculations.

A

to see what we are computing and go back to it????
b)no because a categorical variable does not have a numerical value, we assign one. Therefore the mean and SD would mean nothing. If it was a yes =1 or no =2 question then the mean would be 1.42 which means nothing other than about half the people put yes and half put no.
c)??????????

65
Q

a) Give your own example of something that is normally distributed. b) Why did it end up that way? (be sure to explain multiplicity of causes) c) How does this apply to your example (or I could give you one…e.g., # of texts someone gets and ask how multiplicity of causes impacts that)?

A

Something with a normal distribution is audience laughter at a stand up comedy routine. If the act is funny, the mean decibel level of laughter is 7. Then there is going to be somebody who is completely drunk and just laughing at every word the comedian says. Of course there is also grandpa who can’t hear and is just grumpy so he hates the routine and refuses to laugh. Those reasonings for outliers are one part of the multiplicity of causes. Multiplicity of causes states that everything has its own series of possibility and events/causes. When you drop a ball down in plinko it will go left or right and then again and again so there are a lot of conditions and possibilities that contribute to where a number falls. The middle gets average so the guy is pretty good, he tells some funny jokes, most people are drinking a little so they’re having a good time, hecklers, some of his jokes don’t come across so well and he also is new so people don’t know him and aren’t comfortable with him yet.
In the example of how multiplicity of causes impacts the number of texts someone gets, that would be because of a set of reasons such as 1. Family (active) maybe in the middle their family texts once a day because the children have their own homes and their parents want to stay in touch. So the majority gets M=5 texts from family. But there are people all across the line with their scores, such as those who have some helicopter parents, they get more than most maybe like 8 texts. And then even more than that, are the ones who is not picking up so their parents keep texting to make sure they are okay and they have 13 missed texts. On the left side of the mean, fewer than average are the ones who just went home yesterday, they only texted 3 times. And finally there are the ones with the parents who are negligent and send no messages. All these factors change how many people each get different amount of texts from their parents. (cell service = no messages vs fast messages)

66
Q

In a normal distribution, why are most things average? Why are there few scores at the extreme?

A

Most things are average because of the multiplicity of causes. Every outcome has lots of influences. There are lots of ways for grass to end up in the middle could be the soil, water, sunlight, luck etc. but for the grass to be one of the extreme scores, them everything has to go wrong for it, it was cut that day, could barely grow because the soil was so bad and it hasn’t been watered in months. There is too much sun as well that just cooked the grass.

67
Q

Most things most of the time are ____. Most of things you see on social and other media are ____. Why is this a problem?

A

average. 2) social media 3) this is a problem because people think they can be

68
Q

a) Build your own distribution (M = 5.00, SD = 0.25), including Z scores, and the normal curve. b) Add in the empirical rule c) If a person has a score of 6, what is their Z? Score of 5? Score of 4.75? d) What % of scores are above 5.50? Below 4.75? What % falls within a SD of the mean? What % falls within 2 SD of the mean? What % is above a 5?

A

Draw a line, draw the mean 5. Then next tick on the line is 1 standard deviation = 5.25 2 standard deviation = 5.50. Similarly on the other side -1 standard deviation = 4.75 and -2 standard deviation = 4.50. (plot the scores) Above the line you would also add in the z scores to the ticks with 0 in the middle, 1, and 2 on the ticks to the right and -1 and -2 to the ticks on the left. Then you may draw the normal curve on top.
b) the empirical rule 68-95-99 (34%14%2%)
If a person has a score of 6, their z score would be higher than 2 standard deviations above the mean. Z = 4. A score of 5 means that the z = 0, and a score of 4.75 has a Z = -1.
d) 68% falls within a SD of the mean (from -1 standard deviation to 1 standard deviation.) 68+14+14=96% from the mean to 2(or -2) standard deviations would be 48%. The % above 5 (the Mean) is 50%.

69
Q

M = 7.50 SD = 0.50; Raw Score = 6.48. Estimate (i.e., no calculation needed) the Z score.

A

Z score = approx. -2.1

70
Q

M = 55.78 SD = 22.11; Raw Score = 55.78 What is the Z score?

A

z=0

71
Q

Why do we use Z scores? We talk a lot about how context matters. How does context play a role in Z scores?

A

We use z scores because they are universal. Therefore if we know a z score, we can put it into context easily. We can compare different groups of z score so we would understand that a z score of 1 is a standard deviation above average in whatever we are looking at.

72
Q

In your own words, explain the Z-score formula. Show it.

A

The a score formula is how much you differ (your deviation score) divided by how much everyone differs (the standard deviation). z= (x-m)/SD

73
Q

a) How could a Z score help us compare a 3.50 GPA across Chemistry, Art, & Psychology majors? Based on this information, who is doing best in school? b) Give your own (non-college major) example of when Z scores could help with a comparison.

A

z scores are universal.
I want to compare LoMein noodles to Chicken Parmigiana to ramen soup. Those are all very different foods. The only way to compare the Chinese food to the Italian food to the Japanese food is to find the z scores of each to the same question of how much ____ food do you like to eat in a month. THen I can compare and say okay, the z score for chicken parm is +2, the z score for lomein is -1, and the z score for japanese food is +1.32.

74
Q

Fun with Symbols (X-M)/SD =

A

z score

75
Q

(z) (SD) + M=

A

x

76
Q

a) Give your own example when using samples instead of populations occurs in everyday life. b) Why do we study samples instead of populations?

A

In my daily life, I am curious how many students passed the weekly quiz. I take a convenience sample of three friends I know in the class. I do not ask the entire population of my class because I do not know everyone and do not feel comfortable talking to them. Especially with sensitive information such as grades, I would not just walk up to someone for the first time and ask how well they did. Therefore, a population is not something I would do. It is much more practical to collect a sample than a population on account of how large a population is. It would take time, money and effort, it would be ridiculously hard.
I could sample perfume instead of buy the entire bottle just to find out a I do not like it.
How much coffee everyone drinks ask friend, gets 1.

77
Q

What helps a sample be more representative? (Give 3)

A

To make a sample more representative it must be 1) RANDOM sampling technique Convenience sampling 2) LARGE 3) LOW VARIABILITY

78
Q

a) What does it mean to think probabilistically instead of dichotomously? b) Give your own example of when it is useful to think probabilistically instead of categorically (or dichotomously). c) What’s the benefit of thinking probabilistically?

A

Thinking probabilistically means to think continuously instead of categorically like you would in dichotomous thinking. It would be useful if you wanted to know how much you should worry about something. If you thought categorically, you would only have two options, yes or no in terms of worry. Are you worried about spiders? Yes. It would be better if you though probabilistically, because then you could be 51% worried, but other things are more concerning as well as other things are less worrisome. The benefit of thinking probabilistically is that it gives you factors that contribute to the outcome, some things you can control and change the outcome appropriately based on the percentage. There is a lot of utility.

79
Q

How can thinking probabilistically help you when considering your chances of getting in a car accident?

A

It gives you the proper amount of worry, it gives you the chance to lower your percentage by learning defensive driving and other safety precautions. Be more careful.

80
Q

a) If p = .05, what percentage is that? b) What is that in terms of proportions? c) Can p = 0.00?

A

The percentage is 5% b) 5 out of 100 c) p cannot be 0, there is always a chance something could happen, it just may be very unlikely, therefore, we say p<0.001

81
Q

In your own words, explain the subjective interpretation of probability. Give your own example.

A

The subjective interpretation of probability is how what we look at we decide what we want it to mean because we are biased. If someone told me that I have a 50% chance of developing heart disease, that is something bad, I do not want to happen. So I’m going to think that I am in the 50% that does not develop the condition. I think that it will never happen to me. The same applies for something positive. If Dr.L was going to choose 50% of his students to give $10 million, I am going to believe that I will be one of the 50% who get his money. In both examples, my chances are the same 50%. I have a 1 in 2 chance.
If 50% of people tell me a professor sucks I am going to think im the good 50% and he will be great and pass me, in reality, he sucks. But i had a 50% chance. I subjectively interpreted the results to make a decision because the odds were equal.

82
Q

What makes something a scale? That is, if you see 10 items, what would make them a scale? What wouldn’t?

A

It has to have multiple questions/items. THey all have to relate to each other/the same topic. If I see 1 item that is not a scale, if I see 10 that could be a scale if they are all related questions. For example, if I asked How much do you like apples, and How much do you like pears, they could be a scale, there are multiple and they are related in this case they are both fruits. I would expect to get similar results in both questions. It would not be a scale if I just asked 10 random non-related questions like how much do you like music? How religious are you? How much do you support the right to vote? That is not a scale.

83
Q

a) When we say a scale is reliable, what does that mean? b) Why should we always check the reliability of our scale before calculating the mean? c) If your Cronbach’s alpha is less than .60, what should you do/look for?

A

Reliability means the questions are all similar to one another, every time you take an exam you get similar results. That is reliable, consistent, can be reproduced.
b) you should always check the reliability first because any outliers or extremes or question you would want to delete because they are not reliable enough, would affect the mean. If everything check out then go ahead with the mean but if you accidentally calculate mean without checking reliability, then your mean will be impacted. If your cronbach’s alpha is less than .6 (.7 is what you want for it to be reliable) then you should check the “scale if item deleted section” you may see that the scale items are similar but low and one number is larger, you could delete that item from the scale and then, bring up your alpha. ANother possibility is that there are multiple problem numbers, then you need to do a lot of work to your scale because it is not reliable at all, maybe your items aren’t related.

84
Q

Once you calculate a scale’s mean (or sum) should you use the individual items in later analyses or the scale mean/sum?

A

You should continue to use the scale mean (or sum)

85
Q

a) What shorthand do we use to designate a mean? b) What’s the difference between the mean of a variable vs. the mean of a scale?

A

M = mean.
b) mean of a scale is the mean of a set of related questions (self-esteem)
mean of a variable??????

86
Q

Look at this bit of SPSS syntax, what’s wrong? a) MEAN (lifesat01, lifesat02, r_lifesat02, lifesat03, lifesat04, lifesat05)

A

You included the original lifesat02 as well as the reverse scored version. Once you create the reverse score the original is dead to you. You do not use it.

87
Q

b) MEAN (stress01, stress02, stress03, age, gender)

A

This person tried to take the mean of age and gender (categorical) that is not appropriate. Age is just how old people are which could be like 20.1 years old, versus your item’s mean of 6. They do not go together. Additionally the mean of gender is just stupid, because then its just 1.43. That means nothing.

88
Q

c) MEAN (happiness01)

A

This is not a scale because there is only one item

89
Q

d) MEAN (close01, close03, close04, close05) Any reason why this might be ok?

A

This is missing close02, which could be ok if it was on purpose because deleting the item would improve scale reliability.

90
Q

a) Why do we reverse-score items? b) What shorthand do we use in the variable name to designate that an item is reversed? c) When calculating reliability, do we use the original item or the reversed? d) When calculating the mean, do we use the original item or the reversed

A

We reverse-score items to make sure everything is using the appropriate scale. If someone answers high on a question like I believe in myself. They would be expected to score low on the question I often often doubt myself. HOwever, if we were looking at the scale of questions, the reliability and mean of the scale would be impacted because of the difference in scores. Therefore it is important that we reverse score the items so that they are the appropriate number and give us the correct scores in reliability and mean. If we had left it, cronbach’s alpha would suggest low reliability on those two very related questions.
b)the shorthand we use is to put r before the codename label to signify we reverse coded it.
c) you use the reversed item in calculating reliability. Once you reverse code something the original is dead to you, you do not use it, but also don’t delete it. If you use the original it will impact your results.
d) when calculating the mean, you use the reversed item.

91
Q

Be able to create an analysis plan for calculating a scale mean (i.e., what are the steps you need to do?)

A

Is it a scale? Check if there are multiple items or one item? Check if the items are related or non-related?
Do any measures need to be reverse coded?
Is the scale reliable? Do any items need to be deleted?
Create the scale mean
Calculate the Scale mean????????

92
Q

Analysis Plan: What Would You Do? Provide an analysis plan. What would you need to do first, second, third, etc.? Don’t run anything, just describe how you’d do it with a numbered list. 1. XX 2. XX etc.
For your thesis you have a variable for Major (1=Chemistry, 2=Social Work, 3= English) along with each person’s number of credits completed, a health score (from 0-100), favorite class, and commuter stats (1= lives on campus; 2=lives off campus).
Research Question: Among commuters, what are the means and standard deviations for key variables?
1.???
2.???
3.???

A

You would not run means and standard deviations for the categorical question of if a person is a commuter. The numbers do not mean anything it is just so you can put it into SPSS

93
Q

To get each mean, what would you need to do?
*Mean age

A

If you wanted to know the mean age you would add up (sum) all the ages and then divide them by the number (number of scores) of values there are in that category.

94
Q

*Mean of x_sad

A

When you put an x in front of your variable that is supposed to designate the mean. It was already created so you wouldn’t do it again.
Calculate scale mean

95
Q

*Mean of sad01, sad02, sad03, sad04, sad05
*Mean of self-esteem scale with 10 items

A

For the above two scales you first must create a summary scale for means.
transform-> compute variable->type desired variable name in target variable box -> type and label mean of then the variables then click continue. In numeric expression box determine what items you want to include (sad01,…sad05). It should look like this, important mean is all capital, variables are in parenthesis and there is a comma in between each one. MEAN (variable, variable, variable) then click paste
Then Analyze -> descriptive statistics -> frequencies -> x_sad, click statistics -> central tendency box -> click mean click continue click paste.

96
Q

*Mean of fun01, fun02, fun03, gratitude01, optimism01

A

These are measuring different things so you should not compute mean.

97
Q

In your own words, what are confidence intervals? b) What’s larger a 99% confidence interval or 95% c) How do confidence intervals apply to the Price is Right?

A

a)a range of scores that likely includes the population’s true mean. Like an informed guess. The range of scores we’re reasonably sure we would find the population’s true mean. “We can’t pinpoint the population mean, but we are reasonably sure its right around here.” To be more confident you encompass more data.
b) a 99% confidence interval
c) In the game, Range Finder, they have to guess the price within a $150 target interval. This is a smaller range so you have to be quite precise to win.

98
Q

In your own words, what is statistical power? b) Give your own real-life example of when you use a similar concept. c) What are the ways we can increase statistical power?

A

the likelihood of getting significance when the hypothesis is true.
OR
The chance that we are going to reject the null hypothesis.
b)an example of when i draw accurate conclusions about sample data’s hypothesis, is when I think the chance of getting eaten by a bear is low, but not zero. Never prove anything just give compelling evidence.
c) make sample size bigger, have a bigger effect size, adjust the p level, decrease your error

99
Q

a)In your own words, what is an effect size?
b)Give your own real-life example of a small effect size and a large effect size.

A

a)How BIG the mean difference is. How big the mean of a distribution does NOT overlap with a second distribution. Distance between two populations. A measure of the strength of an association between two variables; or the practical, rather than statistical importance of a research findings.(Cohen’s d measure of effect size) (like z score, population mean minus population mean divided by the standard deviation of the population.)
b)small effect size = Top Ramen chicken flavor versus Maruchan Ramen chicken flavor; medium effect size = Top Ramen chicken flavor versus cup o noodle chicken flavor with vegetables; Large effect size = Top Ramen chicken flavor versus Top Ramen beef flavor.

100
Q

Draw sets of distributions that represent a) small b) medium and c) large effect sizes. Label each with the appropriate effect size convention. (i.e., .2, .5, .8).

A
101
Q

A study has an effect size of .94. What does that tell you about that study’s statistical significance?

A

Nothing. If cohen’s d is .94, the effect size is large because it is above .8. That is a big difference between the means. There is No staitstical significance because we can’t learn anything about the other distribution.

102
Q

If you’re looking for a small effect size, what should you do to your sample size?

A

INCREASE your sample size, significance is greater with small samples, make sample huge it’s hard to find big differences when you have a big group of people.
If I wanted to compare everyone in the psych major to everybody in the education major that is huge groups of people, it will be harder to find differences that really exist. Compare you and me pretty big differences. But giant group that you try to find difference on it be will a lot smaller because people tend to be average and more similar than they are different.