Stata Concepts Flashcards

Question 1

Q

Alternative Hypothesis

Answer

A

Ha, In statistical hypothesis testing, the alternative hypothesis is a position that states something is happening, a new theory is true instead of an old one (null hypothesis).[1] It is usually consistent with the research hypothesis because it is constructed from literature review, previous studies, etc. However, the research hypothesis is sometimes consistent with the null hypothesis.

Question 2

Q

Bivariate Regression

Answer

A

2 variable regression, Bivariate Regression Analysis involves analysing two variables to establish the strength of the relationship between them. The two variables are frequently denoted as X and Y, with one being an independent variable (or explanatory variable), while the other is a dependent variable (or outcome variable).

Question 3

Q

Chi-Square Test

Answer

A

or χ2, looks at how the cases are dispersed across values of the dependent variable. Form of cross-tabulation. Interpreting the χ2 statistic depends on the degrees of freedom and the level of significance/level of confidence chosen by the researcher. The degrees of freedom and level of significance determine the critical value: the upper plausible boundary of random error. If the χ2 statistic is greater than the critical value, then we reject the null hypothesis of there being no relationship between the variables in favor of the alternative that there is a relationship. Doesn’t show direction or magnitude of relationship

Question 4

Q

Confidence Intervals

Answer

A

a range of values which you are X percent confident captures the population parameter (or the true value of the variable). We are confident about the interval itself (technically how we computed it), and not the true value of the variable. We aren’t saying we are X% confident that the true value of the variable falls within the interval. Sample statistic ± (t-value x standard error of the sample statistic). For the 95 percent confidence interval, we use the formula: Sample statistic ± (2 x standard error of the sample statistic)

Question 5

Q

Confidence Level

Answer

A

The p-value of 0.05 corresponds to the 95 percent confidence level

Question 6

Q

Correlation Coefficient

Answer

A

r, ranges from -1 to +1, sign of r refers to direction of relationship (positive or negative), 0 means no linear relationship, further away from 0: stronger linear correlation.

Question 7

Q

Cross-Tabulation

Answer

A

shows dist. of cases across the values of a DV for cases that have diff. values of IV, (when IV=value x, how often is it paired with values of y?). IV: column, DV: row, calculate percentages of IV, compare percentages across columns at same level of DV and make comparisons where we see changes in IV. How does IV affect DV? Not comparing DV vs. DV.

Question 8

Q

Dummy Variable

Answer

A

have any 2 possible values (0 or 1), 0= base/excluded category, 1 unit change= change in category

Question 9

Q

Interval Variable

Answer

A

numeric codes indicate precise quantities and communicate exact differences b/t units of analysis and differences in value, necessary for correlation coeff.

Question 10

Q

Multicollinearity

Answer

A

such a strong relationship b/t IVs that its difficult to estimate partial effects of each IV on DV. Another way of thinking about multicollinearity: the independent variables aren’t sufficiently independent from one another. If the correlation coefficient of two variables is 0.8 or higher, then including both in the multiple regression will lead to poor estimates. You can also look at the change in adjusted R-square value when a variable that is correlated with one of the IVs is included: if there two IVs are strongly related, then there will not be much change in the adjusted R-squared.

Question 11

Q

Multiple Regression

Answer

A

The big innovation of multiple regression is that it lets us isolate the effect of one independent variable on the dependent variable while controlling for the effects of the other independent variables. What we are now calculating are the partial regression coefficients, which estimate the mean change in the DV for every one unit change in the IV, controlling for the other independent variables in the model.

Question 12

Q

Nominal Variable

Answer

A

numeric codes that indicate categories, not actual quantities i.e. 1-North, 2-East, 3-South, 4-West

Question 13

Q

Null Hypothesis

Answer

A

Ho, In inferential statistics, the null hypothesis is a general statement or default position that there is nothing significantly different happening, like there is no association among groups or variables, or that there is no relationship between two measured phenomena.

Question 14

Q

Ordinal Variable

Answer

A

numeric codes indicate rank and relative differences b/t units of analysis, don’t know exact values

Question 15

Q

P-Value

Answer

A

<0.05: reject the null hypothesis, 5% or smaller probability that our sample statistic would be observed if the null hypothesis about the population parameter were true. In statistical hypothesis testing, the p-value or probability value is the probability of obtaining test results at least as extreme as the results actually observed during the test, assuming that the null hypothesis is correct.

Question 16

Q

Population Parameter

Answer

Study These Flashcards

A

characteristics of population (entire universe of cases a researcher wishes to study)

Question 17

Q

Regression

Answer

Study These Flashcards

A

a measure of the relation between the mean value of one variable (e.g. output) and corresponding values of other variables (e.g. time and cost).

Question 18

Q

Regression Coefficient

Answer

Study These Flashcards

A

b, the slope of the regression line, shows both direction and magnitude of relationship, interpret by looking at p-value

Question 19

Q

Regression Line

Answer

Study These Flashcards

A

y=a+b(x), command regress variables

Question 20

Q

R-Squared

Answer

Study These Flashcards

A

tells us the proportion of variation in DV that we can account for with the IV, ranges from 0-1, tells size of contribution IV makes to DV or completeness of relationship

Question 21

Q

Sample Statistic

Answer

Study These Flashcards

A

estimate of a population parameter based on a sample drawn from a population

Question 22

Q

Standard Error

Answer

Study These Flashcards

A

s/√ n, s= standard deviation, n= sample size, a measure of the statistical accuracy of an estimate, equal to the standard deviation of the theoretical distribution of a large population of such estimates. Also takes the size of the sample into account.

Question 23

Q

Statistical Significance

Answer

Study These Flashcards

A

p-value must be ≤ 0.05 in order for the results of our hypothesis tests to reach statistical significance. Ex. If Stata produced a p-value of 0.23, then we would say that we don’t have statistically significant evidence to reject the null hypothesis.

Question 24

Q

T-Tests

Answer

Study These Flashcards

A

(one-sample and two-sample): what we are doing in order to test our null hypotheses and do descriptive inference are called t-tests. So, is this t the same as the t-value that we use to calculate the 95 percent confidence interval? No. (Sorry.) The t in the t-test refers to the ratio of the difference between the sample statistic and the hypothesized population parameter to the standard error of the sample statistic. This is the Student’s t-statistic.

Stata Concepts Flashcards

(24 cards)