data analysis Flashcards
when is linear regression used?
to improve correlation when measuring associations between continuous exposures and outcomes
how can you get a more representative sample?
more data
what does statistics allow for?
it allows us to take all data in and summarise it in a way that is understandable and useful
what are the two main properties of data that we want to capture through statistics?
where quantitative data sits in numerical space and what categorical data is more or less common, what the values look like and understand the relationship
what does the analysis done depend on?
how is the data recorded and how is the data distributed and the research question - does it answer what it is meant to
how is categorical data usually recorded?
as text or labels
what is ordinal data?
when it is ordered or ranked
how can you present categorical data?
counts, percentages, tables and graphs
what alters how you present data?
who you are presenting the data to
in what order does STATA follow commands?
command name, then argument for command and then further options after comma
what are arguments?
they are variables to determine how the command is run i.e. bar
when should you add graphics to the bar chart?
only if they provide more information and help to understand the information already given
what are the methods for testing relationships?
logistic regression and T tests and chi squared - this is where we have one categorical and one continuous variable
what is numerical data?
it is when the data is data is in numbers - can count or measure the values
what is discrete?
when the numerical data is whole numbers
how can you summarise the size of numerical values?
mean and median