Data Unit 1 Test Flashcards
Statistics is the process of:
- Collecting data
- Organizing data
3: Interpreting data
Data:
Facts or pieces of information. A single fact is called datum.
Raw Data:
Unprocessed information (i.e. Not yet compiled in a frequency table, chart or graph).
Aggregate Data:
Data that is organized or grouped such as finding the sum over a given period or time, for example, monthly or quarterly. Data can be organized into any grouping such as geographic area. The data is not individual records.
Micro Data:
Non-aggregated data about the population sampled. For surveys of individuals, microdata contains records for each individual interviewed; for surveys of organizations, the microdata contains records for each organization.
Experimental Data:
Data gathered through experimentation.
Observational Data:
Data gathered by observation of the “subject.” For example, the subject is recorded then the behaviours are noted over a period of time.
Primary Data:
Data gathered directly by the researcher in the act of conducting research or an experiment. Data can be gathered by surveys or through experimentation.
Secondary Data:
Data gathered by someone other than the researcher.
Numerical Value:
- A quantitative variable that describes a numerically measured value
- These variables can be either continuous or discrete.
Continuous Variable:
- A numeric variable which can assume an infinite number of real values
- i.e. the unit of measure can be broken down into smaller units or decimals.
- Example: age, distance, temperature, and school marks
- A histogram graph displays continuous data
Discrete Variable:
- A numeric variable that takes only a finite number of real values
- i.e. can only have separate values, of integers (no decimals)
- Example: number of people, animals, x can equal only 0, 1, 2, 3, etc.
- A Bar graph displays discrete data
Categorical Data:
- Consists of data that can be grouped by specific categories (also known as qualitative variables).
- Categorical variables may have categories that are naturally ordered (ordinal variables) or have no natural order (nominal variables).
Nominal Variable:
- Type of categorical variable that describes a name, label, or category with no natural order.
- Example: subjects in school, hair colour,
- Alphabetical order is nominal because you can put the names in alphabetical order, but the names have no rank. A is not “better” than B
Ordinal Variable:
- Type of categorical variable that has a natural ordering of its possible values, but the distances between the values are undefined.
- Example: Excellent, Good, Fair and Poor to rate something, the answer is only a category but there is a natural ordering in those categories.
Frequency Table:
a table which shows the distribution of values of the variable
Key Features of a Frequency Table:
3 columns: Range, Tally, Frequency
Range column
Make sure the magnitude ofthe all the intervals are the same
Square brackets: includes the value.
Round brackets: up to but not including the value.
Cumulative Frequency Table -
the running total of the frequencies from the top down to the corresponding row.
- (add the total frequencies as you go down the columns.)
Relative Frequency Table (%)
shows the frequency of a range (data group) as a percentage of the whole data set. (used for pie graph)
- Take each frequency and divide it by the total frequency then multiply by 100
Bar Graph:
Bars don’t touch
Each bar is a different colour
Categorical/discrete
Frequency =
Histogram
Relative frequency =
Pie chart
Cumulative Relative Frequency =
Ogive
Histogram:
- Coloured in with one colour
- This type of graph is best suited to show a continuous range of values; hence, the bars touch
- The area of the bars is proportional to the frequencies of the variable.
- To determine the interval ranges, take the range of the entire set and divide it by the number of bars that you want. (Range = Highest-Lowest)
Frequency Polygon (or Line Graph)
- Can illustrate the same information as a histogram or bar graph.
- Points are plotted with the midpoints of the intervals versus the frequency. Then a line is drawn connecting the points.
- This type of graph is best suited to illustrate (changing) trends.
Cumulative Frequency Polygon (or Ogive)
- Illustrates the running total of the frequency from the lowest value up.
- Plot the x-values on the upper end of the range.
Circle (Pie) Graph
- Best suited to illustrate categorical data relative to the whole or to each other (using relative frequencies).
- Need a protractor to draw the sections accurately (each segment size = relative frequency x 360)
–To get the angle multiply the relative frequency by 360
Population:
All individuals that belong to a group being studied. e.g. For a survey to find out what sport was the favourite of students of SDSS, all students are asked.
Sample:
A group of items or people selected from a population (to represent the whole population). e.g. For a survey to find out what sport was the favourite of students of SDSS, 20 random students from each grade were asked.