Data Management Flashcards

1
Q

is the process of gathering and mesauring information about variables on study established systematic proceudre, which then enable toa naswer relevant questions at hand and evaluate outcomes

A

Data collection

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are the four types of data?

A

Nominal, Ordinal, Interval, Ratio

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

it is sometimes referred to as classificatory scale. this scale is used for classifying and labeling variables without quantitative value

A

nominal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

eye color, gender, vsu dormitories and degree programs are examples of

A

nominal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

it possesses the characteristics of the nominal scale, where it classifies data, however, the classification has ranks. data is shown in order of magnitude

A

ordinal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

educational attainment, instructorโ€™s evaluation, emotion and organizational structure is an example of

A

ordinal

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

this scale possesses the characteristics of the nominal and ordinal scale where data are classified and ranked.

A

Interval

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

This scale doesnโ€™t have a true zero.

A

Interval

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

This scale is a classification that describes the nature of information within the ales assigned to varibles.

A

Interval

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

IQ, transmutation of grades, BMI and Temperature are examples of

A

interval

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

This scale possesses the characteristics of nominal, ordinal, and interval scale where zero is absolute. This is the point where the quality being measured does not exist.

A

Ratio

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Age, Monthly Income, Height, Allowance are examples of

A

Ratio

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

is a grouping of the data into cateogires showing the number of observations in each of the non-overlapping classes

A

frequency distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

is the data collection in original form

A

raw data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

is the difference of the highest value and the lowest value in a distribution

A

range

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

is the organization of data in a tabular form, using mutually exclusive classes showing the number of observations in each

A

frequency distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

is the highest and lowest vlues describing a class

A

class limits (apparent limits)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

is the upper and lower values of a class for group frequency distribution whose values has additional decimal place more than the class limits and end with the digit 5.

A

class boundaries (real limits)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

is the distance between the class lower boundary and the class upper boundary and it is denoted by the symbol i.

A

interval (width)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

is the number of values in a specific class of a frequency distribution

A

frequency (f)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

is obtained by multiplying the relative frequency by 100%

A

percentage

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

is the sum of the frequencies accumulated up to the upper boundary of a class in a frequency distribution

A

cumulative frequency (cf)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

is the point halfway between the class limits of each class and is representative of the data within that class

A

midpoint

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

is used to organize nomial-level or ordinal-level type of data.

A

categorical frequency distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

gender, business type, political affiliation and others are examples of the usage of the ___________

A

categorical frequency distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

is used when the range of the data set is large; the data must be group into classes whether it is categorical data or interval data. For interval data the class is more than one unit in width.

A

grouped frequency distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

rule i. to determine the number of classes is to use the smallest positive integer โ€˜kโ€™ such that โ€˜2^k >= nโ€™ where โ€˜nโ€™ is the total number of observations.

A

i = range/number of classes = (hv - lv)/k

28
Q

rule 2. another way to determine the class interval is by applying the formula

A

i = range / ( 1 + 3.322 (lograithm of total frequencies))

29
Q

is a graph which the classes are marked on the horizontal axis and the class frequencies on the vertical axis.

A

histogram

30
Q

is a graph that displays the data using points which are connected by lines the freuqncies are represented by the heights of the points at the midpoints of the classes. the vertical axis represents the frequency of the distribution while the horizontal represents the midpoints of the frequency distribution.

A

FREQUENCY POLYGON

31
Q

) is a graph that displays the cumulative frequencies for the classes in a frequency distribution. The vertical axis represents the cumulative frequency of the distribution while the horizontal axis represents the upper class boundaries of the frequency distribution.

A

CUMULATIVE FREQUENCY POLYGON (OGIVE)

32
Q

is a graph used to represent a frequency distribution for a categorical data (or nominal level) and frequencies are displayed by the heights of the vertical bars which are arranged in order from highest to lowest.

A

PARETO CHART

33
Q

is similar to bar histogram. The bases of the rectangles are arbitrary intervals whose centres are codes. The height of each rectangle represented the frequency of that category.

A

BAR CHART

34
Q

is a circle divided into portions that represent the relative frequencies (or percentages) of the data belonging to different categories.

A

PIE CHART

35
Q

represents data that occur over specific period of time under observation. In addition, it shows trend or pattern on the increase or decrease over the period of time.

A

TIME SERIES GRAPH

36
Q

immediately suggests the nature of the data being shown. It is combination of the attention-getting quality and the accuracy of the bar chart. Appropriate pictures arranged in a row present the quantities for comparison.

A

PICTOGRAPH

37
Q

is used to examine possible relationships between two numerical variables. The two variables are plot in ๐‘ฅ axis and ๐‘ฆ axis.

A

SCATTER PLOT

38
Q

is a central or typical value for a probability distribution. It may also be called a center or location of the distribution.

A

central tendency

39
Q

measures of central tendency

A

mean, median, mode

40
Q

the average of the numbers.

A

mean

41
Q

sample mean is denoted by

A

(๐‘ฅ )ฬ…

42
Q

population mean is denoted by

A

๐œ‡

43
Q

is particularly useful when various classes or groups contribute differently to the total.

A

weighted mean

44
Q

is the value separating the higher half of a data sample, a population, or a probability distribution, from the lower half.

A

median

45
Q

median is denoted by

A

๐‘ด๐’….

46
Q

a set of data values is the value that appears most often.

A

Mode

47
Q

Mode may exist sometimes does not (T/F)

A

True

48
Q

Mode denoted by

A

๐‘ด๐’.

49
Q

Measure of dispersion which includes range, interquartile range, absolute deviation, variance and standard deviation is also known as the

A

measures of spread or variability.

50
Q

measures of dispersion involves

A

range, variance, interquartile, standard deviation, absolute deviation

51
Q

This is the easiest measure of dispersion. It is the difference between the highest value and the lowest value.

A

RANGE

52
Q

range is denoted as

A

๐‘…=๐ป๐‘‰โˆ’๐ฟ๐‘‰

53
Q

This is the expectation of the squared deviation of a random variable from its mean. It measures how far a set of numbers are spread out from their average value.

A

VARIANCE

54
Q

This is the square root of its variance. A low standard deviation indicates that the data set tend to be closed to the mean.

A

Standard Deviation

55
Q

A high standard deviation indicates that the spread of data points is of wider range.
(T/F?)

A

True

56
Q

This is the average distance of all of the elements in a data set from the mean of the same data set.

A

Absolute Deviation

57
Q

It is sometimes referred to as measure of location. It is considered as the extension of median. It talks about the position/location of the value relative to the other values in the data set.

A

Measures of relative position

58
Q

Measures of relative position involves

A

quartile, percentile, z-scores, box-and-whisker plot

59
Q

This measures divides the observation in four equal parts.

A

Quartile

60
Q

The lower and the upper quartile value helps us to find the measure of dispersion in the set of observation, which is called as

A

โ€˜inter-quartile range

61
Q

inter quartile range is denoted as

A

IQR (difference between upper and lower quartile) q3 - q1 = iqr

62
Q

This divides the observation in 100 equal parts.

A

Percentile

63
Q

This indicates how many standard deviation an element is from the mean. The positive and negative signs indicates the direction of the point away from the mean.

A

Z-scores or standard scores

64
Q

z-scores denoted as

A

Z

65
Q

A z-score less than 0 represents an element less than the mean.
A z-score greater than 0 represents an element greater than the mean.
A z-score equal to 0 represents an element equal to the mean.

A

TRUE

66
Q

A z-score equal to 1 represents an element that is 1 standard deviation greater than the mean; a z-score equal to 2, 2 standard deviations greater than the mean; etc.

A z-score equal to -1 represents an element that is 1 standard deviation less than the mean; a z-score equal to -2, 2 standard deviations less than the mean; etc.

A

TRUE

67
Q

It is a graph of a data set obained by drawing a horizontal line from the minimum data value to first quartile, drawing a horizontal line to third quartile to the maximum value, and drawing a box whose vertical line passes through Q1 and Q3 with a vertical line inside the box passing through the median or second quartile.

A

Box-and โ€“Whisker Plot