unit 9 Flashcards

1
Q

what is a fact when looking at visualizations?

A

what the data shows

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

what is the opinion when looking at visualizations?

A

why the fact might be the case

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

what assumptions should we be careful to make?

A

correlation does not equal causation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

what is metadata?

A

data about data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

what happens to the primary data when metadata is changed?

A

can be changed without impacting the primary data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

what can the metadata be used for?

A

finding, organizing, and managing information

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

what does metadata increase?

A

increases effective use of data by providing extra information

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

what does metadata allow the data to be?

A

structured and organized

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

how to create a bar chart

A

count how many times each value in the column appears and make a bar at that height

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

information we can get out of bar charts

A

what values are the most common in this column
what values are the least common in this column
what is the unique list of values in this column

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

what happens when all the values of a chart is unique?

A

it is not useful

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

how to create a histogram

A

similar to a bar chart, but all numbers in a bucket are grouped together

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

when can histograms be created?

A

only with numeric data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

when is a histogram useful?

A

when a normal bar chart may be difficult to read

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

information we can get out of histograms?

A

what range of values are the most common in this column
what range of values are the least common in this column
what range of values do or do not appear in this column

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

when does data need to be cleaned?

A

data in incomplete
data is invalid
multiple tables are combined into one

17
Q

what leads to messy data

A

users enter in different types of data - “two” or 2
users use diff abbreviations for some info - “February” or “Feb”
data may have diff spellings - “colour” or “color”
data has inconsistent capitalization - “spring” or Spring”

18
Q

what does filtering data allow for?

A

allows the user to look at a subset of the data

19
Q

when are bar charts and histograms useful?

A

when looking at one column of data

20
Q

ways to visualize data that look at two columns of data at the same time

A

crosstab chart
scatterplot

21
Q

crosstab chart

A

counts how many times combinations of values appear

22
Q

scatterplot

A

useful for seeing patterns and trends between two values and numeric data with lots of different values
not useful for lots of repeated values

23
Q

what can we takeaway from manipulating and visualizing data?

A

develop insights and knowledge about our world by finding patterns

24
Q

what can we see when investigating two columns of data?

A

we can observe patterns different values move together (how they are correlated) but cannot know the cause of correlation

25
Q

what is open data?

A

sharing data with others so that they can analyze it
publicly available data shared by government, organizations, and others

26
Q

how is making data open useful?

A

helps spread useful knowledge or creates opportunities for others to use it to solve problems

27
Q

what is citizen science and crowdsourcing

A

collecting data from others so you can analyze it
examples of how human capabilities can be enhanced by collaboration via computing

28
Q

crowdsourcing

A

practice of obtaining input or information from large numbers of people via the internet

29
Q

what does crowdsourcing offer?

A

new models for collaborations, such as connecting businesses or social causes with funding

30
Q

citizen science

A

research where some of the data collection is done by members of the public using own computing devices with leads to solving scientific problems

31
Q

what is big data?

A

collect huge amounts of data so we can learn even more from it

32
Q

what does the size of datasets analyzed impact? as a result what happens?

A

how much information can be extracted
people are working with increasingly big data sets in many contexts like business and science

33
Q

cloud computing

A

parallel systems
when data gets too big and can no longer be processed on one computer so this is used to help process all that info

34
Q

what is important to consider when working with big data?

A

the scalability as you want your systems to be able to work even as you’re using more and more data

35
Q

what are bar charts and histograms good for knowing?

A

what is in your data set

36
Q

what are crosstab charts and scatterplots good for knowing?

A

finding relationships and patterns across diff columns