Module 2 - Getting Started with Data Gathering and Investigation Flashcards

1
Q

a collection of data is referred to as a

A

Datasets

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

may exist for the private use of an individual organization or shared across the internet to anyone who wants to reference them.

A

Datasets

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Give an example of private datasets

A

An example of a private dataset is a physician’s patient dataset, which might include patient demographics, test results, diagnosis, and appointment schedules.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Give an example of public datasets

A

World Health Organization (WHO) open data repository, which contains health-related statistics for its 194 member countries and can be downloaded by anyone.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Datasets often contain multiple what?

A

related files stored in different formats.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Information about a dataset, including a description of what it contains and how it is formatted, is called?

A

metadata

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

are valuable tools to provide analysts with an understanding of the data within the dataset.

A

Metadata files

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is Comma Separated Values (CSV) format. ?

A

most common formats used to package and exchange data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

There are two basic ways to perform calculations in Excel:

A

functions and formulas

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are the basic formulas then?

A

+ , -, *, /

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are the five basic functions on excel?

A

Sum, Average, Count, Max, and Min.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

This function adds the numeric values in the referenced cells

A

SUM

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

This function averages the numeric values in the referenced cells

A

AVERAGE

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

This function returns the highest numeric value in the set of referenced cells

A

MAX

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

This function returns the lowest numeric value in the set of referenced cells

A

MIN

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

When performing any kind of data experiment or analysis, it is critical to define the key characteristics that need to be measured or observed. These characteristics to be studied are called?

A

variables

17
Q

The recordings of the values, patterns, and occurrences for a set of variables are?

A

observations

18
Q

The value or set of values for a specific observation is called ?

A

data point.

19
Q

The collection of what makes up the dataset for your analysis?

A

observations

20
Q

Observations usually have a purpose, and the variables included will depend on their relevance to that purpose. For example, if you have lost your pet and have asked other people to help you search for it, only a small set of variables—the dog’s characteristics—are relevant to their observations. What can be those questions?

A
  • What type of animal is your pet? It is a dog.
  • What type of dog? It is a Schnauzer.
  • What color is your Schnauzer? It is gray.
  • What size is the Schnauzer? It is a medium sized Schnauzer.
  • How much does the Schnauzer weigh? It weighs 15 kg.
21
Q

Differentiate Observations vs Variables

A

In data analysis, “observation” and “variable” are two fundamental concepts that are often used to describe the components of a dataset and the process of analyzing that data.

Observation:
An observation refers to a single unit of data or a single data point within a dataset. It represents a specific instance or case that has been recorded. Observations can be people, animals, events, measurements, or any other entities that are being studied or measured. For example, if you’re analyzing data about students in a classroom, each student’s data would represent a separate observation.

Variable:
A variable is a characteristic, attribute, or quantity that can be measured or observed and can vary among different observations. Variables can take on different values for different observations. Variables are the properties you are interested in studying within your dataset. They can be of different types, such as numerical (quantitative) or categorical (qualitative).

Numerical variables: These are variables that represent quantities and can be measured on a numeric scale. Examples include age, height, weight, temperature, and test scores.

Categorical variables: These are variables that represent categories or groups. Examples include gender, color, type of car, or educational level.

In the context of data analysis, you typically work with datasets that consist of multiple observations, each having values for various variables. The goal of analysis is often to explore relationships, patterns, and trends among the variables across different observations. This helps researchers and analysts draw meaningful insights and conclusions from the data.

In summary, observations are the individual data points or cases within a dataset, while variables are the characteristics or attributes that you’re interested in analyzing within those observations.

22
Q

The variables will either be?

A

categorical or numerical.

23
Q

These variables indicate membership in a particular group and have a discrete or specific qualitative value.

A

Categorical variables

24
Q

What are the two classified types of Categorical Variables?

A

Nominal and Ordinal

25
Q

These are variables that consist of two or more discrete categories whose value is assigned based on the identity of the object. Examples are gender, eye color or type of animal.

A

Nominal

26
Q

These are variables that consist of two or more categories in which order matters in the value. Examples are student class rank (1st, 2nd, 3rd) or satisfaction survey scales (dissatisfied, neutral, satisfied).

A

Ordinal

27
Q

Numerical variables are what?

A

Numerical Values

28
Q

What are the two classified types of Numerical Variables / Values?

A

Continuous and Discrete

29
Q

These are variables that are quantitative and can be measured along a continuum or range of values.

A

Continuous

30
Q

These types of continuous variables are quantitative but have a specific value from a finite set of values. Examples include the number of sensors activated in a network, or the number of cars in a lot.

A

Discrete

31
Q

There are two types of continuous variables which are?

A

Interval Variables and Ratio Variables

32
Q

These variables can have any value within the range of values, and examples are temperature or time

A

Interval variables

33
Q

These variables are special interval variables where a value of zero (0) can mean that there is none of that variable and examples are income or sales volume.

A

Ratio variables

34
Q
A