Course-4 Process data from dirty to clean Flashcards

Question

Dirty data

Answer 1

Data that is incomplete, incorrect, or irrelevant to the problem you're trying to solve.

Answer 2

Data that is complete, correct, and relevant to the problem your trying to solve

Answer 3

Transform data into a useful format for analysis and give it a reliable infrastructure

Answer 4

Develop processes and procedures to effectively store and organise data.

Answer 5

An indication that a value does not exist in a dataset.

Answer 6

A single piece of information from a row or column of a spreadsheet

Answer 7

A tool for determining how many characters can be keyed into a field

Answer 8

A tool for checking the accuracy and quality of data before adding or importing it.

Answer 9

The Concept of using data integrity principles ton ensure measures conform to defined business rules or constraints.

Answer 10

Data collected five years ago used technology that is not approved or supported by the business.

Answer 11

The degree of conformity of a measure to a standard or a true value.

Answer 12

Addresses in the business database are identified as incorrect when compared to the public postal service database.

Answer 13

The degree to which all required measures are known.

Answer 14

Null/missing value for the item number of employees per store.

Answer 15

The degree to which a set of measures is equivalent across systems.

Answer 16

Date of store opening stored in both MM/DD/YYYY and MM/YY formats.

Answer 17

An agreement that unites two organisations into a single new one.

Answer 18

The process of combining two or more datasets into a single dataset.

Answer 19

How well two or more datasets are able to work together.

Answer 20

- Do I have all the data I need? - Does the data I need exist within these datasets? - Does the data need to be cleaned, or are they ready for me to use? - Are the datasets cleaned to t he same standard?

Answer 21

The user converts the data from the current long format (more rows than columns) to the wide format (more columns than rows).

Answer 22

A spreadsheet tool that changes how cells appear when values meet specific conditions.

Answer 23

A tool that automatically searches for and eliminates duplicate entries from a spreadsheet.

Answer 24

A group of characters within a cell, most often composed of letters, numbers or both.

Answer 25

A tool that divides text around a specified character and puts each fragment into a new or separate cell.

Answer 26

A function that joins multiple text strings into a single string

Answer 27

A set of instructions that performs a specific calculation using the data in a spreadsheet

Answer 28

A function that returns the number of cells that match a specified value

Answer 29

A predetermined structure that incudes all required information and its proper placement

Answer 30

= COUNTIF( range, "value")

Answer 31

A function that tells you the length of a text string by counting the number of characters it contains.

Answer 32

= LEN (range)

Answer 33

A function that gives you a set number of characters from the left side of the text string.

Answer 34

A function that gives you a set number of characters from the right side of a text string.

Answer 35

= Left ( range, number of characters)

Answer 36

=Right (range,number of characters)

Answer 37

A function that gives you a segment from the middle of a text string.

Answer 38

=MID ( range, reference starting point, number of middle characters)

Answer 39

= CONCATENATE( item-1, Item 2)

Answer 40

A function that removes leading, trailing, and repeated spaces in data.

Answer 41

=Trim(range)

Answer 42

Arranging data into a meaningful order makes it easier to understand, analyze, and visualise.

Answer 43

Showing only the data that meets a specific criteria while hiding the rest.

Answer 44

A data summarization tool that is used in data processing.

Answer 45

Vertical Lookup

Answer 46

A function that searches for a particular value in a column to return a corresponding piece of information.

Answer 47

=VLOOKUP (data to look up, 'where to look'! Range , column, false)

Answer 48

The process of matching fields from one data source to another.

Answer 49

A way of describing how something is organised

Answer 50

- Data validation - Conditional formatting - COUNTIF - Sorting - Filtering

Answer 51

- Different data cleaning functions in spreadsheets and SQL - How SQL can be used to clean large data sets - Apply basic SQL functions for transforming data and cleaning strings

Answer 52

Spreadsheets - Generated with a program - Access to the data you input - Stored locally -Small datasets - Working independently - Built-in functionalities SQL -A language used to interact with database programs -Can pull information from different sources in the database -Stored across a database - Larger datasets - Tracks changes across the team - Useful across multiple programs

Answer 53

Can be used to convert anything from one data type to another

Answer 54

A number that contains a decimal

Answer 55

Converting data from one type to another

Answer 56

Adds strings together to create new text strings that can be used as unique keys

Answer 57

Can be used to return non-null values in a list.

Answer 58

A process to confirm that a data-cleaning effort was well-executed and the resulting data is accurate and reliable.

Answer 59

A file containing a chronologically ordered list of modifications made to a project.

Answer 60

1) Consider the business problem 2) Consider the goal 3) Consider the data

Answer 61

A tool that automatically searches for and eliminates duplicate entries from a spreadsheet.

Answer 62

A tool that looks for a specified search term in a spreadsheet and allows you to replace it with something else.

Answer 63

A function that counts the total number of values within a specified range

Answer 64

The CASE statement goes through one or more conditions and returns a value as soon a condition is met

Answer 65

The process of tracking changes, additions, deletions, and errors involved in your data-cleaning effort.

Answer 66

The process of tracking changes, additions, deletions, and errors involved in your data-cleaning effort.

Answer 67

-Recover data-cleaning errors - Inform other users of changes - Determine the quality of data -

Answer 68

Problem Action Result

Answer 69

Skills and qualities that can transfer from one job or industry to another

Answer 70

- Problem: Previously-absent workflow procedures. - Action: Implemented and communicated daily workflow procedures. Result: 15% Increase in productivity.

Answer 71

Non-Technical skills traits and behaviors that relate to how you work.

Answer 72

- Healthcare analyst - Marketing analyst - Business intelligence analyst - Financial analyst -

Course-4 Process data from dirty to clean Flashcards

(97 cards)