Fundementals of Data Science Flashcards

1
Q

A Data Scientist must find patterns within the data. Before he/she can find the patterns, he/she must organize the data in a standard format. What are the eight steps?

A
  1. Ask the right questions
  2. Explore and collect data
  3. Extract the data
  4. Clean the data
  5. Find and replace the missing values
  6. Normalize data
  7. Analyze data, find patterns and make future predictions
  8. Present the result
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is a data frame?

A

A structured representation of data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is a variable?

A

Something that can be measured or counted.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

How do you create a data frame with pandas?

A

import pandas as pd

d = {‘col1’: [1, 2, 3, 4, 7], ‘col2’: [4, 5, 6, 9, 5], ‘col3’: [7, 8, 12, 1, 11]}

df = pd.DataFrame(data=d)

print(df)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What do I put into pandas to find the number of columns?

A

df.shape[1]

import pandas as pd

d = {‘col1’: [1, 2, 3, 4, 7], ‘col2’: [4, 5, 6, 9, 5], ‘col3’: [7, 8, 12, 1, 11]}

df = pd.DataFrame(data=d)
count_column = df.shape[1]

print(“Number of columns:”)
print(count_column)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What do I put into pandas to find the number of rows?

A

df.shape[0]

import pandas as pd

d = {‘col1’: [1, 2, 3, 4, 7], ‘col2’: [4, 5, 6, 9, 5], ‘col3’: [7, 8, 12, 1, 11]}

df = pd.DataFrame(data=d)
count_row = df.shape[0]

print(“Number of rows:”)
print(count_row)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What python function finds the highest value in an array?

A

max()

Average_pulse_max = max(80, 85, 90, 95, 100, 105, 110, 115, 120, 125)

print (Average_pulse_max)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What python function finds the lowest value in an array?

A

min()

Average_pulse_min = min(80, 85, 90, 95, 100, 105, 110, 115, 120, 125)

print(Average_pulse_min)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What NumPy function is used to find the average value of an array

A

mean()

import numpy as np

Calorie_burnage = [240, 250, 260, 270, 280, 290, 300, 310, 320, 330]

Average_calorie_burnage = np.mean(Calorie_burnage)

print(Average_calorie_burnage)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What needs to happen before the data can be analyzed?

A

It must be imported/extracted.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How do you import data using Pandas in Python?

A

read_csv()

import pandas as pd

health_data = pd.read_csv(“data.csv”, header=0, sep=”,”)

print(health_data)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q
A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly