Introduction to Data Analysis using Python Flashcards

This section tackles base python, list&dict comprehension, pandas and basic manipulation

1
Q

What is data analysis in Python?

A

The process of inspecting, cleaning, and modeling data to extract insights.

Python offers built-in functions and libraries like pandas and numpy.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Name two built-in Python functions useful for data analysis.

A
  • sum()
  • len()
  • count()

Other useful functions: min(), max(), sorted().

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

True or False:

Base Python has built-in support for data frames.

A

False.

Data frames require external libraries like pandas.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Fill in the blank:

The function ____ () returns a sequence of numbers.

A

range()

Example: range(5) gives [0,1,2,3,4].

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Which module in Python provides mathematical functions for data analysis?

A

math

Example: math.sqrt(25) → 5.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is list comprehension in Python?

A

A concise way to create lists using a single line of code.

Example: [x**2 for x in range(5)].

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How do you create a dictionary comprehension?

A

{key: value for key, value in iterable}

Example: {x: x**2 for x in range(3)}.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

True or False:

List comprehensions are faster than traditional loops.

A

True.

They are optimized for performance in Python.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Fill in the blank:

[x for x in range(5) if x % 2 == 0] creates a list of ____ numbers.

A

Even

Output: [0, 2, 4].

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

How do you modify a list comprehension to include an else statement?

A

Using if-else before for

Example: [x if x % 2 == 0 else “odd” for x in range(5)].

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What function opens a file in Python?

A

open()

Example: f = open(‘data.txt’, ‘r’).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

How do you read all lines of a file at once?

A

.readlines()

Returns a list of lines.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

True or False:

The with statement automatically closes a file.

A

True.

Example: with open(‘file.txt’) as f:.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Fill in the blank:

To write to a file, use the mode ______.

A

w

“w” creates a new file or overwrites an existing one.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What does file.write(“Hello”) do?

A

Writes “Hello” to the file.

Does not automatically add a newline.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Which module helps read CSV files in base Python?

A

csv

Example: import csv.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

How do you read a CSV file in Python?

A

Using csv.reader()

Example: csv.reader(file).

18
Q

True or False:

CSV files can only contain numerical data.

A

False.

They can contain text, dates, and more.

19
Q

Fill in the blank:

csv.writer() is used to ____ data to a CSV file.

A

Write

Example: writer.writerow([‘Name’, ‘Age’]).

20
Q

Which library makes reading CSV files easier than using csv?

A

pandas

Example: df = pd.read_csv(‘data.csv’).

21
Q

What is Pandas?

A

A Python library for data manipulation and analysis.

Provides DataFrame and Series structures.

22
Q

How do you import Pandas?

A

import pandas as pd

pd is the most used alias.

23
Q

What is a DataFrame in Pandas?

A

A 2D labeled data structure.

Similar to a table in SQL or Excel.

24
Q

True or False:

Pandas requires NumPy to function.

A

True.

Pandas is built on top of NumPy.

25
# Fill in the blank: A Pandas **Series** is similar to a ______.
**Column** in a spreadsheet ## Footnote Series is a 1D labeled array.
26
How do you access the **first five** rows of a DataFrame?
**.head()** ## Footnote Example: df.head(). If zou want a specific number of rows, define the number inside the brackets. eg df.head(7)
27
What does **df.loc[2]** return?
The row at **index 2**. ## Footnote .loc[] is label-based.
28
# True or False: **.iloc[]** is label-based indexing.
**False.** ## Footnote .iloc[] is position-based.
29
How do you filter rows where **age > 30**?
df[df['age'] > 30] ## Footnote Boolean filtering.
30
# Fill in the blank: **df['column']** accesses a ______.
**Series** ## Footnote Example: df['name'].
31
How do you check for *missing* values?
**.isnull()** ## Footnote Returns True for missing values.
32
What does **.isnull().sum()** do?
Summarize the totals of missing values in each column
33
What does **df.dropna()** do?
**Removes rows** with missing values. ## Footnote Default: drops any row with at least one missing value.
34
# True or False: **.fillna()** removes NaN values.
**False.** ## Footnote .fillna() replaces NaNs with a specified value.
35
# Fill in the blank: **.dropna(how='all')** removes rows where ____ values are missing.
**All** ## Footnote Keeps rows with at least one non-null value.
36
How do you replace **missing values** with the column **mean**?
**df['col'].fillna(df['col'].mean())** ## Footnote Useful for numerical data.
37
What does **df.groupby('column')** do?
Groups rows by a column's values. ## Footnote Used for aggregating data.
38
Which method calculates the **mean for each group**?
**.mean()** ## Footnote Example: df.groupby('gender').mean().
39
# True or False: **Pivot tables** are a form of aggregation.
True. ## Footnote They reshape and summarize data.
40
How do you **count** occurrences of **unique values**?
**.value_counts()** ## Footnote Example: df['city'].value_counts().
41
# Fill in the blank: .agg({'col': 'sum'}) applies ______.
A specific function to a column ## Footnote Example: df.groupby('category').agg({'sales': 'sum'}).