Introduction to Data Analysis using Python Flashcards

This section tackles base python, list&dict comprehension, pandas and basic manipulation

1
Q

What is data analysis in Python?

A

The process of inspecting, cleaning, and modeling data to extract insights.

Python offers built-in functions and libraries like pandas and numpy.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Name two built-in Python functions useful for data analysis.

A
  • sum()
  • len()
  • count()

Other useful functions: min(), max(), sorted().

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

True or False:

Base Python has built-in support for data frames.

A

False.

Data frames require external libraries like pandas.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Fill in the blank:

The function ____ () returns a sequence of numbers.

A

range()

Example: range(5) gives [0,1,2,3,4].

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Which module in Python provides mathematical functions for data analysis?

A

math

Example: math.sqrt(25) → 5.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is list comprehension in Python?

A

A concise way to create lists using a single line of code.

Example: [x**2 for x in range(5)].

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How do you create a dictionary comprehension?

A

{key: value for key, value in iterable}

Example: {x: x**2 for x in range(3)}.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

True or False:

List comprehensions are faster than traditional loops.

A

True.

They are optimized for performance in Python.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Fill in the blank:

[x for x in range(5) if x % 2 == 0] creates a list of ____ numbers.

A

Even

Output: [0, 2, 4].

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

How do you modify a list comprehension to include an else statement?

A

Using if-else before for

Example: [x if x % 2 == 0 else “odd” for x in range(5)].

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What function opens a file in Python?

A

open()

Example: f = open(‘data.txt’, ‘r’).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

How do you read all lines of a file at once?

A

.readlines()

Returns a list of lines.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

True or False:

The with statement automatically closes a file.

A

True.

Example: with open(‘file.txt’) as f:.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Fill in the blank:

To write to a file, use the mode ______.

A

w

“w” creates a new file or overwrites an existing one.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What does file.write(“Hello”) do?

A

Writes “Hello” to the file.

Does not automatically add a newline.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Which module helps read CSV files in base Python?

A

csv

Example: import csv.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

How do you read a CSV file in Python?

A

Using csv.reader()

Example: csv.reader(file).

18
Q

True or False:

CSV files can only contain numerical data.

A

False.

They can contain text, dates, and more.

19
Q

Fill in the blank:

csv.writer() is used to ____ data to a CSV file.

A

Write

Example: writer.writerow([‘Name’, ‘Age’]).

20
Q

Which library makes reading CSV files easier than using csv?

A

pandas

Example: df = pd.read_csv(‘data.csv’).

21
Q

What is Pandas?

A

A Python library for data manipulation and analysis.

Provides DataFrame and Series structures.

22
Q

How do you import Pandas?

A

import pandas as pd

pd is the most used alias.

23
Q

What is a DataFrame in Pandas?

A

A 2D labeled data structure.

Similar to a table in SQL or Excel.

24
Q

True or False:

Pandas requires NumPy to function.

A

True.

Pandas is built on top of NumPy.

25
Q

Fill in the blank:

A Pandas Series is similar to a ______.

A

Column in a spreadsheet

Series is a 1D labeled array.

26
Q

How do you access the first five rows of a DataFrame?

A

.head()

Example: df.head(). If zou want a specific number of rows, define the number inside the brackets. eg df.head(7)

27
Q

What does df.loc[2] return?

A

The row at index 2.

.loc[] is label-based.

28
Q

True or False:

.iloc[] is label-based indexing.

A

False.

.iloc[] is position-based.

29
Q

How do you filter rows where age > 30?

A

df[df[‘age’] > 30]

Boolean filtering.

30
Q

Fill in the blank:

df[‘column’] accesses a ______.

A

Series

Example: df[‘name’].

31
Q

How do you check for missing values?

A

.isnull()

Returns True for missing values.

32
Q

What does .isnull().sum() do?

A

Summarize the totals of missing values in each column

33
Q

What does df.dropna() do?

A

Removes rows with missing values.

Default: drops any row with at least one missing value.

34
Q

True or False:

.fillna() removes NaN values.

A

False.

.fillna() replaces NaNs with a specified value.

35
Q

Fill in the blank:

.dropna(how=’all’) removes rows where ____ values are missing.

A

All

Keeps rows with at least one non-null value.

36
Q

How do you replace missing values with the column mean?

A

df[‘col’].fillna(df[‘col’].mean())

Useful for numerical data.

37
Q

What does df.groupby(‘column’) do?

A

Groups rows by a column’s values.

Used for aggregating data.

38
Q

Which method calculates the mean for each group?

A

.mean()

Example: df.groupby(‘gender’).mean().

39
Q

True or False:

Pivot tables are a form of aggregation.

A

True.

They reshape and summarize data.

40
Q

How do you count occurrences of unique values?

A

.value_counts()

Example: df[‘city’].value_counts().

41
Q

Fill in the blank:

.agg({‘col’: ‘sum’}) applies ______.

A

A specific function to a column

Example: df.groupby(‘category’).agg({‘sales’: ‘sum’}).