Importing Flat Files & Other Data Flashcards

Importing Data in Python (Part 1)

1
Q

how to access the system shell in IPython (on DataCamp)

A

!

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

display directory contents

A

! ls

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

open a text file as read-only

A

open(‘file.txt’, ‘r’)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

print an open file

A

print(file.read())

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

check if a file is closed

A

file.closed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

close a file

A

file.close()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

alternative to opening and closing a file

A

context manager: with open(‘file.txt’) as file:

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

read one line of a file

A

file.readline()

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

flat files

A

table data without structural relationships (like a database would have)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

packages to import flat files

A

NumPy or pandas

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

how to import a flat file with NumPy

A

np.loadtext(file, delimiter=, skiprows=, usecols=, dtype=

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

tab delimiter

A

‘\t’

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

how to import mixed datatypes with NumPy

A

np.genfromtxt(file, delimiter=, names=, dtype=None)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

names argument

A

if =True, tells us there is a header

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

what does genfromtxt() produce

A

a structured array; 1D array where each element is a row of the flat file imported

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

access row of a structured array

A

array[index]

17
Q

access column of a structured array

A

array[‘Column name’]

18
Q

similar to genfromtxt() with default argument dtype=None

A

np.recfromcsv()

19
Q

np.recfromcsv() defaults

A

delimiter=’,’ names=True dtype=None

20
Q

importing flat file with pandas as DataFrame

A

pd.read_csv(‘file’)

21
Q

converting a DataFrame to numpy array

A

df.values

22
Q

missing values in a DataFrame

A

NA or NaN (use na_values argument to specify string to replace)

23
Q

pandas equivalent of delimiter

A

sep=

24
Q

comment argument

A

removes comments after a given character (eg: comment=’#’)

25
Q

explore working directory in Python

A

import os

os.listdir(os.getcwd())

26
Q

importing pickle files

A

import pickle

pickle.load(file) (after first opening up the context manager)

27
Q

import Excel with pandas

A

pd.ExcelFile(file)

28
Q

Excel sheet names

A

spreadsheet.sheet_names

29
Q

import a given sheet

A

spreadsheet.parse(‘specific sheet’)

30
Q

how to import SAS files

A

import SAS7BDAT from sas7bdat

SAS7BDAT.to_data_frame(file)

31
Q

context manager for SAS files

A

with SAS7BDAT(‘file’) as file:

32
Q

import stata (.dta) files with pandas

A

pd.read_stata(‘file’)

33
Q

importing HDF5 files

A

h5py.File(file, ‘r’)

34
Q

importing MATLAB files

A

import scipy.io

scipy.io.loadmat(‘filename’)