Importing Data Flashcards

You may prefer our related Brainscape-certified flashcards:
1
Q

What does the command ! ls do?

A

The IPython magic command ! lswill display the contents of your current directory

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

np.loadtext() what are some of the arguments it takes?

A

There are a number of arguments that np.loadtxt() takes that you’ll find useful: delimiter changes the delimiter that loadtxt() is expecting, for example, you can use ‘,’ and ‘\t’ for comma-delimited and tab-delimited respectively; skiprows allows you to specify how many rows (not indices) you wish to skip; usecols takes a list of the indices of the columns you wish to keep.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What does np.genfromtxt() do?

A

There is another function, np.genfromtxt(), which can handle such structures. If we pass dtype=None to it, it will figure out what types each column should be.

Import ‘titanic.csv’ using the function np.genfromtxt() as follows:

data = np.genfromtxt(‘titanic.csv’, delimiter=’,’, names=True, dtype=None)

Here, the first argument is the filename, the second specifies the delimiter , and the third argument names tells us there is a header. Because the data are of different types, data is an object called a structured array. Because numpy arrays have to contain elements that are all the same type, the structured array solves this by being a 1D array, where each element of the array is a row of the flat file imported. You can test this by checking out the array’s shape in the shell by executing np.shape(data).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Good to remember

A

There is also another function np.recfromcsv() that behaves similarly to np.genfromtxt(), except that its default dtype is None.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How do you convert a dataframe df into a numpy array?

A

df.values converts the dataframe df into a numpy array

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is a pickled file?

A

There are a number of datatypes that cannot be saved easily to flat files, such as lists and dictionaries. If you want your files to be human readable, you may want to save them as text files in a clever manner. JSONs, which you will see in a later chapter, are appropriate for Python dictionaries.

However, if you merely want to be able to import them into Python, you can serialize them. All this means is converting the object into a sequence of bytes, or a bytestream.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How do you open an excel file in pandas?

How do you check the worksheet names?

A

df=pd.ExcelFile(filename)

df.sheet_names

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

How do you Load a sheet into a DataFrame by name?

How do you Load a sheet into a DataFrame by index?

A

df1 = xl.parse(‘2004’)

where x1 is the dataframe which contains all the worksheets

df2 = xl.parse(0)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Parse the second sheet by index. In doing so, parse only the first column with the parse_cols parameter, skip the first row and rename the column ‘Country’. The argument passed to parse_cols also needs to be of type list.

A

df2 = xl.parse(1, parse_cols=[0], skiprows=[0], names=[‘Country’])

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

How do you correctly import the function SAS7BDAT() from the package sas7bdat?

A

from sas7bdat import SAS7BDAT

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How do you read Stata files as Dataframes?

A

pd.read_stata(filename)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

How do you make a histogram from a column in a dataframe?

A

pd.DataFrame.hist(df[[‘filename’]])

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is the correct way of using the h5py function, File(), to import the file in h5py_file into an object, h5py_data, for reading only?

A

h5py_data = h5py.File(h5py_file, ‘r’)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

How do you load a Matlab file?

A

scipy.io.loadmat(‘albeck_gene_expression.mat’)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

How do you create a connection to a relational database?

A
#Import necessary module
**from sqlalchemy import create\_engine**

Create engine: engine
engine=create_engine( ‘sqlite:///Chinook.sqlite’)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

How do you view all tables in a relational database?

A

table_names=engine.table_names()

17
Q

What is the workflow of SQL querying?

A

Workflow of SQL querying

● Import packages and functions

● Create the database engine

● Connect to the engine

● Query the database

● Save query results to a DataFrame

● Close the connection

18
Q

How do you query a sql database using pandas?

A

df = pd.read_sql_query(“SELECT * FROM Orders”, engine)

19
Q
A