FA5 + M5 - Sheet1 Flashcards

1
Q

Which of the following libraries are used for mathematical and statistical operations on multi-dimensional arrays and matrices in Python?

Group of answer choices

Matplotlib

NumPy

Pandas

A

NumPy

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Which of the following libraries are used for data visualization in Python?

Group of answer choices

NumPy

Matplotlib

SciPy

A

Matplotlib

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Which of the following libraries are used for deep learning in Python?

Group of answer choices

TensorFlow

Scikit-learn

Keras

A

TensorFlow

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Which of the following libraries are used for natural language processing in Python?

Group of answer choices

NLTK

Scrapy

Scikit-learn

A

NLTK

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Which of the following libraries are used for creating spiders bots that scan website pages and collect structured data in Python?

Group of answer choices

Scrapy

Pandas

SciPy

A

Scrapy

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Which of the following libraries are used for object identification, speech recognition, and more in Python?

Group of answer choices

PyTorch

Keras

Dist-keras

A

Tensorflow dapat pero Pytorch ung tama sa canvas

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Which of the following libraries are used for reading data, selecting and filtering in data, and data manipulations in Python? There are two correct answer in the options, just choose one.

Group of answer choices

PyTorch

Pandas

NumPy

SciPy

A

Pandas
NumPy

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Which of the following libraries are used for creating interactive and scalable visualizations in a browser using JavaScript widgets in Python? There are two correct ansers from the choices, just select one.

Group of answer choices

SciPy

Bokeh

NumPy

Bokeh

Plotly

Plotly

NumPy

SciPy

A

Bokeh
Bokeh

Plotly
Plotly

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Which Python libraries are built on NumPy? There are two correct ansers from the choices, just select one.

Group of answer choices

Pandas

Seaborn

Scikit-Learn

Matplotlib

A

Pandas
Scikit-Learn

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Which Python library provides machine learning algorithms?

Group of answer choices

Pandas

Scikit-Learn

NumPy

Matplotlib

A

Scikit-Learn

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Data Wrangling:

A

SciPy
NumPy
pandas

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Statistic

A

StatsModels

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

NLP

A

Natural Language Toolkit
SpaCy
gensim

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Machine Learning

A

scikitlearn
xgboost
lightgbm
catboost
eli5

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Deep Learning

A

TensorFlow
Pytorch
Keras

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Distributed Deep Learning

A

dist-keras
elephas
spark-deep-learning

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Visualization

A

matplotlib
Bokeh
plotly
Seaborn
pydot

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

it is intended for processing large multidimensional arrays and matrices, and an extensive collection of high-level mathematical functions and implemented methods makes it possible to perform various operations with these objects

A

NumPy (numpy.org)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

it is based on NumPy and therefore extends its capabilities. SciPy main data structure is again a multidimensional array, implemented by Numpy.

A

SciPy (scipy.org/scipylib)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

The package contains tools that help with solving linear algebra, probability theory, integral calculus and many more tasks

A

SciPy

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

provides high-level data structure and a vast variety of tools for analysis. The great feature of this package is the ability to translate rather complex operations with data into one or two commands.

A

Pandas (pandas.pydata.org)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

contains many built-in methods for grouping, filtering, and combining data, as well as the time-series functionality

A

Pandas

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

is a low-level library for creating two-dimensional diagrams and graphs.

A

Matplotlib (matplotlib.org)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

With iths help, you can build diverse charts, from histograms and scatterplots to non-Cartesian coordinates graphs.

A

Matplotlib (matplotlib.org)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

Moreover, many popular plotting libraries are designed to work in conjunction with ____

A

matplotlib

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

is essentially a higher-level API based on the matplot library.

A

Seaborn (seaborn.pydata.org)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

It contains more suitable default settings for processing charts.

A

Seaborn

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

Also, there is a rich gallery of visualizations including some complex types like time series, jointplots, and violin diagrams

A

Seaborn

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

is a popular library that allows you to build sophisticated graphics easily.

A

Plotly (plot.ly/python/)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

The package is adapted to work in interactive web applications.

A

Plotly

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

Among its remarkable visualizations are contour graphics, ternary plots, and 3D charts

A

Plotly

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

The ____library creates interactive and scalable visualizations in a browser using JavaScript widgets.

A

Bokeh (bokeh.pydata.org/en/latest/)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

The library provides a versatile collection of graphs, styling possibilities, interaction abilities in the form of linking plots, adding widgets, and defining callbacks, and many more useful features.

A

Bokeh

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

is a popular framework for deep and machine learning, developed in Google Brain.

A

TensorFlow (tensorflow.org)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
35
Q

It provides abilities to work with artificial neural networks with multiple data sets.

A

TensorFlow

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
36
Q

Among the most popular TensorFlow applications are _____ and more.

A

object identification, speech recognition,

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
37
Q

is a large framework that allows you to perform tensor computations with GPU acceleration, create dynamic computational graphs and automatically calculate gradients.

A

PyTorch (pytorch.org)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
38
Q

Above this, ____ offers a rich API for solving applications related to neural networks

A

PyTorch

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
39
Q

is a high-level library for working with neural networks, running on top of TensorFlow, Theano, and now as a result of the new releases.

A

Keras (keras.io)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
40
Q

It simplifies many specific tasks and greatly reduces the amount of monotonous code. However, it may not be suitable for some complicated things.

A

Keras (keras.io)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
41
Q

These packages allow you to train neural networks based on the Keras library directly with the help of Apache Spark

A

Dist-keras (joerihermans.com/work/distributed-keras/)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
42
Q

dist-keras and others are gaining popularity and developing rapidly, and it is very difficult to single out one of the libraries since they are all designed to ______

A

solve a common task.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
43
Q

This Python module based on NumPy and SciPy is one of the best libraries for working with data.

A

Scikit-learn (scikit-learn.org/stable)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
44
Q

It provides algorithms for many standard machine learning and data mining tasks such as clustering, regression, classification, dimensionality reduction, and model selection

A

Scikit-learn

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
45
Q

is an extension module that makes several frequent item set mining implementations available as functions.

A

PyFim

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
46
Q

In PyFim, Currently _______ are available as functions, although the interfaces do not offer all of the options of the command line progarm

A

apriori, eclat, fpgrowth, sam, relim, carpenter, ista, accretion and apriacc

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
47
Q

Often the results of machine learning models predictions are not entirely clear, and this is the challenge that ___ library helps to deal with.

A

eli5

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
48
Q

it is a package for visualization and debugging machine learning models and tracking the work of an algorithm step by step.

A

Eli5 (eli5.readthedocs.io/en/latest/)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
49
Q

It provides support for scikit-learn, XGBoost, LightGBM, lightning, and sklearn-crfsuite libraries and performs the different tasks for each of them

A

eli5

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
50
Q

is a set of libraries, a whole platform for natural language processing.

A

NLTK (nltk.org)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
51
Q

With the help of ____, you can process and analyze text in a variety of ways, tokenize and tag it, extract information, etc.

A

NLTK

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
52
Q

is also used for prototyping and building research systems

A

NLTK

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
53
Q

is a Python library for robust semantic analysis, topic modeling and vector-space modeling, and is built upon Numpy and Scipy.

A

Gensim (radimrehurek.com/gensim)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
54
Q

Gensim provides an implementation of popular NLP algorithms, such as _____.

A

word2vec

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
55
Q

Although gensim has its own models.wrappers.fasttext implementation, the ____ can also be used for efficient learning of word representations.

A

fasttext library

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
56
Q

is a library used to create spiders bots that scan website pages and collect structured data.

A

Scrapy (scrapy.org)

57
Q

In addition, Scrapy can extract data from the ___

58
Q

The library happens to be very handy due to its extensibility and portability

59
Q

Introduces for multi-dimensional arrays and matrices, as well as functions that allow to easily perform advanced mathematical and statistical operations on those objects

60
Q

Provides vectorization of mathematical operations on array and matrices which significantly improves the performance

61
Q

Many other python libraries are built on ____

62
Q

adds data structures and tools designed to work with table - like data (similar to Series and Data Frames in R)

63
Q

Provides tools and data manipulation: reshaping, sorting, slicing, aggregation etc.

64
Q

Allow handling missing data

65
Q

provides machine learning algorithms: classification, regression, clustering, and model validation

A

Scikit-Learn

66
Q

Build on NumPy, SciPy, and matplotlib

A

Scikit-Learn

67
Q

Python 2D plotting library which produces publication quality figures in a variety of hardcopy formats

A

matplotlib

68
Q

A set of functionalities similar to those of MATLAB

A

matplotlib

69
Q

Line plots, scatter plots, bar charts, histograms, pie charts etc

A

matplotlib

70
Q

Relatively low-level; some effort needed to create advanced visualization

A

matplotlib

71
Q

based on matplotlib. Provides high level interface for drawing attractive statistical graphics

72
Q

Seaborn is similar (in style) to the popular ___ library in R

73
Q

Loading Python Libraries

A

import numpy as np
import scipy as sp
impor pandas as pd
import matplotlib as mpl
import seaborn as sns

74
Q

Press ____ to execute jupyter cell

A

Shift+Enter

75
Q

There are numerous commands to read other data formats:

A

pd.read.excel(‘myfile.xlsx’, sheet_name = ‘Sheet1’, index_col = None, na_values = [‘NA’])

pd.read_stata(‘myfile.dts’)

76
Q

List first 5 records

77
Q

To view the first 10 records

A

pd.iloc[:10]

78
Q

To view the last few records

A

df.tail(10)

79
Q

The most general dtype. Will be assigned to your column if column has mixed type numbers and strings

A

object (string)

80
Q

Numeric characters, 64 refers to the memory allocated to hold this character

A

Int64 (Int)

81
Q

Numeric characters with decimals. If a column contains number and Nans, pandas will default to float64, in case your missing value has a decimal

A

Float64 (Float)

82
Q

Values meant to hold time data. Look into these for time series experiments

A

Datetime64, timedelta[ns] (N/A)

83
Q

Check a particular column type

A

df[‘salary’].dtype

84
Q

Check types for all the columns

85
Q

list the types of the columns

86
Q

list the column names

87
Q

list the row labels and column names

88
Q

number of dimensions

89
Q

number of elements

90
Q

return a tuple representing the dimensionality

91
Q

numpy representation of the data

92
Q

Unlike attributes, python methods have ___

A

parentheses.

93
Q

All attributes and methods can be listed with a ____

A

dir() function

94
Q

first/last n rows

A

head( [n] ), tail( [n] )

95
Q

generate descriptive statistics (for numeric columns only)

A

describe()

96
Q

return max/min values for all numeric columns

A

max(), min()

97
Q

return mean/median values for all numeric columns

A

mean(), median()

98
Q

standard deviation

99
Q

returns a random sample of the data frame

A

sample([n])

100
Q

drop all the records with missing values

101
Q

Using “group by” method we can:

A

Split the data into groups based on some criteria
Calculate statistics (or apply a function) to each group
Similar to dplyr() function in R

102
Q

group data using rank

A

df_rank = df.groupby([‘rank’])

103
Q

To subset the data we can apply ____

A

Boolean indexing.

104
Q

To subset the data we can apply Boolean indexing.

This indexing is commonly known as a ____

105
Q

Any ____ can be used to subset the data:

A

Boolean operator

106
Q

There are a number of ways to subset the Data Frame:

A

one or more columns
one or more rows
a subset of rows and columns

107
Q

Rows and columns can be selected by their position or label

108
Q

When selecting one column, it is possible to use single set of brackets, but the resulting object will be a ____(not a DataFrame):

109
Q

When we need to select more than one column and/or make the output to be a DataFrame, we should use ____

A

double brackets:

110
Q

When summing the data, missing values will be treated as ___

111
Q

If all values are missing, the sum will be equal to____

112
Q

methods ignore missing values but preserve them in the resulting arrays

A

cumsum() and cumprod()

113
Q

Missing values in ___ method are excluded (just like in R)

114
Q

Many descriptive statistics methods have ___ option to control if missing data should be excluded. This value is set to True by default (unlike R)

115
Q

computing a summary statistic about each group, i.e.

compute group sums or means
compute group sizes/counts

A

Aggregation

116
Q

Common aggregation functions:

A

min, max
count, sum, prod
mean, median, mode, mad
std, var

117
Q

are useful when multiple statistics are computed per column

118
Q

Basic statistic (count, mean, std, min, quantiles, max)

A

describe()

119
Q

Minimum and maximum values

120
Q

Arithmetic average, median, and mode

A

mean, median, mode

121
Q

Variance and standard deviation

122
Q

Standard error of mean

123
Q

Sample skewness

124
Q

kurtosis

125
Q

histogram

126
Q

estimate of central tendency for a numeric variable

127
Q

similar to boxplot, also shows the probability density of the data

A

violinplot

128
Q

Scatterplot

129
Q

regression plot

130
Q

Pairplot

131
Q

Boxplot

132
Q

categorical scatterplot

133
Q

general categorical plot

A

factorplot

134
Q

both have a number of function for statistical analysis

A

statsmodel and scikit-learn

135
Q

mostly used for regular analysis using R style formulas

A

statsmodel

136
Q

is more tailored for Machine Learning

A

scikit-learn

137
Q

statsmodels:

A

inear regressions
ANOVA tests
hypothesis testings
many more

138
Q

scikit-learn:

A

kmeans
support vector machines
random forests
many more