Week 2 Flashcards

1
Q

5 steps in Machine Learning Process

A
  • Data Collection
  • Data Exploration and Preparation
  • Model Training
  • Model Evaluation
  • Model Improvement
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Data Collection

A

Involves gathering learning materials or data set that algorithm will use to generate an actionable knowledge or intelligence. In most cases, the data will need to be combined into a single source like a spreadsheet or data set.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Data Exploration and Preparation

A

Exploring the data generally involves getting a basic understanding of a dataset through numerous variable summaries and visual plots. Through this exploration, we will often identify problems with the data, including missing values, noise, erroneous data, and skewed distributions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Model Training

A

A subset of the entire data set and is used to build up the model.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Model Evaluation

A
  • Because each machine learning results in a biased solution to the learning problem it is important to evaluate how well the algorithm learns from its experience and by the experience of the algorithm as more data becomes available.
  • Its the stage in the machine learning process where you test how well the algorithm has work from the data
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Model Improvement

A

If better performance is needing you might need to supplement data to become more efficient or perform additional steps to acquire accurate data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Vectors

A
  • Fundamentals data structure of R
  • all the elements in the vector must be of the same type
  • either all elements are ‘characters’ or all elements are numeric or logical
  • two special values “Null” and “NA” to indicate missing values
  • R vectors are ordered so accessing it would require counting the position of the element
    • Indexing always begins with [1]
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

How are Vectors created?

A

Vectors are created using

c()

Example :

subjects

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Factors

A
  • A special case of the vector that is solely used to represent categorical or ordinal variables.
  • Example: Creating a factor from a vector
    • subjects
    • Faculty
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Categorical vs Ordinal Variables

A
  1. The categorical or nominal variable has one or more categories. (female, male)
  2. The ordinal variable has a clear ordering of its elements
    • (high, medium, low)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

List

A
  • Used for storing an ordered set of elements
  • unlike vecotor, list is a collection of all kinds of data elements
    • subject1
    • temperature = temperature[1],
    • flu_status = flu_status[],
    • gender = gender[1],
    • blood = blood[1]
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Data Frame

A
  • Most used data structure in R
  • List of vectors or factors, each having exactly the same number of values
  • syntax: data.frame()
  • example:
    • emp.data
    • emp_id = c (1:3),
    • emp_name = c(“Pat”, “John”, “Mike”)
    • stringAsFactors = FALSE
    • )
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Data Frame Contd….

A
  • Analogous to spreadsheet
  • a data frame is two dimensional and is often displayed as a matric (rows and columns)
  • in the language of machine learning, columns are often called as ‘Features’ or ‘Attributes’, while rows are called ‘examples’
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

to extract a column :

A

Emp.data$emp_name

Emp.data[c(“emp_name”,”salary”)]

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Matrix

A
  • Two dimensional table with rows and columns
  • matric can contain only one type of data elements
  • the syntax for creating a matrix: matric()
  • you also need to specify the number of rows and columns
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Creating a Matrix example

A

m

this creates a 2x2 matrix

[,1] [2]

[,1] 1 3

[,2] 2 4

17
Q
A
18
Q
A
19
Q
A
20
Q
A