Data Scinece As A Field Of Knowledge Flashcards

You may prefer our related Brainscape-certified flashcards:
1
Q

Machine learning

A

A subset of artificial intelligence that uses statistical techniques to enable computer systems to learn from experience without being explicitly programmed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Supervised learning

A

A model with a known target variable, that “trains” the data to learn how to predict the target variable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Unsupervised model

A

A model with no known target variable; this type of model is free to make connections between data points without having to consider an outside target variable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Exploratory data analysis (EDA)

A

The process of exploring a data set to discover insights, I identify patterns, establish relationships and trends, and test assptions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Bias

A

Tendency of a sampling method to overestimate or underestimate the value if an underlying population parameter.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Target variable

A

Typically, a component of a business problem or objective.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Fixed data set

A

Does not recieve new data regularly.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Interpretability

A

How easily a human can understand or predict a decision or result.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Iteration

A

Function that repeats in a specified order, often until a specific result occurs.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

iterative

A

refers to a process where the design of a product or application is improved by repeated review and testing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Anomalies

A

data points that stand out from other data points in the data set and don’t confirm the normal behavior in the data. (They deviate from the data set’s normal behavioral patterns.)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Outlier

A

a single data point that goes far outside the average value of a group of statistics. It is markedly different from the norm in some respect.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Data Visualization

A

is the representation of data through the use of common graphics, such as charts, plots, infographics, and even animations. to communicate complex data relationships and data-driven insights in a way that is easy to understand.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Features

A

an individual measurable property within a recorded dataset.

They are often called “variables” or “attributes.” Relevant features have a correlation or bearing (called feature importance) on a model’s use case.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Relevant Features

A

have a correlation or bearing (called ____ importance) on a model’s use case.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Spurious correlation

A

when two variables appear to be correlated but are not.

17
Q

Overfitting

A

occurs when the model cannot generalize and fits too closely to the training dataset instead

18
Q

Lean startup

A

a method for establishing a new company or product that focuses on customer feedback and flexible development over traditional business planning.

19
Q

minimum viable product

A

is a development technique in which a new product or website is developed with sufficient features to satisfy early adopters. The final, complete set of features is only designed and developed after considering feedback from the product’s initial users.

20
Q

model validation

A

is the process that is carried out after Model Training where the trained model is evaluated with a testing data set.

21
Q

Principles behind the Agile Manifesto

A
  1. Highest priority is to satisfy the customer through early and continuous delivery of valuable software.
  2. Welcome changing requirements, even late in development.
  3. Deliver working software frequently, from a couple of weeks to a couple of months, with a preference to the shorter timescale.
  4. Business people and developers must work together daily throughout the project.
  5. Build projects around motivated individuals. Give them the environment and support they need and trust them to get the job done.
  6. The most efficient and effective method of conveying information and within a development team is face-to-face conversation.
  7. Working software is the primary measure of progress.
  8. These processes promote sustainable development. The sponsors, developers, and users should be able to maintain a constant pace indefinitely.
  9. Continuous attention to technical excellence and good design enhances agility.
  10. Simplicity–the art of maximizing the amount of work note done–is essential.
  11. The best architectures, requirements, and designs emerge from self-organizing teams.
  12. At regular intervals, the team reflects on how to become more effective, then tunes, and adjusts its behavior accordingly.
    https://agilemanifesto.org/principles.html
22
Q
A