2 - Innovating with Data Flashcards

1
Q

Data is any [what] that is useful to an organisation?

Such as?

A

Information.

Such as; Spreadsheets, emails, audio, images, ideas

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are 2 challenges of legacy systems when it comes to data?

A
  • processing volumes and varieties of new data (batch or realtime)
  • Finding cost effective solution for setting up and maintaining data centres
  • scaling resource capacity up or down
  • accessing historical data
  • driving insights from old and new data
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

How did budget airlines transform by unlocking the power of data?

A
  • Using past data to predict how many meals will be purchased on certain flights to ensure no wastage or customer dissatisfaction
  • Used destination, time of day, flight connections before and after
  • this uncovered actionable insights to predict accurate amount of meals.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Data Mapping for retail. The below makes up a [what] data bucket?

  • Transactions data set, Item returns data set, Footfall data set = [what] Data Bucket?
  • Staffing levels data sets, delivery (stock Delivery dates), sales performance, staff structure = [what] data bucket?
A

User

Corporate

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is structured and unstructured data? Examples for both

A

Structured
- highly organised (customer info with names, address etc) easily stored and managed in databases

Unstructured
- no organisation (word processing docs, audio files, videos, images)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Example of a used car dealership requiring and using both structured and unstructured data?

A

Used car dealership built an ML model to predict the price of a new car coming in

photo of the car (unstructured)
the pricing of previous similar cars (structured)

Used this combined data to predict the price.

Time to value a car dropped from 20mins - 3mins

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

The key benefits of using cloud technology to unlock value from data, especially for traditional Enterprises?

  • Businesses can process [how much data?] of data in real-time
  • Businesses can query their data and [get what] instantly.
A

terabytes

retrieve results

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

To get the most value out of data, you need what 3 things?

A
  • to know what you have
  • to find it easily
  • to use it while keeping it secure
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are the 3 key terms around data storage?

A

Databases
Data Warehouses
Data Lakes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are googles 2 fully managed DB options? What kind of data does a DB store

A

CloudSQL
Cloud Spanner

Transactional Data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is Googles Data Warehouse option? What is it good for?

A

BigQuery

Assembles data from multiple sources to make it useful for analysis.

Can transform unstructured to semi-structured and use this with structured data for analysis

Enables rapid analysis of multi-dimensional datasets

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What data does a data lake generally hold? What is googles solution?

A

back-up data

Cloud Storage

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Match the below in regards to the Cloud Storage Classes:

nearline - best for data accessed X per month,
coldline - accessed once per X days
Archive classes - once per X

A
  • nearline - best for data accessed once per month,
  • coldline - accessed once per 90 days
  • archive classes - once per year
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is GCP’s Business intelligence solution? What does it do?

A

Looker

A data platform that sits on top of an analytics database and makes it simple to describe your data and define business metrics

With a reliable source of truth for business data, anyone can analyse and explore it and share insights with a simple link

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is ML?

Most dashboards you use as a company probably use backward-looking data (reports etc) to look at whats happened in the past

To create value in your business you need to use that backward data to do what?

A

Establish trends with backward data and use ML to predict insights to help with future decisions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

AI is a term that describes any kind of machine with the ability to act autonomously.

ML is a branch of AI - what does it mean?

A

ML is computers that can learn from data without using a complex set of rules

17
Q

ML is teaching a computer how to solve problems by providing it with the correct answers. What is an example of this in everyday life?

A

Tax Returns

Weather Patterns

18
Q

Bugs in Traditional Software: Mistake in code
Bugs in ML: bugs in data

Google created an ML model that would diagnose diabetic retinopathy almost as well as ophthalmologists can.

They trained the ML model using labelled images of the backs of eyes, each label being the diagnosis.

What’s an example of the possible bugs here?

A

Because humans were involved, the data may have included incorrect labels or human bias, which is then passed into the model itself.

19
Q

The best data has 3 qualities, what are they?

C
C
C

A

Coverage - scope of the problem domain and all possible scenarios. all poss input & output data. If you train a model to detect faults in car parts but only show it red parts to train, the model might not be able to detect defects in blue parts

  • Consistent, Clean - if bringing in data from multiple documents, formatting must be the same across the board - ie: timestamps
    example: if you train with car parts images with shadows, the model will think this is part of the car
    human error - marked incorrectly in docs - will train the model incorrectly - dirty data
  • Complete
20
Q

ML is more accessible now and GCP have many AI/ML solutions to leverage without traditional costs and efforts. What are the 3 options for using ML models?

(Existing and custom option)

A
  • Use pre-trained model using Googles Data - Vision API (detect faces, text)
  • Train existing ML model with your data
  • Build custom ML model, train with your data
21
Q

Use cases for ML.

1-2 key business problems that ML is suited to fix:

A

Replacing or simplifying rule-based systems
- Google Search - using rank brain for ranking most likely correct searches instead of hand rules
Automating processes
- speech to text instead of manually writing reports
Understanding unstructured data
Creating personalized customer experiences.