LM 11: Introduction to Big Data Techniques Flashcards

1
Q

What is fintech?

A

refers to technological innovation in the design and delivery of financial services and products

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are the 4 characteristics or V’s of big data?

A
  1. volume
  2. velocity
  3. variety
  4. veracity (accuracy)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are the 6 main sources of big data? FBGISI

A
  1. financial markets
  2. businesses
  3. governments
  4. individuals
  5. sensors
  6. internet of things
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Who are the 3 main sources of alternative data generation? IBS

A
  1. individuals
  2. business processes
  3. sensors
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are 3 ways data can be organized? SSU

A
  1. structured
  2. semi-structured
  3. unstructured
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are the 3 challenges to big data? QVA

A
  1. quality
  2. volume
  3. appropriateness
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is artificial intelligence?

A

enables computers to perform tasks that traditionally have required human intelligence

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is machine learning (ML)?

A

computer programs that learn how to complete tasks; improving with time as more data have become available

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are the 3 types of machine learning? SUD

A
  1. supervised learning (given inputs and outputs and tries to figure out the best model training data)
  2. unsupervised learning (algorithm seeks to describe data and find patterns)
  3. deep learning (utilizes neural networks to identify patterns, use them in image & speech recognition)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is the difference between underfit and overfit machine learning?

A

underfit: failure to recognize true relationships in a training data set

overfit: model generates very high accurate relationships

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is data science?

A

uses computer science and statistics to extract information from big data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Describe the 5 data processing methods data scientists use. DDDST

A
  1. data capture (how data is collected & transformed into a usable format)
  2. data curation (cleaning data to ensure high quality)
  3. data storage (recording, archiving, & accessing data)
  4. search (locating specific information in large datasets)
  5. transfer (moving data from their source or storage location to the analytical tool)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is data visualization?

A

how data will be displayed & summarized in graphical form

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is the difference between text analytics and natural language processing?

A

text analytics uses computer programs to analyze unstructured text or voice-based datasets

natural language processing (NLP) uses text analytics focus on interpreting human language

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is corporate exhaust?

A

refers to the trail of data left by business activities and transactions. Examples include supply chain information, banking transactions, and point-of-sales data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly