11. Introduction to Big Data Techniques Flashcards

Question 1

Q

Alternative data

Answer

A

Data that are generated from non-traditional sources, such as social media and sensor networks.

Question 2

Q

Artificial intelligence (AI)

Answer

A

Computer systems that are capable of performing tasks that previously required human intelligence. AI methods are sometimes better suited to identify complex, non-linear relationships than are traditional quantitative and statistical methods.

Question 3

Q

Big data

Answer

A

The vast amount of information being generated by both traditional sources—for example, stock exchanges, companies, governments—and non-traditional sources—for example, electronic devices, social media, sensor networks, and company exhaust.

Question 4

Q

Data science

Answer

A

An interdisciplinary field that harnesses advances in computer science, statistics, and other disciplines for the purpose of extracting information from big data (or data in general).

Question 5

Q

Deep learning

Answer

A

An area of artificial intelligence in which a system uses neural networks to perform multistage, non-linear data processing to identify patterns. Also called deep learning nets.

Question 6

Q

Expert system

Answer

A

A type of computer programming, often based on “if–then” rules, that attempts to simulate the knowledge base and analytical abilities of human experts in specific problem-solving contexts.

Question 7

Q

Fintech

Answer

A

Technological innovation in the financial services industry, specifically with the design and delivery of financial services and products. It may also refer more broadly to companies involved in developing the new technologies and their applications, as well as the business sector that includes such companies.

Question 8

Q

Internet of things

Answer

A

The vast array of physical devices, home appliances, smart buildings, vehicles, and other items that are embedded with electronics, sensors, software, and network connections that enable the objects in the system to interact and share information.

Question 9

Q

Machine learning (ML)

Answer

A

Involves computer-based techniques that seek to extract knowledge from large amounts of data without making any assumptions about the data’s underlying probability distribution. The goal of ML algorithms is to automate decision-making processes by generalizing, or “learning,” from known examples to determine an underlying structure in the data.

Question 10

Q

Natural language processing (NLP)

Answer

A

A field of research within the field of text analytics and at the intersection of computer science, AI, and linguistics that focuses on developing computer programs to analyze and interpret human language.

Question 11

Q

Neural networks

Answer

A

A type of computer program design based on how the human brain learns and processes information.

Question 12

Q

Overfitting

Answer

A

When a machine learning model learns the input and target dataset too precisely, making the system more likely to discover false relationships or unsubstantiated patterns that will lead to prediction errors.

Question 13

Q

Scraping

Answer

A

An automated, large-scale, algorithm-driven approach that retrieves otherwise unstructured data available on websites and creates data in a more structured format.

Question 14

Q

Supervised learning

Answer

A

A type of machine learning in which the system attempts to learn to model relationships based on labeled training data.

Question 15

Q

Text analytics

Answer

A

Involves the use of computer programs to analyze and derive meaning typically from large, unstructured text- or voice-based datasets, such as company filings, written reports, quarterly earnings calls, social media, email, internet postings, and surveys.

Question 16

Q

Underfitted

Answer

A

When a machine learning model treats true parameters as if they are noise and is unable to recognize relationships in the training data, making the model more likely to fail to fully discover patterns that underlie the data.

Question 17

Q

Unsupervised learning

Answer

A

A type of machine learning in which the system tries to learn the structure of unlabeled data.