Big Data Flashcards
What is FinTech?
Technology-driven innovation in finance industry
What were early forms of FinTech?
Data processing and automation of routine tasks
What are two important applications of FinTech in quantitative analysis?
Analysis of large (alternative datasets)
Analytical tools such as AI
What is meant by Big Data?
The vast amount of information being generated by the industry, government, individuals, and electronic devices. Includes data from traditional sources (stock exchanges, companies) as well as from non-traditional sources (social media, sensor networks)
What are the four characteristics of Big Data?
Volume (large amounts of data)
Velocity (high speed and frequency, real-time data)
Variety (many different sources)
Veracity (credibility, reliability)
What are sources of Big Data?
Financial markets
Businesses
Governments
Individuals
Sensors
Internet of Things
What is the difference between traditional business intelligence and Big Data?
Big Data incorporates the use of alternative data sources as well.
What are the three broad main sources of alternative data?
Individuals
Business processes
Sensors
What are challenges of Big Data?
Quality of data
Volume of data
Appropriateness of data
What is Artificial Intelligence?
Computer systems that are capable of performing tasks that traditionally have required human intelligence.
What are neural networks?
Programming based on how our brain learns and processes information
What is Machine Learning?
Computer-based techniques that seek to extract knowledge from large amounts of data without making any assumptions on the data’s underlying probability distribution.
What is the expert system?
Type of computer programming that attempted to simulate the knowledge base and analytical abilities of human experts in specific problem-solving context
What is the goal of Machine Learning?
Generate structure or predictions from data without any help from a human. Find the pattern, apply the pattern.
What are the three datasets involved in Machine Learning?
Training dataset: identify relationships between inputs and outputs.
Validation dataset: Validate relationships and tune the model.
Test dataset: Test the model’s ability to predict well on new data.