Deel 3 Flashcards
1
Q
Streaming analytics is..
A
based on the location of the data that is being processed
2
Q
Types of streaming analytics and definitions
A
- Edge: data collected within device (sensor)
- In-stream: data collected between device and server (network security logs)
- At-rest: data collected at rest (database of customer information)
3
Q
Data scientists need .. to be suitable
A
Computer science, mathematics, domain expertise
4
Q
Data-driven organizations
A
Drive towards a data culture, where access to data is needed and data-driven decision making is expected.
5
Q
Data quality issues:
A
- Noisy data: contains a large amount of misleading information
- Dirty data: contains missing/error values
- Sparse data: contains very few actual values
- Inadequate data: incomplete data
6
Q
Clean data may contain too many features to modeled efficiently (Curse of Dimensionality). Some processes to adress this issue are:
A
- Feature extraction: combining features into a smaller set of features
- Feature selection: selecting the most meaningful features
- Feature engineering: combining excisting features in a data set with external features
7
Q
A powerful secure computing infrastructure enables data scientists..
A
to cycle through multiple data preparation techniques and different models to find the best solution