Lecture 16: Data Forensics and Analysis Flashcards

1
Q

What steps are in the forensic process?

A
  • Seizure
  • Imaging
  • Analysis
  • Reporting
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is data analysis and what is needed?

A

Data
“separation of a whole into its component parts”

Needed: Sufficient relevant data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Constraints of Data analysis task

A
  • Time
  • Access to data
  • Technological resources
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Data limitations

A
  • Missing data
  • Altered data
  • Different forms of the same data
  • Different definitions of the same data
  • Non-existent data
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are the steps in data analysis process?

A
  • Planning
  • Data collection
  • Data preperation
  • Data analysis
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q
  1. What does the analysis plan contain?
A
  • Purpose of the analysis
  • Audience for the analysis
  • Data availability and quality (choices about what data to include)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q
  1. What does data preperation entail?
A
  • Data inventory (list items, sources and dates)
  • Date pre-processing (improve quality, data cleaning)
  • Databases and data warehouse
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q
  1. What are the Major Tasks in Data Pre‐processing?
A
  • Data summarisation (Identify typical properties of the data, understand distribution of data - central tendency and dispersion)
  • Data cleaning
  • Data integration
  • Data transformation
  • Data reduction
  • Data discretisation
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What does data cleaning entail?

A

Fill in missing values
– Ignore the tuple
– Fill in the missing value manually
– Use a global constant
– Use the attribute mean
– Use the attribute mean for all samples belonging
to the same class
– Use the most probable value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is the mean?

A

The mean is the average of a data set, calculated by dividing the sum of all values by the number of values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is de median?

A

Median is the middle value if N is odd and is the
average of the two middle values if N is even

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is the mode?

A

Value that occurs most frequently in the set

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is the midrange?

A

The midrange is the average of the smallest and largest values in a data set

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is data dispersion?

A

The degree to which numerical data tend to spread

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is behaviour profiling?

A

Capability to recognize patterns of criminal activity. Predict when and where crimes are likely to take place

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is data mining?

A

Knowledge discovery in databases
KDD = data mining process

17
Q

Where can data mining be used for?

A

– Market analysis and management
– Risk analysis and management
– Fraud detection and management

18
Q

Data mining techniques

A
  • Link analysis (graphical analysis)
  • Clustering analysis (grouping a set of data objects into clusters)
  • Intelligent agents
  • Text mining
  • Neural Networks
  • Machine‐learning algorithms