MS Past paper qs Flashcards
Describe in detail the linear sweep disassembly algorithm, its strengths and limitations (if any). [6 marks]
2017/18/19/23
The linear sweep disassembly algorithm is a type of static malware analysis where in a binary, we iterate through all code segments and we decode all the bytes consecutively and then we parse them into a list of instructions. 1st 4 - data, rest code.
binary blob -> iterate through code -> decode bytes one by one linearly -> parse into a list of instructions
Strengths:
- provides complete coverage of a program’s code sections - won’t miss code
- x86 can automatically resynchronise itself after just a few instructions
- safe for disassembling ELF binaries, don’t typically contain inline data (good for benign libraries)
Limitations:
- doesn’t understand control flow
- compiler often mixes code with data as memory address is unknown so doesn’t actually know what’s the code and what’s the data, but it just assumes everything is good
- with jump tables it can intersperse data: if data is mistaken for code may encounter invalid opcodes
- some data can correspond to valid opcodes so this will output bogus instructions
- as a result can become desynchronised so misses instructions
Describe in detail the recursive traversal disassembly algorithm, its strengths and limitations (if any). [6 marks]
2017/18/19/23
- Begin disassembling from known entry points in the binary.
- Follow the control flow recursively until encountering branches.
- If the branch is unconditional, follow the known address and continue the flow.
- If the branch is conditional, the address is unknown, so the instruction pointer changes accordingly. Otherwise, continue linearly.
- Keep track of return instructions and start disassembling from there recursively.
Strengths
- can easily distinguish code from data
- good for inline/malicious code as it’s not easily fooled into producing bogus output
Limitations
- Indirect code invocation: Jump tables create indirect control flow, making it harder for disassemblers to track instruction flow because there’s no direct target address to follow.
- Hard to follow: Disassemblers struggle with jump tables since they can’t statically resolve where the data is loaded, some need runtime conditions resolved so may miss blocks of code, unless they use special heuristics to handle them. Hard to know where to jump during analysis.
- Some instructions, like “ret” instructions, cause errors and prevent further analysis, leading to missed instructions and incomplete disassembly.
- Recursive Traversal Disassemblers might fail to discover instructions at specific addresses, like 0x443903-0x443925, possibly due to complexities in the code or limitations in the disassembler’s approach.
What is a botnet and what is a botnet C&C server? [3 marks]
2017/18/19/22
A botnet is a network of bots, which are malware that carry out some malicious action in co-ordination with other bots.
A botnet C&C server is a command and control server which the bots utilise to communicate over. This is the server which the attacker/botmaster uses to control the bots.
How are botnets usually created (cite at least 2 different ways)?
Describe two different ways in which botnets are usually created.
[2 marks]
2017/18/19
- As a computer worm
- A Trojan horse resident in a program
Describe three methods botnets use to determine rendez-vous points for C&C servers. [9 marks]
2022
- Botnets may be centralised in which case each bot communicates directly with the botmaster. However, the botmaster would then become a bottleneck for large botnets
- Many botnets use a hierarchical structure in which the botmaster communicates with a set of bots that are in turn botmasters for other bots. This allows control over a large botnet
- Peer-to-peer botnets use a C&C structure in which there is no single C&C server. Instead, a peer-to-peer network is constructed, with the bots acting as peers. Thus, if some portion of the botnet is deleted, the remainder of the botnet can continue to function
fast flux/domain flux
Describe in detail how a drive-by download attack works and what it is used for. [3 marks]
2017/18/22/23
It is a type of infection vector which is the way a malware uses to propagate itself on a computer
This is unintentional download of malicious code through typically taking advantage of an app, OS or web browser that contains security flaws due to unsuccessful/lack of updates.
Uses:
- Exploiting APIs for browser plugins to enable the downloading and execution of arbitrary files
- Exploiting vulnerabilities in the web browser or plugins
State what a rootkit is and describe at least one of its hooking techniques; specify pros and cons. [4 marks]
2017/18/19/22
A rootkit is a pernicious Trojan horse, it hides itself on a system so it can carry out its actions without being detected.
Hooking techniques: altering part of a kernel, for example, the Knark rootkit modifies entries in the system call table to invoke new versions in a kernel-loadable module. These new versions hijack system calls related to examining the file system, network connections, spawning processes, etc., concealing the presence of the rootkit.
Pros:
- Stealth: Kernel modification techniques allow rootkits to operate stealthily by altering core system functions, evading detection by traditional security measures.
- Persistence: Once the kernel is modified, the rootkit can maintain control over the system across reboots, ensuring persistent access for the attacker.
Cons:
- Complexity: Modifying the kernel requires a deep understanding of system internals, making the development and deployment of such rootkits complex.
- Detection: Detection methods such as comparing the system call table stored in the kernel with a copy stored on disk at boot time can potentially uncover the presence of kernel-level modifications.
Describe in a few sentences what clustering (i.e., unsupervised learning) and classification (i.e., supervised learning) are. For each of these two categories, provide the name of one well-known algorithm we have discussed in class. [4 marks]
2018/19/22/23
In clustering, there are an unknown number of classes and the model does not have any prior knowledge. It groups and interprets similar data based only on input data.
(Hidden Markov model)
In classification, there is a known number of classes and we learn from a set of labelled samples. Given a labeled dataset, we find a model that separates them into their respective classes.
(K-Nearest Neighbours)
Report the formulas of Accuracy, F1-Score, Precision and Recall (1 mark for each formula).
2018/19/22/23
Accuracy = (True Positive + True Negative) / ALL
F1 = (2 * Precision * Recall) / Precision + Recall
Precision = True Positive / (True positive + False positive)
Recall = True Positive / (True Positive + False Negative)
Discuss the different behaviour of F1-Score with respect to “Accuracy” in the case of a dummy binary supervised classifier that, given any application as input, predicts always ”0” (benign), and never ”1” (malicious); consider a testing dataset of 1000 apps, with 990 app having ground truth label “benign”, and 10 app having ground truth label “malicious”. In particular, for the comparison, refer to the following definition of Accuracy:
Accuracy = (T P + T N) / (T P + F P + T N + F N)
where TP are True Positives, FP are False Positives, TN are True Negatives, and FN are False Negatives.
[5 marks]
2018/19/22/23
Accuracy:
Total = 1000
TP = 0, none are malicious, all predicted benign
TN = 990, predicted benign actually benign
FP = 10, predicted benign actually malicious
FN = 0, predicted no malicious but all benign
Accuracy = (0 + 990) / (0 + 10 + 990 + 0) = 990/1000 = 0.99
F1-Score:
The F1-score is calculated using precision and recall. Precision is the ratio of true positives to the sum of true positives and false positives. Recall is the ratio of true positives to the sum of true positives and false negatives.
Precision = TP / (TP + FP) = 0 / (0 + 10) = 0
Recall = TP / (TP + FN) = 0 / (0 + 0) = 0
F1 = 2 * Precision * Recall / Precision + Recall = 2 * 0 * 0 / 0 + 0 = 0
Comparison:
- Accuracy gives a high value (0.99), indicating that the classifier performs well in terms of overall correctness. However, this metric can be misleading because it does not consider the class imbalance (i.e., the majority class “benign” dominates the accuracy).
- F1-score, on the other hand, is 0 due to the absence of true positives. This highlights the limitation of F1-score in situations where one class dominates the dataset and the classifier fails to predict the minority class.
Discuss what are True Positive (TP), True Negatives (TN), False Positive (FP), False Negatives (FN) in the context of a malware classifier that classifies an input sample as malicious or benign. [8 marks]
2023
TP = binary malware, system malware - CORRECT MALICIOUS - Malware labeled as Malware
TN = binary malware, system not malware - CORRECT BENIGN - A normal program labeled as safe
FP = binary not malware, system malware (Type 1 Error) - INCORRECT MALICIOUS - A normal program labeled as Malware
FN = binary not malware, system not malware (Type 2 Error) - INCORRECT BENIGN - Malware labeled as safe
Describe the high-level view of the algorithm with an example picture in
2D space, showing the decision boundary, some feature points, and the
support vectors (4 marks).
2018/19
|. . . / / / *. *
|. */ / / *. *
|. *. / / / *. *
|. / / /. *
|. / / / *. *
_____________________________
decision boundary / hyperplane = the line in the middle
support vectors = feature points that are on the margins
feature points = *
Describe the Radial Basis Function (RBF) non-linear kernel through a
visual example in a 2D graph (3 marks).
2018/19
same as SVM normal but it’s in a circle, outer circle one class inner circle another class
You can conceptualize an SVM with the RBF kernel as generating a smoothed linear combination of spheres around each data point ( x ):
- Inside each sphere, the classification is the same as ( x ), while outside is assigned the opposite class.
- The parameter ( \gamma ) determines the radius of these spheres, i.e., the proximity required to classify a point similarly.
- The parameter ( C ) determines the degree of “smoothing.”
With the aid of diagrams, show and explain the main differences between overwriting viruses, companion viruses, and parasitic viruses (in all their forms). [9 marks]
2019/22/23
Overwriting viruses:
- overwrites section of program file with virus, may break infected program
- infect host file with bigger/smaller file size
- potentially break infected program
Companion viruses:
- don’t overwrite or infect files, but create new files with same name as the legitimate one but with a different file extension
- so when user runs legit file, companion file also executes
- in DOS if no extension is given order of execution is COM, EXE then BAT
- eg. infects COM file of same name as EXE file
Parasitic viruses:
- prepending: insert virus code at beginning of executable then shift original code to follow virus
- appending: append virus code to executable, insert JMP at beginning of executable
- fragmenting: code intermixed with original ones
Briefly explain what a phishing attack is. [2 marks]
2019/23
Phishing is the act of impersonating a legitimate entity, typically a web site associated with a business, to obtain information such as passwords, credit card numbers, and other private information without authorisation.
- Requires that the attackers create a website that display a page that looks like it belongs to a bank.
- User is fooled into thinking this is the legit page for the bank.
- Attacker creates an email instructing victims to click on an enclosed link that will take them to the fake page.
- However URL is the real bank but underlying link is the fake bank.
- User enters personal credentials and attacker saves credentials for later use.