Malware Analysis Theory Flashcards
Which teams perform malware analysis?
-
Security Operations
- teams analyze malware to write detections for malicious activity in their networks
-
Incident Response
- analyze malware to determine what damage has been done to an environment to remediate and revert that damage
-
Threat Hunt
- analyze malware to identify IOCs, which they use to hunt for malware in a network
-
Malware Researchers
- in security product vendor teams analyze malware to add detections for them in their security products
-
Threat Research
- teams in OS Vendors like Microsoft and Google analyze malware to discover the vulnerabilities exploited and add more security features to the OS/applications
What are the rules for handling malware in a safe environment?
- Never analyze malware or suspected malware on a machine that does not have the sole purpose of analyzing malware
- When not analyzing or moving malware samples around to different locations, always keep them in password-protected zip/rar or other archives so that we can avoid accidental detonation
- Only extract the malware from this password-protected archive inside the isolated environment, and only when analyzing it.
- Create an isolated VM specifically for malware analysis, which has the capability of being reverted to a clean slate once you are done.
- Ensure that all internet connections are closed or at least monitored.
- Once you are done with malware analysis, revert the VM to its clean slate for the next malware analysis session to avoid residue from a previous malware execution corrupting the next one.
How are executable files often called?
binary or PE (Portable Executable) file
What happens during static malware analysis
?
malware is analyzed without being executed
What are some of the examples of static malware analysis
tasks?
- hecking for strings in malware
- checking the PE header for information related to different sections
- looking at the code using a disassemble
What happens during dynamic malware analysis
?
running the malware in a VM, either in a manual fashion with tools installed to monitor the malware’s activity or in the form of sandboxes that perform this task automatically
What is the Linux distribution built for malware analysis?
Remnux VM
Which command is used to detect the actial file type in Linux?
file
Which Linux command lists down the printable strings present in a file?
strings {filename}
What can the strings
command reveal?
embedded text such as URLs, file paths, error messages, Windows API calls or even specific keywords
What can’t be breached when uploading malware sample to a third party malware analyzer?
confidentiality (malware may contain sensitive information specific to a targeted company)
Which Linux command is used to calculate an MD5 checksum?
md5sum
What Linux command is used to find out what’s the access, modify, change and potentially birth time of a file?
stat
What do most PE files use to perform bulk of their jobs?
Windows API
What is entropy?
measure of randomness or unpredictability in a dataset, such as a file or network traffic
What is high entropy usually associated to?
encrypted or compressed data, where the content appears random and lacks obvious patterns or structure
Why is entropy important for malware analysis?
analyzing the entropy of files can help identify potential malware, especially in file formats that are not typically highly randomized, like executable files
Which Linux tool is used for PE file analysis?
pecheck
What is the name of a GUI-based tool used to analyze PE files?
pe-tree
What is the entropy value range?
typically range from 0 to 8, corresponding to the number of bits in a byte
What are Low Entropy values? What do they suggest?
- values are near 0
- suggests very little randomness, with many repeating patterns
- typical for simple text files or files with lots of redundant data
What are Medium Entropy values? What do they suggest?
- values are around 4-6
- common for normal executable code and data
- suggest a mix of structured data and some randomness
What are High Entropy Values? What do they suggest?
- values are close to 8
- indicate a high degree of randomness, similar to what you would expect from encrypted or compressed data
- little to no visible pattern or structure
Which open-source sandbox is the most widely known sandbox in the malware analysis community?
Cuckoo’s Sandbox
What is the issue with Cuckoo’s Sandbox?
- it has been archived, and an update is pending
- doesn’t support Python 3, making it obsolete right now
Which sandbox is the more advanced version of Cuckoo’s sandbox?
CAPE Sandbox
Describe the CAPE Sandbox
- what is supports and who is it used by
- supports debugging and memory dumping to support the unpacking of packed malware
- used by experienced engineers - advanced knowledge is required for making full use of it
- so far actively developed and supports Python 3
What are some of the common online sandboxes?
- Online Cuckoo Sandbox
- Online CAPE Sandbox
- Any.run
- Intezer
- Hybrid Analysis
Why is malware packing used by the attackers?
to make it difficult to analyze malware statically
What happens when strings search is ran against a packed malware?
no important information is shown
What are the techniques that malware uses for sandbox evasion?
-
long sleep calls
- malware is programmed to not to perform any activity for a long time after execution
- purpose of this technique is to time out the sandbox
-
user inactivity detecion
- wait for user activity before performing malicious activity based on the premise that there will be no user in a sandbox
- advanced malware also detects patterns in mouse movements that are often used in automated sandboxes
-
footprinting user activity
- some malware checks for user files or activity, like if there are any files in the MS Office history or internet browsing history
- if no or little activity is found, the malware will consider the machine as a sandbox and quit
-
detecting VMs
- VMs leave artifacts that can be identified by malware, such as drivers (VMwarae or Virtualbox specific)
What is the malware analysis process?
What questions should be asked when analyzing the malware?
-
Is the file malicious?
- the first answer analyst needs to come up with
-
How does the malware modify the system?
- What files does it drop?
- What registry entries it adds or modifies?
-
What does the malware contact and why?
- most malware communicates over the network
- what protocol, port is it using
-
How can the malware be detected or removed?
- needs to come through answering the questions above
What do packers do?
encrypt the original program in addition of compressing it
What is Sandnet
?
specific type of sandbox environment used for analyzing and observing malware behavior in a networked context
What does a Sandnet
do?
mimic internet services, like DNS or web servers, allowing the malware to communicate as if it were in a real network environment
What does the free Windows tool FakeNet
do?
- tricks malware into thinking it’s online by intercepting and responding to network communication
- logs network connections into a logfile and creates a pcap
What are the 2 types of strings to look for during an analysis?
- ASCII
- Unicode
What are ASCII characters composed of?
characters that are one byte long and end in a null character (7 bits long with the most significant bit set to 0)
How many ASCII characters are there?
128
Why was Unicode created?
to fix the 128 character limitation of ASCII
What is the possible length of Unicode?
8, 16 or 32 bits (UTF-8, UTF-16, UTF-32)
Why is there the extra character . between each character?
beacause each character is 16b long and these specific characters take only 1 byte, so the extra byte is set to 0
Why is it important to understand the difference between ASCII and Unicode when analyzing strings in malware?
some tools only look for ASCII strings by default and won’t find any Unicode strings because they interpret the null bytes between each character as the end of one character ASCII string - make sure that the tool being used is extracting both ASCII and Unicode strings
Why do certain tools extract only a string that has a minimum number of characters in it?
to cut down on garbage strings - the longer the strings being pulled are, the less likely they’ll be garbage strings