Book Image

Mastering Malware Analysis - Second Edition

By : Alexey Kleymenov, Amr Thabet
5 (2)
Book Image

Mastering Malware Analysis - Second Edition

5 (2)
By: Alexey Kleymenov, Amr Thabet

Overview of this book

New and developing technologies inevitably bring new types of malware with them, creating a huge demand for IT professionals that can keep malware at bay. With the help of this updated second edition of Mastering Malware Analysis, you’ll be able to add valuable reverse-engineering skills to your CV and learn how to protect organizations in the most efficient way. This book will familiarize you with multiple universal patterns behind different malicious software types and teach you how to analyze them using a variety of approaches. You'll learn how to examine malware code and determine the damage it can possibly cause to systems, along with ensuring that the right prevention or remediation steps are followed. As you cover all aspects of malware analysis for Windows, Linux, macOS, and mobile platforms in detail, you’ll also get to grips with obfuscation, anti-debugging, and other advanced anti-reverse-engineering techniques. The skills you acquire in this cybersecurity book will help you deal with all types of modern malware, strengthen your defenses, and prevent or promptly mitigate breaches regardless of the platforms involved. By the end of this book, you will have learned how to efficiently analyze samples, investigate suspicious activity, and build innovative solutions to handle malware incidents.
Table of Contents (20 chapters)
1
Part 1 Fundamental Theory
4
Part 2 Diving Deep into Windows Malware
10
Part 3 Examining Cross-Platform and Bytecode-Based Malware
14
Part 4 Looking into IoT and Other Platforms

Choosing your analysis strategy

Reverse engineering is a time-consuming process, and in many cases, there aren't the resources available to allow engineers to dive as deep as they would like to. Prioritizing the most important things and focusing on them will ensure that the best result is produced within the allocated time every time. Here is some advice that may help in this challenging task.

Understand your audience

Depending on who is going to use the result of your work, the actionable deliverables may be very different. Examples of the potential use cases for reverse engineering include the following:

  • Threat intelligence: Here, the focus will be mainly on obtaining IoCs, such as hashes, filenames, and network artifacts. Therefore, extracting embedded payloads and downloading remote samples, as well as finding other related modules involved and extracting C&C information from all of them, will likely be the top priority.
  • AV detection: In this case, the focus will be on anything unique enough to create a robust detection that doesn’t produce false positives (FPs). Examples are distinctive pieces of code and strings related to the malicious functionality and any custom encryption algorithms used. Understanding the main logic will help choose the right category, and code and data similarity will lead to assigning the malware family.
  • Technical article or conference presentation: Here, the most important part will be interesting novel technical details related to functionality, similarities with other malware families, and actor attribution.
  • Article for the general public: For non-technical people, it is common to provide a high-level description of functionality without many technical details, focusing mainly on impact.

Answer your audience’s questions

It’s very important to answer the main questions your audience is asking. Make the answers clear and easy to find in your analysis report.

Here is a list of possible questions your audience might need an answer to in your report:

As long as this part is clear, we can start prioritizing particular topics.

Define your goals

Once the audience is confirmed, define your goals carefully based on the resources available: first, time and skillset. After this, prioritize the selected goals and focus on the most important ones first. It is very easy to get lost in assembly when doing static analysis, so having a checklist of what needs to be done and in what priority will help you get back on track.

Avoid unnecessary technical details

Regardless of who is going to consume the result of your work, having too many extra details won’t show your level of expertise but will simply complicate the understanding of the work and result in wasted time. Common examples include executed instructions, WinAPIs used, standard registry keys accessed, or mutexes created. Therefore, you should do the following:

  • Choose the level of detail required depending on the target audience.
  • If some fact doesn’t help the reader, avoid elaborating on it.
  • Don’t just mention technical details – explain their high-level purpose and why the attackers had to explicitly use them.

Finally, make sure that the most important sections are covered in detail and are definitely correct. Never attempt to make statements based purely on gut feeling or prior knowledge without any material facts related to the current sample. You can always use the appropriate wording for something that you have spotted but don’t have time to dig deeper into (for example: “there are indications that... but more work is required to confirm it”).

Example structures

Here are some of the details that are generally included in the resulting work, depending on its format and the audience.

Technical article

In most cases, the following information will be useful:

  • Sample(s) details:
    • Hashes (MD5, SHA1, SHA2)
    • Compilation timestamps
    • File types and sizes
    • In-the-wild (ITW) filenames
    • AV vendors’ detections
  • Modules’ relationships (if there are several involved)
  • For each module:
    • A description of the main functionality
    • Persistence mechanisms
    • Network communications:
      • Protocols
      • Encryption algorithms and keys
      • C&C details (IP addresses, domains, URLs, unique whois details, host countries, and so on)
    • Anti-reverse engineering techniques used
  • IoCs
  • Detection rules (YARA, Snort, and others)

General-public article

  • High-level functionality description with a focus on the impact
  • The scale of the attack
  • Victim profile:
    • Types of organizations targeted
    • Victims’ geolocation
    • Loss estimates
  • Actor attribution:
    • Sample similarity
    • Matched IoCs (hashes, network artifacts, filenames, and so on)
    • Language codepages and strings used
    • Compilation timestamps

Typical analysis workflow

Now that we know what to focus on, the next question is: how do we organize the work to produce the best possible result in a timely fashion? The following steps are suggested for you to follow:

  • Triage: Here, collect the maximum amount of easily available information on the sample:
    • Analyze the PE header.
    • Check whether the sample is likely to be packed or not (high-entropy blocks).
    • Check public resources for known IoCs (hashes, network artifacts, AV detection names, and so on).
  • Behavioral analysis: Most of the information will be obtained from file, registry, and network operations. This way, we will have an idea about the capabilities of the potential sample.
  • Unpacking (if necessary): Static analysis is impossible before the sample is unpacked as the actual malware’s code and data are not readily available yet.
  • Static analysis: Performed with the help of disassemblers and decompilers:
    • Start from available strings and commonly misused WinAPIs.
  • Dynamic analysis: Performed with the help of debuggers. May be quite expensive to set up and perform, so use it only when needed:
    • Confirming certain functionality
    • Handling string/APIs/embedded payloads/communications encryption