Article

F-Secure excels again in the MITRE ATT&CK® Evaluation

Christine Bejerasco, Vice President, Tactical Defense Unit, F-Secure

In 2018, MITRE introduced the ATT&CK® evaluation as an EDR product assessment leveraging the ATT&CK® framework for APT3. Last October, the results of that evaluation confirmed F-Secure’s industry leading capabilities in detecting advanced attacks. We appreciate MITRE’s commitment to these evaluations as they provide transparency on the effectiveness of EDR products in detecting targeted cyber attacks.

The F-Secure team is excited to share our excellent results from the second round of the MITRE ATT&CK® evaluation

The main focus in the latest evaluation (Round 2) is on detection capabilities against APT29 (aka. The Dukes). APT29 is a threat actor that successfully performed espionage for seven years before it was discovered by F-Secure in 2015. Our research on ‘The Dukes’ became the first contribution to MITRE’s knowledge base for APT29. This evaluation round assessed F-Secure's EDR product, Rapid Detection & Response, as well as Countercept, F-Secure's Managed Detection and Response (MDR) service.

In the second round of the MITRE ATT&CK® evaluation, we demonstrated strong capabilities in:
  • Delivering actionable information fast with minimal number of false positives.
  • Delivering great total coverage and visibility into indicators of attack.
  • Incorporating managed services (MDR) to increase likelihood of detecting attacks sooner.

As with Round 1, Round 2 is focused on assigning detection categories to test cases based on the data collection, analysis, and presentation provided by each solution. Each result is assigned to one of the main detection categories and optionally one or more modifiers.

Each result is assigned to one of the main detection categories and optionally one or more modifiers.
Our detection and response technology once again delivers excellent results

The evaluation results demonstrate how F-Secure's detection and response technology is highly effective against the 58 ATT&CK techniques used across the 134 steps in the test.

The chart below illustrates how F-Secure swiftly detects the APT29 attack scenario with excellent coverage when compared to the other 20 vendors participating in the evaluation, including the vendors with the highest detection coverage as shown in the chart below. The chart also demonstrates the distribution of the different detection types delivered by the EDR products, as well as the additional visibility managed services contributed for the overall detection coverage.

Top vendors by detection coverage across the main detection categories

Figure 1 Top vendors by detection coverage across the main detection categories

From the total detection coverage of 134 tests, F-Secure’s total coverage hit a very high 118 mark. We alerted immediately for 10 tests with general detection information as a non-specific behavior, 28 tests with detailed information about a specific technique, and yet another 52 tests were alerted through human analysis by threat hunters from our managed service. Additional visibility was provided through telemetry covering 28 tests. 

The chart doesn't consider configuration changes made during the assessment which may not reflect the configuration settings that are recommended to end-users by the vendor. The numbers will change for some vendors when removing the contribution of the MSSP category to their overall coverage. 

Detection coverage alone is not the only factor to consider when deciding on the best detection and response solution. Having a high number of general, tactic or technique detections leads to a higher score because this ensures less attacks are missed. Having access to higher-fidelity detections gives the users more time to investigate events that have a higher possibility of turning out to be real attacks rather than sifting through a sea of data that may be predominantly false positives.  

F-Secure’s detection and response technology balances this by having specific detections for anomalous events with different severity levels as well as keeping the noise down for investigators and aggregating alerts into what we call ‘incidents’ or Broad Context Detections. These incidents are prioritized based on a risk scoring system to assist the user in addressing more critical incidents first. 

Since targeted attacks are always designed to avoid detection, it’s critical to detect and stop the attacks at the right areas of the kill chain, and it’s important to understand at what point of the kill chain was the data not indicated. Attackers are likely to perform code execution at some point during the compromise and can do that multiple times. This is the reason why we at F-Secure have focused significant efforts on ensuring we have a rich set of telemetry and detection capabilities for code execution. Also, attackers may do persistence routines to ensure longevity in the target system. Good coverage especially for code execution or persistence increases the visibility of important attacker techniques. Focusing on the right kill chain stages where there is commonality across different attackers increases the likelihood of uncovering targeted attacks. 

The latest MITRE ATT&CK® Evaluation highlights specific benefits of using a Managed Detection and Response

Managed services on top of detection capabilities can be a valuable ally for highly targeted organizations. Whilst the evaluation is mainly intended to assess the detection prowess of an EDR tool on a per use case/test procedure basis, the inclusion of the ‘MSSP’ detection category allowed us to highlight the importance of having a strong team on hand to both investigate and understand the nature of the threat.

From the total detection coverage of 134 tests in this evaluation, the chart below illustrates the significant contribution of our managed service in total of 52 tests. A combination of our capabilities – human and technological – is what has delivered this result. We can demonstrate the value our managed service delivers using examples from the evaluation that highlight how an attack would be detected, investigated and ultimately contained.  

F-Secure’s detection coverage across the main detection categories

Figure 2 F-Secure’s detection coverage across the main detection categories

F-Secure Countercept’s Detection and Response Team (DRT) has the skills to investigate things that might not immediately seem suspicious - and then develop a detection rule for the future. The fact that we collect and store a great deal of telemetry means that, alongside the skills, we have the means to do this; not every vendor has this capability. We’ll see how quickly that can be done in a moment. 

The evaluation involved an unsuspecting user opening files or links in emails that led to unexpected code execution that was detected and flagged. 

Unexpected code execution, even coming from legitimate looking files or links, doesn’t guarantee something malicious is taking place. 

In this case, the user had opened a .doc file, resulting in code execution – something our agent then picked up and reported, since the execution then triggered interactive sessions. Understanding whether this is a bad thing takes a little human intervention. In this case, a human investigator would spot that something was amiss – a ‘right to left override’ method for obfuscation was used. Because there are only a few legitimate reasons to do this, it’s something that calls for further investigation. This is an easy deduction for a human to make – but it’s also a something that a decent rule would spot; creating just such a rule would be an immediate outcome. Our DRT has done exactly that already, by the way. 

Whilst telemetry is key for investigations, sometimes it’s necessary to look at the actual artefact used to work out what’s happened. 

Traditionally, retrieving artefacts has been an Incident Response activity – but it’s something our DRT do in most investigations, and our endpoint agent makes it easy to accomplish. In this case, the initial script execution spawned cmd and PowerShell sessions that downloaded a further file (monkey.png), which itself executed another PowerShell script. By retrieving and analyzing the monkey.png file, it’s possible to understand what the script contained in the monkey.png file does – in this case create a UAC bypass and launch a high integrity process that furthers the attack. 

The above scenario happened behind the scenes and isn’t something that the evaluation specifically allows us to demonstrate, but ultimately the point of an MDR service like Countercept is to do the heavy lifting for the customer we’re defending. An effective MDR will confirm that an incident is actually taking place - and then initiate the right response steps to contain the impact of that incident. 

As the product evaluation only checks for detection coverage, it wasn’t possible to contain these incidents and showcase our response capabilities. In reality, both attacks would have been stopped after the initial user execution. That’s because the alert would have led to retrieval and analysis of the artefact, revealing malicious intent. An experienced human investigator, such as our in-house threat hunters, presented with a suspicion, has enough data, context and experience to investigate and then respond as required. This is exactly what good MDR does: it alerts when you need to know, helps fill in the rest of the gaps to create understanding and then generates an appropriate response to help the organization monitor and contain the threat. 

Broader context and having MITRE ATT&CK® built-in

A solid EDR solution not only gives security teams the needed visibility, but also helps cut through the noise to avoid overloading human experts by flagging only relevant detections and providing a broader context for advanced cyber attacks.  

Our approach is not directly comparable with other vendors since F-Secure’s cloud-native EDR technology is powered by behavioral analytics that groups multiple detections as incidents and presents those with broader context to minimize alert fatigue. This also helps filter out false positives brought about by a large amount of individual events. This contributes to faster triaging and prioritization of detections as well as response actions. F-Secure also provides a common taxonomy that aids user investigations by linking all detections to relevant techniques used in the MITRE ATT&CK® knowledge base. 

Below is an example of a detection with broader context on a launch of an unknown process which could have been benign on its own, but with a broader context and a combination of events from relevant endpoints, raised the risk score to ensure that it will not go unnoticed.  

From the example below, we can see an attacker getting into a target system as a result of spear phishing and social engineering a user to launch an unknown executable. The attack continues with the attacker destroying evidence and moving laterally to the next target. 

Figure showing an attacker getting into a target system as a result of spear phishing and social engineering a user to launch an unknown executable.

The above attack includes stealing credentials from the system which is visible when clicking on a specific process and opening the process view as illustrated below. This image also demonstrates how multiple detections are aggregated into a single Broad Context Detection™ that appears in the MITRE evaluation as only one aggregated incident that includes one or more tactics and techniques:

Figure demonstrates how multiple detections are aggregated into a single Broad Context Detection™

F-Secure can aggregate by 'process chain', host, or user to help users not look at each alert individually, rather at a collection of detections that indicate a suspicious event. Our purpose-built Broad Context Detection™ automates the combination of events, both new and historical ones, into incidents with risk levels and host importance that gives users a more holistic understanding of the attack. This approach significantly reduces the time needed to go through individual events and figure out the sequence of events themselves. Behind the scenes, the detections we have created for these different TTP’s are fueled by behavioral as well as reputational data that we have amassed from our knowledge of threats that fuel our endpoint protection (EPP) products. The knowledge of threats as well as the knowledge of attack methodologies increases the capability of the product to detect anomalous activity whether those are automatically performed by threat actors or by attackers on the keyboard.

Look beyond the MITRE ATT&CK® when evaluating the right tool or service

We believe that MITRE ATT&CK® evaluations provide much-needed transparency in the advanced threat detection domain. It’s however also fair to acknowledge that MITRE, like any test, is just one form of evaluation performed in an isolated environment. It doesn’t test response capabilities, which is the second part of a Detection & Response product. Organizations looking to compare EDR, MDR or MSSP providers should consider MITRE assessments as one of the criteria for selecting a vendor that can enable their organizations to detect attacker tactics, techniques and procedures (TTPs) such as those indicated in the MITRE ATT&CK® framework.

Check the solutions below for more information on defending your organization against targeted attacks using one of the world’s best detection and response technologies, and seasoned threat hunters.