Devising a delegated alerts model for SecOps
SecOps teams are at the forefront of the action; they are the first ones detecting a breach and often the last ones to leave the office when all is said and done, and the breach is successfully contained.
However, that commitment and diligence can come at a price. SecOps teams are among the most burnout-prone teams in the entire realm of cybersecurity. We are forced to make a choice between verbose alerts that can overwhelm us but keep the company safe and more concise alerts that could lead to a breach due to a lack of proper visibility.
Adding to this predicament is the fact that many SecOps teams are understaffed, creating a recipe for burnout and potential breaches.
Yet, it’s during times of stress that the best ideas are forged, and the SecOps team at TrueLayer is constantly improving existing processes to strike a balance between our satisfaction/happiness and the company's security.
With that in mind, I want to introduce you to our delegated alerts model used by the SecOps team at TrueLayer to speed up responses to alerts, while minimising mental overload and the dreaded burnout.
Let's start with a classic scenario we’ve seen many times in our team.
An example of SecOps without a delegated alerts model
Meet Bob. Bob is a new hire who inadvertently skipped his security induction provided to all new employees. During the daily review of alerts, one of the analysts identifies a policy violation from Bob on our Security Information and Event Management (SIEM) system. The process for handling such events follows a standard procedure:
Look through the logs to detect any other unusual activity, as it could be an indication of a compromise.
If the alert is considered non-malicious or the analyst is unclear, they must reach out to Bob for clarification on the activity.
A standard set of questions is asked. Once the analyst is satisfied with the response, they add the necessary comments to the SIEM case and close the alert.
While this process may seem straightforward, it presents a few challenges:
What if Bob is on holiday after the fact? If that’s the case, the analyst must actively keep track of the activity.
The analyst follows the SecOps playbook to get more context. It’s important, but it can feel repetitive.
If the user is not responding immediately or is responding intermittently (a reality of remote work), the analyst must now keep track of an asynchronous conversation, while managing a busy workload.
Managing multiple tickets, tasks and conversations is demanding, contributing to the day-to-day stress of SecOps.
Types of alerts SecOps need to be on the lookout for
It sounds cliché, but security is everyone’s responsibility. At TrueLayer, we emphasise this through continual knowledge sharing and involving team members outside of SecOps in various critical processes. The Security team's role is not to block people but rather to empower them to act and behave in a manner that aligns with their interests and the company’s objectives.
And it’s important to note that certain alerts are not initiated by attackers but rather by employees who may have unintentionally failed to comply with internal rules. These individuals often prove to be the best resources to handle and provide context on the activity.
SecOps analysts tend to encounter one of three types of alerts:
Policy violations (eg a user running unauthorised software on their work device)
Potentially malicious activity (eg a login on AWS using root credentials)
Malicious activity (eg malware detected on endpoints by the Endpoint Detection and Response tool)
Some of these alerts require the SecOps team to reach out to the end-user to get more context. Of all the alert types, policy violations are the simplest to obtain a justification for, as the rules are quite simple and there is little space for ambiguity from the end-user ’s perspective. This type is currently being 'delegated' to the end-user through automation, avoiding the need for manual outreach to the user.
The second type is slightly more complex, and we are still in the process of delegating it. The last type requires the eye of an analyst and therefore cannot be delegated.
Requirements for a delegable alert
Based on our experience, we believe that an alert can only be delegated if it meets the following criteria:
The rule behind the alert is simple: the alert was triggered based on a user’s action against a resource. As the number of actions and resources increases, the activity becomes increasingly complex to explain to the user in an automated way.
The alert can be directly pinpointed to the user: the alert is for an action performed by a user directly, rather than as an indirect consequence of one of their actions.
Obtaining context does not require analyst intervention: it should be possible to extract most, if not all, of context from the alert and/or one or more services (eg API, database) without the intervention of an analyst.
The alert can be explained to any reasonably tech-savvy user: the user should be able to understand their actions without an understanding of cybersecurity.
If an alert meets all of the above requirements, it is simple to implement a workflow where the user is notified and has enough context to explain their actions.
The following table shows two examples of alerts and how they meet the requirements for a delegable alert:
The alert... | ...is simple | ...can be pinpointed to the user | ...does not require analyst intervention | ...can be explained to a tech-savvy user | ...can be delegated |
---|---|---|---|---|---|
User accessed the company's IdP from an unauthorised device | ✅ | ✅ | ✅ | ✅ | ✅ |
An OSX login item was created (potential malware persistence) | ❌ | ❌ | ❌ | ✅ | ❌ |
User logged in from two different countries within one hour | ✅ | ✅ | ❌ | ✅ | ❌ |
The first case can be easily justified, as all we need to do is provide the user with three pieces of information:
timestamp for the activity
the operating system used
and the name of the Identity Provider (IdP)
The second example may be an indirect consequence of the installation of another application, and confirming this and pinpointing it to the actions of a user is not an easy task from an automation perspective.
The third example could, in theory, be automated, as it is easy to ask a user if they are travelling when we observe a login from different countries. However, one of the requirements is not met: obtaining context does not require analyst intervention. The problem is compounded by the fact that VPNs are more ubiquitous these days, and users may not be aware that using such tools may throw off security detections.
The solution
The core piece of our solution is the SOC Bot, a bot developed by the SecOps team at TrueLayer, which requires the following three main components:
SIEM capable of exporting any new alerts to an external application (eg using webhooks)
Collaboration platform with API (eg Microsoft Teams, Slack)
Database
The schematic below shows how the SOC Bot operates whenever a new alert is triggered on the SIEM:
The SIEM sends an alert notification to the SOC Bot through webhooks.
The SOC Bot requests a justification from the user through the company’s collaboration application.
The user receives a notification from the SOC Bot and is asked to fill out a short form to justify the activity.
After the form is submitted, the SOC Bot forwards the justification to the SecOps team for review through the company’s chat or collaboration application. The message includes a form with the justification, a drop-down menu for selecting a verdict and fields for the analyst to add their comments and final decision. If the analyst finds the justification inadequate or insufficient, they can contact the user through the chat or collaboration application to request additional information.
Once the analyst decides that the case can be closed, they submit their verdict and comments through the same form in the company’s collaboration application, which also includes the end-user's justification.
The SOC Bot then processes the analyst's comments, verdict and the end-user's justification. It them closes the initial alert on the SIEM with the provided verdict. The comments are also recorded in the alert for future audit purposes.
One step further
The SecOps team at TrueLayer is committed to automating everyday and critical tasks. By incorporating an external bot and a database, we were able to create an automated event response platform that leverages our security stack to handle anything from simple investigations to incidents.
Additionally, we implemented an alert prioritisation and notification model that utilises various methods and third-party tools. This ensures that the team never misses alerts deemed critical, even during out-of-hours.
Closing notes
At TrueLayer, security is the responsibility of everyone, and this forms the basis of our delegated alerts model approach to SecOps. While we've omitted implementation details from this post for security reasons, the principles and workflow should be easily replicated by anyone willing to embrace this idea.
This model has saved countless hours for analysts, enabling them to concentrate on critical alerts and projects, while allowing the SOC Bot to manage repetitive tasks and playbooks.
Want to join TrueLayer's engineering team? Take a look at our open roles in Product Development.