SIEM vendors claim to provide machine learning functionalities in their solutions. Gartner recently covered the growing arena of Machine Learning Log Analysis, and how it is being positioned as a complement to SIEM. What do CISOs and security directors need to look for to effectively navigate ML in their security platform?
Machine Learning (ML) is a branch of Artificial Intelligence (AI) which uses algorithms and statistical models to perform tasks without using explicit instructions, relying on patterns and inference instead. In the security arena, ML’s purpose is to take raw log and event data and turn it into actionable intelligence about security events. Like other forms of AI, ML is instrumental in bringing more intelligent automation to the cumbersome and complex challenge of managing security systems.
Traditionally, security products used (or claimed to use) ML to detect behavioral anomalies that might lead to a security incident – typically calling it “detecting unknown attacks.”
In the past few years more and more cybersecurity operations initiatives are using ML to help automate the management of security tools, deriving much more value out of them, and out of the data they are generating, and fast. It’s all about time to value from your security tools, or in other words, shortening the time to respond to real attacks.
Gartner Defines the ML Log Analysis Arena
Gartner’s Security Research Director Eric Ahlm published a report titled “Emerging Technology Analysis: Machine Learning Log Analysis Disrupts Traditional SIEM Buying Models” in October 2019. The report highlights some ways in which this new arena of ML Log Analysis is challenging the SIEM arena, and leading organizations to redirect some of their SIEM budget to ML Log Analysis.
While the report includes recommendations for ML Log Analysis vendors on how to position their offerings, the question remains: how should buyers view this new arena, and how to judge if their SIEM requires a “wingman” in the form of ML Log Analysis?
The Gartner report positions ML Log Analysis everywhere on the spectrum from reinforcement for SIEM (with added budgets) to full SIEM replacement. “An ML-based log solution can augment functionality, help scale data or operations, or in some specific cases out right replace an existing SIEM,” states the report.
Example providers listed in the report include empow, Uplevel Security and others.
Drivers for ML Implementation
What drives some organizations to add ML log analysis to SIEM in their platforms?
One driver is a common SIEM disease: license cost creep. Most SIEM pricing models are based on the amount of data - the less data they will need to digest, the lower the cost. Therefore, some will choose to add an ML log analysis intelligence layer in front of their SIEM, which will potentially crunch some of the data, and create fewer alerts which the SIEM will need to digest. This of course can be a good approach and will fit some end users’ needs. However for some users, this approach will not work because regulation mandates them to keep any and all pieces of raw log.
A second driver for adding ML log analysis to an existing SIEM is identified in the Gartner report under the headline “Scale”: Scaling investigators through alert reduction and accuracy, scaling knowledge through predictive analysis and scaling response time normally required for lookup, manual data linkage or search tasks.
To achieve improved scaling, an ML intelligence layer is integrated into the SIEM’s data repository or sits on top of it. Its promise is to analyze the collected data and remove false positives and noise through automatic investigation actions, prioritizing the most relevant data that the user should focus on.
ML Value Criteria – what have you done for me lately?
The main question security teams should ask themselves is not ‘do we have ML’, which is so generic as to be almost meaningless, but rather: Is the ML technology in my network giving me the benefits I want, as per the drivers outlined above – cost and scalability?
When evaluating solutions that promise AI and ML - whether as part of a SIEM or an independent ML Logs Analysis software – we need to again look for the BENEFIT, and ask ourselves the following questions:
- Does this ML functionality enable me to meet the data volume needs of my organization?
- If the AI utilizes, for example, a supervised ML process, then what industries was the training data taken from? Different industries (banking, retail, insurance, manufacturing, etc.) experience different types of security events that can impact the effectiveness of ML. Make sure the data sets used are associated with your industry, or at least with a similar one.
- Who are the security domain experts who provided feedback for the algorithm’s creation process? If the solution doesn’t employ the right domain experts to optimize ML algorithms, the ML will remain a theoretical exercise.
- How
frequently does the algorithm need to be retrained to maintain its
effectiveness? How is the system updated with retrained machines?
And most importantly, the evaluation criteria should be the based on how well it helps you to meet ypur goals, or drivers:
- If my driver is cost, then show me percent of data reduction.
- If my driver is scalability or automating logs investigation for achieving a shorter time to response, then show me trends of these KPIs as part of my cyber operations.
Whether from SIEM or from ML log analysis platforms, if you aren’t getting the benefits you need from ML, its time to continue the search.
Avi Chesla, Founder & CEO, empow