<div dir="ltr">To be fair, the advantage of the network position is it avoids interference with your host-protection programs (aka, implants). And evading on the host is possible too. But both are probably necessary at some level.</div><br><div class="gmail_quote"><div dir="ltr">On Wed, Jun 21, 2017 at 1:41 PM Dominique Brezinski <<a href="mailto:dominique.brezinski@gmail.com">dominique.brezinski@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr">Let me tell a little story about statistical analysis of network traffic. I may or may not have been associated with someone that built a very large-scale, statistics-based detection mechanism using un-sampled network flow and HTTP proxy logs. 3200 cores chugged through the trailing X weeks of traffic, for hundreds of thousands of hosts, building usage profiles and then measured the distance of the current day's activity for each host from the baseline profile. <div><br></div><div>As a digression, in this unsupervised learning space all hope depends on quality feature selection. Once quality features are selected/engineered, the most basic of distance measures is sufficient to detect anomalies. The detected anomalies are just that -- anomalies. With regard to threat detection, the false positive and false negative rates are still likely too high to operationalize. You can ML like a boss and use Symbolic Aggregate Approximation (SAX) to represent your logs as images, then use a Convolutional Neural Network (CNN) to do the feature extraction for you, and feed the results through a Long Short Term Memory (LSTM) Recurrent Neural Network (RNN) to detect the anomalies -- which is approximately what Niara does based on their Spark Summit presentation. Or you can ML like a security engineer and use domain knowledge to identify discriminating features and use some simple Euclidean distance measures to detect anomalies. I have done both with the same approximate results. That was a statistics joke.</div><div><br></div><div>The result of all this statistical analysis is a set of finding about hosts that deviate from normal by some measure on one or more features. So what? Well that is exactly what the team responsible for triaging and operationalizing alerts said. This is where the real work begins. Now if the host communicated with novel domains among the population, for example, the domains would be provided as evidence. The domain information could be enriched with threat intel and results from services like OpenDNS. The monitoring team still says, "yeah ok, it talked to some sketchy shit. What are we really suppose to do about that? I mean really do, so we are not scaling a very expensive whack-a-mole team?" Right.</div><div><br></div><div>Now we go pull all the process execution and process-to-network events from the hosts. Now when a network anomaly occurs, you essentially build the activity graph that resulted in the anomalous network traffic. This looks actionable. It is.</div><div><br></div><div>The thing is, once you have that on-host activity, as Dave said some might say, you really don't need the network data anymore. You get to the same result earlier in the activity chain with actionable results, rich in context that is easily assessed by analysts and incident responders. Even better, you don't need to use statistics. There are better models using this data that are quite good for detection and hunting.</div><div><br></div><div>Some of us like belts and suspenders when we have to depend on imperfect techniques to mitigate risk, so network-level instrumentation presents data from a plane with different attack surface that correlates with host data. That is a nice feature if you can take advantage of it. Network data is also OS/device independent. Building some anomaly detection on network data provides broad coverage at a low engineering cost, however, the compute and storage costs are usually quite high. There are a lot of trade-offs. Honestly, most people get lost and never get clarity about what and how they are trying to detect and whether the data and techniques align with their desired results. They take an opportunistic stab at what data they have and fall down the rabbit hole.</div><div><br></div><div>Dom</div></div><div class="gmail_extra"><br><div class="gmail_quote">On Wed, Jun 21, 2017 at 7:25 AM, dave aitel <span dir="ltr"><<a href="mailto:dave@immunityinc.com" target="_blank">dave@immunityinc.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div bgcolor="#FFFFFF" text="#000000">
<p>Let's talk about the <a href="http://www.cnbc.com/2017/06/20/cisco-introduces-encrypted-traffic-analytics-to-detect-malwre.html" target="_blank">giant
pile of wrong that is this reporting on Cisco's new marketing
campaign</a> around detecting encrypted malware traffic. "This
is a seminal moment in networking" is the quote from their CEO
that CNBC decided to run. Let's revisit the basics of this "new"
technology: do statistical analysis on encrypted data to find
malware traffic. <br>
</p>
<p>People have <a href="https://www.schneier.com/blog/archives/2008/06/eavesdropping_o_2.html" target="_blank">literally
decoded conversations</a> from encrypted data using that same
basic technique. Not even recently - that work is from 2008 and
was not surprising even then.<br>
</p>
<p>"<span>The software,
which will be offered as a subscription service, is currently in
field trials with 75 customers, and according to Robbins, is 99
percent effective."</span></p>
<p><span>99% effective
with the kind of traffic a normal network sees means you are
FLOODED AND OVERWHELMED WITH FALSE POSITIVES. Although they
don't specify what that number even means. Is it false
positives? False negatives? Both? Let's just say this: 99.99% is
useless when doing a network-based IDS. All that might get you
is an indicator you can use to remotely load a more
sophisticated remote tool onto an endpoint for further detailed
analysis. You essentially, need BOTH if you have this level of
network-based IDS, and the endpoint people will probably say you
don't need the network sniffer anymore, because scaling good
analysis at that level at anything near realtime is nearly
impossible (c.f. <a href="https://www.youtube.com/watch?v=2OTRU--HtLM" target="_blank">Alex
Stamos's talk</a>) to the point where they still try to sell
you stuff that has 1% false positive rates. :)</span></p>
<p><span>I'm going to
bug our big customers to see if any of them are in this 75 field
trial and what they think in real life. And I'm going to be
honest and say that if you are thinking of investing in this
sort of thing, but you haven't tested it against <a href="https://www.cobaltstrike.com/" target="_blank">Cobalt Strike</a> and <a href="https://www.immunityinc.com/products/innuendo/" target="_blank">INNUENDO</a>,
then you are knowingly buying snake oil. A good percentage of
our consulting business right now is literally just that because
these anomaly detection products are so expensive and so hard to
test.</span></p>
<p><span>Anyways,
maybe I am wrong! If you are one of the privileged 75 and you
love this and it is amazing, let me/us know!<span class="m_4844909271262295015HOEnZb"><font color="#888888"><br>
</font></span></span></p><span class="m_4844909271262295015HOEnZb"><font color="#888888">
<p><span>-dave</span></p>
<p><span><br>
</span></p>
<p><span><br>
</span></p>
</font></span></div>
<br>_______________________________________________<br>
Dailydave mailing list<br>
<a href="mailto:Dailydave@lists.immunityinc.com" target="_blank">Dailydave@lists.immunityinc.com</a><br>
<a href="https://lists.immunityinc.com/mailman/listinfo/dailydave" rel="noreferrer" target="_blank">https://lists.immunityinc.com/mailman/listinfo/dailydave</a><br>
<br></blockquote></div><br></div>
_______________________________________________<br>
Dailydave mailing list<br>
<a href="mailto:Dailydave@lists.immunityinc.com" target="_blank">Dailydave@lists.immunityinc.com</a><br>
<a href="https://lists.immunityinc.com/mailman/listinfo/dailydave" rel="noreferrer" target="_blank">https://lists.immunityinc.com/mailman/listinfo/dailydave</a><br>
</blockquote></div>