[Dailydave] Machine Learning and Dimensions and stuff

William Kupersanin wkupersa at gmail.com
Sat Nov 22 10:46:08 EST 2014

Yes. Sorry. I'll try to elaborate with an example.  Not really supervised
or unsupervised learning but I think that it gets to the point..... Say for
example that one of my indicators is low probability parent child process
relationships that lead to potential high risk applications such as
powershell or reg. I haven't tried this one, but I am willing bet that the
intersection of these conditions is a fairly high signal to noise ratio.

If I understood Halvar's comment, as this becomes well known, the adversary
will take measures to avoid "tripping" this analytic.   Thinking about the
ways that they might approach this, they might inject code into common
processes in order to avoid the odd parent child relationship, or they
might bring their own tools to avoid using powershell or reg.

My response is two fold:  Either of these scenarios is also detectable and
it becomes an arms race of sensors and analytics versus techniques. Event
Tracing for Windows and various COTS tools already give you some of the
capabilities to detect.  The point that I poorly explained earlier is that
making the adversary work harder to stay on our systems is game changing.
I'd argue that we don't do a lot in this area and that the old techniques
for credential access, exfiltration, and command and control keep working.
When we can begin to challenge the adversary and make our systems contested
space, we raise their cost of information. Finally.

Hope this makes more sense,
On Nov 22, 2014 4:30 AM, "shadown [at] gmail" <shadown at gmail.com> wrote:

> Willie, could you elaborate?
> I'm interested in details, from vague statements we don't learn anything
> new. Please remember this is not the physical world, and very different
> rules apply.
> Cheers,
>   Sergio
> On 21.11.2014, at 22:19, William Kupersanin <wkupersa at gmail.com> wrote:
> The implications are though, that even if the adversary adapts, that the
> ML analytic is forcing the adversary to operate in a smaller space to avoid
> appearing anomalous. I consider anything that can shift the balance of cost
> from the defender to the adversary to be wildly successful.
> --Willie
> On Thu, Nov 20, 2014 at 5:25 PM, Halvar Flake <HalVar at gmx.de> wrote:
>> Hey all,
>> thanks for the link, and it is indeed a fun talk :-)
>> An important detail that many people in "machine learning for security" neglect
>> is that the vast majority
>> of ML algorithms were not designed for (and will not function well) in
>> an adversarial model. Normally,
>> one is trying to model an unknown statistical process based on past
>> observables; the concept that the
>> statistical process may adapt itself with the intent of fooling you isn't
>> really of interest when you try to
>> recognize faces / letters / cats / copyrighted content programmatically.
>> For entertainment, I think everyone that plays with statistics / curve
>> fitting / machine learning in our field
>> should have a look at two things:
>>     http://cvdazzle.com/ - people trying crazy makeup / hair styles to
>> screw with face detection.
>>     http://blaine-nelson.com/research/pubs/Huang-Joseph-AISec-2011 - a
>> riot of a paper that introduces "Adversarial Machine Learning"
>> This doesn't mean that you can't have huge successes temporarily using ML
>> / curve fitting / statistics;
>> attackers haven't felt the need to adapt to anything but AV signatures
>> and DNS blacklisting yet, so relatively simple
>> ML will have big gains initially. I suspect, though, that a really
>> important part of using ML for defense in any form
>> is "not becoming an oracle" - which is often counter to commercial
>> success. It may be that the only good, long-term
>> ML-based defense is one that can't be bought.
>> Cheers,
>> Halvar
>> *Gesendet:* Donnerstag, 20. November 2014 um 19:16 Uhr
>> *Von:* "Dave Aitel" <dave at immunityinc.com>
>> *An:* dailydave at lists.immunityinc.com
>> *Betreff:* [Dailydave] Machine Learning and Dimensions and stuff
>> https://vimeo.com/112322888
>> Dmitri pointed me at the above talk which is essentially a good
>> specialized 101-level lecture on how machine learning works in the
>> security space.
>> There's not much to criticize in the talk! (It has a lot of the features
>> of El Jefe!) They use a real graph database to run their algorithms
>> against process trees - but if you wanted to heckle you'd ask "Doesn't
>> the CreateProcess() system call also take "parent process" as an
>> argument? What IS the rate of false positives? Because if you can't get
>> it down to basically 0 then you are essentially wasting your time? etc."
>> :>
>> But again, nobody asked any hard questions - and while the talk nibbled
>> around the edges of the tradeoffs with using machine learning techniques
>> on this kind of data, it didn't go into any depth at all about which
>> ones they've tried and failed at. It's a technical talk, but it's not a
>> DETAILED talk in the sense of "Here's some outliers that show us where
>> we fail and where we succeed and perhaps why".
>> That said, if you don't have a plan to do this sort of thing, then
>> you're probably failing at some level, so worth a watch. :>
>> -dave
>> _______________________________________________
>> Dailydave mailing list
>> Dailydave at lists.immunityinc.com
>> https://lists.immunityinc.com/mailman/listinfo/dailydave
>> _______________________________________________
>> Dailydave mailing list
>> Dailydave at lists.immunityinc.com
>> https://lists.immunityinc.com/mailman/listinfo/dailydave
> _______________________________________________
> Dailydave mailing list
> Dailydave at lists.immunityinc.com
> https://lists.immunityinc.com/mailman/listinfo/dailydave
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.immunityinc.com/pipermail/dailydave/attachments/20141122/c865eb2b/attachment.html>

More information about the Dailydave mailing list