<html><head></head><body><div style="font-family: Verdana;font-size: 12.0px;"><div>Hey all,</div>


<div>&nbsp;</div>


<div>thanks for the link, and it is indeed a fun talk :-)</div>


<div>&nbsp;</div>


<div>An important detail that many people in &quot;machine learning for security&quot; <span style="line-height: 1.6em;">neglect is that the vast&nbsp;</span><span style="line-height: 1.6em;">majority </span></div>


<div><span style="line-height: 1.6em;">of ML algorithms were not designed for (and will not&nbsp;</span><span style="line-height: 1.6em;">function well) in an adversarial model. Normally,</span></div>


<div><span style="line-height: 1.6em;">one is trying to model an unknown statistical process based on past observables; the concept that the&nbsp;</span></div>


<div><span style="line-height: 1.6em;">statistical process may adapt itself with the intent of fooling you isn&#39;t really of interest when you try to</span></div>


<div><span style="line-height: 1.6em;">recognize faces / letters / cats / copyrighted content programmatically.</span></div>


<div>&nbsp;</div>


<div><span style="line-height: 1.6em;">For entertainment, I think everyone that plays with statistics / curve fitting / machine learning in our field</span></div>


<div><span style="line-height: 1.6em;">should have a look at two things:</span></div>


<div>&nbsp; &nbsp;</div>


<div>&nbsp; &nbsp; http://cvdazzle.com/ - people trying crazy makeup / hair styles to screw with face detection.</div>


<div><span style="font-family: Verdana; font-size: 12px; line-height: 19.2000007629395px;">&nbsp; &nbsp; http://blaine-nelson.com/research/pubs/Huang-Joseph-AISec-2011 - a riot of a paper that introduces &quot;Adversarial Machine Learning&quot;</span></div>


<div>&nbsp;</div>


<div>This doesn&#39;t mean that you can&#39;t have huge successes temporarily using ML / curve fitting / statistics;</div>


<div>attackers&nbsp;<span style="line-height: 1.6em;">haven&#39;t felt the need to adapt to anything but AV signatures and DNS blacklisting yet, so&nbsp;</span><span style="line-height: 1.6em;">relatively simple&nbsp;</span></div>


<div><span style="line-height: 1.6em;">ML will have big gains initially. I suspect, though, that a really important part of using ML for defense in any form</span></div>


<div><span style="line-height: 1.6em;">is &quot;not becoming an oracle&quot; - which is often counter to commercial success. It may be that the only good, long-term</span></div>


<div><span style="line-height: 1.6em;">ML-based defense is one that can&#39;t be bought.</span></div>


<div>&nbsp;</div>


<div><span style="line-height: 1.6em;">Cheers,</span></div>


<div><span style="line-height: 1.6em;">Halvar</span></div>


<div>&nbsp;</div>


<div>&nbsp;</div>


<div>&nbsp;</div>


<div>&nbsp;

<div name="quote" style="margin:10px 5px 5px 10px; padding: 10px 0 10px 10px; border-left:2px solid #C3D9E5; word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;">

<div style="margin:0 0 10px 0;"><b>Gesendet:</b>&nbsp;Donnerstag, 20. November 2014 um 19:16 Uhr<br/>

<b>Von:</b>&nbsp;&quot;Dave Aitel&quot; &lt;dave@immunityinc.com&gt;<br/>

<b>An:</b>&nbsp;dailydave@lists.immunityinc.com<br/>

<b>Betreff:</b>&nbsp;[Dailydave] Machine Learning and Dimensions and stuff</div>


<div name="quoted-content"><a href="https://vimeo.com/112322888" target="_blank">https://vimeo.com/112322888</a><br/>

<br/>

Dmitri pointed me at the above talk which is essentially a good<br/>

specialized 101-level lecture on how machine learning works in the<br/>

security space.<br/>

<br/>

There&#39;s not much to criticize in the talk! (It has a lot of the features<br/>

of El Jefe!) They use a real graph database to run their algorithms<br/>

against process trees - but if you wanted to heckle you&#39;d ask &quot;Doesn&#39;t<br/>

the CreateProcess() system call also take &quot;parent process&quot; as an<br/>

argument? What IS the rate of false positives? Because if you can&#39;t get<br/>

it down to basically 0 then you are essentially wasting your time? etc.&quot; :&gt;<br/>

<br/>

But again, nobody asked any hard questions - and while the talk nibbled<br/>

around the edges of the tradeoffs with using machine learning techniques<br/>

on this kind of data, it didn&#39;t go into any depth at all about which<br/>

ones they&#39;ve tried and failed at. It&#39;s a technical talk, but it&#39;s not a<br/>

DETAILED talk in the sense of &quot;Here&#39;s some outliers that show us where<br/>

we fail and where we succeed and perhaps why&quot;.<br/>

<br/>

That said, if you don&#39;t have a plan to do this sort of thing, then<br/>

you&#39;re probably failing at some level, so worth a watch. :&gt;<br/>

<br/>

-dave<br/>

<br/>

<br/>

_______________________________________________<br/>

Dailydave mailing list<br/>

Dailydave@lists.immunityinc.com<br/>

<a href="https://lists.immunityinc.com/mailman/listinfo/dailydave" target="_blank">https://lists.immunityinc.com/mailman/listinfo/dailydave</a></div>

</div>

</div></div></body></html>