[Dailydave] AI

Wed Mar 30 13:03:11 EDT 2016

Sven,

Your general point is well taken, however I'd contend that while most problems in security don't boil down to simple image classification tasks, there are certainly valid ways of using the unique spatial nature of CNNs to apply to security problems. Namely, mapping data that is not traditionally visual in nature to that of an image representing that data (e.g. binary -> png) can—and in my experience, has—yielded very promising results. Granted, it's debatable whether it's better to utilize a technique more suited to the original data set in lieu of transforming it into an image, but that's a conversation for another day. The bottom line is finding a model that consistently gives good results in context of the question being answered.

On the point just caring about the results and not about the technology/process involved, I'm not sure I agree. When we get into extremely complex technologies that give us binary, "good/bad" answers to not-so-simple questions, I think it's imperative to understand the basis upon which the technology arrived at the answer. It may not be feasible with commercial (read: intellectual property) solutions but is nonetheless important. An example can be found in dynamic malware analysis systems, where understanding the perspective from which data is collected helps frame the efficacy of the result with respect to potential detection by malware.

Just some food for thought.

Chris Smoak
Georgia Tech Research Institute

From: <dailydave-bounces at lists.immunityinc.com<mailto:dailydave-bounces at lists.immunityinc.com>> on behalf of Sven Krasser <sven at crowdstrike.com<mailto:sven at crowdstrike.com>>
Date: Wednesday, March 30, 2016 at 10:49 AM
To: dave aitel <dave at immunityinc.com<mailto:dave at immunityinc.com>>, "dailydave at lists.immunityinc.com<mailto:dailydave at lists.immunityinc.com>" <dailydave at lists.immunityinc.com<mailto:dailydave at lists.immunityinc.com>>
Subject: Re: [Dailydave] AI

Hey Dave,

You got some things right and some things wrong. In security, most problems are not image classification related and do not benefit at the same level from the recent advances in Convolutional Neural Networks. Also, TensorFlow is not the first freely available Deep Learning library nor is it the first freely available Machine Learning classification library by a long shot. Take a look at e.g. some of the presentations that the MLSec Project made available, ML has been in security products for decades (and I worked on shipping products with it back in the day working at CipherTrust before people cared what technology stopped the threats as long as they were stopped). What’s new is that Machine Learning now also appears on marketing materials. So the question one should ask oneself is whether you still have a product once the ML hype wore off.

Best,
-Sven

--
Sven Krasser, Ph.D.
Chief Scientist, CrowdStrike, Inc.
http://www.crowdstrike.com | http://tinyurl.com/cs-svenk

From: <dailydave-bounces at lists.immunityinc.com<mailto:dailydave-bounces at lists.immunityinc.com>> on behalf of dave aitel <dave at immunityinc.com<mailto:dave at immunityinc.com>>
Date: Wednesday, March 30, 2016 at 5:56 AM
To: "dailydave at lists.immunityinc.com<mailto:dailydave at lists.immunityinc.com>" <dailydave at lists.immunityinc.com<mailto:dailydave at lists.immunityinc.com>>
Subject: [Dailydave] AI

There are only a few real computers in the world, and I think we are just beginning to feel their influence. For example, here is a sample project I am working on now that image classification is a solved problem.

Like many of you on this list, I dabble in brazilian jiu jitsu. In fact, in a week we are doing an open mat at INFILTRATE for both newcomers who've always wanted to try to choke me out, to people in the community who are already very good at choking people.

Like many sports, BJJ is typically scored according to a ruleset based on the different positions you end up in. Being on top is usually better. Being able to get on top after you are on the bottom is worth 2 points. Being able to completely mount someone is worth three points. Getting on their back is four points. Generally a tournament will hire judges and they will award points based on their understanding of the rules and their personal feelings towards the contestants and whatever other factors are floating in their heads.

What I'm working on is collecting a set of images of BJJ, then annotating them as to what positions the different people are in. This essentially maps every image into a vector space - and after training a neural network using modern techniques you can have a program that looks at an image and then outputs "Blue is in top mount".

Part of the key here is that you don't have to tell it that the picture is BJJ. Every picture that program sees is two people doing BJJ. All it has to do is output what positions they are in.

And in the end, by assigning point values to transitions between positions, you will have an automatic BJJ judge. I've applied for a TensorFlow API key from Google since although this is not a hard problem by ML standards I want to do it the right way and get good scalable results on video later.

And of course, the same thing is true for the process information El Jefe<https://eljefe.immunityinc.com/> will give you. All those "behavioral analysis machine learning intrusion detection" startups are about to be crushed by simple open source projects that use Google and MS and Amazon's exported Machine Learning APIs.

-dave

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.immunityinc.com/pipermail/dailydave/attachments/20160330/6248c958/attachment-0001.html>