<html>

  <head>


    <meta http-equiv="content-type" content="text/html; charset=utf-8">

  </head>

  <body bgcolor="#FFFFFF" text="#000000">

    <br>

    This paper is bad in many ways, but in particular it confuses

    binaries with 0day (which are more related to vulnerabilities), uses

    a simplistic "windows of vulnerability" model, and uses the Symantec

    WINE dataset to try to derive real data from.<br>

<a class="moz-txt-link-freetext" href="https://users.ece.cmu.edu/~tdumitra/public_documents/bilge12_zero_day.pdf">https://users.ece.cmu.edu/~tdumitra/public_documents/bilge12_zero_day.pdf</a><br>

    <br>

    A brief word about the WINE dataset and datasets like it: It is

    impossible to remove massive observer bias from them. All I want you

    to do is read the above paper and ask yourself "If the most used

    0day on the market was in Symantec's endpoint protection, what would

    this paper look like?"  <span style="color: rgb(0, 0, 0);

      font-family: 'Times New Roman'; font-size: medium; font-style:

      normal; font-variant: normal; font-weight: normal; letter-spacing:

      normal; line-height: normal; orphans: auto; text-align: start;

      text-indent: 0px; text-transform: none; white-space: normal;

      widows: 1; word-spacing: 0px; -webkit-text-stroke-width: 0px;

      display: inline !important; float: none;"><span

        class="Apple-converted-space"></span>A good rule of thumb is

      that if someone is talking about "Windows of vulnerability" they

      have oversimplified the problem beyond recognition.</span><br>

    <br>

    What you get with people who rely on IDS data to talk about 0days is

    a bizarre level of cognitive dissonance when it comes down to how

    bad their data is for the conclusions they are trying to draw. The

    only valid thing you can say from that kind of data is "sometimes we

    get lucky and find an 0day". And the same thing is true when looking

    at the Verizon data to try to understand attacks. Their conclusions

    this year are demonstrably nonsensical, but every year has been the

    same basic methodology...<br>

    <br>

    This is a must read:

    <a class="moz-txt-link-freetext" href="http://blog.trailofbits.com/2016/05/05/the-dbirs-forest-of-exploit-signatures/">http://blog.trailofbits.com/2016/05/05/the-dbirs-forest-of-exploit-signatures/</a>

    <br>

    <br>

    But when you hear me go on and on about how Academia has completely

    lost its way in security, it's because of papers like the one at the

    top of this email. When you don't have the data you need to make a

    conclusion, but you are forced to publish something, you get shit

    results. And then we make government and corporate policy decisions

    based on those results.<br>

    <br>

    -dave<br>

    (P.S. The Windows emulator WINE is great, and not related to the

    Symantec WINE dataset:

    <a class="moz-txt-link-freetext" href="https://www.caida.org/workshops/telescope/slides/telescope1103_wine.pdf">https://www.caida.org/workshops/telescope/slides/telescope1103_wine.pdf</a>)<br>

    (P.P.S. A behavioral Windows dataset would actually be of great

    value. Maybe Crowdstrike could drop one out?)<br>

    <br>

    <br>

  </body>

</html>