Joshua Saxe

@joshua_saxe

Chief Scientist @ Sophos (views my own). Book: Malware Data Science. Recent paper: …. Interested in ML, cyber, social-sci, philosophy

Vrijeme pridruživanja: svibanj 2013.

Tweetovi

Blokirali ste korisnika/cu @joshua_saxe

Jeste li sigurni da želite vidjeti te tweetove? Time nećete deblokirati korisnika/cu @joshua_saxe

  1. Prikvačeni tweet
    29. sij

    1\ I've written a little compiler to ship ML models as standalone Yara rules, and done proof of concept detectors for Macho-O, RTF files, and powershell scripts. So far I have decision trees, random forests, and logistic regression (LR) working.

    Prikaži ovu nit
    Poništi
  2. proslijedio/la je Tweet
    30. sij

    1\ A file seen at "Downloads\svchost.exe" that doesn't *look* like svchost.exe *might* be a problem. Indeed, AI's , and show that a neural net that takes a file's path alongside its contents gets ~30% better detection.

    Prikaži ovu nit
    Poništi
  3. proslijedio/la je Tweet
    21. sij

    VirusTotal is arguably the ImageNet of cybersecurity AI, having provided large-scale labeled data and a rough means by which to compare approaches. This was a happy accident though, and is not VT's purpose -- which is why efforts like et al.'s EMBER are so important.

    Poništi
  4. proslijedio/la je Tweet
    23. sij

    ML is mystified when it's marketed, but effective models are sometimes simple. E.g. this phishing model, which scores an email by adding up word and phrase weights in the text. Positive weights are "bad", negative weights are "good."

    Poništi
  5. proslijedio/la je Tweet
    30. sij

    Large scale malware similarity visualization work by , myself, and others. We built a prototype set of analytics and accompanying GUI to accelerate malware analysis over many samples, and did a user study showing efficacy.

    Poništi
  6. 1. velj

    Some infosec knowledge is useful for months (knowledge of a given campaign), other knowledge, for years, (TTPs), other knowledge, for decades (the halting problem). Here's a "Pyramid of Pain" (cc/ ) inspired model of knowledge in cyber I find useful for myself.

    Poništi
  7. 1. velj

    3\ And this problem isn't going away. I think advances in reinforcement learning's ability to explore very large state spaces in games like Go (e.g. AlphaGo) provide some promise here. But be aware that any commercial or free malware sandbox you use today has this problem.

    Prikaži ovu nit
    Poništi
  8. 1. velj

    2\ The problem isn't that the malware did any sandbox evasion. It's simply that we didn't know the right program inputs (e.g. C2 interaction) to get the malware to exercise the untouched parts of the graph. Code coverage is a very hard, unsolved computer science problem.

    Prikaži ovu nit
    Poništi
  9. 1. velj

    1\ Malware sandboxes are useful but extremely limited. Here's a malware call graph, and in red are the functions the malware actually *executed* when run in a sandbox -- a miniscule fraction of the malware's potential badness!

    Prikaži ovu nit
    Poništi
  10. proslijedio/la je Tweet
    26. sij

    1\ There's an intuition in that behavioral malware detection (ML or not) is better than file-based (static) detection because it's resilient to packing and detects malware in the act. In fact, empirical results bear out that files, even packed files, are better signals.

    Prikaži ovu nit
    Poništi
  11. 1. velj

    Useful summary of known/best-guess information about the coronavirus from the NYTimes

    Poništi
  12. proslijedio/la je Tweet
    27. sij

    If our car engine sounds funny, our internal anomaly detector takes it to a mechanic. Begs the question: could your average user, fed the right visual/auditory encodings of PC behavior (say in a little animated visualization during system startup), help detect breached machines?

    Poništi
  13. proslijedio/la je Tweet
    31. sij

    Facial Recognition meets malware clustering: training on family names plus some embedding tricks stolen from the FR literature plus TSNE leads to super sharp clusters, with a few cases of potential mislabeling to dig into (check out the potential FNs southeast of the origin)!

    Prikaži ovu nit
    Poništi
  14. 31. sij

    A genealogy of 10 malware EXEs that share code - and one that doesn't belong - visualized. The rows are individual samples, and the color-blocks are their functions. A somewhat complicated algorithm is used to draw a plausible evolutionary lineage. Work from my CyberGenome days.

    Poništi
  15. 31. sij

    For a non-mathy thorough intro to ML detection, Chapter 6 of and my book is available for free on the site. Goes through logistic regression, kNN, decision trees, random forests, and when it makes sense to use each.

    Poništi
  16. proslijedio/la je Tweet
    26. sij

    1\ Let's bypass a convolutional neural network trained to recognize previously unseen bad URLs. The classifier gives a score between 0 (benign) and 1 (definitely malicious). I start by making up a phishing URL: hxxp://wellsfargo-customer-support.webhosting.pl/login

    Prikaži ovu nit
    Poništi
  17. proslijedio/la je Tweet
    24. sij

    1/ Here's a thread on how to build the kind of security artifact "social network" graph popularized by and others, but customized, and on your own private security data. Consider the following graph, where the nodes are malware samples:

    Prikaži ovu nit
    Poništi
  18. proslijedio/la je Tweet
    2. sij

    Thread on cognitive biases in cybersecurity I've noticed: Maginot Line: you got breached by an impersonation attack, so you go buy an anti-impersonation solution and assume you're much safer. Sort of like checking people's shoes at the airport.

    Prikaži ovu nit
    Poništi
  19. 30. sij

    5\ The result, as would be intuitive, is much better detection accuracy. Please see the paper for more info!

    Prikaži ovu nit
    Poništi
  20. 30. sij

    4\ These analyses are then merged so that reasoning about the file path in the context of the binary, and vice-versa, can take place. Good targets for detection are, for example, unsigned system utilities that have been replaced by very different looking files.

    Prikaži ovu nit
    Poništi
  21. 30. sij

    3\ And a character string / file path analyzer, which does a high-capacity analysis of file path strings...

    Prikaži ovu nit
    Poništi

Čini se da učitavanje traje već neko vrijeme.

Twitter je možda preopterećen ili ima kratkotrajnih poteškoća u radu. Pokušajte ponovno ili potražite dodatne informacije u odjeljku Status Twittera.

    Možda bi vam se svidjelo i ovo:

    ·