Say someone is investigating a Windows Vista+ intrusion where most likely the attacker added files including malware, but AV doesn't find anything. Could someone run ssdeep on a previous VSC and then on the original file system and discard files that are around 90% the same. The malware will probably be completely new files and not discarded.
That should reduce a lot of data. With further filtering like show only binaries, I would think you'd have a good start on the timeline and just a few suspicious binaries to submit to Virustotal and analyze further using other methods.
Is anyone already using ssdeep like this and if so does it work?
Could someone run ssdeep on a previous VSC and then on the original file system and discard files that are around 90% the same. The malware will probably be completely new files and not discarded.
That should reduce a lot of data.
Not sure what sdeep adds over straight hashing? It, too, would throw out a lot of data. (Perhaps you are thinking that patched binaries are a possibility, so it would not be necessary to have the exact version as hash source? Might be so.)
Another possibility is to throw out executables (and .cab and a few other file types) that are digitally signed, and for which the signature can be verified properly.
Is anyone already using ssdeep like this and if so does it work?
It sounds like an interesting study, otherwise – does ssdeep and other fuzzy hashing methods give any real benefits (say, reduced number of hashes) over straight hashing for files from a OS distribution for which patches can be expected?
You pretty much got what I meant. From the time the VSC was created to the actual analysis, a lot of files could get slightly changed; either by an admin or user. The only real benefit is more data reduction.
I agree, looking for executables that aren't signed is another good one to further whittle it down to a few suspicious programs. Thanks!