Hello,
I am looking to import the NSRL hashes into my encase tool. I read the data on the NSRL site but came away with the idea that once I load the hashes they will only identify a file as "known", meaning it could be a WinXP system file or a "known" malware file.
Am I incorrect in that?
Basically I want a list of known MS (NT/2K/XP….) files so I don't have to look at each one and wonder if its good or bad.
Thanks
Quick Google Search…
http//
I am looking to import the NSRL hashes into my encase tool. I read the data on the NSRL site but came away with the idea that once I load the hashes they will only identify a file as "known", meaning it could be a WinXP system file or a "known" malware file.
It *could* be malware. As the NSRL people insist on collecting hashes only from floppy disks, CDs and DVDs, however, that malware must have made it onto the software distribution disc. That does sometimes happen, which is why there can be no guarantee that the hashes are free of malware. (SOme very old CDs from the good old MSDOS days may even have legitimate programs – like HALT.COM – that today will be flagged as malware because some virus uses that code to destroy files.)
And as to 'known' … look over the structure of the database. You'll find a lot of information as to the source(s) of a particular hash, so it's not just 'known but nothing more'.
Importing the information into EnCase is not pretty , as EnCase will retain the 'first' source of the hash, which is generally some software distribution with a name beginning in A, ad which looks very odd unless you know. Don't rely on this information. (I prefer to remove it, to avoid the risk).
But EnCase uses the hashes as 'known' hashes only – you are supposed to use them to *remove* files from your examination, to be able to concentrate on the 'unknown' files.
Note any hash collection will have the same problem – it won't be 'chemically pure'. So you never trust the hash to identify the file to 100% – you just use it as a possibly unreliable indicator, and then verify the file by other means. If identified as CP, you do look at the image. If identified as malware, you either analyze it or pass it to a trusted analyzer.
Added later if you just want NT/Win2k/etc files, you will have to filter out them from the database, for instance by the platform tag. EnCase won't do it for you you have to do it yourself.
I would suggest that Keydet89's response listing that google link is rather a poor one.
The book referred to here only covers hash sets on a very basic level as indeed it does with many areas of Encase more I suspect for the sake of inclusion rather than to provide any form of didactic reference, thus rendering it suitable for nothing more than a brief introduction to the subject which you would have attained by now to be asking such questions any way.
I am pursuing several questions myself on the subject (Hash Sets, Encase 6.8) as it happens, for the case I am currently working on, and wish to credit Athulin for supplying a good and comprehensive reply to the original question.
I had posted a definition of the NSRL (and Hashkeeper) and was shortly thereafter contacted by Doug White of NIST to clarify some misconceptions I had, in particular my understanding of the Known Good vs. Known Bad issue you referenced. Their FAQ states it nicely and here's the relevant section of my post
Known (vs. Known Good and Known Bad)
Also, the NSRL does not identify software good or malicious but instead provides a simple automated file classification. The reasoning for this approach is that software can be malicious is some settings and not in others. If you are using the NSRL to eliminate files you must analyze during an investigation, it is important to review the “ApplicationType” field and only eliminate files you deem irrelevant. The NSRL FAQ nicely sums things up by stating the files in the NSRL database are “known” - NOT “known good” OR “known bad” - just “known application files.”
So basically you have to filter on ApplicationType to tailor your results. Also, bear in mind that all NSRL entries are traceable to the original manufacturer install media which is a big plus. My full definition is at
http//