Notifications
Clear all

NSRL Hash lists

30 Posts
12 Users
0 Reactions
5,044 Views
Jamie
(@jamie)
Moderator
Joined: 5 years ago
Posts: 1288
 

Hello Harlan, I wondered if the image of me in a skirt would bring you out of the woodwork.

You and I have been over this ground many times before and you know my position (and please, the "I was only trying to help" line is getting old).

So, no more of the same if you want to continue posting, I really can't put it much plainer - or more politely - than that.

Jamie


   
ReplyQuote
(@seanmcl)
Honorable Member
Joined: 19 years ago
Posts: 700
 

I started working with NSRL hash sets, and I'm looking for articles, info, software that could help me with the comparison.

Comparison of what? The NIST hashsets can be very valuable if you understand how they are created, how they can be used, and the limits to which you can rely on them.

Commonly, I use the NIST hash sets to eliminate many "known" files from additional investigation, especially when I am looking for evidence masquerading as system files. But I also augment these with downloads from sites such as Microsoft (which offer monthly ISO CD/DVD images of security hotfixes) and other sources, since the hashsets may not be as up to date as need to analyze a recently patched system (nor should they be expected to be).

Put another way, just because something was not identified as being a known file by the NIST hashsets, doesn't mean that it isn't legitimate. The hashsets are but one tool to be used to focus the investigation.

They are not a substitute for good judgement, experience or skill.

And if I may make a final point, the creation of resources such as the NIST hash sets is not a trivial task. So it is no surprise that they cannot and will never be contemporary with the latest hot fixes.

Because the NIST hash sets include files that I never use and rarely see, they are a good resource with which to filter out those things which are not, commonly, encountered, but benign.

But I, and many others, also make hashes of our own systems (e.g., fresh installs, OEM installs), to compliment what is contained in NIST. These may not be as rigorously vetted as are the NIST hashes (which is why they should not be released to the community as a whole), but they can be very helpful to an investigator trying to find the needle in the haystack.


   
ReplyQuote
(@dwhitenist)
Active Member
Joined: 15 years ago
Posts: 11
 

There are a few videos on YouTube of importing NSRL into Encase, FTK, etc. done by random people. What do you wish to learn? Drop us an email - nsrl at nist dot gov - and we can lend a hand.

Doug

Hey guys,

I started working with NSRL hash sets, and I'm looking for articles, info, software that could help me with the comparison.

Thanks in advance.


   
ReplyQuote
(@dwhitenist)
Active Member
Joined: 15 years ago
Posts: 11
 

Very good points about the NSRL. We don't often hear specific feedback like this.

We are working to address some of these areas. We are no longer limited to hardcopy, shrinkwrapped apps; we have a process for acquiring download/clickwrapped apps, including the MS updates. We have started a process of installing OSes and patches on VMs and harvesting the files from those; input on what OS is most useful to you will help us focus this work. (Is W2K still relevent? Vista Home? 32bit versions? etc.) Would faster updates be useful - rather than getting 1M new hashes every 3 months on CD, would 200K every 2 weeks via downloads serve you better?

Again, I appreciate hearing from NSRL users, and don't be shy about suggestions or complaints. Doug - nsrl at nist dot gov

"I'm pullin for ya, we're all in this together" - Red Green
http//youtu.be/LqK7H_XTd00

But I also augment these with downloads from sites such as Microsoft (which offer monthly ISO CD/DVD images of security hotfixes) and other sources, since the hashsets may not be as up to date as need to analyze a recently patched system (nor should they be expected to be).

Put another way, just because something was not identified as being a known file by the NIST hashsets, doesn't mean that it isn't legitimate.

And if I may make a final point, the creation of resources such as the NIST hash sets is not a trivial task. So it is no surprise that they cannot and will never be contemporary with the latest hot fixes.

But I, and many others, also make hashes of our own systems (e.g., fresh installs, OEM installs), to compliment what is contained in NIST.


   
ReplyQuote
(@jonstewart)
Eminent Member
Joined: 16 years ago
Posts: 47
 

Out of curiosity, how much time does it typically take to import the NSRL into EnCase/FTK/X-Ways?

Thanks,

Jon


   
ReplyQuote
(@douglasbrush)
Prominent Member
Joined: 16 years ago
Posts: 812
 

Doug,
Aside from your awesome name, this is great to have feed back of NIST process and have a channel to offer insight.

I use the NSRL set in EnCase consistently for both DF/IR cases and e-discovery. The biggest current issue I am experiencing is there are more and more "known" each day in both apps and OSs. I found that in the past with de-NISTing an XP box with the kind of standard apps I could cut out 30-40% of the files. Now in a Vista or Win 7 box with lots of downloaded and updated apps (FireFox for example - we should be at version 15 by Friday) that drops to about 15%. But I still do a lot of XP and now just adding more and more Win 7 so I really do need as much as I can get.

It would be nice to have more frequent updates. I can't image the time it takes to build out new sets, test and verify - its a lot of work. I wouldn't expect weekly, but maybe some how a way to pull down individual sets of apps or OS to update my current lists.

What about some sync method? Could that work? Going though it in my head and the overhead of management probably outweighs practicality.

I do agree with Sean - you have to build your own lightsaber. If you start gravitating to certain types of work and investigations you will start finding our own common "known" files to de-NIST and filter so it's good practice to build our own sets and not rely on someone else entirely. Its like anything else in forensics, you employ a lot of common methods and tools but it is ultimately you should employ things that allow you to make the proper evaluations.


   
ReplyQuote
(@dwhitenist)
Active Member
Joined: 15 years ago
Posts: 11
 

NIST imports every release candidate hashset into several tools as part of our internal QC process. NIST also provides pre-release access to vendors for testing with the candidate hashsets. The NIST imports are done on a fairly high-end PC, usually with the current releases of the software. (My apologies, I don't have the PC specs and tool versions handy right now)

Given the set of 4 CDs, we find it takes roughly 2 hours to import into any given tool in our environment. I'd like to know times from other investigators.

We are working with vendors to make that process simpler and faster, by building native database format releases which we might distribute from the NIST NSRL download webpage.

Depending on the investigator's need, we provide reduced-size hashsets.
http//www.nsrl.nist.gov/Downloads.htm#reduced
The 4 CD set for RDS 2.33 contains information about 66M files, all of which is imported despite the fact that only 20M of those files are unique. We offer a "minimal" set containing info on only 20M files - if you don't care about all the possible sources of those files. We offer a "unique" set containing info on only 12M files - these are the files appearing only once in every app we have processed, if that is your need.

Out of curiosity, how much time does it typically take to import the NSRL into EnCase/FTK/X-Ways?

Thanks,

Jon


   
ReplyQuote
(@jonstewart)
Eminent Member
Joined: 16 years ago
Posts: 47
 

Doug, thanks. I was wondering if the EnCase import took too long whether there'd be a way I could write a faster conversion script, but two hours doesn't seem too bad.

I am confused, though, by the difference between the minimal set and the unique set. Can you explain that further?

Thanks,

Jon


   
ReplyQuote
(@dwhitenist)
Active Member
Joined: 15 years ago
Posts: 11
 

I am confused, though, by the difference between the minimal set and the unique set. Can you explain that further?

Jon,

Yes, and that is sometimes hard to explain.

An example would be NSRL has 15,000+ software apps in the library. We have 5,000 MSDN DVDs. We have 20,000 files named "notepad.exe" on those DVDs - basically 4 copies on every DVD. Let's assume the 4 files on each DVD have the same SHA-1 & MD5, and that MS doesn't modify the notepad.exe file very often, say every 500 releases.

In our database, which you all don't see, we have 20,000 metadata records about "notepad.exe".

However, we publish the 4 CD set based on software distros, and we only publish one example of any file in a distro. So the May 2011 MSDN DVD will be tied to only 1 example of "notepad.exe", June to 1 example, etc. This means you will have 5,000 examples on the 4 CDs.

But MS only changed the file 10 times over the 5,000 DVDs - the SHA-1 and MD5 are duplicated many times on the 4 CD release.

That's where the "minimal" set comes into play. If you only want to eliminate a file based on the hash, and you don't care what distro it came from, that just has the 10 SHA-1 and MD5 hashes of the "notepad.exe" files. You don't import 4,990 duplicate pieces of info.

Let's say one of those 10 "notepad.exe" files was an intermediate bugfix, that it only came out in Oct 1999. Only one of it's kind on any MS release ever.

The "unique" hashset release would only have the Oct 1999 "notepad.exe" info, because the other 9 in the "minimal" release came from many DVDs. If you found a PC containing a hash that matched the "unique" notepad.exe, you would be positive that (as far as NIST knows) it could only have come from the Oct 1999 MSDN DVD.

This is the best way I have right now of explaining it.

One of the reasons we do this is because we have a piece of steg malware containing a certain DLL. That DLL is also used by a 3D home design app and a kids' dollhouse game. If you find that DLL, it doesn't necessarily mean the suspect had a steg app installed, but that's a flag for the investigator to check things out at another level if you are using the 4 CD set. If you trigger off a hash from the "unique" set, and it says it came from malware, then, as far as NIST has collected, yes, it did.


   
ReplyQuote
(@jonathan)
Prominent Member
Joined: 20 years ago
Posts: 878
 

We offer a "unique" set containing info on only 12M files - these are the files appearing only once in every app we have processed, if that is your need.

First of all it's great to have someone from NIST posting here - hope you stick around!

Using the NSRL lists in X-Ways is something I do at the beginning of every case I look at; it often reduces the total number of files by around a third so I see the library as an invaluable tool. Would love to have more regular updates if that's possible; much of the investigations I do involve examining computers that were in use only yesterday or even this morning! If it doesn't exist already, how about setting up an email list/Twitter feed so people could be informed of when updates are available?

With regard to the 'unique' hash sets would I be correct in saying that these contain hashes of all the known files in your library and should be used if I am not interested in their origin?

Edited - after your intermediary post it seems that the 'minimal' set is the one for me.


   
ReplyQuote
Page 2 / 3
Share: