The biggest current issue I am experiencing is there are more and more "known" each day in both apps and OSs. I found that in the past with de-NISTing an XP box with the kind of standard apps I could cut out 30-40% of the files. Now in a Vista or Win 7 box with lots of downloaded and updated apps (FireFox for example - we should be at version 15 by Friday) that drops to about 15%. But I still do a lot of XP and now just adding more and more Win 7 so I really do need as much as I can get.
It would be nice to have more frequent updates. …
I do agree with Sean - you have to build your own lightsaber. …
Always nice to meet another Doug. D (Be careful, there is another "Doug White" in computer forensics, but he's the good twin. We have met.)
Yes, that downward hit rate has been the bane of our existence, for reasons we can remedy (given time and computing power) and for some reasons we can't (dang kids with $80/TB drives). When we ramp up the VM installs, we - and you - should see a big improvement. It looks good (80%-90%) in proof of concept tests. Real world, we'll see.
The releases are tied to the CD mastering process. It's about 2 weeks from when we hit the MAKE button to when we send out the QC'ed master CDs to duplicating and mailing. Internally, we have an oversight process where we check hashsets on a per-software app basis and vet them to be in the next CD burn. If we didn't burn, that vetting could release an app's hashset to a download service immediately.
Rolling your own - for a period, we had some tools that anyone could use to build an RDS-format collection, and NIST would host "hashsets of interest to the community" which NIST couldn't verify. That fell by the wayside, but if there is anything we can do to help, give us a call.
Thanks, Jonathan.
I mentioned updates in another post, I don't see them being more frequent than every 2 weeks, realistically. And there are policy hoops to jump through.
I am on twitter, and mention the releases. dwhitenist - http//
Minimal seems to be the appropriate one, yes.
First of all it's great to have someone from NIST posting here - hope you stick around!
Using the NSRL lists in X-Ways is something I do at the beginning of every case I look at; it often reduces the total number of files by around a third so I see the library as an invaluable tool. Would love to have more regular updates if that's possible; much of the investigations I do involve examining computers that were in use only yesterday or even this morning! If it doesn't exist already, how about setting up an email list/Twitter feed so people could be informed of when updates are available?
With regard to the 'unique' hash sets would I be correct in saying that these contain hashes of all the known files in your library and should be used if I am not interested in their origin?
Edited - after your intermediary post it seems that the 'minimal' set is the one for me.
Hey guys,
I didn't have an internet access for the past couple of days, and therefore the delay in my response.
First of all, thank you all who replied. And by replied I mean wrote something useful and not "google it". In a way it does insult someones intelligence. I won't be wasting your time with questions without a proper research beforehand.
I can see how my question was far from being detailed to say the least. What I meant was that I wanted to take the NSRL hash sets and compare them to an image without the use of Encase or FTK. I'm interested in a list of "known good files".I want to create a list of files that are not in the lists or have a "bad" signature. I'm ok with a lot of info as a starter.
dwhitenist - thank you, and if you could point me to some info regarding the best way of comparing the RDS to a list of files with their md5's and sha1's from an image that would be great. Again, preferably without the use of Encase or FTK.
Thank you.
What I meant was that I wanted to take the NSRL hash sets and compare them to an image without the use of Encase or FTK. ..
dwhitenist - thank you, and if you could point me to some info regarding the best way of comparing the RDS to a list of files with their md5's and sha1's from an image that would be great. Again, preferably without the use of Encase or FTK.
I can give a few suggestions, which depend on your long-range needs.
If you are going to be using the list for quite some time, doing many searches, and collecting other information (this appears to be your intent), then I would suggest you import the RDS NSRL*.txt files into a database, e.g. Access, PostgreSQL, etc. and use the good 3rd party tools available.
On the quick-and-dirty side, if you are familiar with unix/linux, you can use a few simple commands to do what you need on an ad hoc basis. If you have a list of MD5s or SHA1s in a file called "suspects.txt", the command "fgrep -f suspects.txt NSRLFile.txt" will show the matching hashes. The command "cut -c 2-41 NSRLFile.txt | uniq" will give you the list of SHA1s NSRL contains, and you can do some pre-filtering with that set. The cut, grep, fgrep, sort, uniq, comm commands can do a good bit of what you are asking. Also awk if you want to get your hands very dirty. You're on your own on Windows - I'd suggest Cygwin (sorry).
You may also want to look at Sleuthkit - the hfind command seems to work well with the NSRL hashsets.
Just wanted to come back and say thanks dwhitenist. Your last post was really helpful.
We are working with vendors to make that process simpler and faster, by building native database format releases which we might distribute from the NIST NSRL download webpage.
Dwhitenist, We are interested in submitting the NSRL list from OSForensics in native format if you go down this path. What is the best way to arrange this?
At the moment the import of the 4 CDs takes about 10 hours. So having a native database can save a lot of time.
NIST also provides pre-release access to vendors for testing with the candidate hashsets.
This would also be interesting for us.
Government Modernizing Software Forensics Database
http//
Doug- you 'da man! Thanks for your continued hard work on this.
dwhitenist,
Does NSRL contain hashes for patches, updates, hotfixes or whatever you want to call them? Or are the hash sets made only from the install media as shipped by the developer?
TonyC
Sorry about the delay answering, busy few weeks. For both of of these activities, drop an email to nsrl at nist dot gov and we'll get it worked out.
We are working with vendors to make that process simpler and faster, by building native database format releases which we might distribute from the NIST NSRL download webpage.
Dwhitenist, We are interested in submitting the NSRL list from OSForensics in native format if you go down this path. What is the best way to arrange this?
At the moment the import of the 4 CDs takes about 10 hours. So having a native database can save a lot of time.
NIST also provides pre-release access to vendors for testing with the candidate hashsets.
This would also be interesting for us.
Yes, NSRL collects the updates and patches. BUT those are still not installed.
Historically, NSRL harvested the files from the purchased (shrinkwrapped) CD/DVDs. We now have a procedure to include downloaded (clickwrapped) software, including the updates, service packs, … Those are still handled through the usual chain of cracking open the .CABs .ZIPs etc. but not installing them.
We have made images of our original CD/DVDs on a SAN, so now we have begun a project that allows us to take a virtual machine, point it at the disc image of Windows 7 (for example), install the OS, take snapshots of the files, registry, RAM, and track changes through updates and installation of other apps. We are starting slow so we don't get overwhelmed by the data. We should be able to apply the patches to the VMs, feed the hashes back into the NSRL, and get better hit rates.
dwhitenist,
Does NSRL contain hashes for patches, updates, hotfixes or whatever you want to call them? Or are the hash sets made only from the install media as shipped by the developer?
TonyC