Reviewed by Jonathan Krause of Forensic Control
Well, this is an interesting proposition. Early last December Nuix, the respected producers of eDiscovery software, released an intriguing, and as far as I know in this sector, unique, application. Called Proof Finder, it’s a restricted and limited version of their enterprise product which they’re making available for just $100 (approximately £65 GBP). It’ll be available for 10 weeks (or until 1000 licenses are sold) and is capable of investigating up to 10GB of data per case. Best of all, 100% of proceeds go to the charity Room to Read, who work to increase literacy skills of children in developing countries. Make no mistake, Nuix deserve high praise for this project – it’s a great idea and a brilliant piece of marketing as well.
I bought it as soon as I read about it. What could go wrong? As it turns out very little -the only issues I have had with it were licensing related ones. Proof Finder is the first product with which Nuix has used dongle-less licensing. The license code they were initially using proved overly sensitive to changes in hardware and would report no license was present if, for example, an external hard disk was changed; not an uncommon occurrence in computer forensics. As far as I’m aware this licensing has now been resolved. The license is not transferable between machines, but its low cost makes buying one copy for your workstation and one for your laptop not an unreasonable proposition.
I found using Proof Finder to be both logical and intuitive. The first step is unsurprising in that you create your case and then add the evidence. There are two ways to go about adding evidence if we assume your starting point is with an image file. The first method would be to E01, L01, DD or AD1 files to Proof Finder directly and let it extract the file types you specify or alternatively you may choose another tool to initially carve files. I used X-Ways to carve out the file types I wanted as it retains exported files’ meta-data and then pointed Proof Finder to folders containing these files. For most computer forensic and eDiscovery cases the second method may be more inclusive as data can be extracted from unallocated, swap space and slack space first – something that Proof Finder can’t do. Proof Finder supports adding images of FAT32, NTFS, EXT2 and EXT3 file systems – other file systems such as those used by Apple computers or FAT16 would need to be pre-carved using another forensic tool.
10GB isn’t it? I think for a lot of scenarios it will be fine. In my test case I extracted every edb, eml, emlx, ost, pst, doc, docx, pdf, xls, xlsx file from a Mac Book and four PCs which in total came to under 10GB. To see if I could stretch Proof Finder the PC images were of Greek computers – I can report the non-Roman character sets did not phase the program at all.
Figure 1: setting the case parameters
One thing to note is that the directory selected for the case index must be a local drive and not a network one; evidence files, while required to be accessed locally do benefit from being a local drive. The Advanced Settings tab, below, allows the configuration of some more advanced options.
Figure 2: now it’s getting a bit more interesting…
For example, selecting “Extract named entities” will extract IP addresses, Social Security numbers, credit card numbers, etc. A skin tone analysis option is not something you’d necessarily associate with an eDiscovery tool but its inclusion is welcome. I’m not sure whether this would be as useful as a Black and White image detector that could help filter scanned documents. The “Parallel Processing” tab allows the user to set the number of ‘workers’ (Nuix speak for processor cores). The maximum number of workers Proof Finder can use is two. This tab also is where the amount of memory allocated per worker can be set. Nuix advise that 4GB RAM per worker should be used, which would indicate that a 64-bit machine with 12GB RAM would be what Proof Finder would be happy with.
Figure 3: many different forms of evidence can be added to as case
Adding evidence can be done by using drag and drop or by navigating to folders. The “Add Mail Store” option does not refer to Exchange EDB files or to Lotus Notes NSF files (these are instead added by using “Add Files” or “Add Folders”) but instead to IMAP, POP, and GroupWise resources from which Gmail, Yahoo, AOL etc. can be harvested.
Before processing case data, pre-filter options are available that will only process data of interest. For example it allows drilling down into E01 files to just select the user directories or into an Exchange EDB and to select custodian mail boxes of interest. During processing Proof Finder extracts and indexes all content and meta data, hashes it with MD5 (allowing for de-duplication) and stamping each item with a GUID and item ID. The list of files that Nuix can process is pretty impressive; at present at over 250 file types can be dealt with. See Proof Finder Supported Files for the current list.
If a problem is encountered with processing, go to the Help Menu, select System Diagnostics, save to File and send the resultant zip file to [email protected] A nice feature.
Figure 4: processing in action
On my workstation, which was running Windows 7 64-bit, with 12GB RAM and a Intel Core i7 920 processor, Proof Finder took 50 minutes to process just under 130,000 totalling 9GB in size, which in my opinion compare very favourably with competing products
Figure 5: the user interface
Once processing finishes , an easy to navigate user interface is populated where data can be searched and refined before exporting those items of interest. The left hand side of the interface lists evidence added, evidence custodians and processed item types (which lists, amongst other things, files types, encrypted files, files with bad extensions and what languages the file contents are in). From here it’s possible to tick-select items (evidence and items types) to show the corresponding files in the results panel such as requesting to display only all Excel files associated with the custodian J Smith. The data in the results panel (the middle panel in Figure 5) can be viewed in a number of formats including as thumbnails, table format as shown, a word list showing the frequency of all words in the selected data or as an event map, graphically showing relationships.
The panel on the right side is the preview area where any selected item can be viewed in detail. Tabs in this panel enable views of the just the text of the files (as in Figure 5, with a particular keyword highlighted), the file’s meta data, a PDF view of the file, a native view of the file (for example in Outlook or Word) and a word List, listing each word which appears in the file and its total number of occurrences.
Panels, can be moved around, closed or moved to a second monitor and can be returned to their default layout by clicking Reset Layout under Window on the menu bar. Running across the top of the panels is a search bar; entering a term here will return matching results instantaneously. Selecting an item and viewing it in the preview panel highlights the search terms in question. Top of the preview panel lists the item’s path, how many duplicates it has, how many near duplicates there are, related items and similar items and options to de-duplicate and filter by date. The Advanced Search button allows for the building of more complex searches including keyword lists and the use of regular expressions.
Once files of interest have been selected they can easily be exported – email messages can be exported as a PST, NSF, MBOX, EML, HTML formats amongst other options. Reports can include some or every available piece meta-data if required.
It feels a little uncharitable criticising such a full-featured piece of software which is so cheap and in aid of such a good cause. Nevertheless, I will. Two small enhancements I’d like to see would be a ‘help’ button on each option box which would bring up definitions and explanations and consequences of selecting particular options. Additionally it would be useful to view how much of the 10GB limit is being used – perhaps this is already detailed somewhere but I couldn’t find it. Still these are minor points, of which the rest of Proof Finder’s features massively outweigh – some of the things I didn’t have the space to mention in this review were the ability to import NSRL hash sets, the scripting language Proof Finder supports and the ability to look at historical searches conducted minutes, hours or days ago
In conclusion, this is wonderfully useful piece of software that is going to form an important part of my computer forensic and eDiscovery toolkit. At only a $100 per year too it represents truly amazing value for money – even more so when you take into account all proceeds go to charity. If you’ve not bought it as yet, do so and do so soon; the 1,000 copies it’s limited to won’t last long.
Reviewer: Jonathan Krause
Version reviewed: Proof Finder 3.6.2
Cost: $100.00 USD
8 thoughts on “Review: Proof Finder by Nuix”
Thanks for this review!
There’s one major drawback in your article, that I’ve read otherwise as well – but this is the main reason you might not want to buy the prooffinder licence:
According to http://www.prooffinder.com/terms-and-conditions , you can only analyze 10 GB _in total_, not 10GB per case:
“Data Set / Limit: No limit on the number of cases processed however the total cumulative expanded data is limited to 10GB.”
So even if this might not be hardcoded in the licence (which I don’t know), at least you’re breaking you licence agreement if your next case is bigger than 1GB.
Good point Marek, glad to see that Nuix have moved on from the Robert Agius scandal.
(No lies I work for Nuix). There is no cumulative counter as mentioned by “Marek” and despite the legalese it is 10gb per case. Unlimited cases.
Version 2 (just released in march) [And no I’m not a sales guy] has increased the cap to 15gb (increased by request) and will run until we reach the goal.
Thanx for the update concerning the case volume. I guess the terms&conditions should then be rewritten, as they still say:
“Data Set / Limit: No limit on the number of cases processed however the total cumulative expanded data is limited to 15GB.”
What does “cumulative” mean when in the same sentence describing that the number of cases is not limited, but the amount of data is? There could easily be added that this is “cumulative expanded data per case is limited to 15GB”.
Just my 2ct….
I’ve to reply to myself: I’ve just talked to a nuix guy and addressed the issue that the sentence is ambiguous.
He also confirmed that a single case can have up to 15 GB in volume, and there is no overall counter of the amount of data that was processed in subsequent cases.
Wonderful review. is there any follow-up review of the recently released V2 of Proof Finder?
A well done review. I see that it’s possible to collect Yahoo and Gmail email data with Proof Finder. Is it possible to limit those collections to specific folders within the mailboxes?
Yes… However…. Yahoo, in order for the the service to be exposed you need to upgrade to the premium version (believe it’s called Yahoo Premium… or the pro version).
As for gmail yes it’s possible, however…. (a stickler) you need to enable imap and imap propagation of all the labels in order to grab everything, otherwise (iirc) it defaults to a pop3 grab which is inbox only.