Why False Positives Are Important

By Jamie McQuaid

Most forensic examiners are familiar with seeing false positives in their search or processing results. False positives will always be present in tools that conduct some form of data carving in their searching and/or processing.

I often get questions from forensic examiners (both new and experienced) on whether the data that IEF or AXIOM has found is valid. Without seeing the data myself, it’s quite difficult to determine the validity of the information so I’ll typically respond with several follow up questions trying to understand what the examiner is seeing. This helps me assess the likelihood of the data being either valid or a false positive.Most forensics tools either parse out structured data from a file system and present it to an examiner, or search sector by sector carving out files (or data) from unstructured evidence. Many tools do both.

Parsing
Parsing data from an MFT or root directory will have very few false positives because the structure of the file system is usually well defined and there are many checks and balances to ensure that the data being analyzed is represented exactly as expected.

Carving
Carving unstructured data can be a little fuzzier and the likelihood of false positives increases quite a bit. Most tools will carve data based on a set of signatures either through a header, footer, or any other unique identifier, or constant, that signifies that the data belongs to a particular app or artifact – regardless of file names, paths or other file system attributes that typically assist in classifying the data.


Get The Latest DFIR News

Join the Forensic Focus newsletter for the best DFIR articles in your inbox every month.

Unsubscribe any time. We respect your privacy - read our privacy policy.


And carving data doesn’t necessarily mean deleted data. Deleted data can be parsed from the file system or carved from unallocated space. Quite often we will carve data from allocated files such as pictures from a do*****ent or parse out deleted data from unallocated space based on the MFT record. The user’s action of deletion is independent of whether something was carved or not.

An Argument for False Positives
In general, false positives are not bugs in your forensics software, they’re simply matches to the criteria used to carve through a hard drive or mobile phone with potentially several terabytes of unstructured data.

There’s bound to be some questionable matches for your signatures because the combinations of data pieced together are almost infinite. If your tool produces no false positives, I would argue that its carving signatures are not aggressive enough and it is potentially not getting all of the relevant data for an app or artifact.

With IEF and AXIOM, we could restrict our carving for many apps but it would limit your results and potentially miss important evidence. In my investigations, I would rather get 10 false positives than miss one false negative.

Now this is not meant to be an apology for any of your tools (including ours) or give freedom to software to create hundreds or thousands of false positives for your case. We work to minimize the number of false positives recovered when carving data from your evidence, but this can be challenging when the apps and data are constantly changing and can be a moving target. At Magnet Forensics, we will often carve data based on a signature for the file type or artifact and then conduct one or more validations on the data to ensure that it is the artifact in question.

For example, most examiners know that if they find a file header of SCCA (0x53 43 43 41) at offset 4 of a file it is likely a prefetech file in Windows, however, carving for SCCA in unallocated space will also yield quite a few hits for SCCA which may end up being false positives (along with the valid prefetch files you’re looking for). To help minimize false positives it’s great to understand the entire structure of the file in question if/when possible.

Tools can check for other unique items like strings, timestamps or other values that can help minimize the number of matches found, especially in unstructured data where you don’t always know where one file ends and another begins.

Even with this check you may still get some false positives but you will get far less than just searching for SCCA. This technique helps for a lot of different file or artifact types but it isn’t always as easy or clear.

If you don’t know the structure of the file or there aren’t any additional checks you can perform, you still may want to carve for that data but the number of false positives is going to be much higher. It’s a tradeoff between making sure you recover all the potential evidence and minimizing the amount of data you’ll need to review after processing.

Otherwise, giving examiners the ability to quickly identify false positives is another thing tools can do to greatly improve results as well.

How to Identify False Positives
TO read more on ways to identify False positives, visit the Magnet Forensics Blog.

Leave a Comment