When looking for digital evidence, one has to look through a large number of files on the disk to discover just the few important pieces. Automating evidence search can help locate evidence stored in files that were moved, renamed or deleted. This article offers a general overview of data carving techniques used in todays computer forensic tools, outlines benefits and limitations of the technology, and demonstrates how to use carving in a forensic tool to discover evidence.
The algorithm then analyzes the file header (assuming that it is in a certain format), and attempts to determine the length of the file. While it may sound easy on paper, determining the correct file length is not always easy. While some formats (e.g. PDF, DOC, PNG) specify the length of the file in the header, other formats (e.g. JPEG or SQLite) don't.
This means that further analysis of subsequent data blocks is required when carving these files. For example, carving a SQLite database involves reading and analyzing subsequent data blocks in order to determine whether or not they contain valid records in the SQLite database format.
Now, what happens if a file being carved was already partially overwritten? In this case, the carving algorithm will obviously extract incomplete or corrupted files. What is more interesting is what happens next: instead of extracting a file of (N) blocks and resuming carving from block number (N+1), the carving algorithm actually returns to the data block located immediately after the detected header, and resumes carving from that point.
This allows dealing with partially overwritten and fragmented files. However, one of the consequences of this carving approach is that it may result in a larger carved data set than was originally available on the disk being carved. This is why we recommend having as much as 1.5 to 3 times more free space on your hard drive compared to the storage size of the disk being carved.
Again, this sounds great in theory, but what about fragmentation? The technique works great for contiguous files, but can fail miserably on fragmented data.
Now, there are at least two distinctly different ways to handle carving of fragmented data sets. The first approach just assumes that a certain number of data blocks following the file's header belong to that file, ignoring the existence of the file system. This method is often used if there is no file system available.
There is also another, more complex approach that reads the file system before making assumptions. With this approach, carving will treat occupied and unoccupied sectors separately.
Let's say, for example, that we have four data blocks marked 1, 2, 3 and 4. Sectors 1, 3 and 4 are unused, while sector 2 is occupied by existing data. A carving algorithm determined that a DOC file begins at sector 1, and is 2 sectors long.
A simple carving algorithm will extract the content of sectors 1 and 2, producing a corrupted file.
A smart algorithm will check the file system and realize that sector 2 is occupied by a different file, so itll extract sectors 1 and 3, possibly producing a working document. Well, or maybe not.
Of course, either algorithm could be wrong. Nonetheless, separate treatment of occupied and unoccupied data blocks definitely has its benefits.
In digital forensics, carving is used to scan the existing file system as much as the free space. Suspects can move or rename files, change file extensions and attempt other naive anti-forensic techniques to make finding evidence more difficult. Indeed, if only the Windows\WinSxS\ folder contains several hundred files and folders with long, obscure names, who is going to notice yet one more folder named "amd64_microsoft-windows-bing-shell-education_31bf3856ad364e35_10.0.10240.16384_none_f414688676e1420e" when analyzing the system? This is where carving of allocated disk space comes to the rescue. Carving becomes a truly indispensable technique while searching for deleted or obscured evidence.
Carving is an integral part of Belkasoft Evidence Center. The entire procedure is automated, allowing you to pick what types of evidence to carve for and to choose whether to carve the entire disk contents or to analyze only certain allocated (or unallocated) areas, which helps you save your time. Moreover, you can choose to carve only the free space inside allocated. Since Belkasoft Evidence Center locates and analyzes the data automatically, choosing to carve only free space will ease and speed up the examination, because this way we reduce the amount of data to carve. Also, there will be no duplication of evidence that has already been discovered by the tool.
Belkasoft Evidence Center allows you to carve devices or images for hundreds of different kinds of forensically important artifacts, including documents, pictures, system and registry files, SQLite databases, browser data, messenger and peer-to-peer communication histories, and more. (Note that while above we were discussing file carving only, BEC extends its set of data to carve to separate chats, visited URLs, emails and so on). It is particularly convenient to be able to choose what to look for when you already know or can assume what kind of evidence you are looking for and want to snipe it quickly.
It is important to note that with Evidence Center you can also carve a Live RAM dump, which can be – and most of the time is – a crucial source of digital evidence. While Belkasoft Evidence Center supports the output of any of other RAM dumping tools on the market, it also comes with a free powerful volatile memory acquisition product – Belkasoft Live RAM Capturer. Live RAM Capturer is available for download: http://belkasoft.com/ram-capturer.
Live RAM contents are often fragmented, which might become a serious problem for investigators, but Belkasoft Evidence Center offers a reasonable solution to it with a smart carving mode – BelkaCarvingâ„¢. BelkaCarving effectively deals with fragmentation of data, allowing a more accurate recovery of evidence that would not be available otherwise.
Besides RAM image file, you can also specify a path to hibernation or page files (hiberfil.sys and pagefile.sys). These two kind of files contain Live RAM data written on a hard drive as a part of Windows functioning, thus they are important source of live memory artifacts, because the RAM contents may survive switching computer off and can be discovered by Belkasoft Evidence Center.
Once the product has finished the analysis, it will sort found data by type and lay it out so that it is easy and convenient to review. You can now inspect the desired artifacts even more closely with one of the built-in low-level tools, for example, Hex Viewer.