Notifications

Clear all

Logical evidence file size reduction

General (Technical, Procedural, Software, Hardware etc.)

Last Post by Rich2005 7 years ago

10 Posts

5 Users

0 Reactions

1,723 Views

RSS

xandstorm

(@xandstorm)

Trusted Member

Joined: 10 years ago

Posts: 61

Topic starter 17/04/2019 7:30 pm [#17732]

Hi all,

Having a 3TB network disk that the suspect tempered with and deleted files from prior to the device being seized for examination.
Conducted a file carve operation on the disk and subsequently applied some regex search patterns to it.

Nothing complicated so far.

However, upon exporting the search results to a logical evidence file, the size of the LEF export exceeds 300TB.
This is an unworkable amount of data and just exporting it will require weeks of not longer to complete.

Challenge here is that the majority of all search pattern hits are related to the unallocated disk space of the disk.
The LEF export process copies big chunks of the same part of the unallocated disk space with it.
Leading to the 300TB+ in LEF size.

I was wondering if it would be possible to just have that same chunk of unallocated disk space exported just once instead of re-copying / re-exporting the same chunk over and over again.

Is there anyone on this list that has a solution for this LEF size problem in particular or the reduction of LEF size in general?

Thanks!

Rg,
Lex

Quote

TinyBrain

(@tinybrain)

Reputable Member

Joined: 10 years ago

Posts: 354

25/05/2019 12:01 pm

Lex, look into AWS DynamoDB to solve the downsizeing

ReplyQuote

xandstorm

(@xandstorm)

Trusted Member

Joined: 10 years ago

Posts: 61

Topic starter 25/05/2019 12:24 pm

Hi TinyBrain,

Thank you for the reply, will look into this later, currently traveling.

Saludos,
Lex

ReplyQuote

armresl

(@armresl)

Noble Member

Joined: 22 years ago

Posts: 1011

26/05/2019 1:25 am

Did you mean 300TB or 3TB?

ReplyQuote

xandstorm

(@xandstorm)

Trusted Member

Joined: 10 years ago

Posts: 61

Topic starter 26/05/2019 2:19 pm

Hi armresl,

The initial confiscated disk has a size of 3TB and the exported LEF dataset encompasses a theoretical 300TB and it was tried in different forensic suites.

Rg,
Lex

ReplyQuote

watcher

(@watcher)

Estimable Member

Joined: 20 years ago

Posts: 125

27/05/2019 6:29 pm

Hi armresl,

The initial confiscated disk has a size of 3TB and the exported LEF dataset encompasses a theoretical 300TB and it was tried in different forensic suites.

Rg,
Lex

While at first blush, this seems impossible, I have seen something similar on a smaller scale.

The variant I saw had to do with millions of tiny files, mostly gifs and thumbnails. Attempting to copy them to another media overflowed even though the target media was much bigger than the source media. The problem was that the source media used a very small block size, while the target media used a rather large block size. The net result was each tiny file took up a large block on the destination media.

Look at the media block sizes of both source and destination.

If you need to deal with a huge amount of tiny files, putting them into a database instead of separate files may work better.

Come to think of it, this would make an interesting anti-forensics method.

ReplyQuote

xandstorm

(@xandstorm)

Trusted Member

Joined: 10 years ago

Posts: 61

Topic starter 28/05/2019 4:27 am

Hi watcher,

I think the whole problem here is the lack of adequate deduplication options when it comes to exporting to a LEF.

It is understandable that, from the perspective of maintaining forensic integrity, each search pattern hit is associated with the file it was found in and that the file in question is exported to the LEF as well.

The issue for me started when a large number of search pattern hits were related to a file carve operation or unallocated disk space search. The export to the LEF process copies large chunks of the unallocated disk space again and again.

I have seen the same with PST container files. When there is a search pattern hit in 1 e-mail, most forensic suites export the entire PST file to the LEF.

It just seems that with the ever growing size of seized evidence data, the LEF exporting processes are walking behind, particularly when it comes to deduplication options.

At the moment I am running some test exports and will definitely look into the block size advise.

Rg,
Lex

ReplyQuote

TinyBrain

(@tinybrain)

Reputable Member

Joined: 10 years ago

Posts: 354

28/05/2019 8:57 am

U also can use the toolbox inside Hadoop MapReduce

ReplyQuote

xandstorm

(@xandstorm)

Trusted Member

Joined: 10 years ago

Posts: 61

Topic starter 28/05/2019 2:34 pm

Thank you TinyBrain,

Will look into that too.

Interestingly enough, I was just able to export an 8TB LEF onto a 4TB USB drive.

Despite the error messages about not enough free space, and some forensic suites even grey out the continue button based on the size, it appears that at least 1 of the forensic suites does deduplicate adequately. According to the log files the LEF export process was successful and without errors. Remarkable.

Rg,
Lex

ReplyQuote

Rich2005

(@rich2005)

Honorable Member

Joined: 20 years ago

Posts: 541

28/05/2019 4:50 pm

I'm not sure of your exact strategy but I'd suggest breaking things up.

Create a data set of all your live files first and separate this (this will obviously be considerably less than the original drive size).

You've then got to deal with all your deleted files, other non-live files such as those from volume shadows if you've got lots of those, and then carved files.

I'd again suggest filtering for all your non-live files, then running some sort of validation process, like checking signatures are OK, or even better some further filtering (if your tool allows text summaries like NUIX you might quickly be able to filter out lots that aren't showing any decodable textual information - this isn't perfect but would be a valid strategy if done in the knowledge you may be excluding image-based documents or problematic ones). You could also then remove all duplicates, by hash, that exist in the file-system set already as an example.

For the carving you could do much the same sort of thing. Although if you do have NUIX I'd warn that it'll be unlikely to match lots of things very well by hash because their carving logic is nonsense (and it carves until the end of the sector I believe - rather than what appears to be the end of the file - so the carving results will usually not match identical documents by hash).

ReplyQuote