Tools for scanning ...
 
Notifications
Clear all

Tools for scanning dd images / Finding an encrypted file

21 Posts
5 Users
0 Likes
2,641 Views
(@gtbase)
Posts: 10
Active Member
Topic starter
 

Hi,

I have a large (300+ GB) dd image which contains an encrypted file buried somewhere in it. The dd image was made from a partition whose first few GBs were overwritten by mistake by another filesystem, therefore (1) the dd image cannot be mounted, and (2) the names and locations of its files is lost. The files can only be accessed as raw data, with software such as PhotoRec.
Now, the one file I am looking for is encrypted, and it does not contain any header or signature, nor any meaningful string in it.
But I am confident I may still be able to find its location within the dd image, by exclusion finding a contiguous stream of data which does not contain any recognizable strings. The file I need has a size of more than 4 GB, and it would be almost impossible for any other data stream of that size in that dd image not to contain at least one recognizable file header/signature or meaningful string.

I need some advice from you guys regarding which tools I should be using. What I need is something similar to PhotoRec (which carves out files from raw data based on headers/signatures) – only, that tool should be able to work 'in the reverse', so to speak, by identifying data streams (of a given size) which do NOT contain any recognizable string.

Please, help me out here. What tools can I use?

Thanks,

GtBase

 
Posted : 17/11/2017 5:15 pm
jaclaz
(@jaclaz)
Posts: 5133
Illustrious Member
 

This is one of those cases where a "negative" approach could be used.

I.e. you find each and every files recovered by Photorec as address/extents on the image, then you overwrite them (in the image) with 00's.

What remains is a subset of what was on the disk excluding recoverable files, where you might have more luck in finding what you are looking for.

I would personally use DMDE (as opposed as Photorec) as it provides more "filesystem level" info on the files that can be identified.

Before the above I would make really sure that the mistake actually ovewrote all the info about the underlying filesystem, i.e. analyze the whole thing with DMDE, maybe some fraction of info survived the accidental overwrite.

It seems like you are assuming that your 4 Gb or so encrypted area is contiguous, but this may (or may not) be the case.

jaclaz

 
Posted : 17/11/2017 7:10 pm
(@gtbase)
Posts: 10
Active Member
Topic starter
 

I checked out DMDE, thanks for the info. I'm still trying it out, but it seems that it can only do what PhotoRec also does, pretty much.

What I was looking for is some tool which tells you in some way (visually or otherwise) in which part of the disks the files were found. Having some sort of visual mapping would be great, but a simple text-mode offset index/table for found files would also be nice. From this info I could work backwards and find out where the bone is buried, so to speak.

Does anyone know of any software tool which produces some mapping or prints out a table of offset addresses of where the files were found within the disk image?

 
Posted : 17/11/2017 8:42 pm
jaclaz
(@jaclaz)
Posts: 5133
Illustrious Member
 

I checked out DMDE, thanks for the info. I'm still trying it out, but it seems that it can only do what PhotoRec also does, pretty much.

What do you think that "Cluster Map" does? ?

jaclaz

 
Posted : 17/11/2017 8:57 pm
gungora
(@gungora)
Posts: 33
Eminent Member
 

Does the encrypted file contain any known byte sequences that could be helpful in identifying it?

If not, I would be inclined to write a script that scans the dd image and keeps track of when consecutive ASCII characters are encountered. Each time such a potentially meaningful string is encountered, starting position would be reset to the end of the potentially meaningful string. When the distance between the last starting position and the current position exceeds 4 GB, those values would be recorded for further analysis.

I do not know of any software that does this out of the box. If you decide to pursue something along these lines, feel free to PM me and I would be happy to help.

I checked out DMDE, thanks for the info. I'm still trying it out, but it seems that it can only do what PhotoRec also does, pretty much.

What I was looking for is some tool which tells you in some way (visually or otherwise) in which part of the disks the files were found. Having some sort of visual mapping would be great, but a simple text-mode offset index/table for found files would also be nice. From this info I could work backwards and find out where the bone is buried, so to speak.

Does anyone know of any software tool which produces some mapping or prints out a table of offset addresses of where the files were found within the disk image?

 
Posted : 17/11/2017 10:09 pm
(@athulin)
Posts: 1156
Noble Member
 

I have a large (300+ GB) dd image which contains an encrypted file buried somewhere in it. The dd image was made from a partition whose first few GBs were overwritten by mistake by another filesystem, therefore (1) the dd image cannot be mounted, and (2) the names and locations of its files is lost. The files can only be accessed as raw data,

This sounds too much like a class or study assignment for me to feel comfortable helping you directly, especially as I do not know the purpose of it.

However, I think I can point out that the only criteria you have for identifying this file is its size, and the absence of recognizable strings in a chunk of bytes of that size.

So … as encryption tends to increase entropy, an encrypted file will appear to be 'more random' than anything else. That means that it will contain recognizable strings just by accident – and the appearance of such false positives directly impacts your ability to identify at least 4 Gb that do not contain any such strings. You clearly want to minimize those.

One part of this appears to be to formulate a definition of 'recognizable string' that does not lead to *any* false positives in a 4GB-sized stream of random bytes.

Added of course, poor encryption would produce relatively low-entropy results, so a poorly encrypted file might look less random than a well-compressed file.

 
Posted : 18/11/2017 8:36 am
(@gtbase)
Posts: 10
Active Member
Topic starter
 

This sounds too much like a class or study assignment for me to feel comfortable helping you directly, especially as I do not know the purpose of it.

LOL. It isn't a class assignment, and I am not even a student. I am just a user who screwed up his system. Here is the back story I was trying to install GhostBSD on my system, and I chose the manual install procedure. I am familiar with the Linux installations, but not with the FreeBSD ones, so I ended up choosing the wrong partition and I overwrote it (partially). FYI, the FreeBSD way of indicating the storage units / partitions is different from the Linux one, which uses sda, sdb, etc. That's why I was misguided. I realized my mistake when it was too late, at the end of the GhostBSD installation when my Linux /home partition had been overwritten.
By using PhotoRec I was able to recover most of my files – except for that 4 GB encrypted container, because it has no header to be identified by. That's, in all honesty, my story.

However, I think I can point out that the only criteria you have for identifying this file is its size, and the absence of recognizable strings in a chunk of bytes of that size.

That is exactly my strategy.

Of course there will be false positives, but I am clutching at straws here. As someone in the previous posts rightly pointed out, there is not even the guarantee that all the data is still stored in one continuous block. But I am hopeful, because most probably it was one block at the time it was created and the later on the size never changed, so I see no reason why it should have been split.
By the way, the file was created by Veracrypt, i.e. with strong encryption, so the entropy is supposed to be very high.

I am wondering, is this not a trival case of forensic recovery? Are there not already any tools optimized for what I am trying to do? That's what I was hoping for…

 
Posted : 19/11/2017 11:29 am
(@athulin)
Posts: 1156
Noble Member
 

I am wondering, is this not a trival case of forensic recovery? Are there not already any tools optimized for what I am trying to do? That's what I was hoping for…

If you know where the file started, or you can find a sector that contains an identifiable header … then it's easy.

But if you have a featureless file, … it's not that easy. So … if Verafile containers are featureless, it's not so easy. Otherwise, just look for whatever file header they have, and search for those.

My idea was to treat the image as an array of 4Gb consecutive segments, and do the equivalent of 'strings -n 20' and just count how many such strings were returned. Look for two segments that stick out - returning the fewest number of strings. That's a best guess at where the file is located. Then, do a similar zoom-in for a smaller length near those – perhaps 100 Mbyte or so, and again look for 40 consecutive segments that stick out being 'high entropy'. And keep going until you're down to sector level or near enough.

Of course, if you have compressed files in the image as well, they will mess things up. They may be possible to remove first, though, as they're likely to have known file headers.

No, this is not an easy thing to do. ZIP files, yes. TIFF files, yes. As long as you have a recognizeable file header. Whether or not a Veracrypt provides one … don't know. But I suspect not.

 
Posted : 19/11/2017 11:56 am
jaclaz
(@jaclaz)
Posts: 5133
Illustrious Member
 

So we are one (little) step ahead, we now know that it is Veracrypt.
And one step behind, as you didn't say which filesystem is it, but probably (and contrary to my guess) it is one among the the EXT Linux types.
No idea if the cluster map feature of DMDE works on such filesystems. (

Truecrypt (and consequently Veracrypt) volumes could be detected (the program was patched later to avoid the risk).
https://github.com/antagon/TCHunt-ng
cannot say if the program still may be of help.

And in any case it will depend on the EXACT program version used, the EXACT filesystem used and the EXACT setup used.

Still, unless something has changed lately (entirely possible) the hidden volume would normally be inside a "container" that should have an identifiable header.
https://veracrypt.codeplex.com/wikipage?title=Hidden%20Volume

jaclaz
EDIT Somehow I posted a wrong link, now it is the correct one to TChunt-ng

 
Posted : 19/11/2017 4:28 pm
(@gtbase)
Posts: 10
Active Member
Topic starter
 

@athulin Your advice sounds very good, I'll try following it.
I had found an alternate method using PhotoRec's log file and extracting from it the sector number of the recovered files, then mapping their distribution to find where there is a gap. But this method is more complicated, because it requires using a spreadsheet to analyze the data (in addition to having to use grep or awk to extract the data from the log file).
So, the method you suggested is simpler and probably better.
Thanks again.

 
Posted : 19/11/2017 5:23 pm
Page 1 / 3
Share: