detection of file-h...
 
Notifications
Clear all

detection of file-hollowing

12 Posts
5 Users
0 Likes
745 Views
mgilhespy
(@mgilhespy)
Posts: 102
Estimable Member
Topic starter
 

Folks, looking for some group wisdom here..

A few days ago I was given a bunch of files to look at. The context is not related to any criminal investigation. My involvement is essential curiosity..

The files had no headers I could recognise but there were structures in them which persuaded me they were images (only a vague hunch at first). The "file" command (gnuwin version) concurred with me, listing them as TARGA image files (.tga) - however no TARGA file viewer I could find would open them - nor would any other viewer I've tried so far. After reading through various posts here on FF I finally decided to try transplanting headers and eventually got a result by inserting JPEG headers. Now, on viewing the files (irfanview, windows viewer, various others all the same result) they appeared degraded, partially corrupted, with banding around the edges or through the middle. I ran a strings search over the files and discovered some hex encoded ascii (which stood out a mile) and which I believe had been pasted into the middle of the files. Cutting out those bytes resulted in "good" images without any of the prior degraded parts.

The question for the group is, how would you go about scanning a system for the presence of other such "hollowed" files, perhaps not all images, being used as a "container" for hidden data? In this case it was the fact that the files did not have recognisable headers that alerted my friend to them. What if the insertion was done while being sure to leave any header/footer intact?

Ideas most welcome.

Thanks

Michael

 
Posted : 22/09/2016 3:03 pm
jaclaz
(@jaclaz)
Posts: 5133
Illustrious Member
 

Simplified, you are looking for a generic steganography detector?

*like*
http//guillermito2.net/stegano/tools/
http//stegsecret.sourceforge.net/
https://github.com/b3dk7/StegExpose/

https://www.backbonesecurity.com/EnhancedSteganographyDetection.aspx
https://www.wetstonetech.com/product/discoverthehidden/

Do some searching for "steganography detection" and "steganalysis".

The issue here is two-fold, more or less it resolves to doing semi-statistics analysis of files to detect some sort of anomalies and raise an alarm, and then to find/prove (and hopefully decrypt) the hidden message (if any).

I believe they don't behave very differently (conceptually) from an antivirus, there is a heuristic engine of some kind and possibly a database of known steganography tools (and of their characteristics), mileage and results will vary (depending on the tool used, if it actually creates noticeable anomalies, if it is among the "known" tools, etc.).

jaclaz

 
Posted : 22/09/2016 3:51 pm
passcodeunlock
(@passcodeunlock)
Posts: 792
Prominent Member
 

Stenography… maybe, or maybe files encrypted by some ransomware ?!

 
Posted : 22/09/2016 4:07 pm
(@athulin)
Posts: 1156
Noble Member
 

The question for the group is, how would you go about scanning a system for the presence of other such "hollowed" files, perhaps not all images, being used as a "container" for hidden data?

Depends. You can easily stuff an ISO or FAT image with 'secret' data, without it being noticeable, unless you do a very careful examination of numbers of free and alocated sectors. You can stuff things into TIFF images and noone noticing – except if someone sums up allocated bytes really carefully.

At most you can verify that the file format is being followed.

In this case it was the fact that the files did not have recognisable headers that alerted my friend to them.

That's not very strange, I don't think. Most file headers aren't well-described.
And some files don't have recognizable headers – they have footers instead – they're identified by their last bytes. Some don't have either, but are recognized by other methods. I mentioned ISO images they are identified by looking at bytes 32kbytes into the image …

Apart from such possibilities, it sounds more like a case of files hat have been deleted and carved but not restored correctly.

Ideas most welcome.

Thanks

Michael

 
Posted : 23/09/2016 12:14 am
mgilhespy
(@mgilhespy)
Posts: 102
Estimable Member
Topic starter
 

Yes, this is steganography in the most generic sense (of being hidden writing), however it is different in that the data appears to have been stuffed into the containers (cover objects) without any real care about the effect on them (when viewed, the images had obviously been altered in some way) - thus missing the invisibility aspect that I understand to be an essential characteristic of good stego.
Having "de-corrupted" some of the images by carving away the additional bytes, we were able to find "originals" still on the system, along with a bunch of others which we assumed would also be found being used as cover objects. We've managed to understand that a data-exfil was going on and to a large extent what the data was - although unfortunately this detection is very very late for the org concerned.

I'm still no closer to a generic way to find files which have been used as containers in this (somewhat crude) way. None of the steganography detection tools I ran picked anything up at all - they all appear to be looking for specific manipulation techniques ("real" stego) such as LSB, DCT etc.

MG

 
Posted : 25/09/2016 8:08 pm
jaclaz
(@jaclaz)
Posts: 5133
Illustrious Member
 

Yes, this is steganography in the most generic sense (of being hidden writing), however it is different in that the data appears to have been stuffed into the containers (cover objects) without any real care about the effect on them (when viewed, the images had obviously been altered in some way) - thus missing the invisibility aspect that I understand to be an essential characteristic of good stego.

Sure ) , and possibly ? in your specific case is just a case of corrupted, partially overwritten files or - as Athulin hinted - a case of "bad" carving.

In other words, before stating that something has been stuffed into something else you might want to consider how it could be the result of some accident (as you said the essence of steganography is that it should be hidden and not easily found, i.e. the "container" behaves "normally").

jaclaz

 
Posted : 25/09/2016 9:18 pm
mgilhespy
(@mgilhespy)
Posts: 102
Estimable Member
Topic starter
 

Thanks for the comments. I thought I may be missing some obvious check, but it seems not.

there is more to the situation than I can comment on right now, however this is definitely not the result of corruption. I've seen the original disk image from where the pictures were taken and in my opinion there is also no chance that these strings appeared as a result of a "mis-carve" of data by the person doing the acquisition. the blocks of ascii (hex encoded) in each picture are all close to the same size (suggesting scripting?) and the pictures being used as cover objects all appear to have been downloaded at about the same time. the picture names have numbers in them, which we took to indicate sequencing and have been able to concatenate many of the hex extracts to find larger portions of readable text. there's just no way this is chance/random error. looking for traces of the images being sent out now (exfil path).

we have basically stumbled upon these "crude containers" in this case and my open question is whether there's a general check we could have done to make sure we got them all. I don't know of one if there is.

 
Posted : 26/09/2016 12:45 pm
(@mscotgrove)
Posts: 938
Prominent Member
 

You stated that you cleaned up images by removing embedded text. My question is what was the length of embedded the embedded text. If it was the length of a cluster, then carving of a fragmented file, many data recovery programs could produce the same results. If the length was not a cluster size, then it was probably not a carving error. The same applies if the added text was on a cluster boundary or not.

Another area I have often seen is data recovery programs that try and recover from FAT32. On deletion the high 16 bits of the cluster pointer are set to zero, many programs ignore this and so recover files from the wrong location.

 
Posted : 26/09/2016 6:49 pm
jaclaz
(@jaclaz)
Posts: 5133
Illustrious Member
 

IF the amount of "embedded data" is significant, maybe checking byte frequencies might reveal them, but otherwise I see no practical, "universal" method to check those files.

I still believe that some kind of error may be part of the cause, I mean let's try to analyze what steganography is used for
1) to hide messages in such a way that they are not seen
2) to send these messages or however retrieve them

The simple encoding of the message(s) may be an attempt to "obfuscate" the message(s), but if the idea was - additional to the one of "hiding" - also that of keeping the contents "private" I would guess that everyone would have used encryption (as opposed to "encoding") or both encryption and encoding.
Even a simple cypher would have probably be enough to make the encoded part (even once decoded) as "garbage".

On the other hand even if we assume that the origin of the file is a (poor) attempt at steganography, we still need to verify if the contents are retrievable.

If - say - the contents are all of the same length and at a same address inside the files (that are numbered) it would be trivial for the receiver of the message (or for the author if the images were used as "hidden storage") to retrieve them in case of need (and would make the hypothesis of an attempt at steganography more plausible).

But again the same could be true of some "error" in the computer (a program gone beserk, a corruption of the filesystem driver, error in flushing a cache, a lot of things can go wrong.

If instead the contents are of different length and placed at different offsets in the images, then to retrieve the "hidden" message one would need a method (or an algorithm) that would be an added layer of complication that doesn't "match" with the simplicity of the hiding.

jaclaz

 
Posted : 26/09/2016 8:15 pm
mgilhespy
(@mgilhespy)
Posts: 102
Estimable Member
Topic starter
 

considering a small padding (front and end) which I had previously missed, the inserted contents are all exactly the same length and are all inserted at position file-len/2.

with that we examined all images on the system to see if we could find hex encoded ascii strings at that location and recovered what we think is the rest of the content.

the naming convention of the files is definitely intended to show a sequence for rebuilding the content.

I will not likely have any involvement in the further search for an exfil path, but if I hear anything that I can post I will.

Thanks again for the comments and keeping us questioning our assumptions.

 
Posted : 27/09/2016 12:44 pm
Page 1 / 2
Share: