Howdy,
I have not been involved with many mobile forensics cases to date, so I apologize if what I'm saying is a bit n00bish. However, I recently heard about all the weird 7-bit SMS encodings that can be used for text messages, as documented here
http//
My question is under what circumstances do you encounter this kind of data?
i.e., do you find such data in physical acquisitions of phones or "logical" acquisitions? Which makes and models of phones? Will you find this data in RAM acquisitions of phones (still a fast-moving target, I know)? Which mobile OSes? Will you ever find this kind of data on a computer? Are there any tools for dealing with this kind of data?
Thanks very much,
Jon
GSM 7bit encoding is found on a lot of phones as it is an ideal way to compress text messages which only use the first 128 bytes of the ASCII table.
All the phone tools that I've come across will translate 7bit GSM automatically when you extract the data. In some cases, however, when you perform a physical extraction a tool may not convert any of the data for one reason or another. In that case, if you know enough about the way that phones store data, you can locate the area where text messages are stored and the tools will be able to translate any 7bit GSM encoded text that you find.
Phone RAM would likely also hold this data, although I don't know if that would be any use to you since the same message would undoubtedly be on the device's less volatile memory.
I haven't come across any computers which have had 7bit GSM encoded data on them, although it may just be that I have come across them but haven't noticed because they're encoded.
Thanks for the quick reply, Joe. Very helpful. How do you locate this kind of data if it doesn't get extracted? By looking for SMS header information around it?
I'm considering adding support for the various 7-bit encodings to our
cheers,
Jon
Jon,
I recently created a python 2.7 script which encodes and decodes GSM7 text. Do you want a copy? If so, send me a PM.
Thanks,
Chris
In fact I've decided to upload it to sourceforge - find it
There is the module itself (GSM_Codec) and an example of it's use. I can send you the test binaries as well if you want.
Note that the module DOESN'T carve GSM7 text - instead you give it the stuff you know is encoded text and it decodes it.
It assumes you are going to use the GSM7_alphabet.txt file to create a simple list which lets you more easily translate the relevant characters.
It's not an efficient or an elegant method, but it works. If lightgrep can do something better, well, I think the forensic community will be better for it )
Hi Chris,
Thanks for this. The good news is that a lot of what we need to change in lightgrep to support GSM 7-bit encodings is also needed to support searching base64 and we most definitely want to search for base64. However, I don't have a good sense for what performance will be like with GSM 7-bit support; because it's not an octet-based encoding, you have to consider a far greater set of potential starts of hits, so it will definitely be slower than searching for UTF-16 or UTF-8. I'm just not sure how much.
Jon
Hi Jon,
Yeah it is a thorny problem - good luck in getting it to work! Let me know if there's any way I can help )
Thanks,
Chris