In a recent forensic case involving recovered deleted sms messages from an sms.db file on an IOS mobile device none of the mainstream mobile phone forensic software made the link between sender and recipient for the recovered records of interest.
I have been asked a few times recently about obtaining the third party of a deleted IOS SMS message that has been recovered by the Forensic Browser for SQLite. The procedure is simply to create a JOIN between two (or three – there are two ways of establishing the third party) tables and if the data is in the relevant tables then a link is made. Unfortunately when messages are deleted, particularly when entire conversations are deleted, the primary keys on the all-important tables are often overwritten making these joins impossible.
When the case mentioned was looked at with the Browser neither of the JOINS described above allowed the investigators to ascertain who the third party in a deleted conversation was. So I was asked to take a look at the DB further and I managed to find a rather interesting, to me, third rather obscure route to make the link. This was made possible by the new Structured Storage Manager in the Forensic Browser for SQLite, but the technique may be of interest to those who don’t use my software.
In this article I’ll describe the structure of the sms.db database and how the different tables relate to each other. I’ll then explain the two ‘normal’ joins and why they usually don’t work when trying to ID the third party for a deleted message and I’ll show the third route mentioned above which can often work where the first two don’t.
First I’ll provide an overview of the relationships between the tables. I am not going to discuss all elements and constraints of the different tables as this goes beyond the scope of this article, but I’ll cover in overview an example of a user deleting a conversation to explain what happens and how data can be recovered.
Firstly the relationships between the different tables:
There are four main tables:
|Contains an entry (row) for each chat (or conversation) between the ‘owner’ of the DB being examined and any of their contacts. Each chat is uniquely identified by the primary key of the table, the ROWID column. Both the chat identifier and the guid column contain the phone number or occasionally name of the third party to the conversation
|Contains the details of each of the messages
|Contains the third party details in the id and uncanonicalized id fields
|Contains a list of file attachment details.
So in summary the database contains a series of conversations (chat table) each chat will have one or more messages (message table) associated with it, each chat can have one or more participants (handle table) and each message can have zero or more attachments (attachment table).
Each major table is joined to other tables via an intermediate join table (e.g. chat handle join) this allows one chat to have multiple participants without having to duplicate entries in the chat table or the handle table.
Let’s take a look at a real, but anonymised, chat from my phone – chat number 109.
In the display below I have provided the SQL behind the queries that show the relevant data for chat 109, it can be seen that chat 109 is between me and one friend
You can also see (the dotted line) that there is a further connection between the message and handle table (message.handle_id -> handle.ROWID) that is not explicitly defined by the table schemas.
What happens when a conversation is deleted?
Intuitively if you deleted a conversation from a phone you would expect all of the associated messages to be deleted and if you deleted all of the associated messages in a conversation then you would expect the associated entry in the conversation (chat) table to be deleted.
The same can be said for the handle table. The default set up for the sms.db is for the handle entry to be deleted when all message entries that point to it are deleted (more on this in another article).
So in summary when a conversation is deleted in its entirety the relevant entries in the message table are deleted and if there are no further messages which involve this contact the relevant row in the handle table is deleted.
So what should be self-evident from the diagram above is that if we recover a deleted text in the message table then the associated entry in the handle table must either exist or be recovered for us to establish who the third party in the conversation is, or we need to follow the links back to the chat table (each of which must also exist or be recovered).
Identifying the third party the ‘standard way’
If there is a single additional participant in the conversation then the handle can be obtained from the handle_id in the messages table and a simple join used to show the other party:
If the entry in the handle table is not present or cannot be recovered then providing that the relevant records in BOTH the chat_message_join and chat table are recovered then a slightly more complex join can be made to retrieve the same information.
If there are multiple participants, then the chain of tables must be followed back around, i.e.: message -> chat_message_join -> chat -> chat_handle_join -> handle – in order to recover all participants. I’ll leave this query to you to work on – it’s straight forward and just the same as the previous query – just involving more tables.
There is however a problem when dealing with deleted records and this relates to an SQLite artefact called freeblocks.
Freeblocks are structures within an SQLite database page that track all blocks of unused space greater than 3 bytes in size. They do this by writing a 4 byte structure to the start of each block of free space that includes the size of the current block and a pointer to the next block. When a record is deleted from an SQLite table it usually results in a new block of free space comprising the deleted record of which the first four bytes are overwritten by this four byte structure. The screenshot below shows a page with two freeblocks (the unused space before the first record is not part of the linked list of freeblocks) the freeblock structure is underlined (a freeblock pointer of 0x0000 signifies the last freeblock):
A major complication has now arisen that will affect our success rate when trying to recover records.
The first two entries in an SQLite record (row) are the Payload length followed by the ROWID.
In most cases these two variable length integers take up less than 4 bytes of storage so when a record is deleted these two records are usually overwritten by the freeblock structure. As the ROWID is usually a records primary key then loss of this value usually means that recovered records in different tables cannot be related to each other.
Luckily when two adjacent records are deleted at the same time a freeblock will be created that encompasses both the records, but only the start of the first record will be overwritten. Unfortunately this does not help in our scenario because there would be only one chat record per conversation and normally only one other participant in the handle table in a conversation; and even if there were more than one other participant it is not likely that they will have consecutive handle_ids.
So what is the solution?
Happily conversations usually involve more than one message, normally lots, and they are very immediate in that a message often gets an immediate reply (message and reply are usually adjacent in the database). So we can hopefully recover lots of messages and hopefully some of them with intact (not overwritten) ROWID’s.
All we need to do is to link one of the recovered messages back to either the chat table or the handle table to establish who the third party participant was.
But before I show you how to do this I need to digress and talk about the Structured Storage Manager mentioned in the opening lines of this article. The Structured Storage Manager (SSM) can be used to break down structured storage (think binary plists, Facebook orca blobs etc.) into a table structure so that the Forensic Browser can query them and happily there is one such blob in the database that can help us.
The chat table contains a column called “properties” that often holds a binary plist.
Which when decoding using the Forensic Browser built in Binary Plist viewer it looks like this:
The interesting item here is the CKChatWatermarkMessageID integer which my testing shows is the ID of one of the messages associated with this chat. In fact what seems to happen is that this blob is updated regularly and the messageID (and following timestamp) represents the last message in a conversation.
We can use the SSM to break down the binary plist into a table. To do this we run the SSM from the Tools menu:
In the following dialog we identify the table and fieldname of the field holding the Binary Plist blob (properties) and we also identify the primary key for the table so we can relate the decoded data back to the original row. We also tell the SSM what table we want the new data stored in and what type of data it is (Binary Plist in this case):
Once we hit OK the new table is created and we can easily create a query that associates a chat identifier (phone number) with each message (CKChatWatermarkMessageID) from the decoded BPList:
The query above is a little complicated particularly when we need to use it in conjunction with another query so I want to simplify the above SQL by using a VIEW. This is best explained by example – I choose “Create view on current visual query” from the Queries menu:
I give this the name watermarks (it could of course be any meaningful name):
The newly created VIEW is a sort of virtual table so I can now run a new very much simpler query that exactly replaces the above:
The list of CKChatWatermarkMessageID and telephone numbers/contacts details above is from all recovered conversations, as well as from some conversations that are still live.
Finally I want to find any one (or more) of the rows from the above query that has a CKChatWatermarkMessageID that is *IN* a list of any of the ROWIDs (message IDs) for any message with handle_id = 108 i.e. our deleted conversation.
A query showing these ROWIDs and a few extra fields for context is below:
Note that that list of messages that we have recovered from chat 108 is not complete and also some of the recovered messages may have had their ROWID overwritten as previously described. But, all we need is to be lucky enough to find one matching recovered message to determine who the conversation was with.
The highlighted *IN* above gives a clue as to the SQL syntax. We can use the SQLite in expression to filter on a list of values that are in a specified list, the format is:
SELECT * from xxx where expression IN (value1, value2, .... value_n)
value1, value2… can be replaced by a SQL query that returns just one column (known as a correlated subquery):
There you have it – we have managed to associate a message ID from the deleted conversation with the decoded binary plist from a recovered chat record with an overwritten primary key and thus establish the third party for all of the messages in chat 108. We have also managed to do what the mainstream mobile forensics tools failed to do and we therefore reiterate that although they quite rightly have their place you should always back up anything they find, or fail to find, with another method.