As “the Cloud” (a varied mix of internet based services ranging from web-based email accounts, on-line storage and services that synchronise data across multiple computers) becomes more relevant and the dominance of the PC or tablet as the exclusive “home” for data reduces, the days when simply taking a snapshot of a computer to capture all available data have gone.
For a number of years Google have offered online solutions for creating, editing and publishing a range of files including Word and Excel. More recently this service has linked into Google Drive, which offers more functionality, but also allows users to synchronise data across various devices. An individual with a Google Drive account is allocated 5Gb of free data storage and can obtain further storage at a cost.
Google Drive supports many of the file types and formats we work with every day including docx, xlsx and pptx. However, any user who creates a new file for example a word processing file via the Google drive website will by default generate a file with an extension of .gdoc (.gslides for a presentation and .gsheet for a spreadsheet). These file types are Google formatted files and are synchronised across all devices running Google Drive.
Running Google drive on a standard Windows PC, by default creates a folder within the user profile. On a Windows 7 PC the Google drive data is stored in the following path C:\Users\USERNAME\Google Drive. Google drive will quite robustly deal with maintaining and synchronising the data as changes are made and a successful synchronisation of the data is indicated by a small green tick and an out of date synchronisation by blue arrows, as shown below.
By virtue of the gdoc file being created on a PC during the synchronisation process it generates and maintains its own metadata on the PC. By right clicking on the gdoc file and viewing properties you see the usual dates and times (created, modified and accessed).
As you would expect the created date reflects the time the file was first created and successfully synchronised on the PC.
The important point to note is if a file is created via the Google website at 1200 on 1st January 2013, but the PC with Google drive installed is not connected to the internet until 1630 on 10th January 2013, then the creation date of the gdoc file on the PC will show 1630 on 10th January 2013 – because this was the when the PC sync’d with the Google Drive website. The modification and accessed dates update as you would expect, with the same limitations associated with the created date.
The valuable metadata is stored on the Google drive servers, however this presents us with a challenge:
- how do we gain access to the account?
- and how do we get the metadata out?
One important piece of metadata maintained by Google is the revision history.
The revision history is a cross between “track changes” and a backup solution, where Google “snapshots” the data when changes are made and so as to permit users to jump back to any version of the file, prior to those changes having been made, at the click of a button.
This means that it is possible to see what a document looked like several days ago after a number of changes to the content have been made. This is fantastic information, however it is not readily available to download or capture in an offline form.
Instead, this data can only be captured by communicating with the Google Drive using its own coding API. This is tricky and a challenge, nevertheless with the appropriate programming skills the revision history data can be captured.
If we take a deeper look at the gdoc file which is created on a PC we notice it is tiny and only 1Kb in size. The reason for this is because the content of the actual file is not stored on your PC. The gdoc file is nothing more than a pointer to the data on the Google Drive Server.
If we look inside the gdoc file it contains a URL which itself is a unique reference to the data on the Google Drive systems and only those with appropriate account credentials can view the data. This is true of gslides and gsheet files also.
There are important considerations when dealing with Google formatted data including documents, spreadsheets and presentations.
- First; forensically imaging PCs with Google Drive installed and Google formatted files stored on the PC is an incomplete exercise because, although the PC holds pointers to data held on the Google server, it does not hold the actual data.
- Second, there is a huge amount of valuable information stored on Google drive about files, in particular the revision history. Where Google Drive is in use, efforts should be made to harvest this data with a view to building, if necessary, a more detailed picture of the evolution of the file.
For clarity, I should add that files in a non-Google format that are stored in a user’s Google Drive are synchronised and stored in full on users PCs: they do not adopt the same pointer system that is utilised by Google formatted files.
Keep an eye on our blog page for future posts relating to this topic