First off, I'm very new in this area and English is not my first language.
I'm trying to work on reconstructing a version history of office documents (esp. excel's xls). There are 2 questions I have at the moment
1. When a document is loaded, does Office create a temporary backup file before loading it to memory? My point to this question is to recover these temp files and see what changes has been made at what time.
2. There are lots of information added to the beginning part of the file. I notice the location 0x464 and 0x46c in XLS files contain date/time information (but still can't figure out what timestamp it is used for). Is there any other timestamp and/or any other information that can be used to reconstruct history timeline. What are their byte offset?
I have another question but not exactly about MS Office. I've read that date/time is stored in 100-milisecond interval from 1 Jan 1961 00000.00 in form of 64-bit string. Is there any sourcecode to convert this 64-bit string to a readable date/time format?
Cheers
Hi there,
In terms of temp files I think it depends on the version and type of excel files. For example I just opened a .xls in Excel 2007 and no temp file was created, I then opened a .xlsx file and hey presto - a temp file. Office temp files start with a tilde (~) and have a tmp extension. This page might help a little http//
As well as this from Microsoft (relates to Word though) http//
In terms of the timestamps and offsets have you tried something like OLEDeconstruct from Sanderson Forensics? http//
This pulls metadata information out, including timestamps, it has a couple of different views, inlcuding sector allocations and might help in answering your questions. The only thing is that I thought metadata information such as timestamps were stored at the end of the file, not the beginning. So what kind of information are you seeing at the beginning?
thanks ddewildt.
Please correct me if I'm wrong.
To my understand, when a document is opened, MS Office opens the file and loads the whole content to memory since it doesn't create a temp backup file in XLS format.
MS Office only creates an auto-recovery file if the editing file is not saved within 10-minute interval.
Therefore, it wouldn't be possible to reconstruct a version history if file is constantly saved without reaching the 10-minute auto-save.
——
The OLEDeconstruct program only pulls out the visible-to-user properties. I believe I have read somewhere that MS Office adds a lot of unnecessary information to metadata. This information can be accessed through document properties in MS Office. However, these information will still be useful if I can get the byte offsets.
I'll just throw in this, which seems to have been overlooked
Excel 97
Excel 2000
Excel "XP"
Excel 2003
already behave differently (slightly) when running and on "normal" .xls files.
Excel 2007 and the new .xslx format behave even differently, I presume that whatever comes out of this research will be version specific, or at least file format specific.
jaclaz
Couple of things -
Have a look at the MS OfficeVis Tool - parses Excel files -
http//
Fact sheet to go with that -
http//
I can't see that it translates the time for you though.
FTK Imager has a date interpreter built in that will interpret FILETIME.
DD has a tool that will decode dates and times -
http//
If you want to see what is happening when you open an MS Office document use Sysinternals tool Process Monitor if you are not already doing that.
If you want to find dates in FILETIME it is quite easy just search for the last two bytes of a FILETIME around the time you know the document exists e.g. c9 01 for 2008/9 c8 01 for 2007/8 ish.
H
thanks ddewildt.
Please correct me if I'm wrong.
To my understand, when a document is opened, MS Office opens the file and loads the whole content to memory since it doesn't create a temp backup file in XLS format.
MS Office only creates an auto-recovery file if the editing file is not saved within 10-minute interval.
Therefore, it wouldn't be possible to reconstruct a version history if file is constantly saved without reaching the 10-minute auto-save.
Hi there,
Sorry about the delay in reply - I missed your post and only saw it now…
I'm not too sure about what happens in memory when a xls file is opened, if you want to test though maybe use something like OllyDbg to see the content of memory once the file is opened. Also, echoing Harry's recommendation on ProcMon, I'd also look at FileMon from SysInternals to see what files are created when opening the docs. From that you might indentify some more temp files (if there are any).
In terms of auto-saves I'm not too sure it will help with Excel, as does not run in Excel by default http//
But, as jaclaz rightly points out, different versions of Excel behave differently, as the above link only applies to 2000 and before.
Not sure if any of this actually helps with your original questions!
Thank you for all your comments. They do help alot ^^
I'd just like to clarify the Sysinternals tools as I have seen a several comments referring to Filemon.
Filemon and Regmon are legacy tools and were superceded in late 2006 by Process Monitor which does the job of both as well as a number of other functions. From the web site -
"Process Monitor is an advanced monitoring tool for Windows that shows real-time file system, Registry and process/thread activity. It combines the features of two legacy Sysinternals utilities, Filemon and Regmon, and adds an extensive list of enhancements including rich and non-destructive filtering, comprehensive event properties such session IDs and user names, reliable process information, full thread stacks with integrated symbol support for each operation, simultaneous logging to a file, and much more. "
It is not to be confused with Process Explorer another Sysinternals tool which can be used as an alternative to Windows Task manager -
"Find out what files, registry keys and other objects processes have open, which DLLs they have loaded, and more. This uniquely powerful utility will even show you who owns each process."
H