————————————————————
Due to me not being able to reformat our thesis in a good way I strongly suggest you look at the whole paper in PDF format here: http://hh.diva-portal.org/smash/get/diva2:635743/FULLTEXT02.pdf
/Philip
————————————————————
Forensic analysis of the ESE database in Internet Explorer 10
Bachelor thesis
June 2013
Authors: Bonnie Malmström & Philip Teveldal
Bachelor thesis
School of Information Science, Computer and Electrical Engineering
Halmstad University
Preface
This project started out as a collaboration with the Swedish Tax Agency (SKV)
in Gothenburg. Due to time issues, they are not able to acquire images of drives
in many of their investigations and are thus forced to gather as much data as
possible using live forensics. They presented us with a problem they encounter
while doing live forensics on various systems; the browser artifacts are often
difficult to acquire due to outdated software or time-frame problems. In early
draft versions our project goal was therefore to create a script for EnCase, using
EnScript, which would be able to parse web artifacts from the latest versions of
the browsers Internet Explorer, Firefox, Chrome, Safari and Opera – and
present this in an easily-readable format. However, as the project developed we
were facing the many changes present in the newly released Internet Explorer
10 (i.e., the change of database from index.dat to WebCacheV01.dat). We could
only find very sparse information about the new database in IE10, and the
project evolved into mainly targeting the forensic aspects of Internet Explorer
10 and the behavior of WebCacheV01.dat.
Acknowledgements
We are very thankful to everyone who supported us in our work by providing
their ideas, criticism and time.
We would like to thank our supervisor Mattias Weckstén for providing us with
great guidance and pointing us in the right directions.
Further, we would also like to thank Anders Lager at the Swedish Tax Agency
who helped us come up with the initial idea to this project, even though it later
evolved into something pretty different.
Finally, we would like to thank Howard Chivers for letting us experiment with
his program wdsCarve, and for taking the time to help us when we got stuck.
Abstract
With Internet Explorer 10, Microsoft changed the way of storing web related
information. Instead of the old index.dat files, Internet Explorer 10 uses an ESE
database called WebCacheV01.dat to maintain its web cache, history and
cookies. This database contains a wealth of information that can be of great
interest to a forensic investigator. This thesis explores the structure of the new
database, what information it contains, how it behaves in different situations,
and also shows that it is possible to recover deleted database records – even
when the InPrivate browsing mode has been used.
Forensic analysis of the ESE database in Internet Explorer 10
1 Introduction
Today, computers are a big part of many peoples’ lives. Many times they are
connected to the Internet and we use them to play games, find information and
communicate with others – among many other things. It is likely that most of
the time spent on the Internet is while interacting with a web browser. The
browser is the program we use to access, view and communicate with web sites
and other documents and files stored on web servers. Every visited page, every
bookmark and every viewed document can leave traces on the user’s system,
and this is why web history analysis has become such an important part of a
computer forensic investigation.
1.1 Motivation
The increasing number of both criminal and civil cases is developing towards
relying heavily on digital evidence and Internet activity. The ability to examine
a criminals browsing history is often critical in not only high-profile criminal
cases, but also in minor fraud cases. Web browser artifacts can help find
offenses ranging from corporate policy violations, committed by employees of
the company, to more serious crimes like child pornography or hacking related
offenses. Even if the investigated crime itself isn’t a literal computer crime (i.e.,
the computer has not been used in the commission of the crime), the suspect
may still have used a web browser to search for information related to the
crime. By retrieving the browser history, cookies, cache and downloaded files,
it is possible to determine the suspect’s online activity.
In the 2001 book Computer Forensics: Incident Response Essentials, authors
Kruse and Heiser define computer forensics as “the science of acquiring,
retrieving, and presenting data that has been processed electronically and
stored on computer media” [1]. There are programs out there that can lift and
present information from Internet Explorer 10 (e.g., Internet Evidence Finder
[2]), but in order to be a good forensic investigator, one also needs to
understand why artifacts exist, where they are, and how they got there. That is
why the focus of this thesis is about the structure of the database and its value
in a computer forensic investigation.
1.2 Problem Description
With the launch of Windows 8 in October, 2012, the general public was
introduced to Internet Explorer 10. From a forensic perspective, the most
important change is that the previously used index.dat files are now replaced
with an ESE (also known as JET Blue) database, named WebCacheV01.dat. This
renders most of the previous Internet history grabbers obsolete when dealing
with Internet Explorer 10. As IE10 was released for Windows 7 in February
2013, it is safe to assume forensic investigators will now come across more
systems running IE10, especially as Windows 7 is the most used OS as of April,
2013 [3, 4]. The main problem is that there is a lack of information about this
specific ESE database, and that makes it hard to know its real value in an
investigation. To get a better understanding of its value as a forensic artifact, it
needs to be examined. To help us focus our research and get a better understanding of what has to be examined, we have defined four research questions.
- · What is the basic structure of the database?
- · What information is stored in the database?
- · Is any data from an InPrivate browsing session stored in the database?
- · Is it possible to recover deleted records from the database?
1.3 Related Work
We have yet to find any papers on the specific WebCacheV01.dat database, but
there are some reports on other ESE databases that have proven to be good
resources.
For our experiments on the structure of the database, we used a paper written
by Joachim Metz, “Extensible Storage Engine (ESE) Database File (EDB) format
specification” [5], as a reference. By spending a lot of time reverse engineering
the ESE file format, Metz has figured out a great deal about the format.
Howard Chivers and Christopher Hargreaves have published a paper on how to
recover data from a different ESE database – the Windows Search Database.
Their paper, “Forensic data recovery from the Windows Search Database” [6],
has been invaluable for us, as it lead us to Chivers, who kindly let us use his
program wdsCarve to experiment with the recovery of deleted database
records.
We have also looked into the report “Forensic examination of Windows Live
Messenger 2009 Extensible Storage Engine” [7] by Wouter van Dongen, Willem
Toorop and Joeri Blokhuis. It proved to be a good resource on how to analyze
the structure and behavior of an ESE database.
1.4 Thesis Outline
This thesis has been divided into chapters which are organized as follows.
Chapter 2 presents some background information regarding the ESE file
format and web browser caching.
Chapter 3 presents the method used to examine the database.
Chapter 4 presents a brief overview of the ESE file format followed by an indepth
part about ESEs log and cache functions with focus on Internet
Explorer 10.
Chapter 5 presents the conducted experiments.
Chapter 6 presents the results and a discussion of the experiments.
Chapter 7 presents the conclusion of the thesis; what has been done and what
has been achieved.
2 Technical background
To better understand why and how things happen and becomes possible with
the ESE database a more technical knowledge is needed about the basic
function of ESE and how it performs its cache operations.
2.1 Extensible Storage Engine
The Extensible Storage Engine, or simply ESE, is a highly advanced indexed and
sequential access method (ISAM) from Microsoft. It is very versatile when it
comes to handling different data sizes, ranging from very small to very large (1
MB – 1 TB). ESE uses a crash recovery system to make sure data can be
consistent even in the event of a system crash. Its advanced caching system
makes sure ESE has a consistent high performance when accessing data. The
software itself is very “lightweight” making it ideal for running in auxiliary
roles.
The primary role for ESE is to be used where the need for fast and light data
storage is of importance. Apart from being used as the main storage of web
history in Internet Explorer 10, it is also used by applications such as Windows
Mail, Windows Desktop Search, Microsoft Exchange Server, Active Directory
and Windows Live Messenger, among many others.
ESE was first introduced in Windows 2000 and was formerly known as JET
Blue. The term JET (Joint Engine Technology), however, can also refer to a
different API, called JET Red, which is very different from JET Blue. [8]
The ESE uses a single DLL file that comprises the whole user-mode (esent.dll).
This binary file allows the user to make advanced queries to the database and is
throughout quite powerful. [9]
The unit of storage in the ESE database is called a page. The current version in
Windows 7 uses a 32 kB page size. All database records are stored within
different pages and, with the exception of “long” records, every reference and
record need to fit within a single page. The ESE database is structured
according to the B-tree. One could imagine the structure of the B-tree like a
flipped tree, i.e., the stem and root of the tree is at the top and the branches
reaches downwards. A simplification of the idea can be seen in figure 1 on the
next page.
Figure 1: Simplified overview of a B-tree
The B is usually considered to stand for “balanced”, and this refers to the fact
that the length of the path from the root to every database entry is the same.
This means that finding any entry in the database consists of the same amount
of decisions. The “balancing” part that takes place within the ESE database
consists of constantly moving around records and entries. The important
aspect being that the database doesn’t perform an overwrite of the space
marked for deletion if a record was to be moved. Because of this there is a high
probability that there are older copies of records still present in unallocated
space, as long as they have not yet been overwritten by another record. [6]
2.2 Web browser caching
Many web sites contain the same elements on most of their pages, for example;
the favicon, images, CSS, and so on. The browser cache exists because someone
once came up with the idea that it is faster to open these files from your hard
disk than to download them from the Internet. So instead of downloading the
files over and over, the browser downloads them once and stores them on the
hard disk.
The cache is used to improve how fast data is loaded while browsing. Most
times, when a web page is accessed, it is downloaded to the browsers cache on
the hard drive. The next time that page is accessed, and has not been modified, the browser will instead open it using the files stored in the cache. Deleting the
cache from within the browser will force it to download all files again and
rebuild the site with fresh content.
All this downloaded content can help build a map over a user’s browsing
history and online habits.
3 Methodology
Because of the general lack of information regarding this specific subject we
have chosen to use an empirical research method [21]. We have performed
experiments based on our defined research questions and this method allows
us to draw from previous knowledge when we look at results and conclusions.
3.1 Tools Used
In order to successfully analyze the WebCacheV01.dat database, we have used a
variety of tools and programs. Here follows a presentation of each program and
also our motivation as to why we chose to work with these specific programs.
VMware Workstation v9.0.2
VMware Workstation [11] is a virtualization software that enables the user to
set up one or more virtual machines and use them on the actual machine. There
are many similar programs (for example VirtualBox, Xen and Kernel-based
Virtual Machine (KVM)), but we chose VMware since we have previously
worked with their software and felt comfortable using it again as it have the
functionality we require.
ShadowCopy v2.02
ShadowCopy [12] is a program from Runtime that lets you copy any file even if
it is locked by Windows. This program was recommended on some of the blogs
we read about acquiring the locked ESE database.
ESEDatabaseView v1.07
ESEDatabaseView [13] is a program by Nirsoft built to access ESE databases.
We used it to get an overview of the database and to verify what data had been
stored in our experiments. We also tried another software (Woanware’s
EseDbViewer), but felt that ESEDatabaseView was easier to use.
WinHex 17.0
WinHex [14] from X-Ways is a hexadecimal file editor, used to inspect and edit
files. We used it to look at the structure of the WebCacheV01.dat database. Why
we chose to work with WinHex is nothing more but personal flavor, we tried
other editors as well and they worked fine.
wdsCarve v1.13
This program [15], created by Howard Chivers, is a forensic tool used to inspect
and carve the contents of an ESE database. It is available from the author for
forensic examiners and researchers.
4 WebCacheV01.dat
As most forensic researchers know, IE used to keep track of cached files on the
system in index files called index.dat. The new ESE database introduced in IE10
is still just an index (i.e., it points to cached files on the system, but doesn’t
contain the actual files). So why change what is already working?
After email contact with Eric Lawrence, a former Microsoft employee working
with the construction of IE10, we learned that the old index.dat files, which
were used in Internet Explorer 1 through 9 to cache entries, were cross-process
memory-mapped index files. These index files were designed for optimal
performance for the most common computers of the early-mid 1990s. For
instance, the data structure that was used in the index file was designed to fit
on the on-chip cache of a 486 processor. Since then, processors have grown
much more powerful (e.g., larger caches and faster clock). Because of this, the
old cache index code was no longer very efficient, especially compared to
operations that proper databases are good at, like running multi-condition
queries. The decision to move the old cache index to a proper database helped
simplify the code, improved performance and enhanced both durability and
reliability of the caching process.
4.1 The Internet Explorer 10 WebCache directory
Internet Explorer 10 has its main storage of database files in the following
directory:
%systemdrive%\Users\%user%\AppData\Local\Microsoft\Windows\WebCache
Inside the folder is a bunch of files that work together in different ways, see
table 1.
Table 1: Overview of files in the WebCache directory
Note: V01 is the base used for all files in the WebCache directory in the current
releases of Windows 7 and 8. There have also been reports on V16 and V24 but
they seem to belong to old beta versions of Windows 8.
4.2 ESE logging explained
The very first time we acquired a copy of the database we were brought to the
attention that it did not always update properly after the browser was shut
down. In order to better understand this we decided to look into the caching
process of the ESE.
Transaction log files contain all the different database operations before they
are written to the database file. They are used to bring the database up to date
if the system crashes or if there is any process terminations relevant to the
database operations. The .log files are recognized by Windows as text files but
are actually written in binary format. If the log files needs to be used for a
database recovery the restoration process is called a soft recovery, as opposite
to a hard recovery which is done when the log files are missing. The log files are
of a fixed size, where the size is determined by a pre-configured value called
JET_paramLogFileSize. When the log file is “filled” it gets renamed into
<base><generation>.log (e.g., V01#####.log) and a new log file is created for
storage.
Reserved transaction log files are created when critical operations need to be
saved for the database to get a clean shutdown. The reserved transaction log
files are mainly a safety net for the database in the event it would, for example,
run out of disk space and operations can no longer be written to disk. In order
to still pull off a clean shutdown of the database the most critical operations get
written to these log files in anticipation of critical errors. In most cases these
files do not contain any spectacular information but mainly critical operations
needed for the database to achieve a clean shutdown state.
The checkpoint files are used to store different log file sequences. Data is first
written to the log files and then cached to memory, and it is first at a later point
the data gets flushed from the log files to the actual ESE database. This is mainly
for performance issues but might have a large impact on how the log files
should be handled from a forensic point of view. In the case of the
WebCacheV01.dat, data gets written to the database first when the system is
shut down using a clean shutdown method. This means that if the system gets
abruptly halted, crashes or is left running for a longer period of time, the
browser history is largely not found inside the WebCacheV01.dat database file,
but instead it is located in the log files. [16, 17]
4.3 ESE database cache in-depth
As we learned from the previous section (4.2), the ESE database uses many
different operations before writing data to the actual database file on disk.
However there is even more caching handled in memory before it gets written
to the .log files. The process of the RAM caching done by ESE databases is the
following:
When the ESE database receives its first operation it promptly stores this in a
log buffer. These log buffers are used as a storage container in RAM for the data
prior to the exchange to .log files on the disk. The default size for the log buffers
is the same as a disk sector, i.e., 512 bytes, and the minimum amount of log
buffers are 128 sectors, maximum amount being 10240 sectors (approximately
5.2 MB).
As the log buffers reaches maximum capacity, the data needs to be moved from
RAM to disk and into the log files. This mission is carried out by the log writer.
Each operation gets written to the disk from memory in a synchronous fashion
and is carried out very swiftly since it is of grave importance that data gets
moved from RAM to disk if a system failure were to happen.
In order to turn the operations stored in RAM into actual data on disk the log
writer uses IS buffers. The IS buffers are each 4 KB in size and grouped together
by the ESE inside RAM. The IS buffers are used to yet again cache the data
before it is written to disk. Depending on the OS used, the IS buffers used by
ESE can reach different sizes, for example the Exchange 2000 Server can have
its IS buffers reach a size of 900 MB.
When the IS buffers are done caching the lazy writer have the final task of
writing the data to the log files contained on disk. Since the amount of pages to
be written can be vast, the lazy writer is tasked with prioritizing them and
moving them to disk in such a fashion that the disk I/O system doesn’t get
flooded. When the lazy writer is finished, the data is static on disk and located
in the .log files. [18]
An interesting aspect to take note of here is that many forensic examiners who
are faced with a system running Windows 7 would probably follow “protocol”
and shut down the system by pulling the power cord instead of doing a clean
shutdown as you would with a system running server applications. This could
pose a problem since you would end up with data from the many ESE databases
in Windows in log files and RAM. However, the risk of losing important data is
very small due to the crash recovery system built into ESE. [19]
4.4 Using esentutl to recover a ESE database
Esentutl.exe is a command-line tool built into Windows that we have used a lot
in the work of this thesis. It provides database utilities for ESE and can, among
other things, be used to view metadata or recover an ESE database to a clean
shutdown mode. Esentutl is located in the following folder:
%systemdrive%\Windows\System32
To check which state the database is in, we use esentutl with the /mh switch.
This outputs the header information from the database in an easy to read
format, as seen in figure 2.
>esentutl /mh WebCacheV01.dat
If the state is dirty (which is usually the case), we want to recover the database
to a clean state by flushing the log files to the database. This is done using the /r
switch, the base of the log file (V01) and the /d option.
>esentutl /r V01 /d
The /d is to make sure esentutl uses the log files in the current directory
instead of searching through the log files for the path to the original log files. In
order to successfully flush the log files to the database it may in some cases also
be necessary to remove the checkpoint file. This makes sure every log file goes
into the database, and not only the ones the checkpoint file believes is missing.
To confirm the database is now in a clean state we use the /mh switch again.
Figure 2: Example of the output generated by the /mh switch
4.5 Looking at WebCacheV01.dat through a hex editor
Examining the WebCacheV01.dat database in WinHex is a huge task, as a
seemingly empty database may consist of many thousand pages. There is a vast
amount of timestamps and entries, and in this section we will try to cover the most
basic entries that may be of value for a forensic examiner. Figure 4 on the next page
shows the database header in hex view.
The ESE database store its data in a little-endian byte order. Little-endian stores its
values with the smallest byte first. This is important to keep in mind when reading
values from the hex editor, since the data might be displayed different from how the
database itself reads its data.
In the file header of the database we find that the first 4 bytes are a XOR checksum.
[5]
The following 4 bytes after the checksum is a file signature. The file signature has
offset 4, and the value is EF CD AB 89. This is of significance in a forensic data
mining operation where you might want to search in unallocated space for an ESE
database. Keep in mind though that this signature is common for all ESE databases,
not only the WebCacheV01.dat. There is also a high possibility the database is
fragmented; however, it gives a clear indication that there are fragments of the
database in unallocated space that may contain evidence.
At offset 24 to 51 we find the database signature. At offset 4 inside the database
signature we find an 8 byte sized entry consisting of the creation date and time for
the entire ESE database. At offset 0 to entry we find seconds, the consecutive bytes
that follow are minutes, hours, days, months and year. The 2 last bytes of these are
filler bytes. Byte number 5 represents the year, were 0 represents the year 1900.
Taking our databases timestamp as an example (see figure 3), we have the byte 71 at
offset 5. 71 converted from hexadecimal to decimal is 113, with the base year as
1900 we simply add 113 and come up with the year 2013.
At offset 52 we find the database state. The most common values to see are 2 and 3;
the dirty or clean shutdown states.
At offset 236 we find the page size entry that consists of 4 bytes. The hex in the
entry is 00 80 00 00. Since the ESE database uses little-endian we read this as 00 00
80 00. When 8 000 is converted from hexadecimal to decimal we get 32 768, giving
us a 32 kB page size for the WebCacheV01.dat database. This means the pages start
at offset 0, 32768, 65536, 98304 and so on (i.e., they increase in steps by 32 kB).
Figure 3: Hex view of the database header
5 Experiments
Using the software ESEDatabaseView a basic examination of the database was
performed in experiment 1. The goal was to present both the program
ESEDatabaseView and the structure of the WebCacheV01.dat database with its
containers and source paths.
In order to explore the possibility of acquiring deleted records as well as
possible data stored from a session of InPrivate browsing the second
experiment was performed. We based this experiment upon the work of
Chivers and Hargreaves “Forensic data recovery from the Windows Search
Database” [6] and the work “Forensic examination of Windows Live Messenger
2009 Extensible Storage Engine” [7] by van Dongen et al., but with the ESE
database of Internet Explorer 10 in mind. With guidance from Howard Chivers,
and the use of his software wdsCarve, a series of exploratory attempts were
made.
5.1 Preparing the lab environment
The experiments on the WebCacheV01.dat database have been conducted on a
virtual machine with the following specifications:
Windows 7 Professional x64, Service Pack 1
2 GB RAM
1 CPU, 4 cores
60 GB HDD
After installing the OS we proceeded with updating the system using the built in
Windows Update. We made sure all available updates, including Internet
Explorer 10, were installed, and then created a snapshot of the completely up to
date system in VMware Workstation. After every experiment we used this
snapshot to revert the machine back to a point where the browser had never
been launched.
Note: When installing IE10 on a Windows 7 system a new registry key is created.
It is located under “Software/Microsoft/Internet Explorer” in the HKCU hive, right
under the key TypedURLs, and is called TypedURLsTime. (In Windows 8 the key is
there from scratch.)
We chose to conduct the experiments on a Windows 7 machine mainly because
it, at this moment (April 2013), is the most used operating system [3, 4].
However, most of the information presented in this thesis should also be true
for the WebCacheV01.dat database in Windows 8.
5.2 Acquiring files from the WebCache directory
Most times on a running system we find that the ESE database in Internet
Explorer 10, WebCacheV01.dat, is locked (i.e., in use by a program or service).
This is because it is dependent on the WinINet.dll.
WinINet (Windows Internet) is an application programming interface (API)
which enables applications to interact with the protocols HTTP and FTP to
access Internet resources. This DLL is loaded by the program taskhost.exe on
system startup. [10]
Taskhost.exe is a generic process in Windows which acts as a host for processes
that runs from DLLs rather than from EXEs. There may be many instances of
taskhost.exe running on a system, as there will be one instance of taskhost.exe
for every DLL-based service that is running.
As the WebCacheV01.dat database is kept online by WinINet, it can’t in an easy
manner be copied out of the WebCache directory. We made some research
online and noticed that the most common way to deal with this is by using the
Volume Shadow Copy Service to copy the file. There are many programs that
can do this in an easy way, but we found this to be very time consuming and
wondered what would happen if we instead just disabled the taskhost.exe
service. This seemed to work just fine, and to verify nothing happened to the
database using this method, we calculated MD5 hashes for the database when
recovered using both methods. First we used the program ShadowCopy to
acquire the database and when that was done we disabled the taskhost.exe
process and copied the database again. The MD5 values for the files were an
exact match, see figure 2. We therefore adopted this method by creating a batch
file on a USB drive and used it to acquire the database in our experiments (see
Appendix A).
Figure 4: MD5 hashes for the database acquired in two different manners
5.3 Experiment 1: Database overview
The database used for this experiment contains regular browsing, such as
Google searches, news reading and document downloads.
Note: The ESE database is none consistent and the container numbers may
change from system to system (“History” can for example be container 2 and 4).
When opening the database in ESEDatabaseView we find all the containers in a
slide down menu in the upper left corner. When we navigate to the container
named “Containers” we get an overview of the entirety of the database as seen
in figure 5 and 6 below.
Figure 5: Left side of the table named “Containers”, providing information such as
container IDs and name of the data stored in the specific containers
Figure 6: Right side of the table named “Containers”, providing the full paths to the data
stored in the containers
The first container in our database is named “feedplat”. This container is the
home of RSS feeds stored within the browser and its full source destination is:
%systemdrive%\Users\%user%\AppData\Local\Microsoft\Feeds
Cache\
The second container is named “ietld”. This is a collection of top level domains,
full path to its source is:
%systemdrive%\Users\%user%\AppData\Roaming\Microsoft\Windows\
IETldCache\
The third and sixth container is of much importance since these contain the
visited URL’s together with timestamps. The containers named MSHist01* are
also of importance and linked with containers 3 and 6. More about these
further down.
The forth container is named “IECompatCache” and is a pre-compiled list of
sites from Microsoft with webpage’s best view in the Compatibility View Mode
and its source can be found here:
%systemdrive%\Users\%user%\AppData\Roaming\Microsoft\Windows\
IECompatCache\
Regarding the fifth container named “iecompatuaCache”, there is very little
information about what exactly it is and we have yet to find exactly what it
contains. We believe however it is closely related to the “IECompatCache”
container. Its location can be found here:
%systemdrive%\Users\%user%\AppData\Roaming\Microsoft\Windows\
iecompatuaCache\
Containers 7 and 9 are named “Content” and these are a collaboration of the
“low” and regular temporary internet files. Source paths:
%systemdrive%\Users\%user%\AppData\Local\Microsoft\Windows\Te
mporary Internet Files\Low\Content.IE5\
%systemdrive%\Users\%user%\AppData\Local\Microsoft\Windows\Te
mporary Internet Files\Content.IE5\
Containers 8 and 11 are named “Cookies” and also contain the browsers “low”
and regular saved cookies, their source path is:
%systemdrive%\Users\%user%\AppData\Roaming\Microsoft\Windows\
Cookies\Low\
%systemdrive%\Users\%user%\AppData\Roaming\Microsoft\Windows\
Cookies\
Container 12 is named “DOMStore” and is the location of Web Store “cookies”.
The DOM stands for Document Object Model. The storage can be compared to
regular HTTP cookies because it allows for sites to save specific data to the
system, just in a larger amount and allows some new options [20]. Its source
path is:
%systemdrive%\Users\%user%\AppData\LocalLow\Microsoft\Interne
t Explorer\DOMStore\
Container 13 is named “iedownload” and contains (if any) downloaded files
information and history, its path is:
%systemdrive%\Users\%user%\AppData\Roaming\Microsoft\Windows\
IEDownloadHistory\
Container 10 is of importance since we believe that these are made up of visited
URL’s per day. Its name is derived from MSHist01YYYYMMDDYYYYMMDD.
Based on observations we believe that if you surf the web on for example Mars
19, 2013, and Mars 20, 2013, a container would be created in the database
named MSHist012013031920130320. The very same URL’s found in these are
also stored in containers 3 and 6.
The source location for the MSHist01* files is:
%systemdrive%\Users\%user%\AppData\Local\Microsoft\Windows\Hi
story\History.IE5\
The History.IE5 folder is a hidden directory in Windows and can (depending on
access rights and system) be hard to access. If that is the case you could try
accessing it using the attributes seen in figure 7 below.
Figure 7: CMD input
When you have access to the folder you will find the MSHist01* files. When we
access these files, we find what seems to be the old index.dat file. More
investigation is needed to say if this index.dat file is the same or similar to the
old index.dat files used in IE1 through IE9. Figure 8 and 9 below shows the
folder access using CMD.
Figure 8: Files contained within the folder History.IE5 shown with command “dir” in CMD
Figure 9: Using prefix /a with command “dir” we see the index.dat inside the MSHIST01
folder
Containers 3 and 6 are named ”History”, and contain the visited URL’s. Inside
the containers we also find timestamps for each of the visited URL’s. The source
for the History URL’s is the same as for the MSHist01* containers and shares its
data.
5.4 Experiment 2: Recovery of deleted database records
As seen in the previous section (5.3), the WebCacheV01.dat database gathers its
information from various locations on the system. When a user clears the
browsing history from within the browser, as in figure 10, the records are also
deleted from the database. This experiment was conducted to answer our last
research question; is it possible to recover deleted records from the database?
Figure 10: Options to delete browsing history in Internet Explorer 10
Filling up the database
To have full control over the browsing history in this experiment we decided to
fill up a new database instead of using the one from the previous experiment.
As was done in experiment 1, the database was filled up by some quite
extensive surfing. After the browsing was done the system was rebooted in
order to make sure the log files were flushed to the database.
With the previous browsing session “in the bag”, the browser was re-opened
and started in InPrivate mode. A couple of key searches were done in this mode
using Google so that we should be able to easily extinguish InPrivate searches
from our regular browsing session.
The system was then rebooted yet again and as it came back up, we acquired
the database with our batch file. After another reboot the browsing history was
deleted using the options inside the browser (see figure 10 on the previous
page). A final site was visited (www.aftonbladet.se) after the history had been
deleted.
The system was yet again rebooted and the database was acquired with the use
of the batch script.
Verifying data stored in the two acquired databases
We recovered both the acquired databases to a clean shutdown state using
esentutl (as demonstrated in section 4.4), and opened them in Nirsoft’s
ESEDatabaseView for analysis.
When looking through the first acquired database we could, as expected, find
all visited URLs and the downloaded files, but none of the URLs visited in the
InPrivate session. (When InPrivate mode was engaged we searched for “how to
kill superman with kryptonite” and “power rangers” using Google, and visited
the top search results.)
In the other database we could only find records connected to the site we
visited after deleting the browsing history and thus the previous records that
can be seen in figure 11 was deleted.
Figure 11: Browsing history present in the last acquired database
Carving with wdsCarve
Figure 12 shows the progress of the carving program.
Figure 12: Initiate the database carving
The output file, CarvedData.csv, can be viewed using Excel. Each carved record
takes up one row, as seen in figure 13.
Figure 13: Overview of the output data from wdsCarve performed on the last acquired
database.
As seen in figure 13 there is more data than just the URL from the single site
that was visited shortly after the browsing history was deleted. When
examined, every single record that was deleted through the use of Internet
Explorers own interface was recovered using this carving technique.
The output data in the file CarvedData.csv is quite large in this experiment, so
in order to find if there are any traces of our InPrivate session we simply used
CTRL+F to search through the document. As previously mentioned, we googled
the string “how to kill superman with kryptonite” in our InPrivate browsing
session.
The search for “superman” yields the following result:
794,2,0,3081990198304673138,3,1301,131393,2,1,2013-05-10 07:49:40,2013-05-10
07:49:40,2013-05-10 06:49:40,0,2013-05-10
07:49:40,0,0,0,http://www.google.se/url?sa=t&rct=j&q=how%20to%20kill%20superman%20
with%20kryptonite&source=web&cd=1&sqi=2&ved=0CCoQFjAA&url=http%3A%2F%2Fwww.
killermovies.com%2Fforums%2Farchive%2Findex.php%2Ft-410593-how-exactly-doeskryptonite-
killsuperman.
html&ei=i6aMUazqEMjBtAawl4DIAg&usg=AFQjCNF5iwfq5FnIb6xO4dwzQnT1GWSg
Cw&bvm=bv.46340616,d.Yms,url[1].htm,-,-,-,-,-,-,7192577
The string above is one of several records connected with the InPrivate search.
A larger output can be seen in figure 14 below. The first portion is a timestamp
followed by a URL. The time provided is in UTC.
Figure 14: Carving shows database records of InPrivate browsing
More research is needed to tell if there are some pronounced differences
between regular browsing history and InPrivate history when carved. This
could be of importance since in this experiment, the search terms used while
InPrivate was active were known. This is not always the case when
investigating and could prove valuable in order to easily distinguish between
regular and InPrivate browsing.
6 Results & Discussion
In chapter 4 we presented the WebCache directory and what it contains. We
explained WinINet and its connection to ESE and we also took a look at the
WebCacheV01.dat using a hexeditor to get a better understanding of how it
worked. After that we performed two experiments to answer our research
questions regarding what data is stored in the database and if it is possible to
recover any data that has been deleted from the database.
6.1 Experiment 1
Starting out we had a thought of providing the database with small insertions
so that it wouldn’t grow too large and become a hassle to analyze. When doing
sequentially larger data insertions we instead found that the database wouldn’t
grow the way we anticipated in the first place. At first we thought it would
create additional containers as it grew, but instead it seems to use the
containers MSHIST01 to store data and simply move history data to its specific
container as it filled up, not making additional “History” containers.
Instead we made the bulk of our analysis using our “largest” database.
Noteworthy is that we cannot say for sure that there won’t be additional
containers like “History” if the database reaches a very large size since this
hasn’t been tested.
We found that this initial experiment would pose as a good introduction to how
the database is constructed and also serve as a basic understanding as we
moved on to experiment 2. Table 2 shows different data types stored in the
database.
Table 2: Overview of containers in WebCacheV01.dat
6.2 Experiment 2
Carving data with wdsCarve was a huge success and made for a good
connection to previous chapters.
The fact that it is possible to carve deleted records from the Extensible Storage
Engine stems in its own construction, where records that are deleted are in fact
just flagged as writable space. As stated by Chivers and Hargreaves [6] the
space occupied by deleted records is not re-used until that part of the database
is being re-organized. This happens within the ESE database to maintain the
balanced structure of the B-tree (see chapter 2.1.2)
A comparison can be made to various file systems, for example NTFS, where
deleted files are removed from the “file index” and flagged as free space.
However the files can still be recovered using a carver if they have not yet been
overwritten.
The InPrivate browsing mode doesn’t change how the ESE database behaves
and data is still stored as usual, the difference however being that the records
of browsing history are deleted when the InPrivate window is closed. As
previously discussed this does not prevent the use of carving to recover
InPrivate browsing history.
7 Conclusion
Web history analysis is an important part of a digital forensic investigation. In
Internet Explorer 10 there is a new interesting ESE database called
WebCacheV01.dat. When acquiring the database it is important to also collect
all log files in the WebCache directory. This is to make sure the database gets as
up to date as possible when flushing the log files to the database using the
recovery option in esentutl. We found that the most convenient way to do this
was to kill the taskhost.exe process and then copy out all the files, preferably
using a batch file.
The experiments conducted for this thesis has shown that it is possible to
recover information that has previously been deleted from the database, along
with the cached files themselves. This is also possible when the browsing has
been done using the InPrivate browsing mode. The tool used for carving
deleted records from the database is Howard Chivers’ wdsCarve, which is
available from the author for forensic examiners and researchers.
The information in this thesis should provide a good resource for anyone
looking to create a tool to recover information from the WebCacheV01.dat
database. Most of what is said about the WebCacheV01.dat database in
Windows 7 should also be true for the same database in Windows 8.
7.1 Future Work
More research is needed since some new questions have risen as the
experiments and work have progressed. How does the database behave if it
contains a vast amount of data – let’s say two years of recorded browsing
history? Will there be additional “History” containers or will it just keep adding
MSHist01* containers along the way? Due to time constraints we have not been
able to look into this as of yet.
Another useful thing to look further into would be the carved InPrivate records.
As of now, we have yet to find any indicator that immediately tells that the
record originates from an InPrivate browsing session. It would prove valuable
for forensic examinations to be able to distinguish between normal browsing
sessions and InPrivate sessions when looking at carved data.
References
[1] Kruse W., Hesier J., ”Computer Forensics: Incident Response Essentials”,
2001, p. 2
[2] Magnet Forensics, “Internet Evidence Finder”, (accessed April 2013),
http://www.magnetforensics.com/products/internet-evidence-finder/
[3] W3Schools, “OS Platform Statistics”, (accessed April 2013),
http://www.w3schools.com/browsers/browsers_os.asp
[4] NetMarketShare, “Desktop Operating System Market Share“, (accessed April
2013), http://www.netmarketshare.com/
[5] Metz J., ”Extensible Storage Engine (ESE) Database File (EDB) format
specification, v0.0.19”, 2012 (accessed April 2013),
https://libesedb.googlecode.com/files/Extensible%20Storage%20Engine%20
%28ESE%29%20Database%20File%20%28EDB%29%20format.pdf
[6] Chivers H., Hargreaves C., “Forensic data recovery from the Windows Search
Database”, 2011,
http://www.sciencedirect.com/science/article/pii/S1742287611000028
[7] van Dongen W., Toorop W., Blokhuis J., ”Forensic examination of Windows
Live Messenger 2009 Extensible Storage Engine”, 2009,
https://www.os3.nl/_media/2008-
2009/students/willem_toorop/wlm2009_ese_fin.pdf
[8] Microsoft MSDN, ”Extensible Storage Engine”, (accessed April 2013),
http://msdn.microsoft.com/en-us/library/5c485eff-4329-4dc1-aa45-
fb66e6554792.aspx
[9] CodeProject – Bakiev A., “Extensible Storage Engine”, 2011 (accessed April
2013), http://www.codeproject.com/Articles/52715/Extensible-Storage-
Engine
[10] Microsoft MSDN, “About WinINet”, (accessed April 2013),
http://msdn.microsoft.com/enus/
library/windows/desktop/aa383630(v=vs.85).aspx
[11] VMware, “VMware Workstation”, (accessed April 2013),
http://www.vmware.com/se/products/desktop_virtualization/workstation/o
verview.html
[12] Runtime Software, “ShadowCopy”, (accessed April 2013),
http://www.runtime.org/shadow-copy.htm
Forensic analysis of the ESE database in Internet Explorer 10
– 37 –
[13] NirSoft, “ESEDatabaseView”, (accessed April 2013),
http://www.nirsoft.net/utils/ese_database_view.html
[14] X-Ways, “WinHex”, (accessed April 2013), http://www.xways.
net/winhex/
[15] Chivers H., “wdsCarve”, (accessed April 2013). Available to forensic
investigators and researchers from the author.
[16] Microsoft MSDN, “Extensible Storage Engine Files”, (accessed April 2013),
http://msdn.microsoft.com/enus/
library/windows/desktop/gg294069(v=exchg.10).aspx
[17] Baher M, “Who said that transaction goes from Logs to DB!!!!!”, 2008
(accessed April 2013),
http://blogs.technet.com/b/mbaher/archive/2008/01/22/who-said-thattransaction-
goes-from-logs-to-db.aspx
[18] Microsoft TechNet, “Default ESE log buffers have been changed”, (accessed
April 2013), http://technet.microsoft.com/enus/
library/aa998538(v=exchg.80).aspx
[19] Bunting S., “EnCE The official EnCase Certified Examiner (Second edition)”,
2008, p. 95
[20] Microsoft MSDN, “Introduction to Web Storage”, (accessed April 2013),
http://msdn.microsoft.com/en-us/library/cc197062(v=vs.85).aspx
[21] Mississippi State University, “Empirical Research – Tutorial”, (accessed
July 2013), http://library.msstate.edu/li/tutorial/empirical
Appendix A
We created this batch file and put it on a USB drive to speed up the acquiring of
the WebCache directory for our experiments. It kills the taskhost.exe processes
and copies all files in the WebCache directory to the USB drive.
@echo off
:: Forcibly kills the process taskhost.exe and all child processes
taskkill /f /im “taskhost.exe” /t
:: Tells xcopy to copy _all_ files and subdirectories and create
:: the folders on the flash drive if they don’t exist.
setcopywc=xcopy /s /c /e /h /i /r /y
:: Copy the WebCache directory to the flash drive.
%copywc% “%userprofile%\AppData\Local\Microsoft\Windows\WebCache”
“%drive%\IE10\%computername% – %username%\WebCache”
:: Create a logfile with date and time.
echo Timestamp: %date% %time% >> “%drive%\IE10\%computername% –
%username%\logfile.txt”
cls
HALMSTAD UNIVERSITY
Congratulation to a very good article! Does andybody know if the file WebCacheV01.dat will be overwritten anytime or does this file contains every data sind using internet explorere 10?
Congrats!!!
Post incroyablement attrayant !!
Magnifique article encore une fois
I’m not that much of a online reader to
be honest but your blogs really nice, keep it up! I’ll go
ahead and bookmark your website to come back in the future.
Many thanks
I read a lot of interesting content here. Probably
you spend a lot of time writing, i know how to save you a lot of work, there is an online tool that creates unique,
google friendly articles in minutes, just search
in google – laranitas free content source
A new forensic tool is able to analyze normal records and recovery deleted records from WebCacheV01.dat
it is better than ESEDatabaseView, EseDbViewer, ESECarve…
the tool name is ieforensic
https://sites.google.com/site/ieforensic/
I read your paper
A new tool(IEForensic) is able to analyze normal records and recover deleted records.
It is also parse download path, http reponse header, web page title…
use it
https://sites.google.com/site/ieforensic/