Thomas Glanzmann discusses his research at DFRWS EU 2017.
Glanzmann: Characterizing loss of digital evidence due to abstraction layers. So what we did is we created a [formal] model mapping different layers and analyzing what gets lost due to abstraction layers.
Yeah, Felix Freiling is from the [FAU] and [00:28] and Hans P. Reiser is from the University of Passau. I’m self-employed, I work as a trainer [on-site] with IT infrastructure, areas of [virtualization, storage network], and backup. And also providing [web] environments. And I am in the process of becoming a PhD student.
This is our agenda for today. We introduce some charts and we talk about contributions, definitions, and identify the model in practice, and then we come to the conclusions.
Abstraction layers are: Hierarchically – you probably know that there is a higher layer which is mapped to lower layer. You can use it to efficiently manage resources, one example which we just heard is deduplication, which is an abstraction layer. Another example would be lazy allocation – so for example, if a 2TB hard disk is allocated, then only the space that is currently in use is physically allocated. The details of the lower layers are usually hidden from higher layers, at least in most cases, because they are necessary for the higher layers and … a problem, or it could be a problem, is to establish authenticity. So often, you try to strive for the lowest layer possible in order to give this authenticity, [although] in cloud environments this might not always be applicable because you don’t have access to the lower layers. Then, ambiguity and the possibility of interpretation errors – let me give you one example.
For example, when you have virtual machine hard disks, they can be fully provisioned. That means the space is allocated, however, they are lazy initialized. What that means is that the space is [blocked] but it’s not overwritten. So the blocks may contain all the data, and if you are not cautious, you might actually put blocks which never belonged the virtual machine hard disk to this virtual machine hard disk because you did not interpret the data structures correctly. And, lower layers might be out of reach – we talked about this.
So our question is: Under which conditions is it possible to reconstruct higher levels of abstraction from data given at lower layers of abstraction? We distinguished between actively used, deleted, overwritten, or otherwise unused data. And yeah, some things are unable to be interpreted, especially deleted files [for the active …] there is no data production always being possible. And yeah, we have a lot of abstraction layers – we will later see in a practical example that just in the usual environment we have up to nine abstraction layers, sometimes even 11 or more, that could be problematic. And the other thing is that you might have concurrent access, which could add fragmentation. So sometimes you have up to 12 virtual machines and one host, which then of course makes it even harder for the [attribution]. And yeah, attribution I already talked about – so when you have too much tenants on one single hard disk or physical storage device, it might be hard to attribute which data belongs to which one, especially if it’s engaged or no longer in use.
There are some related works. Abstraction layers are used in software and hardware design, we know that. We have abstraction transformation functions from Carrier. He goes a little bit deeper than we do, so he also [04:15], for example, problems in the implementation of the [reverse-engineered] data structures which are not fully understood, and also of course problems while collecting the evidence [04:26] of that.
Then, virtualization poses problems to incident and forensics, obviously because you cannot access layers you probably want to access, and also, [probability] to attribution. Attribution, then we talk about recovering deleted files and processes from memory. This is of course possible in most cases. However, for example, if you need to collect memory dumps from a hypervisor, you probably need to take this hypervisor offline, which could affect multiple users, which, again, is a problem. And there are cloud environments which aid forensic analysis. One would be this example.
Another example which [I’ll probably find better] is [05:09] from the University of Michigan, where you can actually replay every single instruction if you want multiple weeks or months back. Obviously, you will never find such an environment in a production environment, as this is more an academic approach.
Cloud-native artifacts can only be acquired by physical access, at least partly. However, the APIs and also the network file systems are changing so that this might change in the near future.
Our contribution is a formal model. [Yeah, we] derive conditions for evidence collection, we focus on block-based storage – by block-based storage we also mean memory, so pages, and of course they might be fragmented, and they also might be different size. We talk about active and inactive data, [so by this] data, we mean that files that currently exist [among] active data, for example, data that has been removed.
And what we will see later is that our conditions are only partly satisfied by real data structures, so we look at a few [data visualization] and then some data is lost through abstractions, so we cover file systems, virtual memory, guest physical memory, logical volumes, and RAID. However, [the model] could also be improved, for example, through the application of [06:23].
And what ignore completely is API-based storage abstractions, [like NFS], because they hide state information that will … what we will see later on, that this might probably change.
Okay, we look at a layer … a single level of abstraction at a time. We have an upper and lower layer. The upper [06:43] lower layer and each layer provides computation and storage resources. We focus mainly on storage resources, and the upper layer is implemented by using resources from the lower layer, which is probably obvious.
Storage in our case can be either memory or disk. Both layers consist of a finite order of block sequence on the hard disk. You can think of it as a logical [block] address, memory you can think of a page number. Block sizes on the upper and lower layer can differ, but however they don’t have to do. And each block is identified by a unique index.
Here we see mapping. We have our upper layer, u, and our sequenced blocks which [07:25] the lower layer and also sequences, and we have here our phi-t, so what this one is doing is our mapping at the point t. So the t is of course increasing. We need this scale in order to identify what [will be later the single most recently deleted] block.
So here we see that 1 is here, and that 2 will however point to this one, then for example would … an example for the application, or here, that one upper layer block points to two lower layer blocks, there might be a difference in the block size.
Okay, mappings change over time – so in the beginning, we don’t have any mapping. We see here that the time increases. We still have our upper and lower layer, and our mapping phi-t. And here, you can see that two are showing to the same one, then here the mapping is changed, and here’s changed again, and here it’s completely [removed].
So the mappings change over time, it’s clear. When we talk of a previous mapping being removed, we call it unmapped. If a mapping is added, we call it mapped. And unmapped blocks are of special interest for digital forensics – you probably know that. So what we here look at is the most recent deletion time. So here, [currently], we don’t have any mapping. Here we have an active mapping. And here this dotted arrow means that we have a deleted block. So this is no longer active mapping, which used to be active. And here we can see that another mapping for this underlying block has been changed, and that here it [has removed], here it has been changed again. And what we will later see is that we are actually [likely to] identify the most recently deleted [points].
Then we have mapping aware and mapping agnostic lower layers – so what does it mean? A normal layer is mapping agnostic – that means the lower layer doesn’t know if a block or page is currently in use. Then we have mapping aware layers, so that the lower layer knows if the block is currently in use or not. An example for that might be UNMAP or TRIM – you probably have heard of it. [SCSI] UNMAP is a [SCSI] command which allows to release a block in the lower layer. This is implemented on [SSDs] and also on enterprise [09:56] systems like we later see on the [09:59] different [vendors]. And [TRIM] is the equivalent for the command, which is also in every [SSD] currently there in the last ten years or something. And TRIM might [10:15] after a block has been trimmed, might [return] the content, while UNMAP doesn’t do that. So UNMAP is specified to [return] zeroes once issued; TRIM isn’t.
So we have mapping aware and mapping agnostic lower layers. Of course the mapping aware layers pose a big problem for evidence collection, because that means that our data could be faster released, and so no longer be available when someone wants to do the [10:48] of evidence.
Evidence collection depends on the level of access, and can be collected at the upper or the lower layer. Sometimes, you only have access to the upper layer, other time you want to strive for the lowest layer necessary to capture the relevant information. In the case of UNMAP or TRIM, that could be [in a chip-off layer], because [through the SATA] or [11:17] [phase] of the disk you no longer can access the block, unless of course you modify the firmware. And usually you strive for bit by bit, or block by block, or page by page copy, depending on what is the access pattern.
We already talked about active mappings and deleted mappings, and we defined three increasingly powerful notions of reconstruction. So first problem is finding and decoding phi. So phi is the management data structure which is used to go from the upper layer to the lower layer. It might be stored on the lower layer, it might be stored somewhere else in memory, and it might be very, very time-consuming to reverse engineer such a thing. For example, when you look at the VMFS file system or the VXFS file system from [Veritas], [it took] multiple years. And the reconstruction or the reverse engineering is still incomplete, as this can be very cumbersome, especially if it’s an exotic file system.
Then we have our first definition, which is the active mappings reconstruction. So giving a copy of the evidence of the lower layer at time t, solving the active mappings reconstruction means that we find and decode phi-t from the evidence and enumerate all elements of phi-t. So this is the easy case, where everything is mapped.
The definition two takes definition one as a baseline and adds the last constraints or enumerate a non-trivial subset delta for the mappings of phi-t, and the empty set is only allowed if they have no deleted mappings, don’t exist, where we need this condition, otherwise we are unable to distinguish between definition one and two.
And the last deleted mappings reconstruction adds another constraint to enumerate the most recently deleted mappings from that subset. And we have here a little dilemma, and the dilemma is that the deleted mappings reconstruction, the last deleted mappings reconstruction could be [folded] together because the set is … the empty set might be added. So if you have any idea how we could formulate this constraint better, to distinguish between deleted mappings and the last deleted mappings reconstruction, yeah … we would really like to hear your feedback.
So identified model in practice – we focus on a particular abstraction layer and concrete implementation, [14:02] identification of the upper and lower layer [remain] the evidence. We exclude – we already talked about this, [memory backup and] disk snapshots are … for, of course, evidence collection, backups, memory and disk snapshots might be very, very interesting, because you can find a lot of artefacts in there. However, now, backups [also gets] mapping aware. For example, when you look at [Veeam], backup software for VMware which is often used, it actually analyses the NTFS file system to identify which blocks are being currently in use, and only backups the blocks which are [actually] referenced. And in addition to that, it can, for example, remove blocks from [complete paths].
So you can, for example, say, “I don’t want to backup my Windows directory or my temp file or my user …” and so on. So this is of course for evidence collection a really big problem, because now the backups also lack data.
We look at pure management mechanisms – that means the active mapping, so excluding backup memory and disk. Of course, the pure management and backups, for example, is a little bit blurred.
Let’s have a look at a usual environment. Up here we have a hypervisor, and our example we will choose [ESX server]. We have here two virtual machines running on the host. First, we look for the virtual [trust] space, this is where software lives in, that would be the kernel, or [15:29] applications which is mapped to the physical [trust] space, which is part of the physical memory, and simulated [PCI] devices, which is then mapped to the host physical, which is the real RAM and the physical [PCI] devices. Then we have here file system within the virtual machine which is on the block device. We could introduce another layer, which, for example, could be a [15:51] even if it’s [15:53]. And then we have a virtual machine hard disk – this is just a file.
Here it gets hard – for example, [15:59] supports seven different formats for virtual machine hard disks, so they have [16:06], that means it’s initialized [at the front]. We have [lazy] initialized, it means it’s allocated but not initialized, and the blocks which have been written are [marked through] a bitmap. Then we have [thin] allocated and then we have [raw] device mapping, which just passes through commands through the physical device. Then we have [NFS] and then we have [16:26] [which is a very small block size] [16:27] [thin provision]. So you see there is a variety of different virtual machine disk commands, and every one comes with some problems or the other.
Then we have here a cluster file system, which means that up to 2048 hosts can access that, and we access usually through [SCSI], so here in this case, [ISCSI] or [finite channel]. Storage array, as an example, we use [16:55] we can use any other [16:56], or even Linux kernel has a target implementation which can [do ISCSI] or the [finite channel]. And we have here volume, and this volume is a [17:06] nodes, so nodes are physical entities, and every physical disk is connected to two nodes. So in case one node is failing, the other node can take over the disks, so that data stays accessible. Also necessary to [17:23] updates.
And what we can see here is this is a [stripe] across all nodes. So in this case we have two nodes, and in the case of [17:33] you can have up to  nodes. And then we have RAID set – in this case we have RAID 1, but it could also be a different RAID set. This is RAID 1 with a set size two, that means there are two copies in the system. And the special thing of the [17:44], which many other storage systems do these days as well, is that it allocates for one volume chunks from each disk of the same class. That means if you have 300 disks in there, you get the idea, you have … you allocate one gigabyte at a time from every disk, and once you have allocated one gigabyte from every disk, you allocate the second gigabyte from every disk. Now, of course this makes it very complex to reconstruct the same, and also, of course, this is a big issue if you need to collect the evidence. When we talked about [eight terabyte] this morning, or about [a petabyte].
[So much for that.] So these are the abstraction layers in practice – so now we looked at a few examples. The first example is the virtual memory management in Linux. I don’t know if you are familiar with this. This is [18:40] map virtual address to physical address. I used a 32-bit example. So the first 12 bits is easy – it’s just one by one, because other than the deduplication, pages are aligned. It means you can pass through the last 12 bits, which is to the power of 12, which is 4096 page size of a 32-bit system. And then, for the … here, we have to see [19:08], which maps the virtual to the physical address space. So each process has its own value stored there, and then you go to the page index, and take either the first ten bits as a page [index, the direct bit set], you are done. If not, you have a second indirection, and take the next 10 bits to find the offset, and then you are done.
This is a [page swap] for 32-bit system, for 64-bit it’s equivalent – the only difference is that you don’t have two abstraction layers but four. The page size is not 4k but 8k. And yeah, what we found out is when a page is unmapped from a process, which is very unlikely because [19:53] doesn’t do that if don’t ask explicitly [19:55] so if you [do a] free library call, and probably we’ll not run an [20:00] command, in order to [free] the page. So what we saw is when you do that, is that the data structure is overwritten and you no longer can find the last [needed] mapping. However of course, the [X] mapping you can reconstruct. So this solves the definition [20:18] definition true, however, using other methods you could of course [interpolate] that so that you probably also can solve the definition true. But for our example, we focus strictly on what we can improve here.
Now, a very similar example is the DOS/MBR partition management, so we have four partitions, start, stop, and the partition number. Again, when you delete a partition, [it’s zeroed out] so you can’t identify the last deleted [pointer]. So that means you have … you get no idea where this partition is. In practice again, of course you can interpolate, because you can look, for example, for the beginning of a signature of a file system, of a swap space or whatever, and then reconstruct the data, you can look for [a whole]. However, when we only use this data such as evidence, it only solves definition one.
The last example is the logical volume manager of Linux. Here we [have two]. I quickly explain the same. We have here a volume [21:22] zero, we have a logical volume. Logical volume consists of virtual [extends], they are usually 4MB by default, which map to physical [extends], which are also 4MB in size. And there is a 1 MB label – the label is only 512 bytes, but the rest is [21:42] regions or 1 MB minus 512 bytes. And in there is a plain text data structure stored, with a pointer that describes the mapping. The nice thing here is that this data structure is usually less than 4 KB, but we have here almost 1 MB of size, that means we find here multiple iterations, so we can solve deleted mappings problem and also the most recently deleted mappings problem, so our definition 2 and 3.
And here’s an example of such a thing. At the beginning of this [zero], as we can see here, was a free virtual [extend] [22:18], then it was shrinked to only use one physical [extend]. So here we have two deleted mappings, and then another logical volume was created. Due to space concerns, it has to reuse physical [extends], and then we have them here.
Of course, as soon as logical volume one writes data here, it doesn’t help us to reconstruct any evidence, because then the data is overwritten and useless for us. But until that, it’s possible.
Then here we see quickly the text format, let’s quickly go here. We have an ID which is unique, we have a sequence number which is increased every time it changes, we have an [extend size] which is by default 4 MB, so this is blocks, they are 512-byte blocks. And then we have the physical volumes, again with a unique identifier, which you can find in the beta region. And the device size, and [get in] blocks and the start, and the P/E count, and this is again blocks, 512-byte blocks, that’s one megabyte, and how many physical [extends] we have here.
Then we have logical volume, we have here an ID creation time. The segment count is space which is continuously allocated. The extent – how big it is – if it’s striped, in this case, there is only one physical volume, so it doesn’t really matter if it’s striped or linear. And then, where it starts, and this is a mapping that describes from the upper to lower layer or what we reference as a management layer.
Then, let’s come to the conclusion. So we looked at the effect of abstraction layers on forensic evidence collection. What we saw is that collection on higher levels is often inferior to lower layers. Then something similar, the semantic gap – that means that the higher layers actually hide information from the lower layers and then we have logical access restricts evidence for mapping-aware systems. So this means if you have any lower layer which knows if blocks are being referenced or not, it can do garbage collection.
For example, from the SSD, when you do a SCSI UNMAP, the block is no longer accessible. You have to do chip off or go patched firmware. And also, there is garbage collection of course of the SSD, which is sometimes done after a power cycle, which then erases blocks and you’ll no longer be able to access it. The [24:51] is [once in mind] that blocks are being released when you do that, for example, on a volume that’s been provisioned.
Then, evidence collection using APIs improves. An example for that would be NFS version [4.0.2] which is currently in the [categorization] process. So currently, for example, when you retrieve the NFS, a large file, you have to read every block, because you don’t know if it’s physically [25:14] or not.
In NFS [4.0.2] you can actually see if it’s physically allocated or not.
Then, mapping agnostic systems deleted mappings can often be recovered unless overwritten. We already saw that – so if it’s agnostic, the blocks are not actively being reused. That means if no one written to that particular block … it can be recovered. And one thing that we have to think about is, for example, when we look at the [career] paper, we can see that file systems might be a little more complex than our model is. Of course, maybe we can just [25:50] the file system in order to fit our model.
However, we still have to think about what we are going to do here. And yeah, what we probably will do in the future is develop a tool which implements some of these mappings, and the compositionality means that we … so currently, we look at one abstraction area at a time, with the default multiple extractions in one piece, and be able to access them through the active mappings. This is [trivial] for the … at least if the management data searches are well understood. For other things, this could be more … could be much, much higher.
Okay. That’s it from me. If you have any questions, go ahead.
Presenter: So we have time for one or two short questions …
Glanzmann: Everyone is tired at the end of the day. No questions [27:02].
End of Transcript