Adding APFS Support to The Sleuthkit Framework

Joe Sylve discusses his work at DFRWS USA 2018.

Joe: So, as [Brad said], my name is Joe Sylve, I’m the Director of Research & Development at BlackBag Technologies. I’m going to talk about some work that we’ve done to integrate APFS support into the Sleuth Kit framework.

I was going to say that this might be a little bit of an awkward talk for me, but I’m not in a Santa suit, so I’m not going to say that anymore.

[laughter]

Joe: But the overview here is that we’ve got pretty full support for APFS and TSK, but the awkward part is that I can’t give it to you yet. Yeah, boo. BlackBag has been very nice and agreed to allow me to release the work. It’s just going to be released “soon”, because our commercial competitors haven’t really come close yet. But I would say [crosses fingers] … six months?


Get The Latest DFIR News

Join the Forensic Focus newsletter for the best DFIR articles in your inbox every month.

Unsubscribe any time. We respect your privacy - read our privacy policy.


But what we were looking into doing is releasing our [pool] storage implementation, which … it relates to [Jay and Martin’s] paper [with the VTRFS] and the paper from last year [on CFS]. While their work was pretty awesome, their implementation is mostly [wrapped around revolve the command line] tools, whereas our pool storage is actually in the framework itself. So, we’re going to be working with those guys in the coming weeks, to actually get that stuff compatible with our pool storage layers in the framework, push all of that upstream. So, that one first, and then the APFS [we’ll get to you].

So, what can we do? We can do most things. We can fully parse the APFS containers – the containers are just the APFS terminology for the pools. We can fully all the file system data, metadata that you would expect, [02:07], including compressed and parsed files, encrypted files, whether that encryption is the native APFS encryption or devices that have been upgraded from core storage, from HFS+ to APFS, if you have an HFS+ system under most conditions, and you upgrade to Mac OS 1013, your file system, whether you know it or not, is probably now not an HFS+ file system but it is an APFS file system, because these things automatically get upgraded. The encryption works a little bit different in those scenarios.

We’re also now able to parse through the snapshots. APFS takes snapshots if you have Time Machine on hourly on your system. They don’t stick around for a whole long time, and that really kind of … rule of thumb is, on an average, you’ll have a snapshot for every hour for the last 24 or so hours. So, it’s not like a volume shadow copy that’s going to stick around pretty much forever until you delete it or run out of space. But I think it’d be useful in scenarios where you’re kicking down the door and someone’s deleting their files.

What we’re still currently working on is support for analysis of the new iMac Pro and now the new MacBook Pro, and maybe another one, others of their higher-end systems, because Apple is really, really focused on security and privacy these days, which, personally, it’s a good thing; professionally, it makes my life a little bit more difficult. But these devices come with actually hardware, a T2 chip, which has two mechanism, one is [03:52] storage where the keybags that have all the encrypted keys from the first step of the decryption are not stored on the disk, like they are with most KPFS … well, they’re stored on the physical hard drive, but they’re not stored on what you would read [04:07] from the disk, they’re actually stored on the chip. So, we’re able to now extract that information from the chip and pass it to the chip itself for decryption, but there’s still some steps there that we haven’t quite worked out yet. And [the Apple goddess] did not let me get [04:22], so still working.

And additionally, we do not have full support for fusion drives, mostly because Apple’s implementation of this hasn’t seemed to stabilize yet, [if they] don’t even have it enabled by default, but I am told that will change with the next version. So, right now, if you do somehow come across a device that has a fusion drive, you can still analyze it by imaging the APFS container, the logical container, [which will be the logical concatenation] of the two different physical partitions. But if you have [a dead] drive that you can’t boot up the system, and you have to [get into both the] physical sectors individually, we’re going to be working on doing those things together in pool storage.

Obviously, Sleuth Kit doesn’t support pool file systems, so we had to make a few framework changes, and luckily, the framework changes are fairly minimal. So, if you’re using Sleuth Kit in your tools, there’s only a few things, this new version, you would have to know how to do. First, we’ve added a pool storage layer, and this sits between the [05:44] system layer and the file system layer. And it has pretty much the set of APIs that you would expect from Sleuth Kit. These are very, very similar to the API calls for the file system layer.

However, this layer is optional. You don’t have to actually open up the pool layer for devices that don’t have the pool layer, for instance. So, you have your calls, you have various [open] calls whether you’re doing a disk [06:10] whatever [close]. TSK pool read is pretty much the same as TSK FS read, but you’re reading from a pool block rather than from the image block.

Getting the unallocated runs – because in pool storage systems, because the individual volumes themselves can grow and shrink based off of the uses, the file system itself doesn’t generally have unallocated space associated with it. If you’ve got unallocated space, now it belongs to the pool. So, we need to have a way, at the pool layer, of collecting all these unallocated blocks that we can carve through.

And … yadda, yadda, yadda … the rest of the stuff is what you would expect.
Very few changes to the file system layer. Pretty much just a different open command, just like right now you have the TSK FS image open – if you have just a volume image, or [TS, FS, VS] open or [07:17] whatever happens to be, if you do have a volume. This is just … there’s a special command, special function to call, to open a pool, a file system that happens to be on a pool volume.

There are a few [new dependencies], and I would be interested to know if these would be a problem for anyone. The actual implementation itself is written in [a variant of C++] or a reasonably modern [variant, C++14], because we should not be programming like it’s 1998. Of course, there’s the [C wraparound], so you only need to have C++ support to use it.

However, this does [08:05] it has potential issues with [08:07] TSK, as … for whatever reason, [08:11] compiling requires Visual Studio 2008 [08:16] 2007, and obviously Visual Studio 2008 does not have support for C++ [Standard], it came out 2014. So, I would be very interested in exploring workarounds for this. I think it can be done, just without static binding of the [08:34] but we’ll see. [08:37] is supposed to die, eventually.

Because we support encryption, there is a new dependency on Open SSL. I’m not [08:50]. If you have any better suggestions or reasons why Open SSL should not be a requirement for Sleuth Kit, I would be interested in hearing from you.

Future work that needs to be still be done. We don’t use, internally, the [Java or the Python bindings], so those [are when] you need to be updated. Visual Studio compilation is probably just a matter of changing the Visual Studio project files used. [We cross-compile it with MGW] for Windows, so I didn’t even bother opening up Visual Studio. So, that’s probably just [09:31]. And the [large thing here] is that we will be working with the authors in [pooling] the existing [CFS and BTRFS] implementations into a pooled storage layer, and getting those actually upstream. So, [as soon as we can we’ll work on] more file systems for everyone.

I guess for now, the marketing team would want me to say that if you want to parse APFS volumes you can [get a licensed] BlackLight.

So, let’s do some live demos [10:07]. Even better, my images, so I’m going to be analyzing my running disk, so [10:16]. Yeah.

Alright, so there, the [tool link] has also been updated, the command line tools. Obviously, we needed to add some tool to [10:33] get information about the pools themselves, so there is … I know this is sort of conflicting to the other tools that [have been] put out, but maybe we’ll come up with something [that works for everyone].

There’s a [pstat] tool, and you can give it a raw disk image, but in this case, I’m giving it a disk. And this is going to give you information about the pool, so in the different pool volumes.

Nope. Why not? Yeah, our disk. Nope.

[laughter]

Joe: Oh, I’m sorry, I’m doing this on the disk. Of course. The disk image and not the … container. First we do that. So, just like regular Sleuth Kit, you have your disk itself, your physical disk is still going to be partitioned in the physical partitions, but one of these partitions is going to be used for the pool, and again, if it’s a fusion drive, you’re going to have more than one partition, so our tools still use the [11:54], as you would expect.

So, [pstat] now … I’m giving it the correct [offset]. It’s going to give me information about the [general] pool, so it’s generalized information about the container, this is the APFS-specific information, but I imagine there’s a reasonable analog for most pool file storage systems. And the breakdown of the different volumes there on the system, [12:24] our main volume and SSD. The encryption … these are the [wrapped] encryption keys … there are multiple different volumes here, you can see this is two … there are actually four different volumes.

Once we decide which volume we would like to analyze, the important thing here that you need to remember is this … okay, so this is our main volume, the volume block number. Because we have a pool storage system, the … all the different pools [in] different volumes are going to be pooling from different blocks. So, these [aren’t contiguous], right? You can’t just say that it’s going to be block zero where the file system starts. So, every different volume among these file systems is probably going to have like one [13:13] block, it’s the starting point. So, we need to know that starting point.

So, we’re going to copy that. And then if I do … we’re already in the [13:25] … I’m going to be able to pass this [as a new] command line option, [13:32] [hopefully it will … didn’t seem to conflict] with any of the other tools.

So … yeah. But this is encrypted, so you need to give it a password, so there’s a new [13:46] for your key, and luckily, I don’t have to show you my password here on the command line, because APFS, all you need is the password of any user on the system [with the] recovery key to decrypt it, so I’ve created a new user. The password is not my password. [14:05].

This is a running system, I bet you the file system block has changed since I was talking.

It did.

Yeah, okay. So, copy-on-write file systems overwrite their [14:31], remember that. So, now that we’re able to do this, [number …] this is actually decrypted, so it’s also going to give me information about the [unwrap] passwords, and here’s also all the different snapshots that are on that volume. So, I’m going to copy this for later, and of course [14:50] going to work as you would expect, all the different tools, [14:55] all that stuff works.

This is the most … if you don’t give it any other options, this is going to be the active volume, and if you want to parse the filesystem state at the time of the snapshot, there’s a new [capital S flag] as well. And that’s [… just trust me, that’s from] parsing the snapshot.

You all know how Sleuth Kit works, so I don’t have to go any further, it’s largely unchanged after that, it’s just the few other flags that you need to [pass your] information [through the] decryption password [15:29].

That being said, are there any questions? Other than “When are you going to release this?” because I can’t answer [you right now].

Host: How about a round of applause for Joe?

[applause]

Host: Questions?

Joe: Great. [Works for me.]

Host: Thanks once again, [15:56], and next up we have [Darryl] to …

End of transcript

Leave a Comment