Using Santa To Augment Forensic Investigations

James Nettesheim and Gary Brown discuss their work at DFRWS EU 2018.

Gary: Hi. The title of this talk is ‘He’s Making a List, and We’re Checking it Twice: Santa for Forensic Analysis’. I want to point out that it was very difficult coming up with this title. We had many runner-ups, including ‘I’m Telling You Why: Santa as a Forensics Tool’, ‘He Sees You When You’re Happy and Knows Just What You [Take]’, and ‘I Saw [00:37] Santa Claus’. None of those washed out, and ‘He’s Making a list, and We’re Checking it Twice’ is the title.

But you’re probably wondering who we are. I’m Gary. I’m on the Digital Forensics team at Google. I handle all security incidents, with a specialty on [00:53] last couple of years, and that’s kind of how I [fell into Santa]. And before that, I worked in the Detection team at Google, and before that, I did detection for the Federal Reserve’s National Incident Response team.

I like sugar, most of the four major food groups – candy, candy corns, candy canes, and syrup. I also enjoy fast food, and I’m a television enthusiast.

James: Thank you, Gary. I’m James. I’m also at Google, [we respond to all the security] [01:23] [I’ve had a life with the] US government, the United Nations, and [01:26], I also love a good [cheese] joke, so [if anyone] wants to share a [cheese] joke later, let me know. Like, for example, what was left [of a] French cheese factory after the explosion? Just de brie.


Get The Latest DFIR News

Join the Forensic Focus newsletter for the best DFIR articles in your inbox every month.

Unsubscribe any time. We respect your privacy - read our privacy policy.


[laughter]

James: It’s a good one.

So, what are we actually here for? We’re here to talk about Santa. Santa is an open-source tool from Google. So, we’ll talk about what it is, we’ll talk about some of the well-known stuff about it, talk about some of the lesser-known stuff, and then some analysis [01:59].

So, what is it? We call it Santa, because it knows whether your binary is naughty or nice. It’s [application-wide listening] for Mac OS. So, yeah, if your application [isn’t whitelisted], you can try and double-click that malware, which users love to do, [02:14] block it for you.

The way we’ve done this is – in full disclosure, neither Gary nor myself are the developers on this, but we love to use it, so we thought we’d share it with you. It’s Mac OS, extensible to kernel extensions, and you can use what are called kernel programming interfaces to leverage interacting with the kernel and making decisions on things such as file executions. We use the kernel authorization KPI that provides a number of powerful features for us. This allows us to listen in on the virtual file system activity going on on the Mac. And we can take actions based on [02:54] or file system operation on Macs. [02:55] are just like how the kernel sees files, it’s the virtual way that the kernel will see a file on the Mac. [03:01] can either take direct action or we can take indirect action. The great thing about it, it’s fully open-source. And with the latest versions of Mac, kernel extensions have to be digitally signed.

So, we’ve taken care of that for you as well. We have an Apple-issued digital signing certificate, so you can take this open-source and install it across your enterprise or home PC, anything like that.

There’s a lot of different [binaries of related concepts], but we’ll focus primarily on the whitelisting and then the file system [tracking. So, there’s a [standard driver] and it registers two different types of KPIs and [03:40] them under the [Kauth scope], it’s the [vnode] listener that listens to [vnode] file execution [03:46]. And then the file [03:47] [scenario], which gets us executions, deletions, renames, [03:51] changes. We’ve also added disk mount tracking by leveraging Apple’s Disk Arbitration Framework.

So, why do we track executions in two different ways? Why both? Why not both? [04:04]. So, we use the first one to make a [block or deny] decision on the executable [04:10], and then we use the second one to log process [04:14] the action that was actually taken, did it execute or was it blocked?

I’m not going to dive too deep into this, but kind of a quick overview of how this works. We register a [Kauth scope vnode] listener. This flow will then help us track [Kauth vnode execute events]. The Kauth [04:33] function is executed by the kernel when a process is trying to execute a file. And it basically provides information on where the executable is located. The [standard driver] will then check its local cache to see if it has an allow or deny entry for that vnode/file system [ID] combination. And it returns that decision to the Kauth KPI, which decides if it will allow that execution to take place. If it receives a deny, it blocks and we’re done. [05:01] carry on. If it receives an allow, then [it passes it on to other] Kauth listeners, that’s where we can then start tracking the process arguments and things like that.

If there is no entry for the vnode [ID] in the cache, then we reach out to the server and try and get a decision for the file there. Any [write to the vnote ID] will invalidate the cache entry, so that’s how, if a file is updated, we won’t … we will make sure that we hash it again and get the information from our external server.

But – so, application-wise this is great, but we also have a lot of file writes and modifications that you can listen to with this. So, we get all write data events to files, we get deletion events, we get rename events, we get exchange events and [linked] events and [close] events. The primary ones we’re [05:50] are write events, the delete events, and the rename events. Everything’s logged [05:58] Santa, santa.log. All execution [06:01] are logged there. Gary will talk about this a little bit, but you can set up regex to track which files are being written to, based on their path. And we’re bypassing the unified logging system in Apple right now. So, everything’s still in plain text, in text files, which makes it a little bit easier to work with logs [06:20] or something like that [06:22].

We’ve also … binary [06:25] is hard, right? So, how do you deploy this across an enterprise and see who … and allow people to execute the software they want? We’ve created a tool called Upvote, which is [06:36] becoming a buzzword, I hope it becomes the next blockchain. It basically allows your users to upvote files and decide if they want to run them. Which, there is no danger in that, [06:47] try and run anything bad. But it also allows you to do it per user, and [06:52] Santa, so I can log on to a variety of different devices, and my policy follows me, versus following [06:59]. It’s also open-source, and you can get it at google/upvote.

One thing you’ll notice is that if five users across our enterprise – and this is configurable – say they think this executable is safe, then it will be allowed locally on those users’ workstations to execute. You can change this [of course]. Gary and I, as Santa’s helpers, have full power to allow blocks [07:23] across the whole enterprise. And as you can see [in the logs here], so people are upvoting a file, and of course, that’s the [07:33] page for that. So, I [would be wary] of … definitely monitor this, this has got some pretty good information, because users love to run malware. When [I’m trying to run] malware, I just tell Gary to upvote it, and he … because he’s my friend.

[laughter]

Gary: Yeah, you may want to set high thresholds and/or discourage company mailing lists that say, “Hey, can you just blindly upvote this? I’m trying to run it on my machine, it’s not letting me.” That is, more often than not, how this happens.

James: If you look at [08:01] you’ll see people sitting around each other, constantly upvoting their tools that they’re downloading. But it does save you a fair amount of administrative overhead and it gets in the way of a fair amount of malware too.

So, what can you do with this for malware [types]? [Michael George from Dropbox] recently wrote [08:18] blog post, and he was happy for us to kind of point this out, how they were using Santa in their environment – they weren’t in deny mode, but he showed how you could use it to track the [proton] malware, which was the [handbrake] supply chain issue recently, where they were basically pushing [bad] backdoor version of the app. So, you can see here, [08:40] the first execution of the app. And then you can see what it was doing. [In this top one] you can see it’s [zipping up one] password vault for you, and then you can see the kernel command to execute all of this data to the c2 command, [08:56] dump is. You can see all of this locally and see exactly what the malware was doing, without doing any other analysis, other than looking at [your central logs].

But you can even do more. What if you want to hunt? While we were looking through this locally in our environment, we noticed that the legitimate [09:18] mounted itself in a volume called [handbrake] with a version number. That one didn’t have that version number. So, we were able to find some other versions of this malware purely based on looking at the names of the [filepaths] that were happening in the [Santa logs].

And then, finally, what if you’re doing some live analysis? You just show up at your workstation, there’s a Santa control binary, and you can run [file info] on the apps you have, download it to your Mac, and it’ll tell you where it was downloaded from, what the refer URL was, what the agent that downloaded it was, and [09:53] things like that. Oftentimes, you can just run this through your apps directory and try to find [that app] that came from a weird location. [Pretty cool, right?]

But what more can we do?

Gary: Hi again.

James: I love that [10:09].

Gary: James talked a little bit about using it for … to block or allow executions. But there are some other cool side-effects, as James mentioned. We are getting good [instrumentation] on writes, renames, and deletions, and the current mechanisms, prior to Santa, [left for] wanting more. For example, FS events – I only learned this recently, but the FS events doesn’t log timestamps. Timestamps are inferred by the timestamp on the FS event file itself. And that can be very scary.

Journaling, quickly [10:48], and then [10:50] things like Spotlight, it’s inconsistent, it’s [reporting] time, [for you to track] things over time, that’s better. So, with Santa [tracker] we have, as I mentioned, writes, deletions, renames, [links] and exchanges, and as James mentioned earlier, [disk mounts] are kind of a [11:03] through Disk Arbitration. So, there’s potentially a lot of data there. This is done by regex. So, you can include or exclude directories and sub-directories thereof.

Here’s an example one. Fill in a custom one that caters to your environment and filters out your log files that you don’t care too much about, and [want to] include [things that you do]. For example, [11:28] and particularly what we’re going to talk about here is volumes.

So, USB tracking is probably the biggest [hit] here. It logs and tries to resolve models and serials for USB devices that are plugged in. And of course you’ll end up with default … things like [noname] or untitled, and it can be … especially with the cheaper USBs, it can be hard to [disambiguate], but we can see when they’re mounted. This is an example of the Drive FS or Drive File Stream utility, and this is kind of indexing, if you’ll see … you’ll see that … oh, no. That’s the next one. Yeah. So, this is [volume/googledrive], this is [12:14], and you can see it kind of discovers all these other files, and that’s a good contextual artefact that you may be able to use.

Renames – you can see, like the [12:27] a Chrome file being downloaded or a file being downloaded from Chrome. So, we see the temp file, we see the [12:32], we know that the download has completed. We know where it is. And then we can see it being renamed. And put in its final location. So, from a timeline, we can see the beginning of the download, we see the plug-in of a USB device, we see the download finish, most importantly we see that file go to the USB, and then we see the file get trashed and manually deleted. And as I mentioned before, you can use things like Spotlight [12:58].

So, we see the [13:02] here – there’s a [noname] and there’s an untitled. The serial’s all the same, and the name is [13:08]. And if you start looking at Spotlight [UIDs], you can see that there are three distinct ones. And this could be two devices where one was overwritten, but by seeing that first one later on, we know that instead of one or maybe two devices, there are three distinct devices in play. And that can definitely come in handy when you’re investigative groups or lawyers have gone to speak to someone and they’re denying the existence of these devices.

So, we get some [13:38] forensics intel here as well. We can see deletions. It provides context and intent. Was the trash cleared? How often is the trash cleared? You can check the timestamp and see if everything’s going at once or if one thing was moved to trash and then that one item was deleted. That, again, it isn’t rock solid, but it can provide some insight into the user’s intent.

James: Yeah, just to wrap everything up now, there’s different ways to do this in an enterprise. You can do a raw log review. We like to export everything to a big database, and then we can search it pretty quickly and easily. We’re also … no Google talk can not mention [14:16], so we’re building a [14:17] parser, and that’s coming soon. And then you can do it with the [14:21] control binary as well. So, there’s lots of different ways you can leverage this information across your enterprise.

There are some bad people out there too, Mr. Grinches, that have proven that they can bypass Santa. But at the end of the day, what we really like is that application whitelisting is a big way for enterprises to protect your environment or even your local workstations. And then you can get all this other great … USB tracking, file write tracking, and things like that, for free. So, why not install it, why not use it? And start finding [14:52].

Any questions?

[applause]

Audience member: My concern, and I hope you can address it, is that you have that upvote thing, which I think makes [15:16] crowdsourcing, I think it’s a cool idea. But what’s stopping my malware from just sending those [15:22] to [15:25].

James: We’re definitely … there’s a Google authentication mechanism that we use. And we use two-factor, so you would be forced to touch your [15:33] or something before you could get [15:36] upvote it. But if the malware can spoof that somehow, theoretically, it’s [not] already running, so … but yeah, if something was running, you could … it was probably [15:44].

Gary: That’s assuming you already have … something would already have to be compromised.

Audience member: Yeah. Fair enough.

Audience member: I assume the answer to this is going to be scale. [Have you guys considered this as a bridge] to this issue, being able to sort of have a [mediated] review of the most upvoted things, in other words, not [16:05] automated mechanism [16:08] basically be a [nomination] mechanism for [16:11].

James: We could do that. We could see what’s being [upvoted]. We could change the threshold and say if it hits this, then it’ll [go under a queue], and [16:18], and you can set it to where people could never actually fully upvote something. But it’s a give-and-take with [overhead] of course, too, once you get into tens of thousands [16:29].

Gary: There were human security engineers that were staffed with being the arbiters of what does and doesn’t get approved, and they hated that. So, I know … and it’s hard. Are their better uses of their time? And the answer is yes. So, this is kind of doable [16:48].

[bell rings]

Gary: That’s time.

[laughter]

Host: Do you want to finish what you were saying?

Gary: Oh, no. I’ve finished.

Host: Okay, brilliant. Well, thank you once again. If you’d like to show your appreciation with another round of applause …

End of transcript

Leave a Comment