Adam Pridgen discusses his research at DFRWS EU 2017.
Pridgen: Thank you, everybody, for being here. I know that I stand between you and lunch. I’m here presenting research on behalf of myself and my colleagues Dan and Simson. So let’s get started.
As you all know, Java uses automatic memory management. Automatic memory management means that the developers no longer have control over the memory that they’re allocating or deallocating. A lot of this happens behind the scenes, using the garbage collector. And in the case of Java, it uses a generational garbage collector but other managed runtimes use different types of garbage collectors. What this really means is data cannot be explicitly destroyed, so that means if we have some type of application that we want to protect sensitive data with, or protect sensitive data, we can’t do it.
But that works out pretty good for us. We’ll talk about that in a little bit.
The other problem is when we’re using generational garbage collection, we’re creating multiple copies as we move data from one generation to the next generation.
Our motivation behind this research has several [prongs] to it. First of all, we’ve noticed that malware authors are starting to look at managed runtimes as a mechanism for writing their malware and deploying it to multiple systems. The reason for this is because the managed runtime gives the developer the ability to write at one time and deploy it to multiple platforms, like Mac, Linux, or even on Windows. The other problem is a lot of these internet applications that we see today, such as Apache Struts, are written in managed runtimes like Java. This gives the attacker the advantage if they understand how to write Java applications because they can live off the [land] and exploit the fact that the runtime is something that they can manage and manipulate.
And the other thing that we really wanted to take part in, based off our previous research, is these managed runtimes retain a number of different artifacts. So processes, spreads, passwords, sites that were visited, commands that were run, and so that’s really the heart of what our research focuses on.
So from this, we have several research questions that we posed. First of all, can we exploit this metadata inside the garbage collector memory to increase the efficiency of our investigations, especially when we start investigating server-side type of applications? Next, can we create a viable timeline based off of the garbage collection’s allocation strategy? And then, finally, what kind of information can we obtain from these artifacts, and can it help us make better decisions and take more actionable approaches to the way we do our response. Then, finally – this is something that we explore [maybe] for future work – can this approach or these methodologies be applied to runtimes like JavaScript or .NET.
The rest of the presentation is laid out in the following manner. I give an overview of managed runtimes, basically the garbage collection process, or generational garbage collection. Talk a little bit about our approach. Give the evaluation of our approach, and then conclude the presentation.
So, generational garbage collection can be thought off as a multipartition memory structure. The heap in the generational garbage collection is engineered based off object lifetime. In general, objects live and die very quickly, and then, the objects that do remain for a long time are expected to stay in the heap for a long … well, [03:51] periods of time.
Just to break this slide down – we have two main generations. We have the tenured generation. This is considered the permanent generation. This is where objects live pretty much forever, until a full garbage collection takes place. And in the young generation, we have two main memory structures – we have the survivor space and then we have the Eden space. The Eden space is partitioned for thread local allocation, with thread local allocation buffers, and this is where objects are allocated by each thread, independently, so they don’t cause a deadlock in the overall application runtime. So basically, it enables them to bump the pointer in allocated memory as they need to meet the application’s demand. And then, after a garbage collection site, the objects are promoted into the survivor space. Survivor space uses copy collection.
Copy collection means that any time garbage collection happens here in the survivor space, it’s just copied to the next survivor space, and so on and so forth, until the object is actually promoted up into the tenure space.
This is the distinction between unmanaged and managed memory. Unmanaged memory works in the following manner. We do an allocation for our data object, and that allocation is typically done with [malloc] or [new], using C++ or C, and then we perform some type of data mutation on the structure of this data block. Then, once we’re done with it, we choose to do some type of sanitization, and [if] deallocated or if we’re not in the security mindset, we just deallocate it. Note that the memory doesn’t ever change its location, so the object will live and die in the same place, and it makes it more likely that this data can actually be overwritten.
There are also certain constructs within the operating system, that when the memory is deallocated, the operating system will automatically sanitize it. And this is for the previous search.
What does it look like for a managed runtime? I know this slide looks a little noisy, but it makes a bit more sense once I start walking through it. First of all, in our Eden space right here, [in the TLAB], we have an allocation that takes place. Let’s say we’re allocating a [long] [06:07] [string or a password], for instance. So we allocate the object, the string, we assign the password to the string, and then all of a sudden, the garbage collection takes place. When that garbage collection takes place, our string is moved up into the survivor space, and that latent data from the string remains in the TLAB. So this means that we have two copies – two potential copies or two potential opportunities – to recover that data.
Now, another garbage collection takes place because a large allocation occurred. Now that object is copied into the tenured space, and we have this latent object that exists in the survivor space, and we also have a latent object existing in the TLAB space. So this is where our multiple copies come from.
Under some circumstances, and generally in the Eden space and the survivor space, data can actually be overwritten. And this is kind of the security mechanism for taking care of sensitive data. But this is bad for people who are security-minded but great for people who want to do forensics on it.
So what exactly does an object look like in the heap? Here we have a [string array of 2]. We’ve got this first reference right here to a string, and it points to our string object. And our string object is this … this first little [double word] is just basically a header for the OP, and then, the second defines the type, so this tells us that is a string, and then this third field tells us that we’re pointing to a char array, and this char array says it’s [07:36] 5 [07:38]. And then this designates that this is the char type, and then here’s the character buffer that makes up our string.
If the developer creates a new string and then reassigns it to our string array, what happens is this reference is updated with our new string. In this case, it’s moved from [mash] to [group], but what happens is this string is still retained in the heap, full structure and everything. And it won’t be overwritten until garbage collection takes place and the data is reallocated to a new object. So [08:10] take advantage of this and recover these artifacts to do forensics and [do timelining and event recreation].
So we created this framework called RecOOP that is generally focused on the [08:23] but can be re-targeted for .NET with additional [research]. In the first steps, we capture the memory. Basically, we do a [de-selector], we dump [08:34] in something like one, or some other memory capture facility. Then we reconstruct the process using [08:40]. Reconstruction of processes takes all of the physical memory pages and resets or reorganizes them into a process-oriented layout, using the virtual memory [08:50], and then we extract our loaded types from the [08:54] of the [JVM], we locate all the managed memory and then enumerate all the objects for reconstruction and timelining.
The overview of RecOOP goes something like this. We have an overlay layer that allows us to do overlays on top of the memory blocks and recover the pertinent structures for reading and writing to … well, mostly reading the data from the memory. Then we have our type management and our reconstruction layer that allows us to extract out of the OOPs and then reconstruct the Java object into a Python object. And then we have an interface that you could repurpose for another managed runtime.
In this case, we’re actually only focusing on x86 32-bit for Windows and Linux. This would probably work on Mac, just with a little bit of rejiggering for the overlays. But for the most part, it’s [09:50], so which is [09:52] with the overlays.
The first step in our process, the first real step in our process, is extracting out the types from the system dictionary. [10:02] [frontlines] have some way of doing advanced reflection or understanding the types they’re loading and executing [against]. In the [JVM], this is the system dictionary. This retains all the class that are … string, class, socket, things of that nature. Then we have a symbol table, and the symbol table includes all the strings for all the different methods, fields, and classes that we would encounter. The symbol table is where we find all the constants and long-lived objects.
Our approach is extract this, mine the structures … or mine the information we find here for the loaded data structures, and then move forward to reconstruct everything. To identify all these structures, what we do is we scan the process memory for the [JVM]. We look for [invariant] fields that we would typically find in the system dictionary, symbol table, and string table.
We attempt to parse the structure as one of those elements or one of those data structures. And then, if we get back [the same] results, then we move on to the next step. We also use a number of constraints that keep the parsing from going out of control. For instance, if we read memory and let’s say the first value looks correct but the next value is something outlandish, it tells us that we need to read 2000 elements or 3000 elements, we would consider that out of bounds, and we would stop with that parsing and move on to the next candidate.
Once we’ve identified all of our different types, once we’ve identified all our symbols, the next step is identifying where the managed memory is within the [JVMP]. The [JVMP] … well, the JVM itself manages its own memory with its own memory system, so there’s not this managed memory segment, there’s also a segment for the [11:52] and there’s also elements for the underlying resource structures and the actual JVM itself. Our goal is to actually find that place in the memory where the element or the data objects for the JVM actually exist.
To do this, there’s two approaches that could be taken. The easiest approach that we use is scanning the [JVMP] for logs. And these logs are produced every time garbage collection takes place, and they tell us exactly where the different spaces are. So the Eden space, which is where all the short-lived objects exist, would exist in 0xa480000 whatever up to 256 bytes, or 256 megabytes of [space]. Front space, similarly, and then the permanent space.
And we use this information to actually identify the segments of the managed memory. But then, we can also use the type information that we recovered from the system dictionary. What’s interesting about Java – and if you remember from that previous slide where I talked about the [memory layout]? Every object has a pointer telling … a pointer [13:06] system, or a pointer to the type that it’s actually implementing. And this can give us a lot of information as far as, okay, whether or not this is an OP or this is just some random bit of data.
So what we can do is we can scan these managed memory segments for high distributions of type pointers, and then also look for a range of unique pointers. And this gives us a good layout, especially for the Eden space and for the tenured space. The reason why the survivor space has such an awful distribution is mainly because it’s just … it’s [copy] collection. It’s just kind of like this intermediate space where stuff gets spread out over each garbage collection.
Once we’ve identified our managed memory, the next step is to scan for the managed objects and recover them. When we’re scanning for the managed objects, what we typically do is look for that type pointer within the … or a type pointer pointing to the [type parse]. And then we parse the next few bytes, and the last … [then work forward] to figure out whether or not this is a valid object and this is something we can recover data from. And notice, since we have the type pointer for which we want to extract the object from, we can also look at the different fields for the Java class. So we’re looking at the … we’re not looking at just the type pointer, but we’re also look at does this type pointer that says [there’s this] type pointer such as the string, which has a character buffer, does that next object have a character buffer, and does that character buffer line up with what we would expect as a value for the buffer size.
In general, when we’re looking for artifacts using this approach, we focus on looking for Java threads, sockets, and mostly files that we wouldn’t be able to recover off of disk. So our general approach in this regard is – let’s say that the attacker has managed to upload their code and then delete it from disk. So how do we actually recover the Java malware that was executing, that we can’t recover from the disk itself? This allows us to do some more detailed analysis against those, I guess, [file-less] malware attacks.
In the case of JAR entries and JAR files, we can actually recover these by looking for the objects [or … the objects and types] that implement the JAR entries and the JAR files. And when we start talking about looking at sockets, we can also start looking at buffer data that could be recovered from the various buffers that implement the socket itself. And then, we can also recover information related to processes.
For our evaluation of our framework, we didn’t just use off-the-shelf malware like [adamant] or another Java backdoor. We actually wrote an implant the type of behavior we would see with a Java malware, and then we ran that against a [16:05] we think we would see. So basically, mimicking … somebody gains access to a system, running through, trying to figure out what access does that particular attacker have within the local network, can they [proxy traffic] in and out, can they do things like file manipulation, and can they [create and run] an operating system [16:26].
The next step, beyond that, to check our progress throughout the script, we actually take memory snapshots to ensure that … first of all, to capture the information that we think we can … or capture the information that’s produced and determine whether or not we can actually map that to that [16:45] script.
This image, before, we just focus on the five key elements found in a malware sample – we want a load code, load Java [software] from over the network … Java [software] in this case would be [17:01] transferring a live Java class over the network, loading it, and executing it. Exfiltrating data, executing operating system commands, modifying files, and then also trying to proxy traffic within the local network.
In our first experiment, we just wanted to see what information we could recover and how it [compared against] other tools. With [volatility], one of the [core] problems with using this in a JVM context is you will lose a lot of artifacts over time, because the operating system will reallocate memory that’s been deallocated after the process [dies], for instance.
In this case, when our JVM keeps running, the socket data is retained, and it also keeps it in an ordered fashion in the heat, so we can actually assign time values to figure out how long … or figure out in what order things took place. In this case, we have a proxy traffic and we’re actually able to recover the buffer socket data from the attacker, and then … the inside compromised host and the outside attacker. This allows us to get more context from what has happened in general.
Another thing we wanted to look at was what if the JVM is used to execute operating system processes, what information can we gather from that? The general approach to starting a process in the JVM is looking at the process [builder] class. The process [builder] takes [a series … a string command] and translates that into a [four command on links] and then executes that within the environment. So at times, you might be able to recover the string command, the process [builder], but it’s not like … well, what we can recover is the PID for the process that we were looking at and latent data within the buffer itself. In this case, our attacker executed [cat] [18:54] password, [uname], group, and then also [catted] the shadow file, and … oh, that’s [19:04]. [Catted] the shadow file, [catted] the password file, ran [inmap], ran history and [grep] for what the history … or [grep] for anything that’s taken place, did ifconfig, and then another set of commands.
This gives us some context around what has happened and what the attacker did when they were on the box. But it also gives us the PID for which these processes executed. So we went back and actually tried to see … we went back and used [volatility] to see if we could recover the process information. And we found that [volatility] actually would not list these particular processes. So this kind of gives us an additional leg up or an additional tool in our forensics toolbox to help us with that analysis.
Another thing we look at is – generally, when we’re analyzing malware, the malware itself is obfuscated to the point where it has a number of random functions that don’t do anything. A lot of times, those functions aren’t [20:03]. So we can actually exploit information from within the JVM such as method calls and other elements along those lines … sorry, method [call times] to determine whether or not a method was called. And we can also look at how many times it was called. For instance, for [maintenance start], it was executed one time, so we know that this is kind of a relevant function, because it was used once. But we don’t know how relevant it is, but we can go back and look at the code that we recovered from memory later. Another thing we can look at is [send data] … or look at how many times it was called.
These were called a pretty significant number of times, so these might actually be important. So this would help us determine what we should be analyzing and what we should look at when we start moving forward with reverse engineering, the [20:52] that we recover from memory.
The one last point that I want to make about research is the longevity of objects created, or the longevity of process objects created in Java. This is kind [what we encountered through our] intuition. We knew that with garbage collection, we would see a fall-off of objects, but with the actual process objects, they were retained through our entire experiment. The other thing that’s fairly interesting is the output buffer of our processes. If you notice, it kind of goes up and then it maintains constant. And this runs counter to what happens when a garbage collection takes place at these two points.
So this is where the garbage collection takes place, and we know this because the process builder, the commands that we were recovering, were wiped out, and we couldn’t recover them after this point – for experiments [t25] and less. However, we could still recover data buffers from commands that were executed from around [t20] and onward.
What I’ve shown with our research is that memory analysis at higher levels in the managed runtime can help us understand what threat actors do when they are actually using managed runtime tools, which is kind of that step up, moving a step above [volatility]. We’ve shown that given the memory allocation strategy of garbage collectors, the timelining effort is actually much easier than it would be when you’re looking at [raw native objects].
And then finally, I’ve kind of given a generalized method for analyzing future garbage collected managed runtimes, because with .NET and Javas- … or [Google VA Java script], they have similar types of garbage collectors, and these properties can actually be leveraged to perform investigations or malware analysis against these particular runtimes. And with that, I’ll open it up to questions. Thanks.
[applause]Audience member: Thank you, for such a presentation. So basically, what I see here is the garbage collection is the one which is a problem, and also, the operating system allocates the memory for something else. So can we introduce a [23:15] where we don’t let the program execute completely? Before the garbage collection we can just abnormally terminate the program where all the objects are loaded, and we can just take the memory dump, and it’s much easier to analyze. Have you thought about this?
Pridgen: I haven’t thought about that, but yeah, it’s within the realm of possibility. Some of my previous research focused on measuring the retention of [TLSPs] inside the [JVMP]. And in order to do that, I had to modify … I had modified the JVM to actually get rid of the data within the heap or overwrite the sensitive data. So it’s within the realm of possibility that someone could actually introduce a [byte code] or introduce some initial functionality and crash the … yeah.
Audience member: [laughs] So then it will be much easier. Thank you.
Host: Any more questions? From the back maybe? Okay. Well, I’d like to ask you one question. When you recover objects from the heap, clearly some of them could be partially overwritten. So how does RecOOP recover, or how does it deal with objects which are not complete or, say, references of which may not be there?
Pridgen: In order to do [24:34] that are not there, we simply ignore them. Because the point [24:39] once we attempt to parse it. For objects that are partially overwritten – so like the character buffer – there isn’t a lot we can do. The only thing we can do is interpret the character buffer in the string as it is, and then leave it up to the analyst to determine whether or not it’s a valid string. So a case in point – we can do a logic check. So strings aren’t supposed to exceed 65, 35 characters, or so. So we can look at those buffers, and if they exceed that length, then we can say this is not a [valid character]. Or this is not a valid object. And we could point that out, but we don’t do anything to that level of analysis. We leave it up to the person doing the investigation.
Host: Okay. But there is a possibility in RecOOP produce all of those partially damaged objects, they’re not totally lost.
Pridgen: Yes. Right. That’s kind of the risk that we run. Yeah.
Host: Okay. Thank you very much, Adam, again. Thank you.
[applause]End of Transcript