Matthew: Welcome to our presentation. I’m Matthew Piscitelli.
Tyler: And I’m Tyler Thomas.
Matthew: And we performed memory forensics on USB attack platforms. This work was supported by National Science Foundation Grant number 1921813. We looked at two USB attack platforms, Hak5’s USB Rubber Ducky, and Bash Bunny. We carried out attacks using these devices and looked for artifacts present in memory.
We found Windows diagnostic event logs, as well as networking events that contained information such as unique device identifiers, timestamps for device usage and IP addresses. We created a series of volatility plugins to extract this information from memory and feed into a visualization framework and time series analysis tool.
Tyler: So I’ll begin by giving a brief overview of what a USB attack platform is and why someone would want to use it. So generally speaking, the devices we looked at are standalone USB devices that are capable of spoofing multiple logical peripherals, be it a keyboard or a human interface device, an RNDIS Ethernet gadget or a mass storage device.
So in a standard configuration, these devices send keystrokes rapidly to the victim machine to inject PowerShell payloads or open a terminal or send in whatever commands it needs to send in. It might have a listener creating a socket on its RNDIS Ethernet gadget so that could send data over a network interface that it could create with its own DHCP server, or it could also have the ability to transfer files back and forth from the host device as a standard USB device.
Generally these would be used by low skill attackers because these are commercially available products that are not that inconspicuous. And generally the goal would be to conduct reconnaissance, exfiltrate data and to deliberate and execute payloads on the victim machine.
And here is a very cartoonish threat model that we’ve created to illustrate this. We have our one physical USB device that generates three or more less logical devices, sends keystrokes, transfers files, and creates network connections. So obviously this is a problem that people have run into before. How do we detect USB devices that have been used on computers?
So past work has looked at trying to identify malicious USB attack platforms by reading in the keyboard buffer and determining the speed and rate and pattern at which keystrokes are being sent in and comparing that with typical human typing patterns. We’re using machine learning algorithms and AI models.
So this anomaly-based detection approach – it’s successful, but likewise, if you could build an AI to learn how a human types, you can build an AI to send keystrokes as if it weren’t human. So further attempts have been made to do a deep introspection of USB packets and the metadata and the Windows event logs to detect these devices and those prove to be successful.
However, we’re trying to look at it from a different perspective. We are trying to look at latent in-memory artifacts on the host system that are independent of the actions of the attacker. So if the attacker clears the event logs or changes the registry, it doesn’t matter. If they’re able to defeat all of this fancy AI packet timing based models, we should still be able to see what they’re doing because these artifacts will be there independent of their actions.
Matthew: So we carried out a series of open-source attacks using the two USB attack platforms. Hak5 publishes many different payloads for these devices on their hub that perform different actions, such as exfiltrating data or instantiating a reverse shell.
So for the USB Rubber Ducky we carried out an attack to exfiltrate files, and on the Bash Bunny, which has an RNDIS Ethernet gadget, we carried out an attack to create a reverse shell connection between the host machine and the Bash Bunny by opening sockets. And all these attacks were carried out on the hardware and software versions, see below. We did it all in Windows 10 on modern equipment.
So initially we carried out these attacks and performed actions such as plugging in and unplugging the attack platforms, as well as changing the modes such as during device and arming mode. And we collected a small set of memory images throughout this process and explored them manually using Unix utilities, like string scrap, and as well as regular expression utilities, such as YERA.
And during this process, we discovered a considerable amount of memory artifacts, such as Windows diagnostic events in the form of JSON blobs, that contained information such as unique identifiers. We also discovered network DHCP diagnostics that gave us information such as when a device was probing for an IP address, when it receives an IP address, when Windows correlated an IP address with a specific interface, such as the Ethernet interface created by the Bash Bunny.
We also discovered some of these payloads that we carried out in memory because they essentially boiled down to being straight PowerShell payloads, and we found them in memory as part of the Windows logging as well as we actually found the PowerShell present in the proprietary scripting languages used by the two Ethernet devices, meaning that they entered into memory before they were executed.
Next we created a series of Volatility plugins, which we’ll talk more about later, to essentially just extract out all these different artifacts in a standardized format so that we can feed them into the visualization framework that we created as part of a prior paper, ‘Memory Foreshadow’.
So we devised an experiment to take a series of memory dumps, one every few minutes over a period of 24 hours, automatically have our Volatility plugin run on each of these images and extract the artifacts, and then feed that into our time series analysis to determine how these artifacts persist in memory over time, even after the attacks are all carried out and the device has been removed from the USB.
We discovered as a result of this, that some artifacts remain in memory for long periods of time. In fact, the entire duration of the experiment such as some of those Windows level diagnostic events that are held in system processes that wouldn’t be killed under normal conditions, such as svhosts and NSH.
Tyler: So this table gives an overview of some of the artifacts that we could use as indicators of compromise. These are all things that we presumed would be in memory in some shape or form and we can be taken as you are [indistinct] to detect the usage of these two devices.
The first row in the table is the reverse shell payload executed by the Bash Bunny. And you’ll notice the Q string in front of it is the proprietary scripting language. And we attempted to reboot the system, re-execute the command to payload, and it was still found.
So for whatever reason, the entire payload in its original format is in memory on the host system. It is not just interpreting it locally on the Bash Bunny device. It was actually transferring over the full proprietary scripting language file and interpreting it there.
Next we have two rows where it’s the USB ID of the Rubber Ducky and Bash Bunny respectively: you know, every single USB product has a vendor ID and a product ID. The vendor ID is given to it to represent the manufacturer, be it Apple or whoever. And the product ID is to represent the product within that manufacturer’s line, not the vendor’s line of products.
So the Rubber Ducky 0 5 AC – that is actually Apple. So it is impersonating a Apple keyboard: 0220 is a specific Apple keyboard. So the Rubber Ducky is actually trying to hide who it is, which is unique in this case because the Bash Bunny does not. It provides F000 as the vendor ID, which is not a real vendor, and FF03 as the product ID, which is also not a real product. So that in and of itself is extremely suspicious of the vendor ID and product ID are not in the USB ID database that is publicly available.
Strangely enough, even though the Rubber Ducky goes so far as to pretend to be a Apple keyboard, when it provides its device ID and hardware identifier to the operating system, the string ducky is in clear text in both forms. So, and it’s inconsistent in its attempt to hide its identity and these things are where you would want to look. If you were trying to detect a more advanced or sophisticated USB attack platform would be to deeply investigate the device ID, the hardware ID of USB IDs and see if they match up.
So Matt mentioned earlier that we were able to find Windows diagnostic events, and we believe these to be telemetry events that are being sent back to Microsoft, so that Microsoft knows what kind of USB devices are being plugged into Windows computers. So it knows what drivers past to support in the future, and what vendors and products are the most popular for analytics and whatnot.
So we found two that were of interest: DeviceConfig, which happens when a device is originally plugged in for the first time. Within this JSON objects we have a timestamp of event creation. The ID of the user, which is important because we can know who the device was plugged in under, the device driver and the driver installation date, and the device vendor and product ID, which is the USB ID as we mentioned earlier.
InventoryDevice PmpAadd: this is more interesting, and this was generated every time the device was plugged in, not just the first time, and this contains the parent device container ID, which is very important for our purposes, because as stated earlier, one physical device is capable of spoofing multiple logical devices.
This container ID represents the individual physical device. So if you see one container ID that has multiple logical ideas belonging to it, you can presume that there’s something strange going on and maybe want to look into it because one physical USB device is representing itself as multiple things.
And you also have the vendor supplied manufacturer name and the vendor supplied model information from the previous slide that Rubber Ducky was telling it so it was, and within that. And then you have driver information and the driver provider, which is obviously useful to know how the device is interfacing with the operating system and what it’s actually doing.
Next we have the DHCP logs that we found in the SVC host process that was where netsh, the Windows networking utility is found in memory. And we found these five event logs or events in memory. And we were able to reconstruct the full netsh log output, utilizing these.
And this is information related to the RNDIS gadget from the Bash Bunny, where it is making its own DHCP server and giving the Windows machine its own IP address. So this is obviously very suspicious and can be detected fairly easily with our tool.
Matthew: So to facilitate extraction of these memory artifacts and analysis of their presence over time, we created a series of tools. We created two volatility plugins, the first one being usbhunt, which looks for that telemetry data and extracts it. The second being dhcphunt, which extracts those DHCP logs from the SVC host process.
They both pan look within those process space, as well as scan the entire system memory image to look through deallocated space for the artifacts. Additionally they feed JSON into our JSON reconstruction algorithms that I’ll talk a little more about later to produce valid JSON objects that can be easily converted to other formats such as CSV. So it outputs string data for certain artifacts such as the DHCP logs and searchable JSON objects for some others like those diagnostic events.
So a common issue we run into when this kind of memory analysis is when artifacts are presented as JSON data, specifically when they are deallocated, either they’ve been deallocated by the process, or they’re in a process which has been killed, they’re often overwritten in some fashion. So either they have a chunk taken out of the middle, or maybe we might have the beginning of a valid JSON object, but then the end has been overwritten by arbitrary data.
So this makes it difficult to extract information from this JSON without just treating it as a string. So we created an algorithm based on a state machine similar to Alexa that will look for where the JSON object has been overwritten and repair it with a stack based method based on the valid JSON delimiters and the values that are actually present in the artifacts. And thus it allows us to more easily feed this information from our experiments into our visualization framework for analysis.
So these are two examples of the output of our Volatility plugin. So the first one is the InventoryDevice PmpAdd diagnostic event, and the second is the DeviceConfig diagnostic event. So you can see these are both valid JSON in both cases and we can save them in however fashion we please.
And this is an example of the output of our dhcphunt Volatility plugin. So we output the reconstructed logs as point X, and you can see a lot of information here such as the GUID of the different network adapters that are receiving IP addresses, which is especially relevant for the Bash Bunny with the Ethernet gadget.
So you can see the GUID assigned by Windows to the RNDIS Ethernet interface, and you can see the IP address that it is being assigned by the DHCP server running on the Bash Bunny.
So we fed the output of our plugin into a visualization framework that allows us to more easily see how artifacts are persisting in memory over time. So the top graph shows the device PnP and inventory diagnostic events. The second one shows unique device identifiers, and the third one shows DHCP diagnostic events.
So what we can glean from this is that in the case of the Bash Bunny, the diagnostic events remain in memory for a reasonable amount of time – about halfway through the experiment, which is 12 hours. However, the other two kinds of artifacts remain in memory presumably indefinitely, or at least for the duration of the experiment, or until the device is rebooted. And there are present varying quantities and different times, presumably they were being moved around as memory management occurs but in all cases we encountered, they were all extractable.
The similar case exists for the Rubber Ducky. Unfortunately in this case, some of the diagnostic events were not persistent in memory for as long as we would have hoped. They were over almost immediately after the devices’ upload.
However, the other two types, the unique identifiers for both the storage and the vendor, as well as all of the DHCP events, with this the same case was true as with the Bash Bunny, where everything was present in memory for the duration of the experiment and was retrievable.
So as far as future work is concerned one thing that we surmise is that the type of DHCP traffic or logs generated may be able to be used to identify the different logical devices’ presence on a USB attack platform. So for this case, for the Rubber Ducky, if you look at the bottom most graph, you can see that there’s a very high ratio of receiving to ack DHCP events. And in this case with the Rubber Ducky, there is no Ethernet gadget present on board.
However, if we go back to the Bash Bunny and we look at that same bottom graph, you can see that there is a much higher ratio of ack to receiving events and this attack platform does in fact have an Ethernet gadget on board. So we surmise that it may be possible to determine whether or not a USB device plugged into a machine has an Ethernet interface based on analysis of these logs.
Tyler: So the main takeaways from our work were that there are certain diagnostic events that remain in memory independent of the registry and event log, so we are able to find these devices, whether or not the attacker does basically everything they can to cover their tracks short of rebooting the machine. If they reboot the machine, there’s not much we can do from here.
And these diagnostic artifacts contained vendor and product IDs, the user accounts responsible for the events, the time that the events occurred and the drivers’ information. This could be used to answer critical questions such as was a device plugged in, and if a device was plugged in what did it present itself as, what was it trying to do on the system? Did these things not correlate? Was it presenting itself as an Apple keyboard and then it started doing networking stuff and accessing networking drivers. So these are useful for an investigator trying to determine what happened with the USB device on the system.
For future work what we could do to improve our experiment would have been to conduct a longer experiment time and simulate user action. So we just ran it on a system that was dormant, not much was happening on the system. We could have been opening, you know, Firefox and going on Facebook and stuff like that to try to generate, you know, move memory around, make the operating system deal with that kind of stuff and see how that changes the memory and the amount of time that these objects were present.
We could also attempt with alternative acquisition methods, rather than just copying VMEM files for a virtual machine. And that was what we went with. And most importantly and most interestingly, the Windows diagnostic events that we looked at, the DeviceConfig and Inventory PmpAdd events – these are mandatory events that are well-documented on Windows, are on Microsoft documentation. If they are mandatory, you cannot disable them. They have to give you all of the information about what information they’re collecting on you.
However, if you opt into these optional diagnostic events, they are not documented. We do not know how many of them there are. I’m sure if you spent the time, you could figure it out, but it is not on Microsoft’s website, and there might be a plethora of forensic information contained within these optional diagnostic events. Not just related to USB devices, you know, these diagnostic events that we’ve found can be applied to all USB devices. However, there may be other diagnostic events that could provide forensically relevant information to other forensic domains.
Matthew: So to sum up our work we discovered that with these USB attack platforms, the artifacts generated by the operating system remain in memory for extended durations, potentially until the system is rebooted as they reside in system processes.
Additionally, there are unique identifiers generated by plugging in these devices that give insight into what devices were plugged into a computer at any given time. And finally DHCP network related logs that are generated by Windows can help identify rogue interfaces that should not be present on advice as well as what actions were carried out on these interfaces.
And a special thanks to Ibrahim Baggili, Bhavik Ashok Nahar, the Connecticut Institute of Technology and the University of New Haven Cyber Forensic Research Education Group, as well as The National Science Foundation for supporting our work. Thank you very much for attending our presentation.
Tyler: Thank you.