Bruce Nikkel: Okay, thanks everyone. I hope you enjoyed the break. Welcome to the first session: session one, with the theme of file system forensics. We have two interesting papers in this session. The first one is a systematic approach to understanding MACB timestamps on Unix-like systems and the primary author is Aurélien Thierry together with Tilo Müller, and Aurélien is giving the presentation.
Aurélien: So I will be talking today about timestamps and file timestamps, especially on Unix systems. So basically talking about Linux BSD and macOS. So I’m Aurélien Thierry, I work at Deutsche Telekom Security and the work is co-authored with Tilo Müller who is a professor at the Hochschule University of Applied ScienceS also in Germany.
So we looked into timestamp forensics and especially what happens with file, so file timestamps basically, because we believe there are fundamental artifacts that you find in many investigations.
Looking into those we have too many questions that arise. The first one is when a user does an action on a system, so his operating system, how does it impact file timestamps? And the other one, the other way around, basically, if you have a set of timestamps, because you’re doing a hard drive forensics, for instance, what can you infer of what happened on the system? So what files were modified or what applications were used?
You’ve probably already seen this kind of table. This one is from the SANS Institute, focusing on Windows, it’s called Windows Time Rules. It’s basically dividing common operations; like a new file, modifying a file, reading a file and so on, and trying to answer both of those questions.
We focused on Unix-like systems, so operating systems like Linux, OpenBSD, FreeBSD, and macOS. Our approach, So our idea was to have some automated testing, automated profiling to determine timestamp updates. And we use this framework to better understand how timestamps are updated on those platforms. And we also provide tables for these operating systems that can be used by investigators.
On Unix, so, Linux, BSD, and macOS, there is something interesting. There is a specification which is called POSIX, which basically aims to help compatibility, portability across systems, and POSIX actually mentions and specifies a lot of things about timestamps.
For instance, it specifies what should happen when you are using command line utilities on Linux, like cp or cat. It specifies also the standard libraries and also what happens on the kernel, so basically in the system calls.
It’s important to be noted that we use POSIX as a guide for our work, but none of the operating systems aim for full compliance, they all aim for somewhat a level of compliance which should work on those systems. There is one exception, which is that one version of macOS is actually POSIX-certified.
So I will be talking about MACB timestamps. So there are four timestamps. The first one is M, M for modify. It is basically the last time the file was written to. Then you have A for access. So when the file was read. You have C for change. So basically when the file owner changed or when the access rights changed. And then you have a fourth timestamp that is B for birth or file creation. This one is actually not specified by POSIX, but we will see that actually all of the operating systems support it in some way or another.
So let’s take our user, he’s using Linux and Ubuntu, he’s using some text editor. This one is called Geany, it’s for the GNOME desktop environment. And he has a file open that is modifying and is clicking on save file.
And actually what he’s doing is not directly writing on the file system. It will be calling Middleware, so a software library that is common for all GNOME applications which is called GTK. This library itself, again, will not write directly on the drive, but call a standard library, which is LeapC.
The LeapC itself is implemented using system calls from the Kernel, and in the end the Kernel is actually write to the file system, and this will result in sometimes some updates.
If we take another user is using the same application, but is using it on openBSD, so a different operating system, the Middleware is actually the same, but then if you look into the standard library, the LeapC is not the same on LINUX or on openBSD, and of course the Kernels are different. So there is no interesting reason that the timestamp updated on those both systems would be the same, even though the operation done is the same.
So the first thing to take away is that if you want to understand timestamp updates, you have to consider each of those layers, because this is a software stack. At the top of the stack we have the application that the users are actually using, and at the bottom of the stack, we have these file systems where the timestamps are really located. So to understand timestamps updates, you have to consider each of those layers.
So how does POSIX actually specify timestamps on specified timestamped updates? The first thing to note is that an update to timestamps happening is a two-step process. The first step is the operation done by the user will mark the timestamp for updates. In the Kernel it’s basically implemented as a flag in a fread.
And at a later point, the timestamp will actually be updated, meaning that the current time will be written on drive. And this will also clear the flag that was set in the previous phase and this actual update shall happen under one of three circumstances.
The first one is, if you’re closing the file, then these flags shall be flushed and the timestamp updated. If you are reading or modifying timestamps, then the file timestamp should also be updated before. And then the Kernel can also decide to do it earlier, typically it’ll be after a 30-second delay it’ll flush the timestamps and write them to the disc.
I have one example here with when you’re doing a file read, you are reading a file using the standard libraries, so the LeapC. You will have three operations, so first one you will open the file with fopen. POSIX actually specifies that if you’re doing a fopen with a read mode it does not change anything on timestamps, but then when you will be reading the file actually, so doing fread, so reading data from the file, then it will mark the access timestamp for update.
And at the later point, you will close the file with close and this, of course, specifies that the timestamp shall be updated at this point. So there is this kind of room between the fread and the fclose in which the Kernel can decide when to actually update the timestamp.
So our approach was to do some automated testing on those timestamp updates at the core. So this is a schematic algorithm for this. At the core, we are basically running the operations that we’re testing. So in the case of the file age, we are running fopen, fread, and fclose.
Before and after we have to fetch the current time on the system, and after that, we fetch the timestamp of the file that we looked into. So basically, on this example it’s called file.txt, and we will compare this timestamp fetched with that with the timestamp before and after the operation, and determine which timestamp was updated.
So here we will see that only the access timestamp was updated, which is expected when you’re reading a file. And then we compare these results with the specification, and POSIX actually exactly specifies that only the excess timestamp shall be updated. So in this case, the test passes.
So the first thing we did, as I explain, is this compliance test against POSIX. It only looks into what’s specified by POSIX. So basically utilities, common line utilities, then standard libraries, and then the Kernel. It’s written in C, so it’s a framework.
We also wanted to go a bit further and look into things that are not directly specified by POSIX, so the first thing we did was define common operations on files; so such as creating a new file, reading a file, modifying a file, copying a file, and so on.
We tried each time to have multiple implementations for these operations to be able to compare them and validate them because POSIX actually specifies what happens with the LeapC but also specifies what happens with common line utilities like cat, and it actually shall behave the same.
So then we compare these results. We also looked into Middleware because POSIX does not say anything about Middleware. So we did not have any specification to compare to, so we only did some profiling.
And we also looked into applications, graphical applications. On this side, it was a bit different because we needed to be able to simulate user inputs. And we used the Python library to actually simulate keystrokes on the keyboard.
So I will take the software stack that I described previously spanning from the application to the file system and beginning with the bottom of the stack, so with the file system, explain how timestamps are handled basically.
So if you look into file systems for the operating systems we’ve been looking into, we found that all of them actually support the birth. So the first timestamp, the birth timestamp which is not specified by POSIX. So all of the operating systems have a file system in which there is a field for it.
We also looked into timestamp resolution, and most of them have a timestamp resolution of one nanosecond, except for a HFS+, which is a one-second timestamp, one-second resolution. So this is for the file system, but if you actually look into the Kernels, it’s a bit of a different story because for instance, FFS2 and openBSD, which actually has a field for the birth timestamp, but the Kernel actually never uses it, it’s always set to zero. So it’s not usable in this system.
And on FreeBSD, there’s also a difference with resolution, because file system UFS2 supports the nanosecond resolution, but actually by default FreeBSD will only use a microsecond resolution.
So at the interface between Kernel and file systems there are mount options. Mount options are basically some configuration that describes how the operating system will actually use the file system that you have on it.
And so for our case, it’s really interesting because they act as a filter between what happens before in the stack and what happens on the file system. And on most operating systems, there are some mount options that will disable completely or partially the updates to the accessed timestamp.
And it’s especially a problem on Linux because by default, it’s using the relatime mount option, which will appear from updates to the accessed timestamp only if it’s older than one day or if it was earlier than the modified or the changed timestamp.
There are, of course, some other options which allow you to perform all of these timestamp updates, or to disable them completely. On the other file systems, on the other operating systems it’s a bit different because by default, they appear from all of the timestamp updates, but they also have some options to disable completely or partially updates to the accessed timestamp. Well, of course, except openBSD which does not update the birth timestamp.
So we also looked into timestomping. Timestomping is basically the question of what could a user tamper on a file system to prevent investigation. Basically, could he modify timestamps on his system?
The first thing to note is that on UNIX on or Linux and macOS basically, if you have elevated privileges, so if you are the root user, basically you will be able to modify anything, so it’s not a really interesting case for this.
But the question is then what can a standard user do with his own files and actually so on the fourth timestamp that we have, the modify and the access timestamps can be arbitrarily set in the future or in the past without any restrictions on all of the systems. The changed timestamp cannot be modified so easily, it can only be updated.
And the birth timestamp, actually a Linux, cannot be set arbitrarily, but on FreeBSD and macOS it can be set arbitrarily in the future or in the past. So the changed timestamp will be the most stable against timestamping because it can basically only be updated to the current time. There is no way to set it to the future or to the past.
So this is the first table we have. Basically, it’s looking into what POSIX specifies; so utilities, standard libraries and the Kernel. Again, you have to consider that in concrete cases, access to the many updates to the access timestamps will be skipped because of mount options, so it has to be considered as filters.
I will describe a bit some of the typical operations that we see here. So the first one is when you’re creating a new file on the system, this file will get updated M, A, C, and B timestamp. So that’s expected, like all of the timestamps will be new, so fed to a current time.
If you will be heading or writing a file, it’s also a bit expected, so reading a file or executing a file on those platforms will always update the access timestamp.
If you overwrite a file, the modified timestamp will be updated along with the changed timestamp. That’s also a common thing that the changed timestamp is always updated along with the modified one.
Then we have some operations which behave unexpectedly. We have, for instance, when you are doing directory listing, so basically LS on your system, POSIX mandates that the directory shall see his accessed timestamp updated, because it is basically like reading a file.
But actually if FreeBSD does not do that, it does not modify any timestamp, which is not so compliant, and also maybe unexpected in our case. We looked also into symbolic links, so following our readings. Symbolic links shall also update the links, access timestamp, but FreeBSD, OpenBSD and macOS do not do that, which is again unexpected and not compliant.
There is also this point on moving files around on the same hard drive, so this is if you are moving file, so basically renaming or moving a file around. POSIX doesn’t really specify what will happen. It says that the change timestamp could or could not be updated depending on the implementation.
And so actually all the operating systems except macOS updates a changed timestamps. So meaning that on macOS, if you’re moving a file around, it’ll not have any timestamps updated.
And there is this one also when you’re moving files across file systems, then it’s a bit more complicated. Usually the copy will get the same time stamp for modifying access as the source.
Then the changed timestamp will be updated and the burst timestamp, it actually depends on the operating system. On Linux it’s a new one and on FreeBSD and macOS, they actually get the modified timestamp from the source file, which is again, a deviation, but this is not specified by POSIX.
So looking only into compliance tests, we implemented more than 200 tests. We basically found out that no implementation is fully compliant. There are also always some edge-cases, which are not compliant, but usually which are not very interesting for investigations or also from a software point of view.
If you take Linux with a non-standard, stricter time mount option, it is actually mostly compliant. And then we also had some other issues, apart from the ones that already described in the previous slide, but we also had some issues with some timestamps that fail to be updated on FreeBSD and on macOS, and in some circumstances there seem to be updated and in some others, they are not, and this is something that we still need to investigate.
So up until now, I have only talked about what is specified by POSIX. I will now talk a bit about Middleware and applications.
So we looked into, as I explained, the Middleware For GNOME, the GNOME desktop environment, there is a library, which is called Gio, which handles the input and output, so basically the file operations for those libraries. There is one function that is called g_file_copy(), which handles copying files around.
The first thing to note is that it’s actually behaving differently than what we saw with POSIX. So basically the modified timestamp of the copy is actually inherited from the source and it is not getting a new one, which was the case with POSIX and with CP common line utility.
There is also another point, which is that the modified and the accessed timestamp of the copied file are actually truncated to the microsecond resolution, which is really unexpected. And it’s actually a known bug that dates from 2010, known from Gio and from Nautilus. Nautilus is actually a file manager, which is used by default on Ubuntu and on the GNOME desktop environments.
And this has the consequence that if you are copying around files with Nautilus, you will see this pattern here where the modified and accessed timestamps have three zeros rows at the end of their nanosecond component. So this could be used to actually find files that were copied along with this file manager.
We also looked into text editors. And so the thing we expected, so we looked into when you’re reading files and modifying with text editors, reading files we found that all of them only update the accessed timestamp, which is expected.
If we look into writing to a file, it’s actually a known pattern that is different from POSIX. If we previously saw that writing to a file modifies a modified and the changed timestamp, but many text editors actually have another strategy to save data, they will write to a temporary file and then replace the original file with the temporary one.
And in this case, we will see that all of the timestamps are actually updated. And it’s actually used pretty widely. Both strategies are used by a wide range of text editors. And so you have to consider this also when looking into texts being modified.
So a bit about the project. So we wrote the framework and it’s available online. The idea for us is to have this to update the results on the tables on the Github repository so that they can be used by practitioners during investigations.
We have tables also for each of the operating systems, because the one I presented before is comparing all of them all along with POSIX, but when doing an investigation, for instance, here on an OpenBSD, you are not interested in having the results for macOS, for instance. So we have these tables that we want to keep updated, and we’re open to any contribution to the project. So feel free to ask.
So basically, we wrote this open-source framework, we published the tables and we used that framework to better understand how timestamp updates are performed on Unix-like systems, on those operating systems.
The thing I want to emphasize again is that, so first timestamp updates are always a two-step process. So the timestamp is first marked for update, then updated.
Then there is this idea that there is a software stack from the application to the file system, and that all of the layers of the stack have to be considered in order to really understand what’s going on with the timestamp updates. And even though there are a lot of specifications from POSIX, we found many file deviations, but also some major ones that I mentioned previously.
As the next steps that we could have for the project, we want to be looking into newer versions of macOS. I don’t know if you noticed, but I talked about the HFS+ file system, which is a bit old, and we would like to look into APFS on macOS, on newer versions.
We would like to also be looking into file managers, maybe some other operating systems like Android iOS, and compare the results with what is already known on Windows to have some comparative results. Thank you for your attention.
Bruce:Thanks. Are there any questions from the audience?
Audience member 2: I had a quick question, actually. It’s a two-point question. In regards to the staff command output. You said FreeBSD had a microsecond resolution. Was there a way to increase that resolution beyond the microsecond resolution? And then my second question would be in regards to, you know, the POSIX compliance operating systems that you chose: FreeBSD, OpenBSD. Why didn’t you include NetBSD?
Bruce: Could you repeat the last question again?
Audience member 1: Sure. My first question concerned the stat command output in regards to the FreeBSD microsecond resolution. Could that resolution be increased beyond the microsecond range?
Aurélien: Yeah, so basically on FreeBSD it’s configurable and by default is the microsecond, but you can choose either the second resolution or also the nanosecond. So you can use a full nanosecond resolution. It’s not just not by default. On the second question of why OpenBSD, FreeBSD and not NetBSD, it’s only a question of time. There is no real technical reasoning behind it.
Audience member 1: Thank you.
Audience member 2: Hello. Great. Thank you very much and also thank you for sharing what you’ve done. My question is actually, so at the beginning, you already said that it’s a difference, if you take a look at, for example, the low-level system calls, which do file operations compared to applications. During your research, have you figured out which application implements which kind of system calls in which order? And if so, have you written this down and can share it with me?
Aurélien: So we’ve been looking a lot into this. The thing is, it also takes a lot of time to really track which application does exactly what. We mostly looked into it when we found some weird results to really understand why we had some deviations. I think in the papers, there must be one or two examples, but if you want to, we can also just discuss it later because it’s really dependent on the application.
Audience member 2: Thank you.
Bruce: Okay. Were there any more questions from the room here? Otherwise I think we have some online.
Moderator: Great. Yeah, we have a question from Martin Lambert from the chat. He says, “You mentioned it briefly when talking about the editors, but is there any difference between moving a file to a non-existing destination path like you might with MV versus overwriting an existing file with the moved file?”
Aurélien: Yes, it’s actually in the tables. So I’m looking at the table now because I don’t know it by heart. Basically overwriting a file is just like writing to a file. So it’s only modifying the changed timestamp. It’s not only modifying the full timestamps.
Audience member 3: Hi. Have you tried different file systems, like Linux and same applications and same everything before, but different file systems?
Aurélien: Not yet, no. For the other file systems, actually, I don’t expect it to be very different because most of those are implemented in the VFS to an individual file system and there are not really so much differences between how every file system is being handled. Of course it has to be validated, but that is the idea behind it.
Bruce: Any other questions from the room here: Good. Then I think we’ve got another question from the chat.
Moderator: Yeah, I got, sorry. This isn’t from the chat. This is just me abusing the privilege of reading the questions from the chat to ask a question for myself. But you mentioned that the root account can essentially use Touch to arbitrarily affect timestamps. And presumably Touch works by some system call?
I don’t really know about the API for the Linux Kernel very much, but if you are interested in constructing a system that is secure against timestomping, are there ways in which you can provide flags to the Kernel or would other Kernel patches be required to make it so that the Kernel enforces the legitimacy of timestamps?
Aurélien: At this point, I don’t think it’s really possible to modify their existing operating system to really protect them against filestomping. The thing is, if you look into the root account, you can basically do anything.
And one [of] the solutions for instance, is to just change the date and then modify your file. They will get the new date that you chose, and then you get back to the original date and then you will have done timestomping. So it’s really difficult, I think, to really protect against those when you have elevated privileges.
Bruce: Good. Any other questions? Last chance. All right. Let’s give him a thanks for that presentation.