How To Solve Digital Forensics’ Biggest Challenges With Oxygen Forensics

Si: Hello, everyone, and welcome to the Forensic Focus Podcast. Today, we have Matt Finnegan joining us from Oxygen Forensics. Oxygen, as you know, we’ve spoken to a few times in various guises, specializing predominantly, in my understanding, in mobile device forensics. Matt knows something about this and we’ll quiz him about that in a little while. But starting off, Matt, how on earth did you get into this wonderful game that we enjoy and love on a daily basis? 

Matt: Yeah, so, I think like a lot of people, I actually fell into digital forensics a little bit by accident. I got into it through the military. I joined the Navy when I was about 20 and originally in the Navy, I did other things, not directly related to digital forensics, but I was in a technical role and and the military was doing some digital forensics things and they they don’t, you know…there isn’t a dedicated branch or you know job for that. So as a techie, one day I was put into a job where I needed to do digital forensics in support of the military. So I really kind of fell into it by accident. I never really knew anything about digital forensics in detail before I was told one day, but that was now my job and I had to learn it.

Si: “Congratulations! Here you are. Here’s some kit, go do this.” 

Matt: Yeah. And it’s like 15 years later and I’m still doing it. So, I enjoy it. It’s super interesting.

Si: Were they good enough to give you any training in it? Or was it really a baptism of fire of…? 


Get The Latest DFIR News

Join the Forensic Focus newsletter for the best DFIR articles in your inbox every month.

Unsubscribe any time. We respect your privacy - read our privacy policy.


Matt: No, I was pretty fortunate actually. I got sent on quite a few external commercial training courses. There was a lot of, you know, in team, you know, the more experienced people talking people through, you know, how things work and what everything is before you go and kind of do it live. So I was very fortunate. I went on some of the SANS courses, I did some in internal training and then we were quite fortunate to get kind of follow on training.

So, you know, external courses again, maybe more SANS courses or vendor courses or things like chip off courses, ISP courses. So, I was very fortunate in that regard actually. But to be honest, I’ve always said, you really do learn more in your first day actually ripping phones or extracting phones than on any of those courses, right? You know, the first day you’ve run into something that you’ve never seen before and you start to learn from day one on the job as well. 

Si: Yeah, no, I couldn’t agree more. And so is it mobile phones that you’ve been doing from day one or are you a computer examiner as well? 

Matt: Definitely started with mobile phones, and that was always the main focus, but, you know, stuff that was coming off the battlefield…it could be anything. You know, predominantly mobile phones, and, you know, way back then, it was a weird mixture of maybe a lot of feature phones, a few smartphones. Although not as many back then. But it could be computers. Drone forensics became quite hot a few years into doing that, or sometimes you just get the weird stuff like a router or, you know, some servers that have been pulled out of some racks somewhere or something like that. So it really could be anything, but definitely with a focus on mobiles, that’s where it, you know, most of the information came from. 

Si: So, I mean, obviously you being at Oxygen now at some point decided that the military was coming to an end and you moved on. Did you do anything in the middle or was it straight over to Oxygen?

Matt: So I went from being a practitioner, UK military UK government, directly into Oxygen in the job that I’m doing now. So, you know, practitioner on the Friday and, you know, my current job was a solutions architect on the Monday in Oxygen. 

Si: So, what’s the definition of solutions architect? Because I mean, I’ve worked in IT and strangely…actually, is reasonably variable depending upon the organization you’re working for as to what a solutions architects actually does day to day!

Matt: Yeah. It’s one of those grand titles that gets put on you and then you have to try and explain what it is. So there’s…what we tend to do as solutions architects…the best way I can describe it is really just trying to, you know, explain what our software does to people that, you know, maybe looking for forensic solutions. But I think just as important to that is explaining where it fits because forensics is a toolbox industry. It always will be. You know, no matter any…how hard anybody tries to change that forensics, digital forensics will always be that industry of, you need a number of different tools.

So also trying to explain to people, you know, where we might fit in their existing toolbox, you know, listening to, you know, what people’s pain points or problems that they’re having and then trying to see how you know any one of our pieces of software might be able to help with that, whether it’s you know a particular type of extraction or a particular analysis problem or an analytic or anything really. So it’s trying to assess what people need and then trying to kind of see if we can help and explain how we can help, is probably how I can put it into a nutshell.

Si: And in that regard…you’re dealing with customers obviously all the time to whether that…are you involved in sort of development side of things as well?

Matt: Yeah, we definitely have a big input into development because, you know, digital forensics is constantly changing. And what people need, a lot of the time, you know, they can put feature requests in through, you know, support mechanisms, etc, etc. But a lot of the time some of the best information that the development teams and product teams will get are actually from the solutions architects, because we’re going out and meeting people, and people that are still actively doing digital forensics every day.

So we do build up quite a good understanding when we, you know, we have calls with people or we go and meet, you know, even current customers for a day, just to, you know, get a feel for how they’re using the products. And, you know, really importantly is there anything that we’re missing? You know, where should the development of the product go? Because, you know, developers can, you know, have a vision or, you know, the product team can have a vision, but there’s no point putting time and effort into that if it’s not what people actually want.

So definitely our feedback is really valued by the development team. And we’re really, really appreciative actually when, you know, either people who aren’t customers or especially existing customers take the time to, you know, give detailed feedback about what direction should we go in and the reasons for that, because it really, really helps to focus that development and we don’t waste our time and we don’t release something that isn’t useful. 

Si: No, absolutely. And you alluded to what I imagine is probably the biggest problem in…well, in forensics…in digital forensics as a whole, but certainly in mobile forensics, even more than, computer based, which is the speed of change is, you know, hugely rapid. I mean, versions of Windows is bad enough, but, you know, versions of phones, versions of Android. And again, you know, that…we got Windows machine…okay, yeah the hardware may be different, but that generally doesn’t have that much of a necessarily an impact on the forensic analysis. Whereas the different hardware in all of the mobile devices that exists, can be hugely different. How…what sort of impacts is this actually having on the way that we’re approaching extractions from mobile devices? 

Matt: Yeah, I mean, I think you touched on kind of two things there. One is the speed of change, and the other one is fragmentation, right? The number of different types of devices. And I remember when I started to do digital forensics and mobile forensics, it was much simpler. It definitely was. You know, the places that we tended to work, for a start, we saw a lot of feature phones or old Nokias, which were quite trivial to extract. And even the feature phones, which were those kind of weird, off brand, you know, straight out of Shenzhen, China type things, they were usually like one of maybe three chipsets.

It was, you know, Spectrum from MediaTek or CoolSense. And when you understand the, you know, how they work and how the extraction methods for those work, whether it’s a commercial tool or a flasher box, which is, you know, what we used to turn to quite a lot, it was pretty trivial to get extractions from those. And then, you know, things became a little bit more complicated with smartphones and Android. But, you know, at the start it was still pretty simple.

You know, even if you had a locked device, in particular Android, you could just do, you know…if there wasn’t a tool method of doing it, you could just do a chip off or an ISP or maybe a JTAG. And there was definitely a…I would say kind of a golden age where like nothing would stop you getting into that device, especially if it was an Android, because you had those hardware methods available to you. But then, like you said, speed of change, encryption came in and there was a big panic, you know, “encryption is here”. But at first you had to enable it by default, and nobody did that.

And then it became enforced and there was a little bit more panic. But then it turned out that they were using a default password. So it was harder because of the hardware backed encryption and things like that, but there were still ways around it. But then things just keep evolving. So, you know, Android and iOS, they moved from full disk encryption to file based encryption, which again made things harder, but not impossible. It just gets incrementally harder and harder. And to kind of talk to your question a little bit…and I’m aware that I’m going off on a bit of a tangent here, maybe so interrupt me if you want to refocus me in…

Si: No, we welcome tangents, it’s fine!

Matt: …in a particular direction. You know, the state of things at the moment, I would say, with, you know, locked mobile phones, Android or iOS to use the two main examples, you know, with…and I’ll talk more about Android because in some ways it’s more interesting to talk about because iOS is a little bit more black and white. Obviously we’re talking about file based encryption.

Nowadays that means that the user password is almost definitely, you know, 99.9% of cases going to be tied to the encryption on the device. And there’s going to be that hardware backed element as well. So there are really, you know, two main, I would say, methodologies to go after those, you know, get into those devices. The first is at the operating system level. So the device is booted up and it’s, you know, more of a screen lock bypass and an exploit to get, you know, escalated privileges and then extract the data, which is already decrypted. Because, you know, if the device is in that AFU state…so I guess what I’m talking about here is the sort of BFU/AFU type methodologies that work at the operating system level and then the other…

Si: So just for clarity: AFU, BFU? 

Matt: Yeah, so, with the…I’ll almost go back a little bit, with the introduction of file based encryption, which is, you know, one of the things I didn’t actually mention is a lot of the evolution in device security is actually driven by user experience. You know, in the start, a lot of these devices were full disk encryption, but it didn’t use your password for the encryption unless you turned on something like secure start. Not great for the user experience because when you turn the phone on, it means that it doesn’t really do anything until you put your password in.

So you’re not going to receive calls or WhatsApp messages, etc etc. So quite secure, but not good for the user experience. And vendors, you know, they’re thinking with two hats on. They’re thinking with the average, you know, general user’s hat on, “what do I want my experience to be like?” And they’re thinking with that security focused hat on, you know, the tinfoil hat, if you like. And that really, I think, is what drove the adoption of file based encryption, because it meant that you could pick and choose what is automatically decrypted when you just turn the device on.

So you can turn the device on and it will be able to do some stuff like receive calls or, you know, various different things. That means that some stuff is decrypted automatically. And the stuff that is decrypted automatically, is the things that we would deem to be available in, you know, what’s termed as a before first unlock encrypt extraction.

So you just turn the phone on, it will decrypt some stuff. And if you’re able to get around the screen lock, that’s the stuff that you’d be able to get in a decrypted format. When you put your password in and you know, everything (or maybe everything) becomes decrypted on the device. And when you lock the phone again, generally speaking, that data will remain decrypted, all those keys will remain available after you’ve locked that device if it’s been unlocked, at least at least once before. So in those instances, you know, and this isn’t actually, you know, one of the approaches that we use at the moment, it’s really a case of getting around the screen lock, and then some form of privilege escalation to, you know, be able to extract data from the device. The other methodology, which is one that we use more heavily, is the ability to do offline decryption.

So, you know…not always offline decryption, but quite often, so, you know, just extract the…all the data from the phone as a physical image, and, you know, whichever hardware keys or hardware values are used in the encryption on the device so that we can decrypt it offline and, you know, brute force or guess the user password if we need to.

There is also another method, which is kind of a halfway house, where you let the device do some of the decryption stuff, but maybe I won’t, you know, go into that at the moment. And both those methods are valid. Both, you know, both of them work in certain scenarios. You know, the offline decryption works better if you have a device that’s already switched off. The, you know, the AFU method might work better if you have a device that is in AFU mode because you might not need to recover the user password. But, you know, things are always moving quickly.

And actually, you know, there was quite heavily in the news recently, though, there was this story about, iOS had introduced a mechanism that will reboot an iPhone if it’s locked for three days. I don’t know the exact details, but that’s specifically to combat the use of AFU style methodologies. So vendors or developers are always obviously looking at how digital forensics companies are kind of breaking into these devices and introducing mechanisms to get around that.

And if I recall correctly (and I might be wrong on this) I don’t think iOS were the first people to…or Apple were the first people to do this. I think this is something that was introduced in GrapheneOS, which is a security focused derivative of Android that usually runs on Pixel devices. And I don’t think it will be long before, you know, that’s introduced into Android as a mainstream update. You know, whether or not all the vendors will choose to implement that, but I can see it coming. And there’s other things as well. (And I apologize, this is a really, really long answer to your question!)

Si: It’s fine!

Matt: You know, one of the other things that I can see happening is the fact that normally once you unlock a phone and then re-lock it, data remains decrypted. That could also change, right? There is already I think an API call in Android that allows app developers if they want to, and if they’re using the key store to, which is an Android subsystem that allows app developers to encrypt their apps data with hardware backed keys. You know, there’s already mechanisms in there that allow developers to only make that data decrypted when the phone is unlocked. They’re just not used. They’re really, really rarely used. In fact, I don’t know of any applications that use those.

And, you know, it may be that more applications start to make use of those. But again, you have to think about the user experience. Would that impact the user experience? And it’s always that trade off. And you might be able to implement that kind of thing in a way that doesn’t really impact the user experience, but it’d be interesting to see if that sort of thing comes into play, you know, eroding the effectiveness of some of those methods.

But if it does, it probably…it won’t be like universal. Like we…there’s always this panic when a new security mechanism or type of encryption comes out that this is the end of the world. There’s going to be no way around it. And either the uptake of that feature will be very low because it impacts the user experience, or there will be a way around it that you know, people will be able to find.

So it’s never the end of the world, but I guess kind of in closing to your your question, we’ve gone from that golden age I think where you could just do a chip off or an ISP or JTAG and it made you look really, really good because you could get it any phone and and people were always impressed, to the age where, you know, exploiting a vulnerability has become necessary and exploiting vulnerabilities that exist in ever decreasing attack surfaces, you know, that’s becoming harder and harder to find the things that we can interact with on these locked devices are becoming smaller and smaller.

The attack surface is ever decreasing. So, you know, maybe we’ll reach a point where it’s kind of the end of the world, but I don’t think we’re there at the moment. And actually just one last thing I would say just before I hand it back to you, because I’m kind of hogging the mic here, so apologies for that. I think what’s kind of come into focus as devices have become harder to extract data from, particularly if they’re locked, you know, the amount of data that people has has expanded so much that actually there are other places that we can look for it.

You might be able to find the data that fills in the gaps in the cloud or on a computer or get some credentials from a computer that you can do a cloud extraction from. So, because, you know, the data volume has bloomed, you know, the amount of data that people have associated to them is much bigger, it means that you might be able to get some of that data and fill in the gaps from somewhere else. So that’s kind of come more in a cloud forensics is only ever on the up in terms of importance.

Si: I mean, you’re lining these. I mean the…my only objection to you hogging the mic is the fact that you steal the question before I have an opportunity to ask it. I’m not serious! So, I mean, yeah. And, you know, again, Oxygen has capabilities for cloud extraction and it’s certainly a more prevalent thing that I’m seeing in cases. I try generally not to deal with mobile phones. It’s not my expertise. I don’t have the equipment and I leave that sort of thing to professionals.

But cloud is cropping up more and more often as components of cases that I see. Let’s move on to that and talk about, sort of, cloud. What…are are we able to extract, sort of things like encryption keys and other stuff, or is it purely other evidential material, because obviously everything’s…if it’s being backed up to the cloud, you just pull the cloud copy and you don’t worry about what’s on the phone, you just go, “this is a backup of that, and therefore…”, that way. Or can you use it as a, you know, a plain text to try and break a ciphertext? Is it…what are the sort of things that we’re gaining by doing this cloud analysis?

Matt: There’s so much data that could be in the cloud. It’s kind of all of the above to be totally honest. It could just be, like you said, the content that was on the phone, it could be the messages from…whether it’s WhatsApp or Telegram or Viber or, you know, any number of different messaging applications. It could also be like you say, there’s a chance that there could be stored passwords in some of these cloud services. You know, there could be just entire backups of devices.

There could be an entire iCloud backup of a device, which not only gives you, you know, data from the device, but it also potentially gives you that historic snapshot of data that was there a week ago or a month ago, but might not even be there if you did do a successful mobile extraction today. And it’s loads of other stuff as well. It’s like a hundred plus support services in the cloud extractor that we have. And it could be Uber, it could be geolocational data. It could be, you know, the MapMyRun or the Adidas running applications that again is geolocational data. It may not even be that you need to know where somebody was at a particular time, it could be just useful to build up a pattern of life. You know, “what does somebody normally do on a Monday morning?

They normally go for a run, but on this day, they didn’t go for a run on Monday morning.” It could be the absence of data that, you know, helps. So it’s a really, really wide variety. If you think about any of the apps that you have on your phone, with the exception of some that may not, you know, store data in the cloud, you will probably be able to extract data from any of them if the cloud extractor for that particular device existed.

So it’s incredibly broad and sometimes these cloud extractions can get out of control a little bit. I’ve spoken to people in the past, more than one organization that has done cloud extractions and they’ve said, “yeah, we should have date ranged the extraction because it’s like two terabytes and it’s still going.” So it’s a ridiculous amount of data. It can be in the terabytes. It could be way more than what you would get from the device on its own. 

Si: Yeah, no. I, again, I’ve been…my experience of them has been fascinating. And yeah, I’ve had one where I’ve definitely…where I’m still downloading it and come back two or three days later and it’s still going and you’re wondering why on earth you started this whole thing. But for things like, let’s say WhatsApp. WhatsApp is a fantastically prevalent and incredibly commonly used in criminal offenses, for some strange reason, I don’t know.

Anyway, it’s the idea of time limited messaging, I’m assuming. But what sort of differences are we going to be looking at if…depending upon the way that we go about extracting a WhatsApp? So, what am I guessing from…what are the problems with doing…what am I going to miss if I do a cloud extraction? What can I get more by doing it directly on the phone? Or, you know, is there a best way to approach it? 

Matt: So WhatsApp is an interesting one actually, because it’s more a case of data being synced between different devices that there could be messages sat in the cloud, but generally they’re the messages that haven’t been delivered yet. And it kind of opens a can of worms actually if you do a WhatsApp extraction you could get via the cloud, you could get messages that were actually sent and never actually received on the phone because maybe it was put into airplane mode or switched off, when it was seized. And we tend to see actually that the chat history for WhatsApp, when you extract it from the cloud tends to go back about a year.

And it could even be, you know, syncing those messages from other devices, that the WhatsApp data is on. So it’s a…WhatsApp is a bit of a weird one, and I probably don’t even understand it as much as I should. But, as I understand it, it’s not even necessarily that the data is held in the cloud, it’s more that it’s being synced from other WhatsApp devices that have that data on them.

As far as I know, only the queued messages are actually held in the cloud. And there is an interesting capability within the cloud extractor in Detective around WhatsApp, which is not actually doing a WhatsApp cloud extraction as such, but if you do get a…because if you set WhatsApp to encrypt on a device, whether it’s an iPhone or an Android, it will save those backups periodically, however often you tell it to to make those backups in the WhatsApp app.

And those could be in the Google drive, they could be in iCloud backups or they could just be on the SD card of the device itself if it’s set to save locally. They are encrypted and normally to decrypt them, you need a key that you can only get from a full file system or a physical and then decrypted extraction. There is an interesting capability in in the cloud extractor actually that if you have one of those encrypted backups and you don’t have a full extraction with that decryption key, you can actually, effectively authenticate with the WhatsApp server as that user, and get it to send you the key to be able to decrypt those backups.

So again, it’s…that can be really, really useful and maybe even sometimes more useful than doing a normal cloud extraction because you could get snapshots. You know, if somebody’s making a weekly backup, you’ll get that perfect snapshot of all the data that was there a week ago, two weeks ago, you know, three weeks ago. And there could be deleted records that are deleted now, but they weren’t deleted then. So you’re not even having to do that kind of recover deleted records to be able to get some of that old data. It’s just there in the backup. 

Si: So how…I mean, obviously, you know, you’ve got a mobile phone…or you’ve got a case. Let’s go with a case. It’s easier to think about an individual. And they probably have at least one device, possibly two, maybe…well, probably two. Possibly one! Seems to me people have more mobile phones than I do. But I mean, you’re looking at an iPad, an iPhone, maybe another Android phone, if they’ve got something, you know, if they’re business people…there may be multiple phones and things like this.

And you’ve got cloud extractions of all of that. You’re talking…and you’ve got snapshots of something over a long period of time. How do we manage that amount of data and how do we reconcile it against itself? I mean, if you’re talking about snapshots, obviously you’re going to have multiple copies of…potentially have multiple copies of the same message across several backups with some that have some messages, some that don’t have other messages. And it becomes a jigsaw puzzle with overlapping pieces.

Matt: Yeah, it definitely can. And you’re right, everybody…pretty much everybody now has…probably has a couple of devices. They might even have the same accounts on those devices. And there’s also that draw that everyone has in their house with all their old phones in it as well.

Si: Mine’s behind me. It’s got about 12, so yeah!

Matt: You can always tell when you’ve been given the contents of that draw. But it’s actually really, really useful because a lot of the time, you know their Android device from 10 years ago has the same pin code as their, you know, current brand new iPhone 16. So, you know, password reuse always really, really helps. But to kind of go back to your point about, you know, you can have an overwhelming amount of data and how do you reconcile that?

I think having the ability to put different data sources and multiple devices or extractions in a single case, which is something that we do do, and it’s kind of for that reason is really important. And, you know, one of the kind of ways that I would go normally to look at that data in a little bit more sense, is (and I’m talking about Detective here), you know, if you go to a case level timeline.

So you’ve got, you know, maybe 10 different extractions of different types in a case, if you look at the case level timeline, you can deduplicate, you know, identical results or messages and just have a timeline with kind of one of each in there. So it does make things easier, but yeah, you’re right…I mean, even just, you know, loading huge amounts of data, big extractions, is something that is also a problem in digital forensics nowadays because, you know, a lot of this software was written when you had a Nokia that you got a few text messages from, and it wasn’t going to take long to load. I think one of the challenges that vendors have at the moment is, you know, it doesn’t matter if you have a super powerful workstation, if the parsing code is relatively inefficient, then it’s still going to take a long time (and a bit of a plug, so apologies) and that’s kind of the reason that…

Si: No, this is fine. You’re from Oxygen. I expect these things!

Matt: …was rewritten actually with kind of, you know, a view of making it as fast as it can be on the parsing and analytics side a few years ago, because data volumes are just getting bigger. You know, an average phone extraction is probably over a hundred gigabytes nowadays, which is pretty wild when you think back to when it was four or eight, maybe sixteen gigabytes. And then you add in all the cloud stuff and then you add in stuff from computers and it does get difficult to manage. So having an efficient piece of software in terms of processing, but also having the ability to look at data on a case level, and be able to deduplicate and timeline things is really helpful when you have, you know, a lot of different extractions.

Si: In…I mean…so with regard to the ability to process huge amounts of data, I mean we talk about it on the workstation basis and we’ve talked about the cloud, but we’ve not talked about the cloud as a analysis platform. Is that something that you’re able to leverage? Obviously, with, you know, instantaneous scalability to throw a million processes at something, many hands make light work, kind of stuff. Is that something that you do? Or is that something…because I’m aware of various products that do make use of the cloud, and I have personal opinions on the security and the implications of uploading forensic evidence into the cloud. But, you know, what are your sort of thoughts and opinions on it? 

Matt: Yeah, I mean, if it’s…you do raise an interesting point around, you know, if you’re using the cloud, that could mean two things. It means that, you know, the first is that you’re using something that you’ve set up, whether that’s in a commercial provider or if that’s just kind of your own mini cloud effectively, or it’s something that somebody else is hosting. And anything that somebody else is hosting or is hosted in a commercial provider that there is always going to be a concern around data integrity and data security. And I can’t see us going in that particular direction. Although I’m definitely not the boss, and I can’t speak, you know, with certainty around that.

I think what would be quite interesting would be more just generic distributed processing. Like you were saying, whether that’s in a different cloud somewhere or in something that, you know, you’ve set up, being able to distribute it. We are looking at, you know, maybe adding capabilities around being able to automate processing (and I’m not even sure if I should be saying that, so we can cut this…)

Si: I won’t press the point!

Matt: But yeah, I think actually…like I said before, it’s fine to throw a million processors at something, but I think equally important or maybe even more important is actually how is the code written that’s doing the processing? If it’s a program that is not as efficient as it should be or could be, then you’re always going to be fighting a losing battle.

I think it’s, you know, important to get the kind of framework right and and the ground works right in terms of, you know, this is a fast piece of software, and then look at the question of how you do the distributed processing afterwards. Because actually if it’s, you know, fast software then actually it will still load extractions relatively quickly. Even if it’s something that you have to leave running overnight. I think for most people, if it’s the case that it will be done by the next morning, that’s probably acceptable. And if it’s there, when you come in in the morning and ready to go.

I think what’s increasingly adding to the overhead of processing is that we’re not just, you know, pulling the WhatsApp messages out of a database and pulling files out of a file system anymore, increasingly people are expecting other analytics like OCR, speech to text, translation, even malware detection. Those things do all take longer. So, you know, being able to again GPU accelerate those…actually, if you can GPU accelerate something, like speech to text, you probably don’t even need to distribute that. You know, one GPU is probably going to be enough to do that very, very quickly.

So, I guess I’m kind of with you, I’m a little bit of a skeptic on the really massively distributed things. If nothing else, not just because of the security implications, but also the complexity. You know, when these kind of systems are kind of built, they tend to fall over quite a lot in my experience. You know, as things become more complex, it looks brilliant on paper until it breaks and you have to get a specialized engineer in to figure out why it’s gone wrong. You know, suddenly you’ve got a big amount of downtime. So again, I think there’s a…when I was saying before security is trade off of, you know, user experience versus security, I think with processing, it’s also a trade off of complexity and reliability, you know, versus speed. And actually we might be able to achieve what we need with just single workstations still. Long answer. Sorry. 

Si: No, no, it’s an excellent answer. And actually, I’m going to challenge you on this. Now you sound…and, you know, your background is clearly technical. And you’re coming across very technical. And we’re talking about legacy code bases and how they’re not well written. Is there an opportunity, do you think, for refining forensic tools in general by moving to a better designed operating system? Windows itself is a large piece of legacy code, and it’s quite often a limiting factor in writing high performance stuff. It’s not designed as a high performance operating system, unlike…well, Linux isn’t necessarily designed as a high performance operating system, but at least Unix is designed as a high performance operating system, and therefore there are opportunities for more performant code to run on alternative operating systems than Linux. What do you reckon? 

Matt: So, you sound like much more of an expert in this than me. I think that you’re probably right. And I guess, you know, as an example, which is not directly, you know…it’s not a direct parallel, but if you look at any supercomputer, right, it’s not going to be running Windows, it’s going to be running some form of Unix or Linux. Yes, I think you’re probably right. But again, I would worry about the complexity. That would be some sort of really complex bespoke type system that it’s not going to be the case. You just download the Windows installer from the software portal, and then, you know, within 10 minutes, you’re up and running.

For organizations with huge amounts of data or that want to do clever analytics across huge amounts of data, yeah, I think something like a different operating system probably would be useful. And to be honest, they’re probably the type of people that are looking at cloud, and they’re probably not even thinking about Windows. They might even be running their own custom analytics and software to do it.

So, you know, can I see a forensics vendor implementing a kind of, you know, processing engine on a different operating system? Maybe, but I think more likely they would probably host it themselves, and then we would be back in that, “you’re going to have to upload all your data to the cloud for it to be processed,” type question. And it definitely is taking longer to process data, but I’m not sure that we’re at the stage where that’s absolutely necessary at the moment. Although for, you know, some people data processing speeds are more of a priority than others. So kind of a half answer to your question. I think, Si. I’m not really sure!

Si: No, it’s fine. It was an unfair question. I mean I do majority of my work on…I use Linux, I use Windows and I use MacOS. My background is as a Unix systems administrator, so for me, I’m very comfortable in Linux and MacOS because I can go to the command line and do whatever I want and it usually works!

Matt: …actually install programs on Linux, which I think stumps most of us at times. 

Si: And it’s interesting what you said, because actually I…one of the things I’ve done in a past life, was set up what was called a Beowulf cluster, which was actually commodity desktop machines repurposed with Linux to run as a large parallel processing machine. So it was about 32 odd PCs sitting in the rack, each of which obviously is a standalone PC, but they’re all talking to each other to do distributed processing. And that was for cancer research and doing, you know, distributed computing on cancer simulations and stuff like that.

So there’s a lot of technology that exists in other parts of computing that we haven’t really come around to leveraging in forensics yet, in some ways. So I kind of feel that, you know, there’s opportunities still for speed increases. And the funny thing about Beowulf clusters actually is that they were fundamentally sold as the workstation you’ve just retired off the desktop, as opposed to the one you’re actually using goes into the cluster, okay. And then, you know, that just gradually keeps expanding and it doesn’t…it kind of matters less how performant each chip is when you’ve got more of them over time. It’s an odd model, but it does work. 

Matt: That’s a really noble use case as well. It’s better than Bitcoin mining or something like that! Yeah, I mean, it’s an interesting concept, right? If you’ve got 20 examiners that each have a workstation, why not be able to tap into everybody else’s resources when they’re not using them, right? Like you say, it’s technology that exists, so it’s…

Si: Well, it’s the whole fundamental design principle for virtualization is that…you know, certainly in a data center whereby, you know, one machine wasn’t running one web server all the time and therefore you could run 50 web servers and they would probably be okay, on a single machine and you virtualized it out. And, again, we don’t really see…well, to be fair, if we’re running forensics jobs, we tend to be taxing the process to the top of its ability while we are doing it.

But like you say, in a lab full of 20 people, 10 of them are writing reports at any given time, and 10 are actually, you know, doing something. So, there may be some room for shared or distributed computing. But like you said, GPUs are a fascinating thing in so much as it is such a significant processor in and of itself, or more to the point, so many cores in and of itself that it’s churning through some stuff so quickly now. And burning so much energy. 

Matt: Yeah, that’s an interesting consideration. What is the common footprint of digital forensics, right? 

Si: Yeah, yeah. 

Matt: …voice to text transcriptions. 

Si: But I’m going to circle back to something you said earlier, and it got me thinking because obviously I have my drawer of, you know, 50 phones behind me or whatever it actually is. I don’t know. Too many! But what I do is, is I actually…I do a factory reset on them before I chuck them in the drawer. Well, most of them. I’ve got a couple that are still running apps that I have to go and find the charger for to pull them up because I haven’t figured out how to get the one time passwords off the bloody things. But, how effective is a factory reset? Is there anything that’s recoverable after that? Or is it easy to…good, solid, reformatting, with a couple of overwrites that we would consider for a hard disk? 

Matt: You know, I always say digital forensics people are like politicians. We don’t give straight answers. You won’t get a yes or no answer from a digital forensics person usually. So I’m not going to give you one! The answer I’m going to give you is: I think it depends. You know, it depends on a whole host of factors really, you know, what generation is the device, what’s the operating system, what version of the operating system, but I…probably most importantly, what is the encryption type on the device?

You know, in years gone by, if you…before, you know, the Android devices were encrypted. Yeah, if you did a physical extraction of a device that had been factory reset, you were going to get loads of messages and photos and all kinds of data back. Nowadays I think it’s much, much less. And I can’t remember looking at a specific case myself. I’m trying to think back…iPhones. There may be some indicators, but in terms of, you know, actual indicators that something was factory reset possibly. But in terms of user data, I wouldn’t really expect to find anything, especially with file based encryption being quite prevalent nowadays, I would be surprised if there was much left. But I’m not an expert on that, so I’m going to say: probably not much. 

Si: Okay, so what I’m taking away from that is for the older devices, shred them, but anything more modern, you’ll probably be fine. 

Matt: Yeah, I think so. Just don’t forget to delete all that data out of the cloud as well, right? If you do want to get rid of it permanently.

Si: Oh, well, this is it, isn’t it? This is how on earth do you delete the footprint that we’ve created now? And I wouldn’t even know where to start. It is terrifying. I mean, I could write to…go to the Google and sort of hit the, “please delete all my data” button, but I’d have no clue about whether they did or didn’t and the extent beyond that. And, it was the thought the other day is…we were talking about AI generation and as everybody is at some point sooner or later, in any sort of forensics conversation, but it was about voice simulation. So you know, taking someone’s voice. And it’s like, I’ve now done half a dozen of these (more than half a dozen of these) podcasts. And there’s more than enough voice data to turn me into a simulation. And can I get it back? No. There’s no hope. It wasn’t a consideration when I started doing this, but could I go back and pull that in now? No. No chance. So as soon as you’ve got to kind of accept that we are now part of a digital world, which is…it’s just going to be what it is.

Matt: Everybody’s immortal in a small way now, right? 

Si: Yeah. Although I do…I have come across this wonderful cheat, okay. And I don’t know how well it will work out for you. But for me, the thing that made me vanish from the internet most effectively was having a very, very similar name to a very successful gymnast in the US. Because now anybody who searches Simon Biles actually gets automatically corrected to Simone and she gets…she is the top search for everything that you try and put my name in for. I used to come up on the top…on the first page of Google, and I don’t think I even show to about page 30 now, so it’s an interesting technique to get oneself removed from the internet. And quite hard to reproduce.

Matt: There’s a…I think there’s a writer for like a PC World magazine or something that has the same name as me. So I think that’s what normally comes up when you search my name. I don’t know though, do you want to be on the first page of Google or are you quite happy…?

Si: Well, actually, to be honest, at the time when I was on the first page of Google, I was fine with it. Now I’m very happy to be on page 30! I’m quite glad to have been pushed back. It’s not a big deal for me. I do not have sufficient vanity to care, but I do find it amusing. So, I mean, I’ve touched on AI, but I don’t want to talk about it anymore! Not in a bad way, but I think it’s starting to sound a little cliched. But what else is coming in our future of…well, let’s talk about mobile devices. But what’s coming in the future for us that’s going to present a challenge, do you think?

Matt: Well, you’ve mentioned AI, but we’ll skip over that one. I think that the…some of the biggest challenges are going to be around extraction. And I think I’ve covered some of them already. And I think it’s going to be…I think two things. I think the first thing is that vendors are becoming much more…not even much more, but they are taking note of what digital forensics companies are doing. You know, they’re actively, you know, patching vulnerabilities. But they’re taking steps that are quite specific to how digital forensics tools extract data.

You know, the example with iOS (and I think GrapheneOS already does it automatically does it) rebooting after three days. That’s very targeted at digital forensics, right? There is an argument that, you know, some malware is probably not persistent and it’s probably good hygiene to reboot your device every few days just to clear any malware out of RAM that might be sitting there. But that’s quite edge case. That’s quite niche. I think that is very, very focused at digital forensics companies. And, you know, that’s the top level. That’s the Android and the iOS of the world.

I think as app developers become more security focused and may also start to take note of, “how do we mitigate tools that are using AFU extraction methods?” What I was talking about before where, you know, there are APIs available (or API calls available) to developers now that allow them to say, “this data: once the phone is locked, re-encrypt it.” You know, “don’t make those encryption keys available anymore.” And that, at the app level, could also be a problem as we go on.

And it may be that stuff like AFU extractions just decrease in viability, and we have to look more to those hardware key extractions or kind of, you know, lower level beneath the operating system level extractions and do, you know, either offline decryption or on the device, but not when the devices is fully booted. But even that’s becoming harder because those attack surfaces are quite small as well. You’re looking at boot ROMs, which are, you know, tiny code bases, or boot loaders, which again are quite limited code bases that…I think if Samsung or if, you know, Xiaomi, or whoever wanted to go through the, you know, low level boot ROM/boot loader code bases with a fine tooth comb, I reckon they could patch everything in a few weeks. If there are vulnerabilities there.

You know, some vulnerabilities are more complex to find, you know, than really simple ones. You know, some of them, you might have to put the device in a certain situation for it to…but I do worry that the attack surface is relatively limited. There are the mitigations that, you know, the top level vendors, if you like, like Android and Apple are putting in place, but I think that will trickle down into the app developers as well.

So, I think that…and it’s kind of a cliche just to say extraction, but think that that’s only going to get harder, especially with what we’ve seen in the last few weeks around some of these new security mechanisms. Which are actually not even that new, you know, they’ve been around in some operating systems for a little while. It’s the fact that they’re actively taking note of what digital forensics tools are doing and effectively targeting us in a way to mitigate against those specifically, is quite interesting. We’re on the radar, should we say. The entire industry is on the radar.

Si: Well, you’re doing…going back to the military, you’re doing free pen testing for them, effectively, is what you’re doing. You know, software vendors in…if you’re trying to create a secure solution, you know, you have to pay somebody to come and test it for you to make sure that you know that it’s locked down properly. You guys are doing this for free for them, and then producing it. And there’s this interesting sort of issue that presents itself at that point, which is what we do, we have to be able to go into court and explain.

Therefore, we need to know what we’ve done and therefore you need to tell us what we’ve done by doing it. And therefore anybody else is also able to get that information themselves. You know, Samsung could come and buy a copy of Oxygen, possibly under a different name, but they will get the training. They’ll be able to look at the tool and they’ll understand it. There is a company out there who we both know starts with “gray” and ends in “key”, that don’t share this information and keep it sort of locked down. And there’s this interesting problem that it presents, which is, okay, so we get Graykey obtained stuff. How do we actually defend that in court by standing up and going, “okay, I’ve got this, I have no idea how I got it or how it came out of that, but it’s here and I want you to trust it.”

We’ve got this lovely war going on whereby, okay, so the vendors keep…the phone vendors keep improving their stuff. You guys keep doing your jobs properly by finding the opportunities to get in and then telling people how they work so that you can justify it in the court of law. And then there’s these other guys who come along and ride slightly roughshod over the whole concept. What do you reckon? 

Matt: The black box, quite literally, right? You know, it’s interesting. In some regards, I get it, right? Encryption is the problem, right? And if you go way back to like the start of encryption, you know, it was the case that governments didn’t even want the average user to have it because it is such a powerful thing. And it, you know, good encryption is inherently strong. You know, you’re not gonna brute force an AES key in trillions of years.

So you have to find, you know, ways around it. Which, you know, we have to have, you know, methods that can help us. I say have to, you know, it’s just my opinion. So, you know, if a vendor does find a way around a screen lock to get the decrypted data, I can see the reason for wanting to protect that method because it will be patched in a heartbeat. I think there’s the wanting to protect it so that examiners can get the data and there’s also the, you know, the business…the commercial side of wanting to protect it because you’ve probably, you know, it’s taken a while to find, it’s not cheap to do these things and you don’t want it to go away tomorrow, which they always can. You’re always on an eye fetch.

So, you know, there are reasons to protect the methods. But you’re right, you know, anybody could say, “well, you don’t know what happened on that device. It may be that you never…you were able to get the data, but you were never able to unlock the device to actually, you know, verify what was on it.” So how do you know that even if the tool is good that it hasn’t even been tampered with to put, you know, fake data on the device, or does it give you the ability to put fake data on the device? Or when it’s extracting, you know, is it extracting valid data or is it manipulating it or changing it in some way?

And I…you know, I don’t have a good answer. You’re absolutely right. Anybody could challenge that and say, “if it is literally a black box that you have no idea how it’s gaining access, or, you know, how it’s extracting data, how do you know that it’s correct or that it’s even working properly?” And I don’t think that you can, I think you just have to trust. I think people just rely on the trust of the reputation of the companies, that they’re not doing anything untoward. But it does create a problem, I think around where…I don’t know if there’s any cases where it’s been challenged specifically, because people that are using the tools don’t 100% understand how they’re working. It’s definitely a Pandora’s box. Or, an interesting question, but yeah. 

Si: An interesting question. Well, I’m going to draw a line on that because we’ve literally just hit the hour. And I’ve run out of questions and…but it’s been an absolutely fascinating discussion. And thank you so much for coming on to the podcast to have a chat and tell us all about yourself and your thoughts, and to enlighten us as to what Oxygen is up to and what they’re doing. I’ve thoroughly enjoyed this. So thank you again.

Listeners, obviously you are listening to this, so you do know where to find this in some way, shape or form. But if you have a preferred podcast player such as Spotify and/or Apple Podcasts and/or whatever podcast thingy that Desi always remembers and I always forget…but we’re also on Forensic Focus website and on YouTube, and please, we’d be delighted if you came and listened to us at any point again in the future, because we have, like Matt, so many fascinating people who come on and tell us all about amazing things that are cutting edge in the industry…in the, you know…they’re experts, we aren’t! It’s absolutely wonderful. So, thank you very much for joining us, Matt.

Thank you so much for coming in and talking. And, it’s been an absolute pleasure. And I hope you’ll come back again in future, and tell us about something else and what’s going on.

Matt: Well, thank you very much for hosting me, it’s been really good to talk to you. So thank you again, Si.

Leave a Comment