Christa: Hello and welcome to the Forensic Focus podcast. Monthly we interview experts from the digital forensics and incident response community on a host of topics ranging from technical aspects to career soft skills. I’m your host, Christa Miller.
Today we’re talking with Sarah Edwards, senior digital forensics researcher at BlackBag Technologies. Sarah works in the DC Metro area and specializes in Mac and mobile forensics. She has worked with various federal law enforcement agencies and has performed a variety of investigations including computer intrusions, criminal intelligence, and terrorism products.
Sarah’s research interests include anything and everything Apple-related: mobile devices, digital profiling, and Mac and mobile device security. Sarah has presented at many industry security and forensic conferences and is the author-instructor of SANS FOR518 Mac Forensic Analysis and Incident Response.
Today we’re talking about Sarah’s most recent research around forensic pattern of life analysis, her Apollo tool built for that purpose, and some of the issues examiners might encounter. Sarah, welcome.
Sarah: Oh, thank you. Hello Christa. Happy to be on the podcast. Thank you.
Christa: So pattern of life analysis started with the intelligence world and then made its way to digital forensics via link analysis capabilities included in some of the existing tools. What became important about going beyond those communications links who also linked [inaudible]?
Sarah: I see usage data. I think the really important pieces from a forensic standpoint are trying to figure out what the user is doing at any given point in time. What applications are they using, who are they talking to? How are they talking to them? Are they using something like a messages application, or maybe a secure messenger that is kind of a big deal when it comes to forensics, and getting that data, and then down into actual physical device usage. Is the device unplugged, plugged in?
Do they have a certain pattern? You know, I always plug in my phone at night. Do they have that same pattern? Combining all these things together can really tell a story about how the user uses their device.
And if you’re looking for something in particular for an investigation, I think that that pattern, or the lack thereof, or perhaps a certain day, can really help you out in an investigation. So they’ll going beyond the digital forensics tools that have the link analysis. Like what’s important about adding in those pieces of data to the existing apps that they’re using that you’ve mentioned, or the communication patterns specifically. So those being added into the tools themselves. I think it adds into the analysis whether or not you’re using the tools.
So you can look at, let’s say, third-party apps all day long, but how does the user use that app? Are they using it once, they downloaded it three years ago and completely forgot about it because it was a crap app? Or are they using it consistently? Are they using it every morning to talk to somebody? Are they using it throughout the day? Do they use it every Thursday? So just getting that context can bring a lot into different investigations.
Christa: Your blogs and webcast show how much of that data it’s possible to extract even given a narrow window of time, and how correlation can take time across databases. How detailed a picture do investigators really need?
Sarah: Yeah, so definitely everybody’s case is going to be different. So I originally wrote APOLLO as kind of a proof of concept tool for my own investigations.
I was spending too much time querying the 20-something databases, and it just takes time, and most folks just don’t have the time to take, you know, they got a case in case out. So I wrote APOLLO to make it as quick for the investigator as possible.
But once you get the data, it’s a matter of the investigation. So with APOLLO you’re going to have a million plus different records from 20 different databases. That is too much for any single person to go through. You’re not going to be scrolling through this. So, depending on your case, you’re going to be filtering on certain things, whether it’s a certain application, whether it’s a certain time period, whether it’s a certain contact. And that’s just going to be on a case by case basis, a literal case by case basis. Whatever your analysis tells you you need to look for.
Christa: You talked about some of the patterns you’re generally looking for. Is it always about identifying anomalous from normal patterns, or are there other kinds of patterns?
Sarah: Well, yes and no. Certainly you’re going to look for the one-offs. Again going back to the application usage, if they are using an application all the time and then all of a sudden they stop for like the last two weeks or so, that anomaly is not part of their pattern anymore. So that would be perhaps significant to an investigation.
But then you’re also going to look at your normal patterns as well as their day-to-day. Where do they go? So like say looking for locations, they go to the coffee shop every morning, they go to the gym, they go to someplace for lunch, you know, using those patterns.
Maybe you’re looking at a person who breaks those patterns, and you want to know why did they go over here versus the same place that they went to for the last four weeks or so. So it’s just using the normal activity versus the anomalous activity in ways that can help an investigation move forward.
Christa: So how far back does the data generally need to go to discern those kinds of patterns, and which databases are generally helpful to answer those questions?
Sarah: It’s all a matter of what your investigation is, honestly. So things like the application usage, the knowledge, the database goes back about four weeks or so, which is a nice chunk of time. However you get some of the similar application usage in another database called the power log database. And that goes back to about three days or so. So every database is going to be a little bit different.
Even in the different tables, the different queries will have different retention periods for that data. Everything from maybe a few hours to 24 hours, to a week, to four weeks, to a few months. As far as which databases are useful, that’s part of the investigation. If I’m looking for location-based information, I’m going to go to the location D, the routine D, what’s commonly known as the significant locations in iOS devices, because that actually builds up over a period of time.
So there’s some of those databases [which] keep a very detailed location record of where a user was for about a week or so. But they also store the significant locations for quite a bit longer. Looking at my own data, I have locations in there for, gosh, probably over a year or so. But it’s not every location for a year. So different retention periods, data get shifted off and moved and, and all that stuff.
Christa: So really when you’re looking at this stuff, you want to get to the data as quick as possible and save that off before it gets rotated out. So how would an examiner go about working with their investigation or their case agent on determining which of those databases are important to what they’re trying to get to?
Sarah: Yeah, it’s certainly looking back at what does the case need? Does the investigator need to know where a certain person was, what application they were using, what was their battery status? All of those are going to come from three different databases.
Some of those have analogs in the same databases. A lot of the knowledge, the power log stuff, can be across the different databases. It’s stored slightly differently, but you’re going to be looking at it in a couple of different databases.
So it’s really kind of figuring out where do you want your investigation? What pieces of data can help move that forward? What questions do you need to answer? And to try to look at that. I try to do as many detailed blogs as possible to try to describe all the different things that are available in these databases. But we’re talking about, you know, hundreds of potential things and everybody’s investigation is going to be different.
Some people might like might use [a] particular query for the power lawn: was the flashlight on or off to other people? They might think that’s so trivial and not needed, depending on a particular investigation and how it needs to move forward.
Christa: So that actually leads into another question I had about queries and or filters going into the APOLLO tool. What are some other examples of the queries and filters that examiners can use to get to useful data?
Sarah: So again, if we’re talking about the SQL queries, I try to find useful things and I try to keep an open mind when I’m powering through these databases. There are some very seemingly trivial things and some of these databases, like the flashlight or whether it’s plugged in or not, or you know, what fingerprint did they use to authenticate to the device. But you honestly never know what’s going to be used.
So I tried to throw that all in there and together certainly I’m not grabbing everything, because sometimes I just have no idea what it is or how it can be useful. But it’s up to the investigator to know what they’re looking for. And I always encourage investigators to email me saying, Hey, I don’t know if this is stored anywhere, but this is the thing that I’m looking for.
And I can usually do some tests to figure out whether it’s stored there or not. It’s not always quick. But you know, given a certain importance and an investigation, I have found some very useful minutiae that can be very good for investigations.
Now as far as filtering goes, once you’ve pumped all these databases into APOLLO, you definitely want to filter it. Again, I mentioned millions of different records. You’re not going to scroll through every single one of them. You’re going to be pivoting off of some piece of data, a contact and application, a moment in time.
I don’t joke when I say within a minute time period on an iOS device, we’re looking at hundreds and hundreds of different records from all these different databases. So you really do have to filter on a particular thing. And of course it’s multiple filtering. You’ll pivot from one thing to another to another to another.
Christa: So you also, in your webcast for SANS, talked about some specific artifacts including the power log .jesus, without which you mentioned analysis would be missing information. How so?
Sarah: Yes. So APOLLO is not perfect and I am the first one to say that it’s not perfect. There are some places that I’m missing data because I haven’t programmed that into APOLLO just yet. So I want to make it aware that it’s not going to grab everything. You still need to potentially do manual analysis to get some of that data out. So things like… well, APOLLO specifically works with SQLite databases. You pull these out of file system dumps. They have transactions that are written in pillow, the wall log for SQL type database. APOLLO is not going to parse that. APOLLO does not coalesce these databases before it’s parsing that information out.
So there could be, and there will be plenty of entries, quote unquote “hiding” in that, right? I had log, it’s something that I hope to have features for in the future. You know, this is an in my own time kind of thing, so it’s just a matter of when could I sit down and do it. But it is on my little roadmap.
And then the power log: power log is great. Power log is a database that has, gosh I don’t even know, probably 200 plus different tables in it. So it’s got a lot of weird stuff that’s being recorded and I’m only parsing out the main database. It’s called current power log, that PL-SQL. However, most devices have archives of these power logs. They’re called current power log, sometimes stamp PLSQL, God, Jesus. So I am not on archiving those or parsing those out.
Again, it’s another thing on my roadmap. I just don’t have it done just yet. And you know what, I’m probably missing data in other places. There’s other new databases coming up all the time. I haven’t implemented screen time yet. That is one that’s requested. But that database is particularly tricky. But again, it’s on my to do list.
I have a very long to do list, but it keeps me busy. It keeps me looking at the artifacts and I love doing it. I just have all these jobs that get in the way of actually getting this done, I think, like so many other examiners do. Right. I know, I know. I do have the fortunate… being fortunate to be able to do a lot of the research. I had been doing a lot of this pattern of life stuff for many, many years. This is not new to my research, so I really, I spend a lot of time on it. It’s a lot of personal time spent.
Christa: I want to actually circle back as you did mention that that there are a lot of changes, like as you’re doing this research, that the databases themselves are changing. The databases are changing every year. This can change the queries in APOLLO a bit. Does this mean for a tool like APOLLO versus, for instance, vendor tools that are relying on the same schemas, especially for examiners who might be trying to use the different tools to validate the results?
Sarah: Yeah, that’s going to be a problem for both me and the vendors. Apple loves to add new features. They love to change database schemas not even on a major upgrade like a iOS 12 to 13, but sometimes even on a point release, which makes it extra fun for some of us, which means if we’re all working with the same queries and they add in a new column, a new table, new whatever to the database, we have to update that.
And sometimes, you know, because there’s hundreds of databases on these devices, they’re going to miss it. You know, half the time I tried to do the major upgrades, the 12 to 13. next year I’ll do the 14, and so on and so on. But there’s going to be minor changes in between that. And currently APOLLO doesn’t really do the minor point releases.
I haven’t had a need to really put that in there, but it could absolutely happen. So that is also why I have the YOLO option. You know, the ‘you only live once,’ basically try to run all these queries. If it works, great. If not, Oh well, you know, so it’s not perfect, but it’s at least kind of a bandaid on that problem until somebody realizes and sends me a pull request, or sends me an email [that] says, Hey, this is broken now, or this is updated. Can you take a look at it? Because again, I don’t have the time to keep up with all of this, other than the major ones or once I realize that something has changed/
Christa: Is that something that the community can help with? Is this an open source tool that the community can come out and help with that research, or help add to that?
Sarah: Oh yes. I have not found a whole lot of folks helping me with that other than Alexis, but I have specifically made APOLLO so it’s relatively easy for the everyday examiner to update it. All you need to do is be able to write a SQL query.
Not saying SQL queries are all easy all the time. If you look at some of the power log queries, I have like three or four different selects and a lot of craziness and math going on with them. But that’s the easiest way to update that. Can’t write a query, just drop me a line saying, Hey, this does not work. Provide me details: which operating system you’re running, which database, you know, have you looked into the database, do you see what any changes are? Just try to give me as much context as possible, and I’ll try to update it as necessary.
Christa: And then on a similar note, because you’ve talked quite a bit about the amount of time that all of this takes you, Alexis, Mike Williamson, and some others have spoken about the need to take time to verify the results that you’re getting back from APOLLO and from other tools. But specifically in this context when so many labs are backlogged or have resource limitations, what are your recommendations for testing and verifying data?
Sarah: I bet it’s hard: time. People just don’t have time. So you know, of course you could run it through as many different tools as possible. Some tools will accept data, and some will not. But getting in there, getting on the device and physically testing it.
You know, if I’m looking at one of the flashlights has been turned on or off, I’m going to be out of a jailbroken device. I’m going to test that and I’m going to look at the particular database and run the query on it on a live device. That is the quickest way to verify that.
So if you’re coming down to writing a report and have a very, very specific record that you want to absolutely verify if it’s correct or not, get in there and do it. You know, it does take a little bit of time, but testing does not need to be an extremely lengthy thing. I have a whole other presentation that I did: it’s called Poking the Bear, on how to test how to test on Mac and iOS devices.
You know, it doesn’t have to be longterm or highly intensive, but you keep a jailbroken device around just specifically for that purpose. So I do find that quite useful. It can be five minutes of testing, but you can feel a little bit more comfortable about the data that you’re going to present to whomever.
Christa: As far as a pattern of life analysis overall, I want to pull back a little bit and ask, there’s some obvious applicability here. We’ve talked about criminal investigations and counter terrorism and so on. Is pattern of life analysis also useful in corporate investigations or litigation?
Sarah: Oh, absolutely. It’s all an investigative kind of utility. Are your corporate investigations having to deal with, let’s say, exfiltration of corporate data. If I’m on their iOS device and I see that they took a picture with the camera, they saved the picture, they maybe uploaded it to Dropbox, they sent a message to, you know, a competing company over whatever secure chat messenger there might be. It’s the same thing as other investigations. You’re putting that series of events together to tell a story. So it really doesn’t matter what type of investigation it is, it’s all a matter of how you use the data.
Christa: And then I’m going to close out here: tell us a little bit more about the APOLLO plugin for BlackLight. What does the BlackLight interface offer above and beyond APOLLO’s existing functionality?
Sarah: It’s prettier. So APOLLO’s output is ugly, and it’s ugly because I can’t make things pretty. Because we are taking… I say ‘we’… because I’m taking data from 20 plus different databases, it’s going to all be different.
You know, there is sometimes a database where I’m pulling, let’s say, two to three columns, and other ones where I’m pulling 50 columns. And trying to normalize that and make it easy to go through is very, very difficult. I basically push it into a CSV file or a SQLite database. It’s not consistent across databases. I’m really just pulling that data so I can see it, I can parse it out visually, but again, it’s very ugly.
So bringing that into BlackLight, you can poke around and look at all the different APOLLO modules. They’ve all been split out, their columns have been split out, so it’s just a little easier to visually take a look at.
So that’s one of the major pieces to that. And also because it’s a plugin, whenever I do an update to APOLLO or the modules, you can reload that into BlackLight and use the newest features. You don’t have to wait for BlackLight to update, you can just load it in there, or download it from Github, load into BlackLight and move on, because I do update them quite frequently, especially when they’re big things and big changes that I have been doing.
I just updated it recently for iOS 13, officially last week. And so lot of those changes can be put into BlackLight now without having to download a new version of BlackLight.
Christa: Awesome.
Sarah: Yup.
Christa: Is there anything else that about pattern of life analysis that we have not covered that you’d like to cover, or anything in general?
Sarah: I think there’s so much more to move on. I’ve talked mostly about iOS. Another thing that I am going to work on, another thing that’s on my to do list, is to get more of the Mac side of that.
Now if anybody’s actually run APOLLO on a Mac system, you will notice that it does work. You’ll have to use the YOLO option. But a lot of the database schemas across Mac and iOS do tend to be similar. They’re not always the same, but I will be working on the Mac side of the house.
I want to do iOS first because I think, you know, with with our full file system dumps, we are getting this data now and it’s really important. Our mobile devices are very intimate objects. They really tell what’s going on. It’s time to move a lot of that to the Mac side.
I’d love for somebody to do perhaps a little bit more on the Android side. Alexis really did quite a bit with his Artemis. Unfortunately on Android you don’t get a lot of the same data being stored in SQLite databases, they are going to be stored in quite a few other different file formats. But the capability is there now to use Artemis for some of that parsing. So I’d really like to see, you know, cross-platform other artifacts being put in there.
I’m not probably going to do it, I’ll be perfectly honest. Apple keeps me very busy with its updates. But I’d love to see more community involvement with that. So I keep it out in the community. I want the community to use it. I put a lot of work in there and I love hearing the stories when folks say, Hey, I used APOLLO for this really cool case. You know, so always drop me a line. If you have used APOLLO, whatever details you can provide, I’m very nosy.
Christa: Yeah. Well that’s good to know. Thank you.
Sarah: Yeah.
Christa: And I think that about wraps our time here. Sarah, thank you again for joining us and thanks to our listeners for joining us on the Forensic Focus podcast. You can find more articles, information, and forums at www.forensicfocus.com. If there are any topics you would like us to cover, or if you’d like to suggest someone for us to interview, please let us know.