Join the forum discussion here.
View the webinar on YouTube here.
Read a full transcript of the webinar here.Transcript
Amy: Good morning, everyone. I’m Amy from the marketing team here at Magnet Forensics. I’d like to welcome you to our webinar today. I [00:12] this is a one-way audio webinar. We have plenty of time for questions. So to submit a question, use the Questions handle in the Go To Meeting application, on the right-hand side of your screen. We are recording our session today, and we will distribute by the end of the week. With that, I’d like to introduce Jessica Hyde, Director of Forensics with Magnet Forensics, to present to you ‘Connecting the Dots Between Artifacts and User Activity’.
Jessica: Hi, thank you, everyone, for joining. So we’ll be talking today about connecting the dots for artifacts and user activity. Just a little bit about me – my name’s Jessica Hyde, as Amy mentioned. I’m the Director of Forensics here at Magnet. That means I mostly work with the product and development teams to ensure forensic examiner input is taken into the things we do in product, as well as doing a lot of review of things we build into product.
Before working for Magnet, I was a contractor. I worked for Basis Technology for a couple of years, running a mobile forensics team where we specialized in devices that could not be acquired or analyzed by commercial tools. Prior to that, I did digital forensics as a consultant for Ernst & Young or EY. Before that, I was a contractor again, working for American Systems, doing mobile forensics there, in a lab where I received all the image devices, primarily doing JTAG and chip-off extractions, and then deep analysis. Prior to that, I was serving in the United States Marine Corps. I do have a second job, because one’s not good enough. [01:56]. I teach mobile device forensics as an adjunct professor at George Mason University, where I also received my Masters in Science and Computer Forensics.
So what are we going to talk about today? Today we’re going to talk about what artifact attributes are, how we can relate artifacts based on those attributes. Then we’ll talk about some operating system-type artifacts that are worthwhile to use in correlating data, and then I’m going to show you what we’ve been working on for quite some time, and just released the product about three weeks ago, called Connections, that’s part of Axiom, that makes the correlations with these artifacts simple to do and to present. And then we’ll take some questions.
So we have a couple of goals today, and the first one is to learn how we’re going to link common artifacts, how we determine that artifacts are correlated. Then we’re going to actually learn about how we can see connections by doing that, and how that can be used to actually replicate the journey of a file, and how that actually helps us understand user activity.
So why do we want to do this? Well, we want to discuss linking common artefacts with each other so that we can see how they correlate. This isn’t limited to just understanding relationships between people and calls – so you might see [in cold chain] analysis, where we’re correlating different identities and communication methods, but we can also go a little bit further and understand the relationships between different files based on their attributes. So that can be how a media file correlates to another file, or how it correlates to associated file system artifacts. And the goal here is to be able to tie all of those things together – how an individual or an identity, a selector, depending on your terminology, communicates with other individuals, shares files and data, as well as those files that move, and then provide attribution as well as other operating system artifacts that can help us really understand the relationships.
So what do we mean when we’re saying we’re connecting the dots? Now, there’s a lot of structured data that we recover forensically. And what’s really interesting about this conversation and presentation we’re talking about today is this should be stuff that everyone is doing right now, but we came up with a new way to make it easier to do it. So right now, we take a lot of structured data that relates to when something was done because of timestamps, what type of activity was done, was something viewed, accessed, which user account was doing it, and frequency. So we can take that structured data and we can provide context to it. So by dealing with artifacts first and looking at parsed content as well as looking at metadata, we can start to really build a real understanding of what’s going on, and then looking at the aspect of attribution at the same time. So how do we look at all of this the same time, so that we can begin answering the questions we get on our cases?
And I don’t know what kinds of questions each of you individually are getting on your cases, but I know some of the questions that I regularly get as an examiner, that are things that I needed to answer, which is: How did this file get on a computer and where did it go? How was the file shared? How else was it viewed? When was a viewed? What else was viewed at the same time? What other activity was going on at the same time?
So how do we figure that out? Well, we look at the individual attributes of an artifact. What is an attribute? It’s the properties of a specific artifact. Historically, in forensics, we’ve typically discussed artifacts as almost being the smallest element, right? We parse and carve an artifact, i.e. a Google Chrome download path, and that Google Chrome download, that’s an artifact. Well, that artifact contains multiple different elements. It contains the file path, it contains the application, it includes the source where it came from, it has a time/date stamp. So instead of just saying that we’re going to look at that artifact, what we want to do is start pivoting on [those] attributes and pairing those attributes.
An attribute is a property of a specific artifact. For example, let’s say we’re looking at Facebook contacts. The attributes that we might parse from a Facebook contact could include a profile ID, a first name, last name, a display name, a small picture URL, a big picture URL, and a phone number. What’s interesting that you hear and important to look at is that there’s actually relationships, not just externally, between the fact that that same last name may be used for other areas or that that profile ID might be used for a Facebook message, but it’s also important to look at the relationships internal to the artifact itself. So we want to correlate that that first name and last correlate with that display name as well as that picture, and that creates an identity.
Another type of artifact – well, all artifacts have multiple attributes, but just an example of the types of attributes, especially as we move into more operating system-type artifacts – is a pre-fetch file. So for example, a pre-fetch file contains an application name, the application run count, [last from] date/time, operating system version, etc.
So what do we want to do? We want to use rules-based methodology – steering away from anything that isn’t rules-based to group artifacts that share common properties. And the whole goal of sharing those common properties is to give meaningful relationships to other artifacts. So those shared properties could include, let’s say, the file name or file path, and then, based on that, we can say [facts].
So let’s talk about the connections. Let’s say we have any message – and messages is a pretty big category. So email, chat, SMS, etc. Well, we can make connections based on the attachment itself, based on the application it was used, based on information that was maybe shared, for example, in a chat application you might be sharing a contact card or you might be sharing the geolocation that you’re at. As well as who transferred that data – so that can be the program that sent it, it can be the sender themselves, or it can be an account ID. And the whole point is once we have a message, we can now correlate that message based on every other existence of that attachment or usage of that application, or the program that’s transferred it or who the sender was for that account ID. And don’t worry, we’ll graphical and show this to you visually in a few moments. It’ll make a little bit more sense.
But what you’re basically thinking about – and I’m sure a lot of people are already saying, “Jessica, this is cold chain analysis, right?” But it’s not, because currently we don’t do cold chain analysis in forensics, until now, with really files, documents, pictures, etc. So the goal is that we can make connections based off that filename. The file path – file path is probably my favorite. So we can say that a filename may have been located at a file path if we have, let’s say a shellbag in common with that file path. Then we just need to look at that time instance and see if it was there. We can look based on the hash value and see what other files [were done] with that hash value.
And here’s the thing. As examiners, we’re all doing this all the time. We’re saying, “Alright, I have this filename. Where else does it exist?” And we’re pivoting on a keyword search. Or, “I have this file path. What other artifacts exist at this file path?” Or, “I have this hash value. What else exists with this hash value?” We spend a lot of time doing this manually, when this is a concept that can be done programmatically because they’re definitive. Looking at the application that was used for something, and then what else was done with that application. We can also correlate off of operating system artifacts, and this is probably the most important place, because in a lot of ways this provides us attribution.
So this can include things like, in pre-fetch, we’ll get the application name of the application that was run from a particular location, typically the first time, and the path in which it was launched from. So we’ll have a lot of good information there. Then we’ll have the file path for the link files. So if we pivot off a file and we want to see if there’s a link file, we can then look at that file’s file path, see all the link files, and coordinate that to potentially either the system creating the link file or the user creating the link file. But regardless of which way it was created, it can also show definition that something existed on the system or was accessed from the system, even if that artifact itself no longer exists.
We have jump lists, and jump lists are great, because these are recent files for an application. So we have evidence of that, and then of course we have shellbags, and shellbags are just a fantastic artifact, and thanks to David Cohen’s podcast and [11:42] post the other day, I learned a whole bunch of new ways that shellbags are created. But typically, we think of them in terms of Explorer windows being open and items being navigated to.
So now we need to start putting all of this information together, so that we can answer questions. How did a file get on this computer and where did it go? The goal here is to really tell a story. How did the document get here? Was it opened? And we’ll go ahead and walk through some cases to really show this. How was a file shared? How else was it viewed? When was it viewed? What else was viewed at the same time? And these are the typical types of questions we’re trying to answer as examiners, and we dig through our tools to find these answers.
So let’s actually go look at some data, because I’m really not a fan of PowerPoint slides, if you couldn’t tell by the fact that mine weren’t so captivating.
So I have up here Axiom. And if you’re not familiar with Axiom, it is Magnet Forensics’ full forensics suite, and it is very similar to IEF, for those of you who are used to IEF in terms of looking at artifacts first. So what I’m going to do is I’m going to just go to a document of interest. So we have some documents here … let’s look at PDFs.
And if we look down here, I have a document which I already tagged, called the Newcannabisgrowbible. And it’s maybe of importance in my case, and so I’ve got these questions we want to answer – how did it get here? Well, what you’ll see now, after version 1.2, is after the case runs you’ll see this icon is created. And this is how you enter the Connections Explorer. And what you’ll see is that there’s more than one, because we were talking before about attributes, and what you see here on your Details card are all of the parsed attributes for a connection, and so we could go in on the source, we could go in on the author, we could go in on the filename. And I’m going to go ahead and go in on the filename.
Once I go in on the filename I’m brought to a graph. And the purpose of this graph is to show you all of the relations based on this filename. So on the right side here, I see every artifact we have that contains the filename as an attribute. In the graph, I see everything external that pertains to having a shared relationship. So what you’ll see is I can very quickly see that this newcannabisgrowbible PDF was downloaded from this URL, and if I hover over this URL, I can learn more about where this came from. I’ll just scroll down here. And what I can see is that this URL was accessed with the identifier [14:43] both Chrome and Internet Explorer. And if I look at this instance here, I can see that this actually is the Chrome download.
Additionally, I can see very quickly that it was transferred, that this filename or a file having this name, was transferred to this user, live:monicaneff4, using this URL by live … transferred by the live account Isiah.dashner. I also see some file paths here that I’ll probably want to explore as well.
Let’s go ahead and look over here on the right-hand side. I’m going to go ahead and pop in my Details card. I would normally do this on two windows and pop out the Explorer view so I can navigate and click through it to the second window. But here, I’ll see all of the same information I would typically see on a Details card. So here is the actual chat message, where you can see that the media URL exists, as does the file. So this PDF file exists in this chat message, you can see here the PDF that we looked at when we came in. What you may notice is that there’s a second PDF, and what this PDF is, is this is the PDF not as it exists parsed in Axiom, but as it exists in the file system, and since it’s from the file system, you can go ahead and do things like on-the-fly decoding, because it’s just [the same] as any other artifact pulled directly from the MFT in the file system view.
And then here I can see the information about my Chrome downloads. So I’m going to go ahead and close that. Something I really want to explore now is my file paths, because I know that’s where I often get my attribution artifacts. So I’m going to go ahead and double-click on my file path, and you’ll see I already tagged them here, but we can see the device ID that it was accessed on, we can see that it was downloaded from this URL, and then we can again see that it was accessed with both IE and Chrome. And then we see the file again.
But what’s really important to me here is that I’m seeing these link files. And then I can go ahead into the details of these link files, and I can see the path, here for the document, as well as here, and so we’ve got two different link files, we also have a jump list for this path, as well as the downloads. So now we have some good attribution, operating system type artifacts. And you’ll see I tagged some of these. So the way that I personally would now work from this case is I would go ahead and I’ll go back home, real quick, so that’ll bring me back to my original view, and then I’ll go ahead, since I did use a tag, I’ll go ahead and filter down to that tag, and then I’ll go ahead and throw it into a timeline view, and I’ll pop in the Details card. So I can begin building a timeline based on these items.
So that’s good for that, but that’s only across one piece of evidence. All of this existed on the computer. So really quickly, I’ll show you that this is … for this case, I have a couple of pieces of evidence, I have a USB, I have an Android phone, I have a computer, and you’ll see here I have some cloud data. So actually from the cloud, I’ve pulled a couple of different cloud accounts. So I pulled Dropbox, Facebook Messenger, I pulled all of Facebook, and iCloud backups.
And actually, what’s interesting about this is when I was processing this case to build this presentation, I was sent the encrypted iTunes backup, I was processing it at zero dark thirty, and while I had the encrypted iTunes accounts, the person who sent it to me did not send me the encrypted iTunes password. None of the passwords I had in the case seemed to work on it, but I did realize that I had the Gmail account and password from that, from having looked at the computer, and so I went ahead and just guessed, and was able … well, made an assumption that that was probably the iTunes account that was used for the iCloud backups, and went ahead and did an iCloud backup, and voila! I actually had three backups. And I’ll just show you real quick, just by showing you something that would only exist on the iOS device, like iOS iMessages, that those then were parsed out. So that’s pretty handy, to be able to do that, and that’s something that would happen to me on a real case, where I don’t necessarily have the password, but I’m just [19:26] other ways.
So let’s go ahead and look at something from the cloud, and try to see if we can pull evidence from different places. Here you see a picture, and you guys can make fun of me, but I do watch American football, and I am a Giants fan, and so this is Eli Manning. He might not be the best. This is Eli Manning with the Minnesota Vikings, who did beat the Giants this year, but this picture isn’t from this year, so it’s okay. As we can see by the filename, it’s probably from October 2016, possibly.
So let’s go ahead and explore connections and see if we can figure out where this picture came from, other than knowing it’s in Dropbox. As soon as we go to Dropbox on here, we quickly see that this was accessed from Dropbox, we have some possible file paths … you’ll notice that we have a file path both on the computer and then we have a file path to the cloud backup. The reason we have two different applications for [access list] is one of these is going to be the instance from the computer and one of these is going to be the instance from the cloud, hence we have the multiple places. Here’s that source of the jpeg existing in the source. But I’m fairly curious as to if this exists anywhere else, and when I look at that file hash – I’ll do that again so you can see – I can see that there’s another instance of this with another name. Some other existence of this same hash value.
So I have this same hash, the same image, existing [also called] instead of this [date] format as it was in the Dropbox, called Giants Viking Football.jpg. So I’m going to go ahead and click there. And when I do that, I see some really interesting things. So I have a file path on the E: drive … and let me go ahead and start tagging some of these … [these] bookmark. I have a location on the E: drive, which probably means this was on removable media, and I also have a file path. So we’ve gotten something from a Dropbox, we have existence of this picture – let’s go ahead and look, this picture existence here … And if we look on the Details card, we can see that this is on the USB. So we have the existence on the USB, we have the cloud. What about figuring out if this has ever been on the computer? So that’s file path … we don’t seem to have any goodies there.
Let’s go ahead and go back one. And there is a Back button here. You can also save a node – so if I really want to save this node and come back to this picture later, all I do is just hold over that node, and it’ll go ahead and save for me. You see, I have a saved node here, so I can go to it later.
So I’m going to go ahead and go to this file path, and lo and behold, I have some link files. So we’ve now gone from the Dropbox on the cloud, from the cloud artifact, to seeing evidence of it existing with the same hash value, renamed, on a USB drive, and now in addition to the hash value of the rename on the USB drive, we now have evidence of it being accessed with the link file. And if we look at the link file of course, that’s coming from the PC. So we started really, really quickly with the Dropbox, and in just a few clicks, we were able to quickly see evidence of this not only being in the cloud Dropbox account, but being accessed from the computer and existing on the USB.
So I don’t know about you guys, but these are the kinds of things that I used to draw out manually, and make graphical representations of, to be able to show this to my end customers, so that they could understand how I was saying something got somewhere, and proving attribution. I mean link files are gold, right? Because now we can show not only did this come from the cloud, and was placed on the USB, but it was accessed on this system.
I’m going to go home one more time and I’m going to go another way, because sometimes when I’m thinking about attribution artifacts, sometimes I actually don’t start at the file of interest. Sometimes, I actually start from operating system artifacts. I was actually talking with an examiner at HTCIA about this. So let’s say I’m starting by looking at my link files, because I really want to make sure that I’m starting with things that are attributed. And here, I might find a link file that I’m interested in. It’s on the E: drive, it’s actually to a specific file. And I go in there.
So if I go in on this one, I very quickly – because I started on a file path, I’m going to start with my link files … I’ll see … [do they] have a second one … I’ll go ahead and mark that. And now I have the filename. So let me go in on that filename. And now I actually have the picture, I can again look at the details. I’m going to go ahead and tag these real quick. Those are now my evidence. But let me see if this existed somewhere. So now, actually, I have on here, I have that the source is from the SanDisk, which I kind of expected – that’s not far off, being that the file path was on the E: drive. I do have another file path for this. Let’s go ahead and look at that.
So this one doesn’t have much interesting here. But if we look at our details card, we can see this is also coming from a USB. So that’s not terribly interesting. Let me go back one.
But what I do have here is I have a hash, and – oh, look. That file hash has another match, the file’s been renamed. I’m going to go ahead and go through that file hash. And I can actually even start … almost this working backwards from the one we were looking at before, because now we’re getting to the cloud. So what I’m able to see now – double-click on that instead of single-clicking – what I’m able to see now is that this jpeg was renamed, and also exists on the USB. So we’ll go file hash, and … not only on the USB, but here it is in its other existence that came from the cloud. And we see it also coming from Dropbox.
What’s great here is we’re able to really, really quickly replicate and find all of the other artifacts. We’ve got some neat features in here, as far as warning you when you go a little too far. So for example, maybe I don’t necessarily want to look at everything on the source of the PC, or on the source of the cloud. So it will kind of prevent you from doing that. But sometimes when you get to the [26:32] identifiers … I’m going to go ahead and go back one. And I’ll go back one more, get to the USB. Oh, good. [26:44]. Sometimes when you get to device identifiers, you’ll have a lot of hits off of something, where it’s not manageable. And you’ll kind of get that input, because when you click … [hover] it, you’ll see that there’s almost a full starburst, and you might not want to go that route. But this helps us quickly find artifacts that correlate to other artifacts, and be able to replicate that file journey.
You can also very easily just go ahead and print these, and save these visual graphs. And like I said, before you can save them – so you can go right to them. As I mentioned, when you’ve double-clicked on something, in the matching results, you’re going to see everything that has the property that’s written under it in common – so these are the artifacts that have the filename. When you connect … when you click on something once and make a focus node, you’re going to see all the artifacts in common. So now we’ll see all the artifacts in common with that file path, which there wouldn’t be any there. But then, if I make this again my new focus node, I’ll now see everything with that file path as I’m going through. So that’s how you navigate the graph.
I did want to show you guys another example. Can you [pivot from external] E: drive to the USB registry key? So we actually parse out the … I’m sorry, that was a question that just came in. Thank you, Jonathan.
So we actually parse out the USB registry keys, and so let’s actually go to a USB registry key and pivot. Alright. So anything that we’re parsing the data from the artifact from, you’re going to be able to do that. So if we go to the operating system here, and we actually go to our USB devices … let’s go ahead and just … I’m just picking one. You’ll see here that we have all of your sources here. And if you wanted to validate the source in a registry key, you would go here.
When you’re doing that from the other graph … I’m going to go ahead and just go straight back to artifacts. Let’s say we’ve come in on the [family] name, and you are now on a USB device. And any artifact that was stored here could have brought you here. I guess I could have done this from that USB device. When I click on the details here, I will be able to go in, on to the registry key. So you can absolutely pivot and validate in the exact registry key, the USB that you’ve seen. So absolutely, you can do that pivot. Now I can go right back to my connections if I want.
I’m going to show one other example that’s pretty interesting. Because it’s a good one that’s pretty quick to run if somebody wants to just test this out and try it. And I do have a case of this pre-built, so anyone who would like to give it a shot, just let me know.
There’s a scenario called the M57-Jean scenario, which the naval postgraduate school put out. And I just want to pick something with a public image that had a scenario that you could solve using this tool quickly, so that people could test this out and have a known solution. And the scenario basically goes like this. Jean works for a company called M57. Some personal identifiable information, PII, got on to the internet from the organization. Jean claims she had nothing to do with it. So the question is how did … is it true that Jean had nothing to do with this information getting out into the public. There is no way she could have. She’s never sent this document to anyone. It just existed standalone on her computer. And so, the information we’re given is that it’s the M57 biz XOS. And this is a really small image set. It’s an EO1, it’s about 6 gigs total, so you can easily download it. And all I did was process it with version 1.2 of Axiom, so that the relationships would be built. And if anybody wants the link to this case file, this is publicly available as well as the scenario. You can also request a solution.
So the first thing we know is we know we have an M57 biz.xl document. And because we have the artifact’s first view, you can go ahead and just navigate the documents. Straight to Excel documents, and here’s that document. And real quick, in the preview, we can actually see – yep, here’s some personal identifiable information, social security numbers, salaries, names, there’s an attachment jpeg in there, and there’s some really unique [socials] in here – one, two, three, four, five, six, seven, eight, nine, eleven. So how do we quickly answer the question in this scenario? Could Jean possibly have any implication in the data being sent off of her device?
What I’m going to go ahead and do is just click on the connections icon for that M57 biz. Immediately, I see an email, and I see both instances of the parsed document in the [files section]. I actually want to go ahead and look at that email. So I’m going to go ahead and look at that, and so – yeah. We have an email – “Hi Jean … sorry to bother you, but I need this information now … the VC guy is being insistent … send me the social security numbers of current employees and intended hires …” And she’s saying she’s attached the email.
We actually, technically answered the scenario. It is possible that Jean shared this information outside of her computer, which is what she claims she didn’t. Because we can see the attachment there. But we probably want to dig a little bit further. So let’s look at the graph, and I’ll go ahead and save this graph. So just holding on the node … we’ll put it into the saved nodes. Or not. Once we have this node, what I want to see really quickly is that this document was accessed with Excel, it was modified by Alison Smith, so this document was modified by the other user. It exists at this path. It was also modified by the Jean User account. It was transferred to this email address by this user. And it existed in these file paths. So I think we’ve already got some decent information here, but there’s probably way more we can learn. Let’s say that I’m really interested – and you’ll see the second gray line – of what other communication Jean has sent to Alison.
What I’m going to do now is I’m going to make the Alison node the primary, but I’m going to look at the emails that she sent to Jean. Now I can actually scroll through and actually read through all of the emails just between those two users, and determine if there’s anything else that might be of interest in my case.
Here’s an instance where we’re asking for real data, and so this might be of interest, right? So the whole point is that we can quickly go through, and we now have this narrowed down query so that we can find our evidence of interest.
Actually, just going to go back one. Back two, to that first node that we were at. So we can also answer some other things real quickly as well. We can say does this exist under another name? Well, if we go to the source – sources will always include the hash value, so I’m going to go ahead and skip to that hash value, and if I look at the hash, you’ll see there’s nothing comes out off of that hash, so there’s no other instance of that hash in this case.
With that, you can see that you can really quickly answer the scenario here, and you can prove through this entire case, if you look through, there are some interesting emails between Jean and Alison. I would of course go ahead and still look at the file paths for attribution artifacts before I disappear.
And I do have a whole host of link files. And while I’ve got a device identifier, I’ll show you … Sometimes you might not want [35:20] try to give you that visual cue that it might be a literal onerous to go [35:24].
Overall, I hope what you’re seeing is that on the backend, we can actually, by using rules-based methodology, define the relationships between artifacts, provide that information in context, and allow you to see the journey of a file – where it came from, where it went, if it was viewed, etc. – just by looking at the artifacts that exist.
So now I’m going to go ahead and start looking at the questions.
How can I request that evidence, the question and answer you just talked about? George, consider yourself … just requests have been requested. So if anybody wants that, you can just type into the chat box, and I’ll go ahead and send you the link to the site where the Jean M57 case can be found as well as a link to a share site where I have this case, so you can bring it down. If you don’t have Axiom, you can request a trial and then you can just open this in Axiom.
How did I acquire the cloud, i.e. Dropbox evidence? Great question, Greg. I used … the same day, actually, that we released connections earlier this month, we released the cloud product, and we can acquire iCloud, Google accounts, Twitter accounts, Facebook accounts, and a whole slew of other cloud artifacts with that.
What is the average processing time for a case to get this level of information? Thomas, actually a great question, and I meant to answer that, so thank you for asking it. I will tell you that we wanted to get this out as soon as we had the feature done, and this took about a year. Because I actually … [Jamie McQuaid] and I actually validated and verified every single intra-artifact relationship, and then all the inter-artifact relationships were also validated by hand to make sure that this was all correct. We wanted to make sure this was [realistic]. So the processing time right now is actually quite slow. When you process a case in Axiom, your processing will finish, the working in your case, and then connections will kick off. If you don’t want connections to process when you’re first looking through your case, you can go ahead and cancel that process, and the next time you open the case, it’ll kick off again. Right now, I would suggest running this over a weekend if you want to see those connections built. You’ll know that they’re complete because you’ll have that [icon]. So it does take some processing time, however, it’s post-processing time, after you’re already working in the case. We are working on optimizing that, and in our next releases, this’ll be optimized.
How did we … we answered the question on how we acquired this. I’m seeing a lot of requests for [the case files]. That’s great. I will happily send that to you.
Does this [indexed data] in unallocated space include searching of attribution links to files of data that is deleted? Excellent question, David. Anything we parse, we’re going to correlate on. And we have a lot of carving that we do. The same carving that exists in IEF, if you’re more familiar with IEF, exists in Axiom. So we’re going to go ahead and run our carvers. Now, when we carve data and have something from unallocated, we will probably, in many instances, not have as many attributes to correlate off of, but any attributes we will. For example, we’re still going to hash a file. So let’s say we’re able to carve that [as an allocated,] a complete file. If we carve that entire file out of unallocated and don’t have any MFT reference data but do have the existence of that file, and that file hash matches a hash on a secondary piece of media or somewhere else, that will absolutely get a correlation. So it all depends on what is found on that device, as far as what is carved. But for any artifact we carve, we’re going to go ahead and do that correlation. Let’s say we carve a URL or a Google Chrome download. That’ll still be associated to that artifact and you would still see all of the other secondary correlations [also there]. So yes.
Just reading through to find questions. They’ve mostly got “Can you please send the case?” Case, case, case, case … okay.
Do you need a separate search warrant for cloud data? Great question, Vic. Cloud data is going to depend on your jurisdiction and where you are, and this is actually a pretty global call, so your rules, where you are, are going to differ greatly. If it’s a corporate-owned device, you may not need a search warrant, for example. You may be able to use the credentials for the corporate-owned account, and go ahead and pull cloud data. For example, if your organization uses Google or Dropbox [40:42]. It is, in my opinion, always best to get a warrant if you can for cloud data. And maybe that’s something people should start being more … proactively including in their warrants, so that they can find more forms of data. But it really does depend on your jurisdiction. I know in the US there are states in which it is common that people are going after cloud, and that there are states in which it is not common that people are going after cloud. I do know that in the international community, there are some countries that allow you to pull data from the cloud if the device at the time of seizure was connected to that cloud account. And there are other countries where it says that you can never access cloud data. And a lot of these rules are being built right now, and precedent is being set, and there aren’t really a lot of good [case laws] in those countries to establish this yet. It’s kind of a really new, challenging area, from a legal perspective. And since it is challenging from a legal perspective, if you’re not the data owner, I think it is always best to try to get a search warrant.
When I open an existing case, it starts to fill connections, but it runs for a long time without completing. Yeah, [Aaron], you know what – if you could … it says here that you sent logs to support. I’m going to go ahead and check on that case for you. Unstructured data sets work as well, like [42:13], [pagesys]. So good point, [Aaron]. Yes, unstructured data sets will work as well.
One of the things to note about Axiom, and IEF as well, is that we actually don’t care if we recognize the file system to go ahead and run our carvers. So we’ll go ahead and still parse contextual data out from the case, even if it’s from a memory dump, a [PCAB], a video game system, anything we don’t necessarily recognize the file system, like we would on smartphones and computers. So with even that, especially if you have that data in conjunction with other evidence on the case, like I had shown here, where I have … oh, this one has one piece of evidence on the other case, where I have multiple pieces of evidence. You’ll definitely then get to build some of those cross-correlations, but absolutely, this will still run, regardless.
Okay. A lot more requests for the [case file]. Oh … how about … I believe I’m pronouncing your name correctly. If you’re not seeing connection icons – this was introduced with version 1.2. I would make sure that you’re on 1.2. [Vic], very good point – consent can also be of help. For those people who asked for the answers to the scenario, you can request them from that site.
Can [43:44] location connections? Great question, Patrick. We do not yet have geolocations being correlated in here from call data records, because we’re not currently bringing in CDR data. However, geolocations themselves do match. So if it’s an exact match on the geolocation, it will correlate as an attribute.
For a typical image size, what is the estimate of the size of additional storage of Axiom-generated files? So … [William Wong], that’s a really good question about typical image size versus how big the case file is that’s created. So the MFDB or Magnet Forensics Database is based on not the size of the image but the amount of carved artifacts. So it actually really depends on the density of the data on the drive. We do reference a lot of items off of the regular image for the file system view, and where you really typically see an enlargement is if you’re creating a portable case external to Axiom to share with somebody else, and in that portable case you’re exporting out all of the videos and pictures, because then those videos and pictures will be stored in a case. So it really depends on how you’re using Axiom, what size the MFDB will.
That is true. [Daniel] brings up that a search warrant for a phone does not include cloud storage for the device, it’s a hundred per cent correct. You want to make sure that you are asking for cloud data in your warrants.
Are there any additions to the reporting options that allow for the new connections feature. Great question, David. Right now, you can print out the graph itself by right-clicking on the graph. However, we are working in the future to incorporate it specifically in the reporting.
There’s a correlation between different items of evidence, is it automatic. Yes, it is. Because we’re basing it on those relationships of the actual attributes being the same, and the correlation between.
Does the connections deduplicate files [45:59] number of times the same file appears? You will actually get a separate instance in the right-hand pane, in the Details pane, for each instance of the file as it exists, depending on how you process your case. So if you process your case to dedupe, you won’t have those duplicates; if you didn’t process your case to dedupe, you will.
Can you … is there any restrictions for the number of cases that could be ingested into the software? Very good question, Abhinav. So yes, there is. [Our backend] is a SQLite database that’s doing all this correlation. So this is not for cross-collection analysis. I would not put petabytes of data and attempt to try to run that with connections. It will not work. I typically try to keep under 4 TB of images in a case.
I’m going to make sure … please, I will send you this case, I want you to … with the case file that’s already built, go ahead and try to build it with this case. I want to see if there’s any issues with that.
Does it require an internet connection to run? No, this does not require an internet connection to run. You can run this entirely on a standalone system.
Files download directly on to some external device or destination of files changed after downloading, and there’s no cloud service active on the system, then what effect would be on the information? So, if I’m reading the question correctly, Abhinav, you’re saying that you’ve … I’m not sure if you’re asking if the cloud image is done to an external device or if you’re asking if on a case you have data that was downloaded to an external device, and the destination file was changed after downloading … you would probably still have some artifacts. For example, if it was an internet download, you’d probably have that in a cache, possibly. You may have some link files to the external device where it was. You may have some operating system artifacts that may also cue into the fact that that file existed. If the file was changed after downloading, you’ll have a difference in hash value but you may have the same name, and you will be able to follow that path based on the name. [I’ve potentially shown you that.]
If I heard you right, you can’t run … [pauses] Is there a limit to terabytes [48:54]? Okay, George, I’m going to ask you to … you can absolutely … I’m actually not clear on the question. If you can’t run … so you can run a cloud case on anything that you have legal authority to pull stuff from the crowd. So that’s 100 per cent, George, just based on your legal authority. If you have legal authority and you have the cloud module for Axiom, you can pull that data. It’s not limited to the type of case, it’s just your personal authority.
Is there a limit to the terabytes on connecting before it stops working? It’s limited based on the number of artifacts at the expense of the SQLite database. And just to be clear, when I’m talking about the cloud, I’m talk about cloud acquisition with Axiom, not any parts of your data being stored on a cloud from a case perspective. We’re not doing any kind of cloud storage for your cases – just making sure I’m clear about that.
So the 6 GB case – I can tell you exactly how long it took connections to run through the 6 GB case. It took 36 minutes, because I did it recently. So it took 36 minutes. I actually did it live. It’s going to depend on your hardware. So I did it live on a customer’s machine. I ran that case on Thursday. I actually have no idea what the specifications on the hardware on the machine was, because I was in their lab and their environment, and they had downloaded the case, and we just ran it, and it took 35 minutes.
In Axiom processing, during the case setup, there’s a Scan Information section. What’s that? Great question, Chris. The Scan Information section allows you … in Axiom you can actually reprocess or post-process [out] a piece of evidence [at] a new evidence [50:57], and that would be a second or third scan. So we allow you to put information in there just for your case file.
How much storage could that take up? Well, you might check offline. I’ll go ahead and just pull up the case file, and I’ll let you know in a second email how big the case file was, the MSDB case for that 6 GB image. That’s easy, I can do that right after this call.
Are there any other questions?
Well, thank you. I’d like to thank everyone for joining and taking time to join the call. Especially because there were a couple of questions regarding search warrants and cloud. I would like to mention that we have a white paper coming out this week, called ’12 Tips for Presenting Evidence in Court’, that might be useful to those of you who have asked. And I really hope that you find some value in automating things that we all do everyday as part of our cases. So thank you very much. Have a great day.
End of Transcript