Join the forum discussion here.
View the webinar on YouTube here.
Read a full transcript of the webinar here. Julie O’Shea: Hi everyone, thanks for joining today’s webinar: Timesaving Forensic Techniques For Your Next Case. My name is Julie O’Shea, and I’m the Product Marketing Manager here at BlackBag.
Before we start today, there are a few things I would like to review. We are recording this webinar and we will share the on-demand version when the webinar is complete. If you have any questions, please submit them in the Questions window and we will answer them throughout, or in our Q&A at the end of the webinar.
I’m excited to introduce our speaker today: Ashley Hernandez. Ashley is the Director of Product Development here at BlackBag and has worked in digital forensics for over fifteen years. She has taught and certified investigators in digital forensics and security topics, including speaking at many digital forensics and law enforcement conferences.
Thanks for joining us today, and Ashley, if you’re ready, I will hand the reins over to you to get started.
Ashley Hernandez: Thanks Julie, and thanks everyone for joining us today. Today’s topic is going to be covering some techniques to use to make sure that you’re approaching your case smartly and you’re able to get to the data that you need quickly and easily. We’re proud to have released BlackLight 2019 R1 this week, and I’m going to highlight some of the new features that we’ve added, that we think will make your job go more smoothly, and you’ll be able to get to the data you need more quickly.
So just a little overview of where we’re going to go in the next section of time: there are some general techniques that are going to work regardless of your tools; those are mostly covered in the first parts of this presentation; and then there’s going to be some specifics that we’re going to talk about, including information about our new image categorization; our ability to export to logical evidence files; and our new index search – we’re going to cover how those are specifically designed to speed up your investigations.
So we’re going to cover seven different techniques in this presentation; there’s going to be a combination of Powerpoint and then some live demo, and we will be taking occasional breaks for you guys to ask questions, so please use the Question window if you have any questions.
The first topic that we’re going to talk about is beginning with the end in mind: making a plan. So it’s a highly effective strategy for general productivity, but it’s especially important when we’re looking at starting an investigation. I pulled this thread from Twitter that also inspired a blog post by Brett Shavers; I’ve listed the DFIR.Training link down below if you want to dive more deeply into this discussion.
But one of the ways that we set ourselves up for success when we’re looking at starting a case and making sure we’re using our time and resources wisely, is by going ahead and making a plan. So, knowing what the objectives are for the case, both from the person that you’re accountable to, whether that’s maybe an internal HR department if it’s a corporate use case, or from the district attorney if you’re looking at prosecuting specific types of crimes. Knowing what their bar is for evidence and what they want to see will really help you keep the end goals in mind as you’re looking through how you want to approach the evidence.
Also, taking down some notes about the context of the evidence. This can be everything from information you know about the person who owns the device that you’re looking at, to information about the device itself, like the operating system. As you can see from this Twitter post from one of the folks who started this discussion, they start with the registry, which is a great place to start on Windows investigations, but not as awesome for a Mac investigation. So we want to make sure that we are understanding the type of device that we are working with and things that might be in play, like encryption or other pieces that we might need to get past.
And the last point [is], what are your time constraints? These could be resource constraints, these could be time and cost constraints. Sometimes they’re not actually looking for everything, they just want enough to move to the next step; or there might be places where they don’t want to see anything from unallocated, things that have been carved out, because it’s too hard for them to move forward with the investigation. So they might have some parameters that can help limit what you need to do.
Another example of this would be time constraints, search warrants, and other instances where you’re only allowed to look at certain amounts of data.
So there were a couple of other things on the thread, I recommend that you go and check out Twitter for some of this active discussion, as well as Brett’s post; but just finding out how you want to start is a great way to set up what you want to do.
So I’m going to show a quick example of this in BlackLight. And one of the new features that we added inside of BlackLight with this release is the ability to add investigative notes. So you can see that here, on the bottom left of my screen. And I just went ahead and wrote some intake notes here.
So this is going to work similar to a notepad. These investigative notes are kept with your case, but they’re not included in the report. This is more for you to just capture your thoughts; maybe copy and paste some different information as you go through, doing your investigation, to remind you where you’re at. You finish at the end of the day, and you know the next lead that you want to follow up on is you want to check images for a specific thing; you want to say “I want to do that next.” You can just leave these here, so you have some notes for you. You can make as many of these as you like; they will not block the interface, so if you want still to interact with the interface, you can have this up, and just have a scratch sheet that then goes along with your case. It’s just a nice place to have your notes and preparation for what you’re going to do.
So we’ve got our plan, and now we’re ready to get started. The next tip that I want to focus on is making sure that you know your tool. And this is going to help you, both in knowing what your tool can do, and knowing how long specific things take to run. Ideally, like in BlackLight, you can run certain options upfront to start your investigation, and then run longer options further down the line.
Some functions just take a long time because they have to touch every file: things like hashing, if you’re going to do that across your whole case, that’s going to take a little while because it’s going to read all the bytes: there’s not a way to super-speed that process up.
Other processes, though: maybe you know your tool does this faster or better than other tools, and knowing where you can get information is really important.
Another thing that you can look at is if you know that there are specific areas, like the Registry, that you want to dive into first, and that you want to look at quickly, you can make templates for you to look into those areas right off the bat, if those are things that you always do as part of your investigation, you can go ahead and look at those pieces first. So the more you’re able to customise your tool, and know where those options are, the faster that can go.
So when you’re jumping into BlackLight, you need to be as wise as Indiana Jones and choose wisely what you’re going to start with. So I know that over here on the Ingestion Options, there’s something that says ‘All.’ But likely, ‘All’ is not what you want to run when you’re working a case that is time-sensitive. ‘All’ will work if you’re going to process everything over the weekend, when you have a backlog queue and you just want to jump in and work the case once it’s done all of its processing, but if you’re looking at a time-sensitive type case then you’re going to want to start with a subset of options. And you can always run any of these other options later. So I’m going to show you a little bit about how those two pieces work.
So if we’re in BlackLight we could do that at the processing options. There’s also some quick places that we could look at if we are interested in starting off in a specific part: so, actionable intelligence – I know that the particular one that we looked at had some info about wanting to start in the registry; that might be one place that you want; if you’re looking at a security incident you might want to look at file downloads. Knowing where to go when you start is a great way to know what you’d like to do. But if you were interested in the registry, one of the items that we have here – I have an evidence file that has both a Mac and a bootcamped Windows volume on it – is I could look through and, if I know I always want to get some basic security information about a user, I could go ahead and say “This area of the registry is a significant area for me.”
So over here on the right, I have the option to say, “Whenever I do an investigation, I think the user’s information in the registry, under the SAM section, even though it might be pulled out in my main tool, is something that I want to verify.” And if I can make it as significant, now I can come and look at all the different places in the registry that I think of as significant, and I can look through those automatically to pull out stuff like, let’s go ahead and see what the last logged in date was, or when he last changed his password. We can take this, and again if I’m just trying to keep some notes for myself, I might go ahead and say “Well, I saw that the last logged in date was that.” And then I just have it for a note. I’m not necessarily wanting to put that in my report; but I have a note for that, as some place that I might want to come back to later is when I last saw activity on this machine.
So, first few steps are: make a plan, and know how to work your tool to do that for you. So, do we have any questions, Julie, about those first two pieces?
Julie: Yes, so we have a few that came in. The first one: Are the investigative notes saved with the case, and can anyone see them if I share the case file?
Ashley: So, the investigative notes are saved with the case. They’re not part of the report, although you could take these and tag the data that you have in here, if you want to copy and paste them in; but they are part of the case file, so if you shared the case file they would be there; it’s not specific to the user.
Julie: Great. Can I save my favourite processing options, is our next question?
Ashley: Yes, you can save your processing options. So when we look inside of our evidence status here, you can see that I’ve run some on certain files and not options on others. But as part of our case template – so up here we see File>Save Case Template, you can actually choose both to save things like tags, file filters, or searches that you’ve done; but if you edit ingestion options, this will let you pick what you want to have, starting out the gate, for the different items. So if you wanted to have a custom thing where you always ran picture analysis, and you did file identification and hashing, you could go ahead and set those as your default for ingestion options.
Julie: Thank you. And the last question in this section: Are there any options I can’t run later?
Ashley: Pretty much all of the options you can run later. So you can choose to increment and run these options. The one exception is interesting behaviour around the APS and unallocated, because that is a special area of the drive, but in general, on most evidence types, any of these could be run later. So, start with what you know is necessary, and you know how it works and how to use it; and then if there’s longer processes, you can usually run those later once you’ve got a triage of the device that you’re looking at.
Are we ready to move to the next section?
Julie: Yes, thank you.
Ashley: Alright. So the next thing we want to do is narrow down what we’re looking at. And traditionally, this has meant filtering; maybe some hash sets – we’re going to show how those work – but this is trying to focus your attention, not on the whole drive or the full set of evidence, but really looking in on a specific area. So we’re going to show just some basic saved filters, I’m not going to build all of these for us now, but we’re going to show how that works and how you can make sure that you’re using your hash sets to exclude anything you already know is on an operating system – it’s surprising how much that actually impacts the amount that you’re working with – and then we’re also going to highlight in this section how you can use the logical evidence file format to export files. So again, we want to limit how much we’re working with, and make sure it matches the scope of what we’ve been asked to do.
Alright. So, coming back in to our view here, we’re going to go up to the Filter view, because we want to work on some specific… looking for specific types of files. So I’ve actually saved a filter here that has a few different levels, and the first thing you’re going to notice is that I’ve used my hash sets here. So if you guys are not aware, in BlackLight, you can go to hash sets and it will actually allow you to download the known Windows and Mac hash sets. And both of those I have actually run the known file hash sets against my case. So when I look at these PDFs here, I can see that I chose to look for anything that is a PDF – not by extension, but anything after signature analysis that’s a PDF – I want to make sure it’s not an empty PDF, and that it’s not an already known file. So I don’t want to see things that are known files. So I’m saying all of these conditions are met:
– it’s a PDF
– it’s not empty
– and it’s not part of the Windows or Mac operating system.
So the first thing I can do with those – you can see there’s 117 here – I could choose to select all of them and tag these items, so I could tag these files as PDFs that I need to review, which I actually already did before: you can see them showing up here on the ‘PDFs to review’ as the items that I have selected. And I can also tell that they’re already tagged, because I can see the tag item – the column here shows that these ones are tagged. So I could do that for PDFs, to look at those; I could also say we have a lot of these common things here… so maybe you’re looking at an image-based case; or maybe you’re working on a case where you’re looking more for documents – Office documents which are separate from PDFs, I could filter for those – and that would allow me to look at just the 39 documents that are not part of the known set.
Again, I can highlight all of these, I can choose to tag files, and mark them as tagged. So I went ahead and did that. But I did want to show you the difference between when you’re using these hash sets, and not. So here I have 39 documents, Office files, that are found that are not part of this set. But if I go ahead and remove these hash sets, you’ll see that there’s 57 of those ones; and if I were to do the same one that I did before, with PDFs… I think that was the one where there was more of them… with the hash set removing items, there were 78, but without it, there’s going to be 1654. So a lot of these were actually PDFs that were installed as part of the operating system; definitely 78 is a lot less that I have to review than 1654. So if you’re able to eliminate some of these broader areas, it will really save you time on the system.
And you guys can create additional hash sets beyond just the ones that we ship with, but those are ones that are updated for both the Mac and Windows system.
Another thing that we can do is go ahead and combine items that we’ve tagged. So I’m going to make a new filter here. And we have this tag name, and I’m going to say I want to tag anything that has the PDFs – maybe I want to go ahead and work with both the PDF tags, and the ones that were the Office files. And I want them to be – instead of it’s in both, it’s a PDF and an Office file – that likely isn’t going to happen; I want to make sure that it’s either in the PDF tag or in the Office tag.
So here I’m showing the group of those two together: I’ve found all the Office documents and all the PDFs. Now, maybe that’s all that they wanted to see; they might want to add a date range to that, or maybe some other stuff, but if they just wanted all the docs and PDFs that are not part of the operating system, I can now choose to export those into a set using our export selected files as a logical evidence file. You’ll notice the extension here says L01; these are actually the EnCase logical evidence file format, which is a forensic format that EnCase created originally, but many of the forensic tools support it, so this can be ingested by most any of the forensic tools, including BlackLight; but it’s also supported by tools that support review, like Relativity or ReclaiMe. Being able to bring these in, all of the metadata that we see here is preserved, and it’s preserved in a container file.
So when we choose to export files, we’ve made it really simple in that we’re just going to ask you to go ahead and export it, and we’re going to take the files that are in here and export them into a logical evidence file, preserving their folder path. So the other piece of logical evidence files is, sometimes when you’re looking at the difference between Mac and Windows, Mac and Windows have different limitations on how far down into the file system structure you can read. So by having it within a container – kind of like if you had it inside of a zip – you can actually have those structures preserved as they were originally on the disk.
So if you export from the filtered view, even though you haven’t chosen to export the whole folder structure, you can go ahead and still preserve it, because that’s going to automatically preserve the folder structure when we export logical files, we’re going to assume that you always want that.
That would be the same as, if we moved over onto the Browser view, and maybe you just wanted to export a user’s directory, you could go down to the user’s directory and again choose to export the following files as a logical evidence file. And then just bring those in as you would any other logical evidence file that you’ve worked with.
Any questions on the logical evidence file stuff, Julie?
Julie: Yeah. We have had a few questions come in. So the first one is: Are the logical evidence files encrypted?
Ashley: So, by default in our implementation we do not allow for encryption of logical evidence files, so you are not going to be able to do encryption within the file itself. You could always put it on a drive, or put it within a container that you’ve encrypted.
Julie: And the next question is: What type of files can I export to logical evidence files, and does it support email?
Ashley: So currently, in this iteration, we only do files like you would see on the system, not files extracted during parsing, like emails within PSTs. Now Mac’s a little different because EML and EMLX files sit on the file system as they are. You could export those, because they are files, but say you’re looking for data that was parsed out of a .dat file, that is not going to be exported currently in the logical evidence file. We would end up exporting the index.dat file, or the whole PST, at this point. So we do know that that’s a request, and we’ll be looking at that a little bit later down the road.
Julie: And the last one that came in about this section: Which tools support importing logical evidence files?
Ashley: So I mentioned a couple of the review tools, which are Relativity and ReclaiMe, but I do know that EnCase supports it, and I believe X-Ways does, and some of the other forensic tools that you might have. The logical evidence file format reading, many of the tools do create. But this is our ability to create files, rather than to actually read them, and that is new for BlackLight, for us to do the creation. And a note of clarification, since I saw a question: Are we creating the L01 or the LX01? EnCase has done an update to their standard called LX01; we’re just doing the basic support for the original evidence file, the L01. That’s why you don’t see an option there to choose between L01 and LX01.
Any other questions?
Julie: I think that’s all for this section.
Ashley: OK, perfect. Alright. So we’ve narrowed down our scope; we could bring in those logical evidence files; maybe you just want to make a logical evidence file of all the graphics, that would be one way that you could filter down to what you’re looking for. We might still look at filtering down by date specific files, so I’m going to show you quickly how to make a filter that will just look across a set of dates. But one of the techniques that is new, is our new image categorization that is in BlackLight 2019 R1.
So we have a whole webinar on image analyser’s integration into BlackLight, so please check that out if you want more details. But I want to go into some of the highlights of why we’re so excited to have image categorization.
So when we look at image cases, there’s just so many images that are out there. And there’s been some great things around hashing, and even tools that help you identify known hashes of files that have exploitative images in them. But we want to be able to provide the capability to highlight items that might be categories that are risky on a device, whether it’s for law enforcement looking for specific types of images, or for corporate use cases where it could look for improper use of a device by looking at pictures they shouldn’t be looking at at work. We want to make sure that it’s easy for us to find those pictures that are more likely to have that kind of material. And we want to do that using machine learning based technologies, so it’s looking at new things. It’s actually analysing the images across your case to see if there’s any of this type of content.
So I’m going to focus in this section on how we can do that type of investigation. So, say you’re told, “Hey, I want to know also if there’s any pictures of weapons, of the subject holding weapons,” or “I want to see if they’re on probation and they’re not allowed to be using drugs, I want to see if there’s anything with drugs,” might be a law enforcement use case for this piece. But it might also be just a corporate use case of “I want to make sure people aren’t looking at explicit material during work hours, can you run a search? There’s been a complaint that this particular person is doing that in an area where everybody can see them, during work hours. So can you show me if there’s material in that area?”
So one of the things that we did actually improve beyond just our ability to add image analyzer, is we’ve done some rework here to make it just a little simpler when we’re reviewing pictures, now pictures is our first tab. But you’ll see that we have the ability to now sort and bring up to the top of your investigation specific risky categories. So if you are law enforcement and you want to see if there’s any pictures of weapons, whether it’s of a shooting event if you want to see there’s information on their system; or as I mentioned, maybe images of drugs; these are the categories that we currently have. I’ll show you as well… we have them listed here.
So here we have all of these different categories available, as well as CSAM is coming soon. And you can select them from this drop-down. So I’m going to sort by weapons first. And then I can choose to either tag these items… there is always the way for you to tag, especially if you’re reviewing a specific type of images; adding additional tags for each type of violation allows you if you’re then going to produce it, to produce it across multiple… by doing that filtering where you stack your tags and to say “Well, I want to see all that were category 1,2,3 of the different types of filters.”
So one of the tricks to doing selection of these a little more quickly, is you can do Sticky Select, which allows you to select pictures without having to hold down any shifts; so that’s available in the Gallery view. But as you can see, these categories really do help you to bring up these images. I didn’t have to narrow down the case by date range, or looking into specific folders; these were able to bubble up to the top. So Image Analyzer is going to continue to improve their model, so you’re going to get updates to our support for these categories, as well as adding new categories as we go along. But I think if you try running these, you’re going to be very happy with how well these identify different categories of threats. Some of them are a little more complex, so you’re maybe going to get more accurate results for things like weapons, which are easier to identify, whereas extremism is a little more rare, and those might be more difficult. But that is one thing that we’re hoping will help, especially with investigators who have to look at all of those images that they may not want to be looking at.
You can always still filter out the items that were in the known hash set; these ones, none of them were in the known hash set, but we can always choose here to apply a filter from our Filters view. So if you had under your File>Filters view running that hash set, so anything in the known hash set would be eliminated from this view, you can go ahead and choose to run that on top of it, so that will reduce more the number of images that you need to check.
Aright, Julie, any questions on that section?
Julie: Yes, we had a lot of great questions coming in about this feature. Are there some new categories that will be coming?
Ashley: So, I mentioned the CSAM, the exploitative material, is one that is going to be coming for children’s images. We’ve also had a request for documents of identification, or items around currency; so those are some of the most popular requests that we get right now, and if you want to request additional categories, you can go to our product request feedback form on our website, and go ahead and request them there, and then we’ll be working with Image Analyzer for them to develop models that help identify those images. Any of these models don’t actually contain the actual files or images themselves – it’s used based on some algorithms and so we’re able to update those rather quickly.
Julie: Great. Does it cost extra?
Ashley: So Image Analyzer and the Image Categorization is built in to everyone who is current for BlackLight, so if you’ve downloaded 2019 R1, it is there, and you don’t need to do anything else to make it work.
Julie: And does it run on carved files?
Ashley: So, it does run against the carved files. You might need to process again after that, but once it’s carved something that it’s identified as an image, then it can run against those images.
I did have a question about what happens if the image hits on multiple categories. So I’ll show you a second here… the metadata information will have it down on the… you can either do it here on the bottom left, or in the metadata down below. So this one hit 100% on weapons. If one of the algorithms hits 100%, it’s not going to do the other algorithms against it, because it says it’s 100% that particular category. Now if it’s more varied, like 99.9%, it would still run the other algorithms. So let me see if I can get any that have multiple here…
It looks like most of these are pretty high. You can see here that it also hit for possibly having drugs or extremism; you’re going to see those items set up across the categories. Only the things at 100% match for something that’s a weapon or something that’s drugs, would it not run the other categories. Otherwise you’re going to see all of the categories assessed. And right now in the product, you do have to run all the categories; you can’t choose to only select specific ones.
I think there was a question about setting up a date filter, so I also want to show how that can be done. I did create a date filter in mine, and then I will also show you how you can do this in the next section, when we talk about searching.
So if you wanted to set a filter for a date range, this is going to look across any created, modified, accessed, or date added. So say I knew my investigation happened, and my scope was only for the year 2017, or if you’re looking for a networking event you might actually have it even smaller, to a specific day. But you can run this filter, which, this one’s going to return a lot more files. So if I run the year down this, I might create a logical evidence file of it… it only had dates in this particular range.
I could choose to tag these, and say “These are just the particular files within this date range,” and then I can add to that additional filtering if I want; I think that was part of the question. But once you have this built in, when you go to the Media view, now you can choose to run that current filter and it’s going to narrow down my pictures to things in that particular date range.
So you can combine the filtering with the image categorization, you can do those things together.
And then those were part of the saved searches that were saved when we did the saved case template, so you could choose to save those as part of the template.
Alright. So I think we’re going to move on to the next section, so we don’t run out of time, and we’ll get to some other questions if we have them.
So the next thing that we added in BlackLight 2019 R1 is the index querying. So there’s kind of, in forensics, three general types of searching that happens. And each of them is really powerful for a specific use case. Understanding when to use each one is really what can save you a lot of time, because searching is just an intensive process, and we want to make sure that we don’t miss any data, and that we’ve done it correctly, so that we can confidently say “This is the data that you were looking for” or “Yes or no, those names did or didn’t exist.”
So we’re going to focus on two of these types: one is sometimes called raw keywords, or raw data searching. That is going to encompass anything where we’re looking for directly, in a stream, without trying to break it up with any patterns, specific words or characters in a specific order. It’s not about looking for things that are chopped up by words.
And then regular expressions are patterns. So email addresses, where you know you’re going to have something.something@something.com, or you’re going to have something similar to that. There are known regular expressions for those; for IP addresses; for social security numbers… where it’s a pattern, it could be a combination of numbers and letters and special characters, those are all done through… kind of more like the raw keyword searching, but a layer on top of that is called regular expressions.
And then the last type of search theme, which is new for BlackLight in this release, is index querying. So with index querying we’ll talk about the advantages, and how that works in the next couple of slides.
So let’s go over why raw keyword searches have been the preferred and most used option that we have available.
Raw keyword searching is the type of searching that you do against a drive where you want to look at every single character, in a row, and see if it matches what characters you’re sending it. I want to find ABCDE, all together, in that order, on the drive, regardless of whether it’s in unallocated space, or part of the metadata, I want to find all of those characters as I’ve set it. And I don’t care if there’s a space afterwards, or more letters after it. I just want to see those characters, in a row.
So you can search anywhere on the drive with keyword searching, typically. There can be exceptions, where you’re looking inside of compound items, where we’ve had to pull that expression out. But you can look for things like patterns, and you can look for things in memory, where it can be more… a lot of junk, then actual words, then a lot of junk. It’s easier to identify looking for those specific values using keyword searching.
I’m going to quickly show you how we do keyword searching in BlackLight, and some ways to see data, but the only limitation on keyword searching is, every time you want to look for something new, you’re going to have to run that against the whole drive. So, if you have 100 keywords that you run upfront, that you always run on a case, then you can put those in to the keyword search and it will look everywhere for them. But then, if you want to do 101, the next one, you’re going to have to re-search the drive. It’s still going to give you the most comprehensive results; but it is going to take a little bit more time once you add additional words to it.
So within BlackLight, we have these two types of searches, and the content search is what we would do for keyword searchings. If I want to add a keyword search here I can choose where I want to search, I can search in all my files, I could search against specific areas; I might want to just search the Windows and the volume shadow copy that I’ve brought in from Windows. And I can even narrow it down by path: maybe I only want to search the Users directory, I can do that. I can also choose, do I want to search content or file names, or only file names? I’m going to choose content here. I can choose things like case sensitivity.
Deep search is helpful when you have files like .doc and .docx. So .doc the text is on the drive, but .docx you have to look inside the container. And then another thing we could do if we made that date range search – maybe we only want to search documents from 2017 – we could use that one that I have just one, the date filter, and limit my keyword search to that area.
Another thing that is helpful, especially if you’re going to look across, say, the whole Users directory, is maybe you know you don’t want to look at executables or bin files. This is excluding them, so you wouldn’t want to put anything in there that you do want to search. And you could also search for whatever keywords you know about your case. So from our initial notes, we know that it was about a stolen Ford Mustang, so I could put in ‘Mustang’ and ‘Ford’, and I could run this keyword search. So I can either save it, so I have it going forward, or I could run it, and that would allow me to do a keyword search looking anywhere on the drive for it.
So those are some ways to speed up regular keyword searching – it’s still a powerful technique – but today I want to focus on index querying, since that is new for us.
So this initial implementation of our index query is going to focus on the files that are on the drive. So when we think about an index, it’s actually a two-part process. The first process is going to be processing all of the documents on the system, to pull out what words are actually in those documents. So we are focusing on things that are like words, like the word ‘Josh’ or the word ‘tree’ or the word ‘Ford’, or whatever. We’re looking at words, not necessarily if they had had letters and numbers mixed together. So if it’s kind of in the Leetspeak specification, that’s not considered by most indexers a word itself. On words, you also get to index the metadata, so you could look for things like ‘device is athlon4’. Those are optional types of searches that typically, forensics index engines allow you to do.
You do spend some time upfront building the index, pulling out all the words in the case, but then it’s going to be very quick to run your different searches to look for different types of words. And then the other piece that’s an advantage to the index, is you can locate pretty easily – where it’s going to be more complex to do that with keyword searching – words that are near each other; in a specific order; and you can combine that with things like date ranges and paths. So there is an upfront cost to build the index, but once it’s built, you’re going to be able more quickly to follow the lead in your case, to find the data that you’re looking for.
So again, over on the left, we now have our index searches – I’ve set up a couple of these for us to look at – and the first one that we could look at is going to be called ‘Mustang.’ So I am going to grab that one real quick. Give me a second. This is why you shouldn’t do all of the things live. Do we have questions, Julie, while I send this piece?
Julie: Yes. So a few questions have come in during this section. The first one: Does indexing search slack or unallocated?
Ashley: Indexing does not currently search slack or unallocated. We want to focus indexing on the areas that we think are going to be most relevant for you to look at. So again, keyword searching is going to be better for doing any sort of unallocated or slack piece, but when we’re looking at the actual data within the files, that is where index is going to make the most value.
Julie: Got it. So is searching the index case-sensitive?
Ashley: Searching the index is not case-sensitive. So I’m going to show you how we can search the index here. So I have the search here, and I have my terms. But you’ll notice that ‘Mustang’ has a capital ‘M’, and ‘ford’ has a lower case ‘f’. So if I run this query I’m going to be able to see things that have the word ‘Mustang’ or ‘ford’, and I can highlight multiple of these files, and I’m going to see them in the hit context down below.
So I can see here, ‘ford’ and ‘Mustang’ are both highlighted, and I can see that those words are in the files. So I can actually highlight multiple of the files – this is individual files that are responsive – and I can go through the hits themselves, down in this middle section, with some previews of that particular area.
So those are ways that you can look for files that have either the word ‘Mustang’ or the word ‘Ford.’ I did mention that we can also look across more complex pieces, so the operators that we have are ‘AND,’ ‘OR,’ or ‘NOT.’ We can group things together using the refs. So if we were looking for our created date, or modified date, or accessed date, and we wanted to do it in a particular path, and make sure the file size was larger than one, and it had ‘Mustang’, we could actually combine all of those pieces together and run it in one section. And then we could go ahead and highlighting one will give you just that hit; if you highlight multiple files, you’ll be able to search across those files for any of the hits.
And the last piece that we can do is looking for things within a number of words of things. So if I wanted to look across ‘Facebook’ or ‘Josh’ I could look for things within so many words of the others. So this allows me to say “I want to find the word ‘Josh’ within five words of each of these being considered a word, to look at them together.”
So there’s a lot of different complexities to indexing. One of the things that we’ve provided is a cheat sheet for how the operators work. And that is also available for you in the release notes.
There was a question about what index is used, whether it’s DT-search based or Lucene-based; this is a Lucene-based index. It is our own implementation of it, and you’ll notice that any of the Lucene or elastic searches query language, if you’re familiiar with either of those, those are going to be similar to what you’re using as far as our searching.
Were there any other questions, Julie?
Julie: Yes. So, how large are the index files, and how long does it take?
Ashley: So the index files are a lot smaller, because we’re not trying to index things that are not as valuable. So you’re going to have index files that are significantly smaller than your case file, or even your evidence file, because we’re focusing on things like documents and not things like executables or memory files. As far as speed, that’s always variable. I think if you try it, you’ll be pleased with how quickly it does run, but the indexing upfront is going to take you a similar amount of time to, like, a keyword search; maybe a little bit longer than that. And then once you do the queries, you can quickly go through queries of data without having to wait.
Julie: Great. And how about saving index searches? Can users save them?
Ashley: Yeah, so every one that I’ve added here I can go ahead and choose to create and save another one. So these were ones that I actually had saved in the case, and then if I want to add a new one, I can go ahead and add a new one. And once I name it, it will be saved here. If you want to copy any ones that you already had from other cases, those will be available for you as well.
Julie: Great. And the last question from this section will focus on: Can I exclude files from being indexed?
Ashley: So right now, we don’t do any exclusion in the index options, so when you look at the evidence status for indexing, it’s pretty much off or on; you’re going to be indexing the files on the file system. The only way to really accomplish that would be to filter down which files you wanted to index and make a logical evidence file of those, and bring those in, and then that would allow you to just search those files themselves.
Julie: That’s great, thanks Ashley.
Ashley: Alright. So the next part that we can look at, as a time-saving technique, is to make sure that you hand off data early that will help to make you be able to narrow down your case, and also to get other people started so that they can give you more details back.
So this can be especially helpful in mobile cases, but it can also be helpful in a regular case when you get a request for something reasonable, like “I want to review all of the documents, text files, PDFs, etc. from 500 machines.” If you show them a report that says “That’s 10 million files,” they might come back and say, “OK, I don’t want all of those, so let’s go ahead and narrow down what I’m looking at.”
So there’s a couple of ways that we can do that. We can look at the actual information about the device, to give them an overview of, “Hey, this is about how much data is listed for this particular device.” We can also give them information, again, pulling out that actionable intel: “Here’s the top contexts this person has been working with; if you’re talking to Josh, you might also want to talk to Arlene, or Evan, because they do a lot of messages,” or “Graham Gibson seems like he messages him, calls him” – those kind of statistics really help you focus on which relationships you should be looking at first.
So the other piece is sometimes just doing an early report. So our reporting is really simple, and we can go ahead and look at a report here that is going to be just on this phone that I have ingested here. So if the investigator just wants to know who has this person been talking to, who have they been calling, and are there any other people that are important for us to focus on, you can quickly, without having to do additional processing or do anything else, bring in the phone and choose to just give them the basic information about the device by selecting the device; and then choose under Case Data that you only want to run a report on this selected evidence.
In this case, if I want to do it just on Tenisha’s phone, I can choose that one, and I’ll unselect these, and go back to my report, and then I can choose to look at the voicemail and calls just for that particular person. So I’d be saying “OK, here’s what’s on the phone, here’s an overview, and then here’s the most recent emails, and then here are the most recent calls, and how long, and what their status is.”
So quickly going to export these and give that data over, so that they can start working the case right away, allows you to go right back and focus on some of the more deeper work.
Alright. Another way is to just prioritise which device you’re looking at first. So then you may want to determine which machine is the hottest – which one you want to look at first. We talked about content, but you might also want to look at a timeline of activities; which apps have they used most recently? So we could look at things like usage, who’s logged in to specific machines; if I was looking for Josh’s machine, is there a Josh account even on that machine, or not? Also things like file downloads, we covered that earlier.
Having a list of your favourite places to look at depending on your case type will also help, and you can always look at things by a time range. So if you just want to look at a specific time range, to narrow it down, you can do that using some of the built-in filtering and artifacts that are extracted.
So we know that datasets are not going to be getting any smaller, so having a plan before you get started and beginning with the end in mind is one way for us to make sure that we use techniques that make it quicker for us to work through the devices that we have.
Making sure that you have a tool where you can bring in all the evidence and also can extract out just the pieces that you want to look at is helpful; and then heavy use of filtering and tagging will really help you to narrow your focus.
The better you know the tool, and the more you set up some of those preferences, like we had the filters here, or having some of those date references built in, the categories built in… the more you know about your tool, the better you’re going to be able to quickly move through your case. So spending the time to know your keywords, knowing what things take more time, and then sharing that information with other investigators, whether they’re on your own team and you say “Hey, we’re on this kind of investigation, use this case template, it’s going to help you get data faster” really helps us to focus on which devices are important.
So were there any other questions, Julie?
Julie: Yes, we’ve had a lot of great questions come in, so I’ll focus on a few right now, and then we’ll respond to everyone individually after – I want to just be mindful of the top of the hour mark.
So what report formats are available?
Ashley: So… we can see which ones are available right here, you can go out to HTML, PDF, docx, txt, CSV; all of those work to be able to export to.
Julie: Great, and when will all these features be available?
Ashley: So all of these features are available in BlackLight 2019 R1, so you guys should be able to use them right away, as soon as you download the new build. So if you go to our software downloads, they’re available there. And all of these are covered in a little bit of step-by-step in our blog posts, as well as in the notes. So I highly recommend, especially for the index, all of those pieces are covered in the release notes.
Julie: Excellent. Well Ashley, thank you for walking us through all the latest features in BlackLight and showing us how examiners can find the most relevant information as quickly as possible.
Again, if you submitted a question and we didn’t get a chance to answer it live, a member of our team will reach out and follow up with you individually. I’ve seen a few questions come in about the recording, and yes, we will be sending this out once it’s available, so stay tuned!
If you’re interested in learning more, like Ashley said, we have a blog where we outline a lot of this, as well as our products and services; but you can also email sales@blackbagtech.com. And don’t forget to follow us on Twitter, Facebook and LinkedIn, and subscribe to our blog.
Thanks again Ashley, and thank you all for joining. Have a great day!