Complex Digital Data Calls for a New Breed of Data Hunters

Panelists:
Carolyn Casey JD, Sr. Director, Industry Relations at AccessData; John Wilson, President at Discovery Squared, LLC; John Grim, Investigative Response Team Lead at Verizon; Jason Britton, IT Technical Engineer at iHeartMedia

Join the forum discussion here.
View the webinar on YouTube here.
Read a full transcript of the webinar here.Carolyn: Thank you so much, and welcome, everybody. We are thrilled to have you here today. My name is [Carolyn, as] [unclear] mentioned, and I’m an attorney, I work at AccessData as Senior Director of Industry Relations. And I have been in the e-discovery and information [unclear] management industry for about ten or twelve years. Love it, and I get to follow trends that impact our customers, and hopefully bring them some insight and thought leadership in those areas as well as help the company here with the strategy.

In a moment I’m going to introduce our wonderful panel of speakers who are with us today. We are glad you are here. Thanks for coming.

Okay, we do have a lot of great content for you, by the way, today, and we want to have the experts have time to give their case studies, which they’ve prepared. So we may not have time for questions, but please do send in questions. We’ll be sure to get back to you via email if we don’t get to them today.

But again, welcome to this webinar. We’re going to talk about this new breed of data hunter that is taking over IT [unclear] at corporations, surviving and managing through the deluge of digital investigations. Really, what we’re coining the new primal hunting and gathering instinct. These teams are getting asked by lots of different corporate teams to quickly find information so they can analyze it. They need to look across networks and devices, and all sorts of different data sources to pull up data for an investigation or a regulatory inquiry or e-discovery. This is really calling for a new type of hunter, as we call them, who helps these teams with their investigation.

Today we’ll walk through… I’ll introduce the panellists, they will talk about the new hunter breed and the evolution they see happening for that role and for the organisations today, and then we’ll talk about how the hunter interacts with the tribe, namely, and frequently, legal, HR, compliance, and other groups that need their investigative help. And then we’ll move into case studies, where the panellists will share some details on three different case studies to give you some practical insight on managing all of this.

I now have the great pleasure to introduce what has just been a wonderful team of folks to get to know, your speakers today. I’ll start by introducing Jason Britton of iHeartMedia, formerly of [RAMCO]. Jason is an IT Technical Engineer at iHeartMedia. He’s worked with a number of companies as an expert in internet response, he’s done policy development and training, incident response investigation, he works in forensics, malware research, extensive career in this area, and he really has a passion about data security and the technology that supports it. He and the others really are this new breed that we’re talking about and celebrating, in a way, here today. Jason holds numerous certifications, as you can see there. Welcome, Jason. I’m also going to introduce –

Jason: [Thanks for having me.]

Carolyn: Oh, you bet. Going to introduce John Grim of the Verizon RISK Team, and John is the investigative response team lead there, and also, interestingly, the primary author of a great study, the Verizon Data Breach Digest. Mr Grim brings 14 years of experience in conducting digital forensics investigations with both the government and civilian security sectors. Currently John is on the Verizon team providing services to clients, not to Verizon itself. He leads a team of highly trained technical digital investigators, and he responds to cyber-related security incidents, conducts on-site threat mitigations, breach research, containment activities, you name it. He too is one of these new data hunters and analyzers.

Also, welcome to John Wilson of Discovery Squared. John is Presidnet of that company, he is a licensed private investigator, certified forensic examiner and information technology veteran. John brings two decades of experience to the party, working with the US government, public and private companies, and he serves clients in many different industries, as a trusted advisor to law firms, corporate legal departments, outside counsel and executive, and best practices for litigation readiness. So again, we’re really fortunate to have this wonderful team of experts.

Let’s start by talking a little bit about what’s going on inside corporations in their area of digital forensics, forensics investigations of data. There’s a lot of external and internal forces that have really pushed the corporate digital investigations to change and evolve. And we’ll have the panel comment on what’s driving that change, and then tell us about this new breed of data hunters – what do they look like?

John Wilson – we have two Johns today, by the way, so you’ll hear me calling them by their last names. So Wilson, what kind of evolution in enterprise digital forensics are you seeing and can you talk about this new breed a little bit? What group inside corporations do they fit in, who do they report to, what are they all about? What are you seeing there, John Wilson?

Wilson: So, the industry is changing, evolving as technology continues to evolve and data sets have gotten to tremendous sizes, where an average computer can have a 2TB hard drive and be filled with content that has to be searched, and in addition to that you have the complexities, as business have evolved and the e-discovery requests have continued to grow, and… originally it was a little IP theft and litigation type responses, to now, you’ve got to deal with malware, you’ve got to find and identify malware, and prevent the zero day exploits within your company network, you have to prevent data leakage from the organization. So it’s no longer just an HR and a legal thing, you’ve got compliance departments that have to handle government requests, you have departments from all over the organization that now suddenly have e-discovery needs, they’ve got to find out what documents they have, what documents have been distributed or accessed, and that haystack of documents has gotten twenty times the size that it was just five years ago.

Carolyn: Yeah, exactly, and there’s privacy regulations, [hits on] data protection, there are cyber security regulations now, there’s… personally identifiable information needs to be protected. It just seems like there’s a lot of forces in the regulatory area impacting it too. Yeah, thanks for that perspective, John. Maybe we’ll turn to John Grim of Verizon. John, what do you see as the new technical challenges that digital investigators are facing inside corporations? How do they overcome them, what kinds of technology and tools must they be able to work with as the new hunter gatherer?

Grim: Sure. Actually, one of the biggest challenges – and I think it was mentioned a little bit earlier here – is the sheer amounts of data that is essentially involved with a case, and in particular data breach cases. With hard drives storing more and more data, and more and more systems storing data, gone are the days when we could just simply take a hard drive out of a system, do a bit-for-bit image and analyze it. We’re faced with servers, multiple servers, and we need to go ahead and scope down, triage down, and focus on which servers are going to be the most [eligible] in terms of forensic analysis.

The second thing is probably the environments. When I first started out doing digital forensic investigations we were doing a lot of [dead box] selection and [dead box] analysis. Now we’re faced with doing collection in live environments, and the live environments collection [unclear] investigations, you’ve got to get that live system image, and that volatile data. Furthermore, we can’t necessarily shut down the servers that are involved, they’re needed for business purposes. So you’ve got to be able to adapt and use tools and go ahead and collect against those live systems.

The other challenges also are the operating systems. Are you dealing with Linux machines, Windows machines? Various different versions of Windows machines, Macintosh, perhaps even mobile devices. So you have to be very flexible and have a toolkit that can [cover down] all those different types of operating systems and different types of systems themselves. And furthermore, in terms of environments, the location is a challenge. Are you going to be able to go onsite and collect or do you have to do the collection remotely? And if you do the collection remotely, you certainly have challenges there with [throughput], in terms of data speeds and the capacity of pulling that data across the wire.

And then I think probably the third technical is – it’s maybe not really a technical challenge, it’s the challenge of working with all kinds of different, other stakeholders, technical and non-technical stakeholders, different backgrounds, and more specifically, different rules and responsibilities within the incident response situation you find yourselves in. Legal folks are involved, corporate communications are involved, human resources, as well as IT security. So everybody has a role to play, so it’s no longer just focusing on that hard disk drive and doing the analysis, it’s actually working with everybody else to give them the information they need so they can perform their jobs.

So I think those are probably the biggest challenges.

Carolyn: Great. That sounds like a couple of great insights there. Thanks, both of you, for that perspective. We’ll pick up on what John was just talking about there, and talk about the hunters and the tribe, the stakeholders that they work with and in a way serve, almost their internal customers, and what’s been changing there. Jason, why don’t we go to you a bit? Maybe you can talk a little bit about how the digital investigative team works with the stakeholders, aka the tribe. What kind of communications goes on, how is data shared? How does this tribe all work together efficiently? And maybe if you see some bottlenecks or places maybe where efficiency gains could be had, comment on those too if you will.

Jason: Okay. I’ll be happy to. The communications has been a big problem for a lot of different groups we worked in. Just because there’s not a policy defined, there’s not a lot of experience in working with this huge level of… John was speaking previously, the huge amount of data we’re dealing with, the huge amount of… the shift from dead box to live box, the shift from… you can’t kill the servers, so the servers have to stay on.

It’s a complete shift in the way we do investigations, and [unclear] we have to communicate with all the different teams and stakeholders, everyone from HR to legal to external investigative entities as needed, to [certs], to collecting data from vendors, to getting vendors involved in our investigations. It’s a whole new game in the communications that are necessary. We have to be able to be consistent and stable in our communications of what we need, what we want, what’s expected as our deliverables from our forensics work and from our investigation work, as again, it’s… the data difference is so huge, it makes it hard to communicate to some of these traditional entities what we’re dealing with.

When we’re asked for different things like internet histories, it’s a totally different game now than it used to be. It’s not just one browser on one computer. It’s multiple browsers, multiple devices, it can be on a thumb drive, it can be on a thousand different places, and it just makes our hunting and communicating all the needs and information back and forth just all that more difficult.

Carolyn: Yeah, yes indeed. Great comments there. Thank you so much. One of the things that [Sharon McKinnon], an analyst with [Forrester], observed yesterday on a talk I was sitting in on… she commented on the growing extension of e-discovery technology beyond legal, to help compliance and HR and other see patterns and gain insights from enterprise data. And in fact, we’ve had requests to train HR and audit folks on using e-discovery technology, whether it’s email analysis and being able to see communication patterns between multiple custodians, potential custodians, or using some of the analytics even, to gain insights about the data pool that you have to review and analyze. John, can you talk about this a little bit? Are you seeing compliance, HR, other groups starting to use e-discovery technology or looking at that now or in the future? John Wilson, sorry.

Wilson: [laughs] Yeah, HR, legal, compliance, all of them are getting significantly more interested, and they want to see those data maps and they want to see the communication, the timelines, and the communication patterns, who’s talking to who, it’s becoming much more important as these data volumes have increased to such a substantial level that you don’t have time to review 80 TB of data in order to complete a case.

You have to start figuring out, hey, here’s my point of entry for whatever incident has occurred, and I’ve got to start mapping around that, so you’ve got to be using these technologies to really build and define those maps, and then deliver it to HR or compliance or legal, whichever division you’re dealing with and providing that information to, because they’re the ones that are going to have the context to say, “Okay, oh, we see that the data is moving from here to there, so now we’re interested in this data over here,” whereas us, as the hunters, we need that guidance, because again, the data volumes have gotten so large, it’s hard to stay linear and just walk through all the data. There’s no longer the time for that.

Carolyn: Yeah, it’s almost an iterative process where the party that needs the information might look at early batches and say, “Hey, in this email I noticed that this guy was also mentioned. Have we collected his data?” or “Can you get me some more information from this repository where we found lots of data on X, Y, and Z?” Really quickly, because I do want to move to the case studies, but how are the chief compliance officers and VPs of HR or folks in ethics and compliance groups… once the investigation is complete, you guys have done the forensic collection and done some forensic analysis, how do you give the data over to the folks that need to look at it and how do they look at it if they don’t have an e-discovery technology, how do they go rapidly through emails and files preserving metadata etc? Could someone jump in and comment on that?

Wilson: I can certainly do that. And that’s again, ten years ago was, “Hey, here’s my report.” It was all printed out in paper and I handed them the report, and they would read through it, dig through it, and say, “Hey, that’s what I was interested in, great.” Today those volumes are so huge that you really do have to start looking at, evaluating the technology and figuring out a way of, “Hey, now I can push my whole report or my timeline to their review interface, whatever that may be.” And figuring out those technological ways of handling those things so that you can then, again… it’s finding a needle within a haystack, a needle, so you’ve got to have something that allows them to run down that rabbit hole without having to go through that linear approach, and you just can’t do it in paper anymore, it’s got to be digital, it’s got to be electronic. You got to have a review platform or a data platform to try and submit that information and allow them to see that, review that in order to be effective and efficient and [timely].

Carolyn: Yeah. What we really see is the data needs to be found quickly so it can get to analysis. Because the folks we’re looking at on the screen here, they’re all under pressure to respond to a request from a regulator or a very sensitive matter. So speeding up the process, managing the risks throughout it, and getting them the data they need, and then having tools to analyse it seems to be where we see a lot of evolution and change going on.

Alright, well, let’s shift into our next section here, which is… we’re excited to bring you some down and dirty insights into a couple of use cases, almost. And the panellists will set the context for you, as you see on the slide, that first slide there, and then walk you through the situation of a specific investigation that they were involved in or that they [unclear] to [protect the innocent]. They’ll talk about the workflows and stakeholders and the challenges and the approaches that were used, and then finally share with you takeaways, lessons learned and [unclear].

So John Grim of Verizon, I’ll turn it over to you for your case study.

Grim: Absolutely, thank you. [As you said,] folks can see here, I’ve got a slide here for cyber-espionage threat actions, and what I wanted to do was concentrate on a scenario that we’ve seen in the past involving [content] management systems. For those not familiar with [content] management systems, these are ubiquitous, they’re used for various [cipher] purposes such as publishing or modifying [consent], organising data, or even managing users.

So what you can see here with this slide is the cyber-espionage threat action. And we had a particular scenario involving a CMS compromise that involved a lot of these underlying threat actions here. What I wanted to show you is in terms of espionage what we see a lot of times is a complexity that involves a hacker, involves malware, it usually involves some kind of social engineering, as you can see [unclear], and then that malware is specialised – it usually has some kind of command control component to it, or a backdoor, or it steals credentials. So this particular scenario I chose illustrates the cyber-espionage. And just incidentally, in terms of the threat action motivations, looking at our caseload, espionage is second only to financial motivation. So 80% of the data breaches that we’ve seen over the years have a financial motivation behind it, [9%] have espionage.

So in this scenario involves a CMS compromise, and this has elements of cyber-espionage as well as financial motivation. And in this particular situation, the CMS, which is, as I mentioned, ubiquitous, was targeted by threat actors. And the typical scenario here, we [unclear] holiday. And this involved actual, real-world pirates who were attacking global shipping conglomerates’ cargo. And the victims actually noticed a change in tactics from these pirates. They noticed that they weren’t just attacking ships at random, locking up the crew, or the crew locking themselves up for safety purposes all day while the pirates would rummage through the cargo crates. The change in tactics involved targeted attacks. It seemed like specific vessels were targeted, specific cargo was targeted, specific containers were targeted within shipping vessels.

And the crew also noticed … because normally they would lock themselves in a room and be there all day until the pirates would leave. But it seemed like the pirates were getting on and getting off the ship very quickly, and only specific types of cargo was targeted. It turns out these are high-value crates containing diamonds and jewellery, some valuables. So digital forensic investigators were called in to determine if there was any kind of digital angle behind this targeted attack by these modern-day pirates. And as it turned out, when we looked at this situation, we were dealing with content management systems. And in particular, what we started to do was we started to focus in on how would the information be gained by these potential pirates who can know specifically what vessels to target, specifically what cargo to target.

And it turns out what we ended up having to do is we took the approach that involved what we see many times nowadays in forensic data breaches – a complex attack needs complex methodology. So instead of just collecting images of servers that we suspected were involved, what we did was we set up a network forensics capability. We were able to capture packets, we were able to look at net flow, we were able to look at network logs and narrow down to the specific server that looked like it was involved or was being leveraged by these pirates for the attack.

So what we did was we collected logs, net flow data, packet captures, [unclear] system images, and we collected volatile data as well. And in addition to that we had a malware reverse engineer on hand to look at any malware that we found. And it just so turned out that these threat actors were leveraging a digital component where they were having someone go in and hack into this environment, get into the content management servers and access this data, things such as shipping routes, the schedules of ships, the inventories and the bills of lading, and were infiltrating this information by way of malware that they had installed on the computer systems, and were feeding that information to these pirates, who were leveraging it for targeted attacks on certain vessels.

So this is a good example of a complex situation where we wouldn’t really have the time to sit and image a whole bunch of systems, and then maybe a week after we’re completed with our imaging, start looking at it, and we needed to go into a live environment, capture network data, capture endpoint data, capture volatile data, and then look at any malware that we found. And it turns out that the malicious actors, the digital threat actors had a malicious [unclear] that they were running commands, using the [unclear] to upload and download data, and this data included, as I mentioned, those bills of lading, the crate numbers, the vessel numbers, and the shipping schedules.

So one of the things [unclear] this particular scenario we see all the time with our data breaches, is our job just isn’t over after we collect [a certain] analysis. What we need to do is feed real-time advice and guidance to the victims in terms of containment, eradication of malware, remediation of the system, as well as recovery.

So in this particular case we had to advise the customer to take certain systems offline. We advised the customer to go ahead and block the threat actor IP address. We also advised the customer or victim to reset compromised passwords, to [unclear] those certain CMS systems that we had identified as being compromised, and then we also advised that they implement periodic vulnerability scans and then take a more formalised [unclear] approach.

So what we did was we advised, from start to finish, how the victim can get themselves back on their feet again, and hopefully prevent or mitigate this type of situation from occurring in the future.

[unclear] really good example of the new complexities that we’re faced with in terms of threat actors, in terms of the environments that we have to work in, as well as the speed with which we have to operate to not only collect but also parse the data, analyse the data, and give timely feedback to the victim, so that they can go ahead and combat the situation or the threat actor that they’re currently facing. So [unclear].

Carolyn: That was just really fascinating, thanks so much for sharing that anatomy of an incident response. Loved the pirates factor, and I’m picturing guys with [unclear] hats on and black patches on their eyes. So [good colour too].

[laughter]

Carolyn: Thank you very much.

Grim: You’re welcome.

Carolyn: Okay. We’ll move now on to kind of a human resource case study example from Jason. Jason, take it away.

Jason: Alright. Well, ours was a little piratey, but it was not from the vector we expected again. Maybe we should have done our research a little better. We came to find out, like the slide shows, a large percentage of attacks were carried out by insiders. We hadn’t properly prepared, our company wasn’t prepared, we didn’t have procedures in place.

So this all started with an alarm from our [sim], [posted to our] [unclear] guys about a PUP, just a simple potentially unwanted program, just one of those awareness things, they’re not really concerning. Traced it down, found it was on a thumb drive, found it was on an approved thumb drive. That started setting off some alarms and bells and whistles in the right places. So we were asked by management to begin tracking the person’s usage, just pull a basic panel of what they’ve been doing with the thumb drives, and their basic activity on the web recently. So we reported to HR that we [were needing to do] this investigation, they contacted legal, legal sent the go-ahead, and we brought them in with all the details and gave them an initial report. Then began tracking the user’s activities, and found that he had been browsing regularly on job-hunting sites, [unclear] email and awful lot, and also plugging in multiple devices from home while accessing network shares.

So obviously this set off a lot of alarms for us internally, so we began doing [packet] captures on all the data going to and from his station, we began daily loads of information up to management, turned on a new tool we got that allowed us to track every file that was written to or read from the thumb drives on his endpoint device, which made life a whole lot easier.

We were actually able to establish exactly what files he was copying, which turned out to be, again, private files and databases of customer data, and research data that should not have been leaving the company and was confidential. So as we gathered all this data, then it became the time to decide how to present it to legal and HR. We didn’t have a good system for it, and it was a lot of data, so it became a very manual process. I wish we’d had a proper e-discovery tool in place for this. But we… days of presentation and data to legal was finally brought in to law enforcement. And then we started learning that our company didn’t have the policies and procedures in place to properly handle it, so it took a lot more digging on our end to get it handled by the police. But it came to…

Our biggest takeaways from this whole thing was we didn’t have a proper policy in place and procedure for how to handle thumb drives, how to handle internal investigations, and how to properly work with law enforcement on handling this. While we thought we had all this marvellous incident response, thought we had all this marvellous stuff, all the data we had to dig out, we couldn’t present it in the proper way, couldn’t correlate it properly, and it was taking a lot of extra man hours to do.

So for us, it showed that while we can trust our employees, we can’t trust them that much, and we didn’t have the proper mitigation, logging policies, anything in place for remediation. It took us weeks to do the investigation while the user was siphoning out data. It did end up with… we found out that the user was hunting for a new job and had been offered a better job at the place he was looking at if he could bring over certain kinds of data. So it was a very questionable deal he was working through overall. So I guess for us it came down to lack of tools, lack of proper visibility into it, and it really changed the entire operations management, and how we handle our communications between HR and legal to get them a better e-discovery tool and get them actually on the same page as us so we could present the data in a way that made sense to them, instead of days of having to show reports and log lines that didn’t translate well into their world.

Carolyn: Fascinating. And by the way, for the audience, as you’re hearing these amazing, detailed case studies, please do type in any questions you may have for the panellists here. It looks like we will have some time for questions at the end. So thanks for that great anatomy of that internal threat investigation around employee theft of intellectual property, which probably happens a lot more than any of us want to admit in our organisations. Alright, let’s move on to John Wilson, and he’s going to walk us through an anatomy of a large, complex e-discovery matter. John Wilson, the stage is yours.

Wilson: Alright, so I guess I want to set the table, in that in 2016, data sizes have gotten very large. Everybody knows hard drives have gotten large. But also understanding custodian… an average user in a corporate email environment receives anywhere between 40-50,000 emails a year. A C-level user is probably close to twice that. And that’s a large amount of volume just in email. Then you start talking about smartphones – smartphones now you can have smartphones that have 500 gigs data, and text messages, your millennials that work for the various organisations, they do a lot of things by text. And those numbers are probably very conservative – 85 text messages a day is a low number for an average millennial employee, very low number. But that’s overall, across your whole organisation.

And then you got thumb drives obviously, you can have thumb drives that are no bigger than the tip of your finger, and they hold 256 gigs a day. The one that most people don’t think about when they’re thinking about litigation and cases is the social media interactions within our organisation, people that are representing your company, that are carrying on chats on Facebook or chatting with customers or with other employees within the company, and actually getting work done. And your average user in the workplace has five or six social accounts, and they’re sending hundreds of messages each day on each of those platforms. And that just brings together this whole picture of volumes of data.

And as we go into the case study here, we’re talking about a large metropolitan transit authority has multiple internal agencies, it’s broken into small departments, and was involved in a half-billion-dollar litigation. We had 850-plus custodians that needed to be collected from. And if we were to actually collect all the data in a traditional forensic sense from all of those custodians, we’d have had over a petabyte to a petabyte and a half of data that would have to have been processed and analysed. And in today’s world, that’s a lot of data to chew, and that’s even more crazy when you start talking about getting into the review of that data and involving the attorneys to do that review.

So we had to really get creative with our workflows and figuring out how we were going to deliver the documents that needed to be delivered for this case. We also had to defend a spoliation claim as well as, then, get the documents delivered to the opposing side within the constraints of a court-mandated delivery time period, which was under six months.

So it became really critical that you had to preserve just the relative data for each custodian, so we actually started doing custodian interviews, and really started figuring out where do the custodians have their data, what kind of data do they have, and then starting to apply those technologies, and we really needed to get into using the fuzzy hashing to start saying, “Hey, just finding duplicates isn’t going to reduce the population enough. It’s not going to get us from a petabyte and a half down to 200 terabytes or 500 terabytes even. So we had to start getting into… we need to start doing some fuzzy hashing, we need to figure out, hey, is this document pretty similar to that document, then let’s identify what the differences are and what we need to determine from there.

And then a big component of the spoliation claim actually had to do with four custodians out of these 800-plus custodians using social media to communicate about route closures and things of that nature within the transit authority, and dealing with those issues. So legal hadn’t even thought of social media really being a key factor. We started delving into it, we discovered that hey, they were communicating using social media, so they were sending Facebook messages to the guy on the other end of the line to say, “Hey, we’re going to have a closure from here to here.” And we wound up going from four custodians that we ultimately identified as the key people with this social media data to actually having to collect social media from a hundred custodians. That volume of data…

Carolyn: Wow.

Wilson: Go ahead.

Carolyn: No, I was just saying wow. That’s huge.

Wilson: That’s huge. And it was something that was never even thought of on the frontend of the case side. They’re assuming that employees have sent emails to say, “Hey, we’re going to have a route closure,” or whatever the case may be, and it was only through really strong investigative work and building those strong workflows to really limit that data, start building those maps of who was communicating to who, and what kind of data was really relative to the matter in getting that data to be proportional, to make sure we’re getting the right amount of data that meets our case needs, and that it’s all relative, we’re not capturing whole hard drives. You have 800 custodians, you can’t capture whole hard drives. You just don’t have that luxury to get that data collected, processed, reviewed.

Carolyn: [unclear] more relevant, targeted collections of the highly [relevant data], is that what you’re saying?

Wilson: That is correct, yes. You had to focus, apply proportionality rules, and then start looking at really relative-only data, finding very specific keywords, building those heat-maps of all the keywords within the case, and start to identify those really relative keywords and pull that data down, so that we actually reduced the total data set that had to be collected and analysed for the case to under 200 terabytes.

[crosstalk]

Wilson: Yeah.

Carolyn: Yeah, and that had to be reviewed. [laughs]

Wilson: Yeah, and you still –

Carolyn: Was [TAR] used at all on this? Technology-assisted review?

Wilson: I’m sorry?

Carolyn: [unclear] Was technology-assisted review used at all on that still very large corpus you had? After the [unclear]?

Wilson: The TAR was used after the culling, after we got it down into that 200 terabytes, because again, still… that was exactly legal’s statement to me, was, “200 terabytes is still a lot of data! That’s a lot of documents! What are we going to do? We’ve got 90 days to respond here.” And that’s where we had to go into using that technology-assisted review to further refine and really pull out those relative sets of documents.

Carolyn: Yeah, we see people using TAR in combination, like it sounds like you did earlier, with de-duplication and keyword searching, and that kind of thing. It gains more credibility, TAR that is, in legal circles. I think very big law firms helping clients in rapid, large and complex [unclear] are more apt to be adopting technology-assisted review or machine learning in review. And the rest of the legal community is coming along. This is quite fascinating. I know that we did this survey recently that we asked over a hundred global law firms what were the hot e-discovery issues that they were on their watch, if you will, what’s out there that they’re still working to and concerned about. One of course was mobile devices, which we’ve touched on, just the volume of data that’s even on a single device that you have to preserve, collect, and then analyze. And also, number two was social media.

As John has just been touching on here. That need to mine all sorts of different data types, like social media, for insights on particular cases. And the last one would be internet of things, which is certainly a horizon and complicates digital investigations and collections as well.

I wanted to, this time, invite the other panellist… Since John Wilson has just finished his case study, I wondered if John Grim or Jason had any comments or maybe an idea popped for you as John walked us through this large, complex case. Jason or John Grim, any thoughts? If not, maybe we can go back to Jason’s presentation there, with that very interesting case study on intellectual property theft by an employee. John Grim and John Wilson, as you have looked at similar investigations, did anything pop for you in this particular one or areas where you see what’s working what’s not working in these kinds of HR investigations?

Grim: [unclear] These types of investigations obviously are a bit different than data breaches. [unclear] primarily going to be HR and legal that are heading up these types of [unclear]. I’ve found that working with these incidents, you’ve got to really understand what the objectives are. One of the things that you can do that helps you, obviously aside from identifying which data devices [unclear] is a timeframe to look for. What is the timeframe of suspected activity – that will really help out. You can do timelines [unclear], you can [unclear] systems and their logs. Another thing too is to get a [unclear] of keywords that are relevant to that particular investigation [unclear] filenames, or names of particular proprietary information, products or devices or whatever that are involved. All of those kind of things, those keywords can really help in [unclear] things, because really, you’re not looking necessarily at malware. You can use different techniques to find malware. [What you’re looking for] is actual information that will [potentially infiltrate] out of the environment by the insider.

So identify what systems he had access to. To include social media helps, emails, mobile devices, [unclear] devices, file shares. Find out that timeframe of those suspected activities, so they can help you narrow down and limit… or [have further] [unclear] what you’re looking at. And then give that [unclear] keyword list that you can use to run against those evidence sources. And again, you need to really work with HR and legal for assistance for a lot of those things that you’re going to be doing.

Carolyn: Great. Thanks for that perspective.

Wilson: Just John Wilson, I was just going to add, one of the key things that I’ve seen especially in these kinds of investigations is your data breach or your data compromise can be in the megabytes, it can be [T], but it’s critical information. So you got to keep that in mind when you’re talking about now we’ve got to search and index a hundred TB and you’re looking for a 10MB compromise. It takes some really advanced work and some great policies, procedures and knowledge of your data to mitigate those risks.

Carolyn: Well said. And Jason, before we move on – your wonderful case study here, was there anything else that you wanted to add on this particular case study?

Jason: Nothing in particular. Again, I think the biggest thing is the lessons learned. The incident is itself… but learning where you can improve after each incident is I think one of the bigger things that you can do to improve your [stance] every time, as just the nature of your takeaway from that, how to do better in the next one.

Carolyn: Right. And getting the latest incident response and forensic investigation technology working for you, I take it, was kind of the takeaway there too.

By the way, let me invite the audience to send in any questions. Earlier we were getting a little bit of feedback that maybe we needed to speak up, it wasn’t quite as loud, but that may have been limited to a few participants, by the way. But please do send us some questions if you have any for the speakers here. And I think [unclear] why don’t we also invite Jason and John Wilson to comment on any thoughts they had as John Grim walk us through this compromise of a content management system, related to robbery on the high seas.

Wilson: Again, another big takeaway there that jumped out for me was interviews, and that is so often overlooked in digital forensic HR matters. Talking to people involved in a case, and having that hands-on touch, but then taking that information, not just having the interview just to have it, but taking that information and digitising it – putting it in, setting up your keywords and your parameters around the case and where those breaches may be, using that information. So often the interview gets done, and then it’s just a documentation of hey, this is where they stored stuff but they never actually digitised that information and turned it into intelligent assets to help you solve your case.

Carolyn: Yeah, it’s often overlooked, isn’t it? In our digital world, it’s still the importance of sitting down face to face and interviewing custodians or potential actors about where their data might reside, what kinds of activity they were under, and using that in complement with scanning across their network shares, their computers, thumb drives etc, to identify custodians and also identify where the information. It’s good to combine the two.

It sounded like somebody else had a thought there too. Please go ahead.

Jason: This is Jason. I was just [obviously going to note on] [unclear] made a big difference having the right tools in the right place, having thought about how to properly deploy those technologies at the right time to really help your investigation along without having tons of data just constantly being wasted, I think makes all the difference, and knowing what tools you have, when to deploy them, and when to let them sit it out. It makes a big difference.

Carolyn: Another question that we often get here at Access Data, that Jason, I might talk to you, and certainly the others can be welcome to talk about it too, and that’s: What can corporate digital forensics teams learn from law enforcement? It’s certainly one that comes to mind that I’d be interested in your take on, and this would apply mostly to large organisations, by the way.

But I’ve heard a very senior person at a large law enforcement, digital crime lab make this statement that the days of one hard drive and one examiner are over, that the data is just everywhere, and it can be stored on Google searches that we’ve done for our entire life, and stored on our thermostat or even voice-controlled TVs record live chats if you’re sitting in your living room. This whole internet of things and this explosion of data types we’ve been talking about – that that really calls for a new approach, he calls it a team approach. Because there’s just too much data. And [they look at] ways to manage a team of examiners, to make assignments, to corroborate, to shift resources as needed. “Here, you take the internet history; I’m going to parse down into this mobile phone data.”

Interested in your thoughts on where this new breed is going as part of a team, given all that we’ve talked about with these massive data types. [unclear] Jason, others?

Jason: Yeah, I think that’s an important thing to note – when I started out in forensics, it was very much that one-man, one computer, you went and captured a hard drive and made a report. I think being able to notice that the world is shifting and… I personally do use a team of investigators. I mean, this has gotten too big for me to handle. I can’t handle the whole mass of cloud drives and all their data and all their social media and all their… and being able to look to law enforcement for some of the trends they’re seeing, and also, I think there’s been a lot of give and take between law enforcement and the forensics community over the years. They’ve learned a lot from the corporate world, and corporates have learned a lot from them.

But I truly say that they have given us some interesting insights in some of these recent cases on the internet of things and stuff, and places that we need to look, and the ways that we need to shift as a corporate investigations group, away from that traditional – and even away from the little bit newer – into that cutting edge of the internet of things, of the clouds, the thumb drives, the ubiquity of our existence between our phones and our laptops and our TVs, and how all that plays together, and all the data pieces that we’re missing in between that can completely change the course of an investigation.

Carolyn: Yeah, how do we mine it all, that totality, and geospatial information and all of that, mine it for data and to respond to manage risks, such as what we’re talking about here. Other thoughts on what we were just talking about from the other panellists, and back to this new breed, and even what we might see on the horizon, or what additional challenges do you see coming for this new breed of data hunter?

Wilson: This is John Wilson again. Two things that I would comment on are mobile phones – there’s thousands of new mobile phones introduced every year. So that’s why a team approach is absolutely necessary, because there’s just so much [unclear] there’s too much for a single person to know. You need to balance strengths and weaknesses, and have a team approach that can cover the breadth and width of a matter that you’re involved in. And then secondarily, just new tech – bitcoin, investigations… bitcoin… traffic is certainly going to be of… or really blockchain, but bitcoin and blockchain are certainly of extreme interest in the coming years, even as healthcare and insurance start using it to manage their transactions and data transfers. I foresee that’s going to be a very prolific growing area.

Carolyn: Yeah, that’s interesting, financial services, banks seem to be exploring and moving in towards using the blockchain technology, which would eliminate a lot of intermediaries that data passes through, where there’s sort of a credentials, a small group of people who can exchange money, if you will. Yeah, that’s coming. Other thoughts on… from John Grim or Jason, and I think that what we also might do here… I don’t know if you guys wanted to comment on that subject, but I also think we could move to… maybe leave the audience with your one or two best practices for them, almost a summary observation of how these digital investigation teams are evolving inside corporations, how the demand for their services is growing, from HR and legal and compliance and audit, as the organisations are swamped with data types. What do you want to leave the audience with? And maybe we’ll go to John Grim and ask for your parting observations, John.

Grim: Sure. Data breaches, cyber-security incidents, they’re not just an IT problem anymore, they’re a problem and a burden that needs to be shouldered by multiple stakeholders. So [unclear] my parting advice would be to know who all the stakeholders are, their roles, responsibilities, and authorities, train together, have an incident response plan, a set plan, identify any gaps, address those gaps, and keep updating that plan, and making sure that everybody is on the [unclear] should an incident occur.

Carolyn: Great. Thank you, and Jason, let’s go to you.

Grim: I got to say again, that the key takeaways [of mine] would definitely tie into the… I agree with the documentation, keeping your policies and procedures, but also one of the things that we learned is practising – tabletop exercises, working through it, but not just with IT, but bringing in the HR and the legal components, so they know how to actually physically do the process when we get to them, not just on paper.

Carolyn: Thank you. And John Wilson. [unclear]

Wilson: My big thing is companies today still don’t understand where their key data is. Just really, they think, “Oh, it’s all in email, or it’s all on our file share,” and data is on cellphones, it’s on computers, it’s on the internet of things, devices, it’s everywhere. So a company really needs to – especially as a company grows – it really needs to assess and evaluate where their data is stored, and how they’re storing it, and how they’re managing that. Because so many of those devices have much lower security levels and all of a sudden you find out that all your customer account information is on your sales guy’s mobile phone – that he drops it and somebody has now published it all out to the world.

Carolyn: Boom – it’s scary, isn’t it? I know there was a [unclear] 48% of [DYOG] employees disabled security settings on their mobile devices without the company knowing. I see your point there.

I want to thank our wonderful panellists, John Grim of Verizon, John Wilson of e-Discovery Squared, and – I’ll give you their pictures here – Jason Britton of iHeartMedia. Thoroughly enjoyed hearing your insights and discussion. I hope the audience enjoyed it and benefitted from it as well.

In closing, we’re proud here at AccessData to provide forensics solutions for complex investigations and for data-heavy, large-scale e-discovery, helping our clients solve some of the challenges that were discussed today.

With that, I’ll ask the host if there’s any other housekeeping to be done, and otherwise I think we’re done. Thank you so much for joining us for this discussion on the new breed of data hunters.

End of Transcript

Complex Digital Data Calls for a New Breed of Data Hunters

Get The Latest DFIR News

Join the Forensic Focus newsletter for the best DFIR articles in your inbox every month.

Leave a Comment Cancel reply

The End Of Manual Transcription Starts Here

Katelyn Rogers, Digital Forensic Analyst, Mississippi Cyber Initiative

Digital Forensics Jobs Round-Up, July 13 2026

Still Reviewing CCTV The Slow Way? See S21 CCTV v2.0 In Action

From Inaccessible To Actionable: How Punjab Police Recovered Critical Evidence From Feature Phones

CCTV Review Has Evolved. Have You? Introducing S21 CCTV v2.0