We’re Not in Document Land Anymore: Modern Data Collections for Litigation Technology Professionals

Gail: Hello everyone. Thanks for joining today’s webinar entitled “We’re Not In Document Land Anymore: Modern Data Collection for Litigation Technology Professionals”. My name’s Gail Green and I’m the marketing specialist here with Cellebrite Enterprise Solutions. There are a few notes that I’d like to review before we get started. We’re recording the webinar today and we’ll share an on-demand version after the webinar is complete. If you have questions, please submit them in the questions window and we’ll answer them in our Q&A at the end of the webinar. If we don’t get your question, we’ll follow up with you afterwards. Now I’d like to introduce our speaker today, Monica Harris.

Monica is an experienced E-discovery professional and over the past 15 years has specialized in development, implementation, and training of proprietary software for companies such as KroILDiscovery and Consilio. Before joining Cellebrite, she worked with the US food and drug administration, where she oversaw policy and procedure curation as well as enterprise solution rollout in training for enterprise for agency litigation and freedom of information requests.

Monica’s an active leader and mentor in the E-discovery community. She also was the immediate past president of the Association of Certified E-discovery Specialist (ACEVS), DC chapter. A board member of the Master’s Conference and committee chair of the DC Master’s Conference. Monica has served as the assistant director for the women in E-discovery (WIE), DC chapter, and collaborates frequently with the DCIE, the DC bar and the Women’s Bar Association of DC. Thanks so much for joining us today, Monica. And if you’re ready, I’ll hand it over to you to get started.

Monica: Thank you Gail. And thank you everyone for joining us for today’s webinar: “We’re Not In Document Land Anymore: Modern Data Collection for Litigation Technology Professionals”. As Gail shared with everyone, my name is Monica Harris. I’m the product business manager for Cellebrite Enterprise Solutions and I’ll be happy to host the webinar today.

So, for today’s discussion, we’re going to take everyone on a journey from modern data collection 1.0 to modern data collection 2.0. Here at Cellebrite Enterprise Solutions, we describe modern data as mobile, cloud and computer. And when we say 1.0, we are talking about modern data collection before the pandemic, specifically. So let’s take a look at some of the challenges that we had with modern data collection for mobile, computer and cloud before 2020.


Get The Latest DFIR News

Join the Forensic Focus newsletter for the best DFIR articles in your inbox every month.


Unsubscribe any time. We respect your privacy - read our privacy policy.

Starting with a physical image is a best practice for litigation technology professionals. Before the pandemic, we saw the…just the general idea that in order to not miss any data or to not miss any relevant facts, specifically, you collected everything. If you have everything, then you can’t miss anything. Some people would say a lot of this philosophy not only came from some conservative legal professionals but also came from the forensic examiners or the litigation technology collection professionals themselves.

A lot of that may have had something to do with the fact that we saw a transition of individuals who had done forensics collection in the public sector (so that would be for our federal, our state, our local agencies, even our law enforcement) having successful careers retiring and then transitioning into the private sector. For criminal cases, we always went and collected all the data. Then we saw that best practice move into the private sector as well.

So, with all of the data that was being collected, either because it was a best practice that had followed individuals from sector to sector, or just to ensure that nothing was missed because everything was had, usually that data was cold after collection. But as devices grew larger, whether it was the amount of memory on your computer or even the size of your phone, that meant that collections were growing because we collected everything. And you could see on the left hand side of the screen, we have a stat for expenditures. From 2012 to 2019, you almost see that stat double. That’s because the devices themselves were growing in size and we were collecting everything.

In addition to collecting everything, you saw forensic examiners and litigation technology professionals traveling in order to do collections. You still see some of that now, but it was a little different before the pandemic in the fact that collections were centralized. If there’s 15 to 20 custodians on average per case, they all know each other, more than likely they’re working at the same organizations.

This meant that litigation technology collection professionals could go to a site and at that site they could either ask everyone to bring their device into a room (of course with everyone already being in the building), or they could travel around the building to individuals and collect from them in that manner. There was a call, so that was associated with the travel but that was a pass-through call. So it was a very common practice to see your collection professionals out of the office up to 70% of the time.

In addition to that, those traveling litigation technology collection professionals, they were using very sophisticated collection tools, and we still have sophisticated collection tools to this day. The difference being then…well there’s really not a difference, you really see the same thing between 1.0 and 2.0 in terms of the sophistication. I think there’s just a difference between whether or not we’re talking about certifications that are industry specific. So for E-discovery for example, we have ACE eds. ACE eds is not vendor specific, it won’t necessarily train you on how to use a tool, but it will give you an excellent foundation for E-discovery.

It’s very similar in forensics or for collection professionals. For those sophisticated tools, there were certifications from the vendors themselves that were required to operate the tools. So, here for Cellebrite, you see one of our certifications is the Cellebrite Certified Operator. But the general idea is that if you were working with forensic software, you were certified in that tool.

Before the pandemic, we also began to see the rise of social media and as a result of that, our collection tools had to evolve in order to collect it. Facebook, Twitter and Instagram became more prevalent in discovery as civil cases began to open up to the idea that not only could you find a wealth of discovery and communications like email, you could also find it in social media where we were posting and talking about things that were happening, and what we were talking about, that could be relevant to the case at hand.

There was a consistent (or maybe even a persistent) conversation going on about whether or not we could collect from those sources. If an individual posted information about their weekend that was relevant to the case, say, it was an injury case and they were out and about frolicking on the weekend, but during the weekday it was a little bit of a different story. But let’s say that post, for example, was private, just private to the friends of that individual. Was it private enough to not be used in discovery?

That was an ongoing conversation, but we definitely saw a rise in communication on social media platforms, and then we also saw emerging technologies that focused on social media collection specifically as well. So, collection tools like Onna and Hanzo for example, that had connectors that would attach specifically, or connect specifically, to social media platforms.

We can’t talk about social media platforms without talking about the rise in chat applications as well. On the left hand side of the screen, there’s a stat for WhatsApp, and before the pandemic, WhatsApp had 1.5 billion users. After the pandemic (or now), WhatsApp has 2.44 billion users. And just for context, there are 7 billion people in the world. So, right now WhatsApp has a third of the people in the world on their platform, but we began to see that rise, that adoption before the pandemic.

For a lot of companies that gave out company issued devices, we didn’t have WhatsApp, right? There was MDM software or other protocols in place that meant that you could not add chat applications to your phone. But we also began to see the rise in BYOD before the pandemic as well, or employees who were using their own devices to enable work.

Because we saw the rise of social media platforms that we just talked about and also this rise in chat applications, we also began to see employees talking to each other across platforms. So, now that there were additional messaging applications, some of them associated with social media, so Facebook Messenger, Instagram Messenger, for example, you could start a conversation via text message and then you could move onto a social media site and message in place with their messaging app.

So, you would see individuals move from text messages to Facebook to WhatsApp as they wanted to talk about things that were relevant at that moment to an individual. And so, you saw the rise of social media, the rise of chat applications, and then individuals beginning to talk across platforms.

Last but not least, before the pandemic review platforms rendered modern data very similar to how they rendered data from documents. So, of course with email when we are talking about communications, there is one communication per document, right? That makes sense until your communication goes from paragraphs to a sentence or a few words and the communications become continuous. There were options to either review one chat or one message per document, or you had the ability to review them in a spreadsheet, for instance. I’ve seen many a spreadsheet loaded to a document review platform to have a team of legal professionals review it.

There are caveats to that as well, whereas you weren’t necessarily looking at one chat or one message per document you had context because you were looking at the main Excel spreadsheet. It wasn’t necessarily the most streamlined way to review data if you had a team of legal experts looking at discovery, right? A team of individuals working with a single document. In addition to that, searching and filtering and redacting that data, whether it was all in one spreadsheet or whether it was messages spread across multiple documents wasn’t fluid as well.

So, now that we’ve talked about some of the challenges that we saw with Modern Data Collection 1.0, or some of the challenges that we saw with modern data collection before the pandemic, let’s talk about the resolution. So, for physical imaging is the best practice for litigation technology professionals. Technology companies that built forensic software needed to build more robust collection tools that could transfer large amounts of data. When we think about that stat that we saw earlier, the expenditure of E-discovery collections growing, we did reference the fact that that had in part something to do with the fact that the devices themselves were getting larger.

We went from 8 to 16 to 32 gigs in terms of phones, and at that time maybe had even had 256 and 512. That was large amounts of data that depending on how the collection was being conducted, that may have needed to be data that moved across the network, it just depended on whether or not you were collecting online or locally.
And then we also saw a slight change. So, in addition to doing that physical, we also began to look at full file system, right?

The difference between a full file system and an image being unallocated data, right? You still had the ability to pick up some deleted data but whether or not you had unallocated, and whether or not you might find your smoking gun there. So, we began to consider ways of collecting data that was a little bit more flexible for the amount of data that we were collecting.

In addition to that, we needed to create continuous training for collection professionals because those tools that were collecting data, as we saw social media evolve, as we saw chat applications evolve, the collection tools were evolving as well, and that meant that the experts who were using them, they needed to stay up to date.

Particularly if you were traveling, that could mean that you were the person…you were the person that was there and you may not have access to someone. It could have been after hours, it could have been a remote location and you didn’t have access to ask anyone for assistance, so making sure that you were up to date with the latest and greatest certification was important. Whether that was to run the tools right there during the collection, or whether or not that was to testify at a later time about the collection that was performed.

We had to incorporate social media and chat application collection into our forensics tools. There was a question of whether or not we would have that ability in existing tools so that you could add to what was already in your forensics toolkit, or whether or not you were branding new tools into your forensics toolkit.

We also had to consider when taking a look specifically at chat applications, whether or not we were looking at chat applications which were for personal use, like WhatsApp for example, or whether we were looking at chat applications that were ministered by your IT department. Because we began to see the rise of chat applications like Slack and Teams, for example. And then we had to think about what came beyond, what comes beyond social media and chat? Like the internet of things, for example.

Last but not least, we had to reimagine the review of modern data and review platforms, so that we were not looking at text messages or chat data the same way that we look at documents. We were looking at it in its native format, or the same way that it displays on our phones, or if you’re using Slack on your desktop for your instance, the same way that it is rendered there. Review of native format that easily lent itself to searching and filtering, so that you could quickly get to the relevant content and also, an export of that data for going from collections to review platforms, export of that data in a format that review platforms could handle.

All right, now that we’ve taken a look at the resolution for 1.0, let’s take a look at some of the solutions that we created for 1.0 collections. Three ways to collect the most data for traveling litigation technology collection professionals, starting with UFED. For forensic professionals who were traveling and were in the field, we had not only UFED for local collections, you could take a dongle with you and perform collections there, but we also had detachable for individuals who were in the field but wanted to keep their license in a centralized repository.

We also had Digital Collector, formally known as Macquisition. Digital Collector is also something that could be taken into the field, or easily taken into the field, by forensics collectors because it is on a USB drive. A USB drive that initially started at 512GB but now moves up to 1 terabyte. Whereas Digital Collector can collect from both Windows and Mac. It is formally known as Macquisition because of how well it collects from Macs, including in one with T2 encryption.

And last but not least, Inspector, which not only has the ability to collect from Windows and Macs, but also brings in the analysis portion of what could be happening during collections for discovery. So, you had the ability to not only triage your computer data before you collected it, but once you collected it, you can then perform a deeper analysis of the data before you exported it in load file format so that it could be loaded to a review platform.

You also saw the integration or a standalone of…for social media and chat with our cloud analyzer solution. Cloud analyzer could be integrated into physical analyzer. So, if you already had physical analyzer as part of your toolkit, cloud analyzer could easily be set up within that. Or cloud analyzer could be set up by itself, giving collection experts and then also analysis experts some flexibility there in terms of how they wanted to use the product. And last but not least, certifications.

Earlier, we spoke about the Cellebrite Certified Operator Certification, but Cellebrite actually has several certification programs, a few of which are listed here, and cover all of the modern data types that we have talked about: mobile, computer and cloud, and even moves from collection into analysis.

All right, so now that we’ve taken a look at the challenges, the resolutions and the solutions for Modern Data 1.0, or the way that we collected from those various sources before the pandemic, let’s take a look at how things have changed after the pandemic, or right now today, in some of the challenges that we’re seeing there.

Starting with our hybrid workforce. I think that before we all came back into the office, before vaccinations and boosters and things of that nature, we were all remote. But most of us are still to some extent in a hybrid, a little bit of being in the office, a little bit of being at home. So, what that means is our devices are now moving with us. If you’re in the office, your computer, your laptop, your phone, it’s there. You could take your laptop and your phone home, you may have devices in both. So our devices are moving with us.

In addition to that, you could be using your own device to enable work, whether that’s your phone and that BYOD that we had saw become more prevalent, which rised in usage after the pandemic, or whether or not you were using your own laptop and collecting or connecting via VPN, for example. And in addition to that, because the hybrid workforce, we’re now moving in and out of the office, our devices are moving with us, perhaps being out of the office doesn’t necessarily mean you’re at home. Maybe you took advantage of the fact that a majority of the time you’re not in the office and you were traveling, but still working. That meant that our devices could be anywhere, we could be anywhere, the data could be anywhere and collections were no longer centralized.

We began to see consistent or persistent updates for not only the hardware that we had, but then the software that went with it. So, when we’re talking about mobile devices specifically, the number up updates that we began to see for our devices increased. In 2022 alone there have been several updates to iOS. I know a lot of us are looking forward to iOS16, for example, that will be available in the fall. In addition to that, consumers are usually on plans where you’re upgrading a minimum of every two years, if not sooner. And then of course, for security reasons and even feature enhancements, we begin to see more updates and chat applications that we use on our phones as well.

We talked a little bit about BYOD when we talked about the hybrid workforce, but we began to see much, much more of it, a shift towards BYOD and a shift away from company owned devices. And with that shift, we began to see more of a mix of business and personal data on our phones. You could be using your phone to make calls to customers, but at the same time you could have pictures from your birthday party that weekend, for example.

A lot of the reason when we took a look at several surveys of why individuals were transitioning when they had a choice of having either the company issued device or their own device was functionality. It could be that you had the Apple ecosystem in your home and the company was not offering Apple phones, and so you might choose to use your own device or vice versa. You could be an Android user and the company was offering iPhones, which we see more prevalent in North America.

We saw a lot of critical data that could be found in chat applications and workplace apps as well. And we talked a little bit about it with 1.0 collection, with chat applications being WhatsApp, Facebook Messenger, Signal even, and then the workplace apps being more like Slack or Teams, for example. With that shift to hybrid work and with employees and also their devices being just about anywhere, we began to see a rise in the amount of data that was in chat applications. It’s not that you saw any less email in your inbox, it’s that you saw more chat or more communication in the chat collaboration apps themselves.

We had moved away from the phone. If you wanted to have that important conversation, perhaps you documented an email or perhaps you started that in a chat application and documented that in email afterwards. So, we began to see this critical data that was then found in these chat applications and work apps when before the pandemic, we saw more of that critical data in our inboxes.

And the amount of data that we could collect began to grow. So, we talked about the fact it’s really the culmination of everything that we’ve talked about to this point. So the fact that we were hybrid, the fact that we needed quicker turnaround time to some of our communications, even though we were still communicating via email, we were communicating more on workplace apps as well. And then the devices themselves were growing in order to be able to support the load. We began to see one terabyte phones for example.

So, there was increased volume, velocity and variety of data due to where you could be using it, how often you could be using it, and then the devices and the operating systems updating themselves. It meant that culling data at the time of collection became more important than ever. We talked about, for 1.0, how we would collect everything and cull later, but there was so much data being generated throughout chat applications, across our phones, in cloud that it really became more important to cull at the time of collection if you could, or be selective about that collection.

And of course the time and cost that it takes to review the information or the data: if you were collecting everything and we were generating far more data, primarily because we were no longer centralized in an office, then that was all data that you had to review, and that would mean that your review costs would go up.
We also saw mobile data preservation with more critical information being on our phones.

And even in chat applications, we then not only had to preserve data that was in Office 365, we had to preserve data that was on phones as well. Collect to preserve became a best practice, but with collect to preserve, whereas if you were to go into your email or your Office 365 to collect, collect to preserve was not necessarily continuous preservation. So, there was the need to go back to the source and preserve at regular updates to make sure…or regular intervals…to make sure that you had the most recent copy of the data for the preservation.

But as you went to these phones, as you went to these chat applications, which were often BYOD, then the concerns about privacy came into play, right? Making sure that if you were going in to preserve text messages, that you were only preserving the text messages that were related to business and not to personal information.
Still even after the pandemic, or today (I don’t necessarily know if you could say that were past the pandemic), there was still the need for document review platforms to render the data natively or in a way that we were used to seeing it, that we could easily traverse it, that we could easily review it, search it, filter it, redact it, and produce it.

So, that are the challenges that we see today with mobile data. Then let’s take a look at the resolutions that we have today for some of those challenges that we just talked about. The resolution today is to build collection tools that can collect from employees and their devices anywhere that they can be located. This directly addresses the pain point of having a hybrid workforce. The fact that individuals could be anywhere: at home in the office, maybe they brought that RV and and they’re going around the country, but their devices and their data are no longer centralized. It’s no longer a best practice to send individuals out.

There are still individuals or, you know, litigation technology collection professionals, forensic examiners, who are traveling. We do see that. But with employees being anywhere when those professionals travel to collect, that has changed. And as a result of that, we have seen the time to get to the device grow because they’re everywhere. So, either now devices are coming back into a centralized point or perhaps someone is going out depending on the case to collect, or perhaps we’re even shipping out licenses on…we’re on very long support calls, walking individuals through how to collect the data. But we’ve seen an extension, a longer time to get to that data. Although the amount of time that we have to review the data, that hasn’t changed.

We need to enhance collective tools with selective collection, specifically with an eye towards employee privacy. That mix of personal and business that could be on your laptop or your mobile, employees at custodians are highly sensitive to that. The idea that we’re going to go and collect everything, that was very much a 1.0 collection best practice. And today we’re looking for collection tools that have the ability to be targeted so that we can make sure that we have employee privacy at the forefront of our minds.

Enhanced collection tools with the ability to collect critical data and messaging apps. Because BYOD means messaging apps can be used alongside workplace apps, then we are seeing that rise, that rise in critical data in those apps. So, we have to make sure that we have the ability to get to the data there. So not just your SMS or your text messages, but we also need to be able to get to chat applications too, distinguishing between what you may be using personally and what your company may be administering. But we also need to make sure that we can get to chat applications because individuals are talking across platforms. We began to see the rise of that before the pandemic, but it is far more prevalent today.

And lastly, enhancing collection tools to seamlessly convert mobile data in chat applications into a format that review platforms can easily render. That way we have that native review, a review and a document review platform that looks very similar to how you would see data that is rendered on your phone. In the past that had been a manual process, a manual process that may not necessarily have been repeatable, perhaps uneven, defensible. So, automating that process is definitely a best practice.

So, what does that mean for the solution of modern day collections today? One way: one way to collect and preserve for mobile devices. So that includes not only the SMSof the MMS that you can find on your phone, but your chat applications as well. For 1.0, we talked about three different solutions, for 2.0, we have one solution. One solution that can collect from Mac computers, Windows computers, iOS devices (whether that be phone or tablet), Android, and also chat applications such as WhatsApp. And that one solution is called Endpoint Inspector. Endpoint Inspector not only has the ability to collect from the different sources that I just named, but it also has the ability to do targeted collection.

Targeted collection with the thought in mind of protecting employee data. So, we don’t need to go after everything that’s on the phone or all of the artifacts on the phone. If you are just interested in taking a look at an individual’s text messages, if you’re just interested in taking a look at an individual’s text messages and the call logs, only that information can be collected from the phone and that information can be collected remotely. So regardless of whether the employee may be in the office, at home, traveling, you can still get to the relevant data as quickly as possible anywhere that it may reside.

In addition to that, we have one solution for getting to that critical data in chat applications, and that also has the ability to include deleted data as well. And that solution is called Mobile Elite SaaS. Mobile Elite SaaS has full file system collection. When we talked a little bit about that when we were talking about modern data collection 1.0, moving from the physical image to full file system, which is how we have the ability to collect deleted data, but also access chat application such as WhatsApp, Telegram and Signal.

For WhatsApp SaaS, because it is a SaaS solution, a lot of the updates (or if not all of the updates, excuse me), all of the updates for the tool because we are constantly updating because chat applications and the operating systems for our mobile devices are constantly updating. All of those updates are in the Cellebrite hosted cloud. So, we’ve also removed some of the complexity from that piece as well.

There are no longer a need to constantly update as the tools are consistently updating because the applications that we’re collecting from are consistently updating. You can plug in and collect from some of the more sophisticated applications that are available today.

And one way to convert mobile data, including chat apps to a format that review platforms can ingest and we call that solution Legalview, which is integrated with Physical Analyzer. Legalview is going to allow you to take data from either mobile, iOS, Android, that could be SMS, MMS, or that could be some of the more popular chat applications that we see today. And it’s going to convert that data into a form that document review platforms can easily handle and render natively, including RSMF.

In conclusion to today’s webinar, the key takeaways from today’s presentation are the following: the Cellebrite collection, analysis and review ecosystem for the modern day workplace or modern collection 2.0 has remote collection for mobile, computer and cloud. In addition to that, we have selective or targeted collection that could be used to protect employee data on mobile, computer. We have full file system collections so that you can get access to some of the more critical data and chat applications, including deleted data.

And we also have conversion for the review of modern data that’s mobile, computer and cloud for review platforms. If you’re interested to know more, please reach out to us. We can be reached on Twitter or LinkedIn at @CellebriteES. Thank you for joining today’s webinar.

Gail: Okay, Monica, we did have some questions that came in while we were talking. The first one is: do Cellebrite certifications require continuing education credits?

Monica: Well, that’s a great question. Yes, we spoke about several certifications during the presentation. So for the CCO and the CCPA certifications, those do require re-certification every two years. And for the CCME, that is every year. Great question.

Gail: Great. Next question is: can UFED perform full file system extractions for iOS and Android devices, or just one of those?

Monica: Oh, that’s a great question. So, UFED does have the ability to perform full file system extractions for iOS, and also for Android. During the presentation for mobile collection 2.0, we also spoke about Mobile Elite, which has the widest range of full file system collection capability for iOS and for Android as well. So, while you will have full file system extraction capability with UFED, Mobile Elite is going to give you that capability for the widest range of iOS and Android devices.

Gail: And the last question we have time for today: can any of the Cellebrite Collection products collect Signal data?

Monica: That is a great question. Absolutely. So, with full file system capability…well, let me pull back…so we can collect Signal data. How much of it that we’re going to collect depends on the extraction type. So, throughout the presentation, not only with the last question that we had, but when we were talking about some of the pain points of modern data 1.0 collection, and then some of the solutions for modern data 2.0 collection, for full file system extraction, you’re going to see the most data and that includes chat applications, like Signal.

Gail: Great, that’s super helpful. Overall, I think this (and I think everybody will agree) that this webinar has been very useful. So thanks so much Monica. We’ve had a few more questions come in that we didn’t have time for, however, we will reach out to each of you individually after the webinar to answer those questions.

Also, after the webinar, you’ll see a prompt asking for some feedback on what topics you’d like to see going forward. We’d really appreciate it if you’d take a minute to help us decide future webinars. Thanks again, Monica, and thanks you all for joining us today. Have a great rest of your day.

Leave a Comment

Latest Videos

This error message is only visible to WordPress admins

Important: No API Key Entered.

Many features are not available without adding an API Key. Please go to the YouTube Feeds settings page to add an API key after following these instructions.

Latest Articles