Searching And Filtering Emails When Forensically Collecting Mailboxes

by Arman Gungor

When mailboxes are forensically preserved for eDiscovery or digital forensic investigations, their contents are almost always searched and filtered. Filtering emails helps overcome time, scope and cost constraints and alleviates privacy concerns.

There are two main ways of filtering emails—before and after the forensic acquisition. Each method has its pros and cons, which we will discuss here.

Filtering Emails After Forensic Collection

This method involves forensically collecting mailboxes entirely. Once the collection is complete, each mailbox would be ingested into eDiscovery or digital forensic investigation tools and searched before subsequent steps such as processing, analysis and review.

Pros

  • Flexibility — Case requirements, keywords, date ranges are all subject to change. It is not uncommon for a legal team to discover more search terms after they have started document review. When you have access to the entire mailbox, you can go back and re-run your searches without having to perform another acquisition.
  • More powerful search options — Digital forensic investigation and eDiscovery tools are able to extract attachments and embedded objects recursively, perform optical character recognition (OCR) on pages that do not have extractable text and create a searchable index of the entire mailbox. A well-designed tool also provides you with detailed reports on which documents cannot be searched (e.g., encrypted, corrupt or unrecognized files).
  • Familiar workflow — You are likely very familiar with the capabilities and search syntax of your eDiscovery and digital forensic investigation tools. It is an advantage to be able to use the workflow you are already comfortable with.

Cons

  • Collecting entire mailboxes can take a long time — Depending on the size of the mailbox and the capabilities of the server, you might have to allocate several hours to collect a mailbox in its entirety.
  • Privacy and scope concerns — The owner of the mailbox may not be fine with your collecting their entire mailbox. The mailbox may contain confidential information that is outside the scope of the engagement. This issue is often exacerbated when collecting an opposing party’s mailbox.
  • Increased processing time & cost — Ingesting and searching mailboxes take time and often have associated costs—usually proportional to the data size. Collecting the entire mailbox increases processing time and cost as you would be starting off with a larger amount of data.

Filtering Emails Before Forensic Collection

I like to refer to this method as pre-acquisition searching and filtering. In this scenario, searches are run directly on the email server and the forensic email collection is limited to only the responsive emails.


Get The Latest DFIR News

Join the Forensic Focus newsletter for the best DFIR articles in your inbox every month.


Unsubscribe any time. We respect your privacy - read our privacy policy.

Most email providers such as Gmail and Office 365, as well as on-premises Exchange and IMAP servers, support searching. Forensic email collection tools can perform nearly instantaneous searches on email servers and display the results.

Pros

  • Time savings — Having to collect a large mailbox (e.g., 200k+ messages) from a slow email provider such as Yahoo can take a long time. What if only a very small percentage of those messages were responsive? You could run a search on the server side within a couple of minutes and forensically collect only the few hundred responsive items rapidly.
  • Helps with privacy and scope concerns — I ran into many cases where I was simply not allowed to collect an entire mailbox due to privacy and scope concerns. I was instructed to limit the collection to only messages between certain individuals and date ranges. Performing server-side email searching is the only way to accommodate such requests.
  • Reduced processing time & cost down the line — When the data universe is limited from the very beginning, a smaller amount of data will be run through subsequent steps such as ingestion, processing, analysis, and review. This often results in significant time and cost savings.

Cons

  • Changes in scope & instructions — If there is a change in the scope of the project, you might have to go back and perform a supplemental forensic collection using the revised search parameters.
  • Limited search capabilities — Server-side searching is much more limited in terms of functionality, especially when it comes to blanket keyword searches. Depending on the case, you could use server-side searches for filtering emails by recipients, subjects, dates, etc. The ability to search attachments depends on the capabilities of the server. File types that are not recognized by the email server would not be indexed for searching.
  • Search syntax learning curve — Search syntax that you would use to search emails on the server side changes from service to service—or server to server for on-premises scenarios. For example, the search syntax in Gmail application programming interface (API) is quite different than the Exchange Advanced Query Syntax (AQS) used in Exchange Web Services. The IMAP SEARCH command is a completely different ball game. The good news is that you would not need to deal with the APIs directly, and a well-designed forensic email collection tool would provide you with an intuitive user interface and guidance on the search syntax.

A Hybrid Approach

Some email archival tools have the option to filter emails during acquisition. They do not execute the search on the server side. Instead, they download each message, evaluate it against the search criteria, and then save it or discard it based on responsiveness.

This method does not have the advantages of pre-acquisition searching on the server side, because you still have to download all the emails. Because attachments are not extracted, OCRed (when necessary) and indexed on-the-fly, you do not get the benefits of using a powerful eDiscovery or digital forensic investigation tool for performing the search, either.

Conclusion

Filtering emails before and after forensic collection both have their pros and cons. In some cases, you may find that only one of these methods is a viable option. For instance, if you are restricted from forensically preserving the entire mailbox of an opposing party due to privacy concerns, you might have to perform your search on the server side. On the other hand, if you have a long list of complex queries with proximity and Boolean operators, which need to be executed on all documents—including attachments and documents without extractable text—ingesting the emails into your eDiscovery or digital forensic investigation tools and performing the searching and filtering there might be the only option.

In some cases, you might have the flexibility to choose which method to use. It is important to be familiar with both options and understand the trade-offs so that you can make informed decisions.

About The Author

Arman Gungor, CCE, is a digital forensics and eDiscovery expert and the founder of Metaspike. He has over 21 years’ computer and technology experience and has been appointed by courts as a neutral computer forensics expert as well as a neutral eDiscovery consultant.

Leave a Comment

Latest Videos

In this episode of the Forensic Focus podcast, Si and Desi explore how artificial intelligence is being leveraged to uncover crucial evidence in investigations involving child sexual abuse material (CSAM) and examine the importance of exercising caution when implementing these tools. 

They also discuss a recent murder case in which cyber experts played a vital role in securing a conviction, and explore the unique challenges associated with using digital evidence as an alibi.

Show Notes:

A Practitioner Survey Exploring the Value of Forensic Tools, AI, Filtering, & Safer Presentation for Investigating Child Sexual Abuse Material (CSAM) - https://dfrws.org/wp-content/uploads/2019/06/2019_USA_paper-a_practitioner_survey_exploring_the_value_of_forensic_tools_ai_filtering_safer_presentation_for_investigating_child_sexual_abuse_material_csam.pdf

Man charged with NI murder ‘faked live stream to provide alibi’ (The Guardian) - https://www.theguardian.com/uk-news/2023/feb/02/man-charged-with-ni-faked-live-stream-to-provide-alibi

A YouTuber accused of murder faked a 6-hour livestream to produce an alibi (Sportskeeda) - https://www.sportskeeda.com/esports/news-a-youtuber-accused-murder-faked-6-hour-livestream-produce-alibi

European Interdisciplinary Cybersecurity Conference (EICC) 2023 - https://www.forensicfocus.com/event/european-interdisciplinary-cybersecurity-conference-eicc-2023/#more-493234

YouTuber reportedly faked GTA livestream to have an alibi while he committed murder (Dexerto) - https://www.dexerto.com/entertainment/youtuber-reportedly-faked-gta-livestream-to-have-an-alibi-while-he-committed-murder-2052974/

Forensic Europe Expo - https://www.forensicfocus.com/event/forensic-europe-expo/#more-493225

In this episode of the Forensic Focus podcast, Si and Desi explore how artificial intelligence is being leveraged to uncover crucial evidence in investigations involving child sexual abuse material (CSAM) and examine the importance of exercising caution when implementing these tools.

They also discuss a recent murder case in which cyber experts played a vital role in securing a conviction, and explore the unique challenges associated with using digital evidence as an alibi.

Show Notes:

A Practitioner Survey Exploring the Value of Forensic Tools, AI, Filtering, & Safer Presentation for Investigating Child Sexual Abuse Material (CSAM) - https://dfrws.org/wp-content/uploads/2019/06/2019_USA_paper-a_practitioner_survey_exploring_the_value_of_forensic_tools_ai_filtering_safer_presentation_for_investigating_child_sexual_abuse_material_csam.pdf

Man charged with NI murder ‘faked live stream to provide alibi’ (The Guardian) - https://www.theguardian.com/uk-news/2023/feb/02/man-charged-with-ni-faked-live-stream-to-provide-alibi

A YouTuber accused of murder faked a 6-hour livestream to produce an alibi (Sportskeeda) - https://www.sportskeeda.com/esports/news-a-youtuber-accused-murder-faked-6-hour-livestream-produce-alibi

European Interdisciplinary Cybersecurity Conference (EICC) 2023 - https://www.forensicfocus.com/event/european-interdisciplinary-cybersecurity-conference-eicc-2023/#more-493234

YouTuber reportedly faked GTA livestream to have an alibi while he committed murder (Dexerto) - https://www.dexerto.com/entertainment/youtuber-reportedly-faked-gta-livestream-to-have-an-alibi-while-he-committed-murder-2052974/

Forensic Europe Expo - https://www.forensicfocus.com/event/forensic-europe-expo/#more-493225

YouTube Video UCQajlJPesqmyWJDN52AZI4Q_7QiFTiuY7Vw

AI In CSAM Investigations And The Role Of Digital Evidence In Criminal Cases

Forensic Focus 22nd March 2023 12:44 pm

Throughout the past few years, the way employees communicate with each other has changed forever.<br /><br />69% of employees note that the number of business applications they use at work has increased during the pandemic.<br /><br />Desk phones, LAN lines and even VOIP have become technologies of the past workplace environment as employees turn to cloud applications on their computers and phones to collaborate with each other in today’s workplace environment.<br /><br />Whether it’s conversations in Teams, file uploads in Slack chats, or confidential documents stored in Office 365, the amount of data stored and where it is stored, is growing quicker than IT and systems administrators can keep up with.<br /><br />Corporate investigators and eDiscovery professionals need to seamlessly collect relevant data from cloud sources and accelerate the time to investigative and discovery review.<br /><br />With the latest in Cellebrite’s remote collection suite of capabilities, investigators and legal professionals can benefit from secure collection with targeted capabilities for the most used workplace applications.<br /><br />Join Monica Harris, Product Business Manager, as she showcases how investigators can:<br /><br />- Manage multiple cloud collections through a web interface<br />- Cull data prior to collection to save time and money by gaining these valuable insights of the data available<br />- Collect data from the fastest growing cloud collaboration applications like Office365, Google Workspace, Slack and Box<br />- Login to a single source for workplace app collection without logging into every app and pulling data from multiple sources for every employee<br />- Utilize a single unified collection workflow for computer, mobile and workplace cloud applications without the need to purchase multiple tools for different types of collections – a solution unique to Cellebrite’s enterprise solution capabilities

Throughout the past few years, the way employees communicate with each other has changed forever.

69% of employees note that the number of business applications they use at work has increased during the pandemic.

Desk phones, LAN lines and even VOIP have become technologies of the past workplace environment as employees turn to cloud applications on their computers and phones to collaborate with each other in today’s workplace environment.

Whether it’s conversations in Teams, file uploads in Slack chats, or confidential documents stored in Office 365, the amount of data stored and where it is stored, is growing quicker than IT and systems administrators can keep up with.

Corporate investigators and eDiscovery professionals need to seamlessly collect relevant data from cloud sources and accelerate the time to investigative and discovery review.

With the latest in Cellebrite’s remote collection suite of capabilities, investigators and legal professionals can benefit from secure collection with targeted capabilities for the most used workplace applications.

Join Monica Harris, Product Business Manager, as she showcases how investigators can:

- Manage multiple cloud collections through a web interface
- Cull data prior to collection to save time and money by gaining these valuable insights of the data available
- Collect data from the fastest growing cloud collaboration applications like Office365, Google Workspace, Slack and Box
- Login to a single source for workplace app collection without logging into every app and pulling data from multiple sources for every employee
- Utilize a single unified collection workflow for computer, mobile and workplace cloud applications without the need to purchase multiple tools for different types of collections – a solution unique to Cellebrite’s enterprise solution capabilities

YouTube Video UCQajlJPesqmyWJDN52AZI4Q_g6nTjfEMnsA

Tips And Tricks Data Collection For Cloud Workplace Applications

Forensic Focus 20th March 2023 12:00 pm

This error message is only visible to WordPress admins

Important: No API Key Entered.

Many features are not available without adding an API Key. Please go to the YouTube Feed settings page to add an API key after following these instructions.

Latest Articles

Share to...