What Changes Do We Need To See In eDiscovery? Part VII

Harold Burt-Gerrans talks about how to deal with structured data in ediscovery cases.

I’m baaaaa–ack! And you thought I was done with my earlier six-part series, but I have a new topic to
add to my rants and raves. For review, the previous series was:

And today’s new topic – Structured Data – What is it and how to deal with it

Data Types

We typically classify the documents we tend to see in eDiscovery as emails, power points, spreadsheets,
etc., and these comprise the majority of the documents in the review. However, this document
classification is really a second-level classification – the primary level is “Unstructured Data” vs. “Structured
Data”.

Unstructured Data: This data is free flowing, without any rigid rules as to how the content is organized.
Written documents such as Word files, and all the file types listed above, are typically considered
unstructured data.


Get The Latest DFIR News

Join the Forensic Focus newsletter for the best DFIR articles in your inbox every month.


Unsubscribe any time. We respect your privacy - read our privacy policy.

Structured Data: This data is organized with defined rules – the most common types of files would be
database files, where the database consists of data tables, and each table consists of a set of columns
containing a specific type of data.

In the strictest interpretation, data files are either one or the other: structured or unstructured. That said,
there are times when there may appear to be crossover points, such as a spreadsheet that contains only a
worksheet with a defined set of columns and each row is an individual record. For example, Column A is
‘Name,’ Column B is ‘Street Address,’ Column C is ‘City,’ and so on. Conversely, there can also be free
formed text fields inside of a database – like a product description, or some other field containing text
describing an object/situation.

The eDiscovery industry has been managing unstructured data for years, so I don’t need to discuss how
well we do it, with consideration to my suggestions in the previous six articles.

Dealing with Structured Data

There are going to be exceptions to the situations I present below, however, there is a good possibility
that if you are receiving structured data, it can be managed as I will describe. Oddly, the processes being
discussed are designed to convert structured data into unstructured document types that can be
handled in typical eDiscovery platforms.

1) Packaged Application Data: There are a lot of packaged applications where the data is stored in
an underlying database. A very common example of this is an accounting system such as SAP or
QuickBooks (one extreme to the other). Depending on the nature of the matter, accounting data
may be significant. Typically packaged applications come along with either a defined set of
reports, or with a built-in report writer tool that can be used to extract the data in a format that
makes sense because the application is providing the contextual structure needed to
understand the data.

The best approach to working with this data is to work with the client staff who usually create
or use the data, to pull off reports that contain the information that would be relevant to the
matter. For example, in a Mergers and Acquisition matter, you would work with the accountants
to extract GL Ledger Reports, Balance Sheets, A/R and A/P subledger reports, etc.

2) Proprietary Databases: These are becoming more common as companies implement systems to track corporate specific data. These may be done using in-house development staff, or by leveraging external application hosting environments, such as implementing a customized complaint tracking system within SalesForce or a Support Ticketing System within a case management tool.

How do we handle these? Read part eight here.

About The Author

Harold Burt-Gerrans is Director, eDiscovery and Computer Forensics for Epiq Canada. He has over 15 years’ experience in the industry, mostly with H&A eDiscovery (acquired by Epiq, April 1, 2019).

Leave a Comment

Latest Videos

In this episode of the Forensic Focus podcast, Desi and Si discuss different online programming courses and what they think about the popular platform, Udemy. They also talk about Flipper, Dev boards, and Raspberry Pi, and delve into the fascinating phenomenon of running the classic game Doom on unlikely devices.

Throughout the episode, Desi and Si share their digital forensics expertise, referencing some of the cases they have been working on and highlighting particular methodologies and technologies that have an impact on cybersecurity.

Show Notes:

100 Days of Code: The Complete Python Pro Bootcamp for 2023 - https://www.udemy.com/course/100-days-of-code/

Domestika - https://www.domestika.org/en

MIT OpenCourseWare - https://www.youtube.com/@mitocw 

MasterClass - https://www.masterclass.com/

Raspberry Pi 400 Complete Kit - https://core-electronics.com.au/raspberry-pi-400-kit.html

Flipper Discord - https://discord.com/invite/flipper

Flipper Zero - https://flipperzero.one/

This Programmer Figured Out How to Play Doom on a Pregnancy Test - https://www.popularmechanics.com/science/a33957256/this-programmer-figured-out-how-to-play-doom-on-a-pregnancy-test/

Here’s a dude playing Doom Eternal on his fridge - https://www.polygon.com/2020/10/13/21514933/doom-eternal-refrigerator-door-samsung-smart-refrigerator-xbox-game-pass-richard-mallard

Doom hacker gets Doom running in Doom - https://www.pcgamer.com/doom-hacker-gets-doom-running-in-doom/

Doom Running On A Calculator Powered By Old Potatoes - https://kotaku.com/doom-running-on-a-calculator-powered-by-old-potatoes-1845374069

GoldenEra - https://www.imdb.com/title/tt11753760/

Racing the Beam - https://en.wikipedia.org/wiki/Racing_the_Beam

High Score (TV series) - https://en.wikipedia.org/wiki/High_Score_(TV_series)

Microcontroller Courses (Udemy) - https://www.udemy.com/topic/microcontroller/

The story of Final Fantasy XIV’s renegade do-good modders - https://www.pcgamesn.com/final-fantasy-xiv/ffxiv-modders-renegade-do-gooders

Logical fallacies - https://yourlogicalfallacyis.com/

In this episode of the Forensic Focus podcast, Desi and Si discuss different online programming courses and what they think about the popular platform, Udemy. They also talk about Flipper, Dev boards, and Raspberry Pi, and delve into the fascinating phenomenon of running the classic game Doom on unlikely devices.

Throughout the episode, Desi and Si share their digital forensics expertise, referencing some of the cases they have been working on and highlighting particular methodologies and technologies that have an impact on cybersecurity.

Show Notes:

100 Days of Code: The Complete Python Pro Bootcamp for 2023 - https://www.udemy.com/course/100-days-of-code/

Domestika - https://www.domestika.org/en

MIT OpenCourseWare - https://www.youtube.com/@mitocw

MasterClass - https://www.masterclass.com/

Raspberry Pi 400 Complete Kit - https://core-electronics.com.au/raspberry-pi-400-kit.html

Flipper Discord - https://discord.com/invite/flipper

Flipper Zero - https://flipperzero.one/

This Programmer Figured Out How to Play Doom on a Pregnancy Test - https://www.popularmechanics.com/science/a33957256/this-programmer-figured-out-how-to-play-doom-on-a-pregnancy-test/

Here’s a dude playing Doom Eternal on his fridge - https://www.polygon.com/2020/10/13/21514933/doom-eternal-refrigerator-door-samsung-smart-refrigerator-xbox-game-pass-richard-mallard

Doom hacker gets Doom running in Doom - https://www.pcgamer.com/doom-hacker-gets-doom-running-in-doom/

Doom Running On A Calculator Powered By Old Potatoes - https://kotaku.com/doom-running-on-a-calculator-powered-by-old-potatoes-1845374069

GoldenEra - https://www.imdb.com/title/tt11753760/

Racing the Beam - https://en.wikipedia.org/wiki/Racing_the_Beam

High Score (TV series) - https://en.wikipedia.org/wiki/High_Score_(TV_series)

Microcontroller Courses (Udemy) - https://www.udemy.com/topic/microcontroller/

The story of Final Fantasy XIV’s renegade do-good modders - https://www.pcgamesn.com/final-fantasy-xiv/ffxiv-modders-renegade-do-gooders

Logical fallacies - https://yourlogicalfallacyis.com/

YouTube Video UCQajlJPesqmyWJDN52AZI4Q_5f72B6DD5wk

Programming Languages, Flipper And Gaming

Forensic Focus 24th May 2023 11:43 am

In this episode of the Forensic Focus podcast, Si and Desi talk to Mackenzie Jackson, Developer Advocate at Git Guardian. 

Mackenzie discusses the problem of hard-coded and leaked credentials in Git repositories, the task of scanning Git repositories for leaked credentials, and how that’s helped by the setup of GitHub and Git. 

He also looks at some public and private cases of security breaches through Git repositories and recommends tools you can use to combat attackers on Git. 

Show Notes:

Toyota Suffered a Data Breach by Accidentally Exposing A Secret Key Publicly On GitHub (GitGuardian) - https://blog.gitguardian.com/toyota-accidently-exposed-a-secret-key-publicly-on-github-for-five-years/

GitHub.com rotates its exposed private SSH key (Bleeping Computer) - https://www.bleepingcomputer.com/news/security/githubcom-rotates-its-exposed-private-ssh-key/

Conpago - https://www.conpago.com.au/

Source Code as a Vulnerability - A Deep Dive into the Real Security Threats From the Twitch Leak (GitGuardian) - https://blog.gitguardian.com/security-threats-from-the-twitch-leak/

Teenagers Leveraging Insider Threats: Lapsus$ Hacker Group (Forbes) - https://www.forbes.com/sites/emilsayegh/2023/03/15/teenagers-leveraging-insider-threats-lapsus-hacker-group

Lapsus$: Oxford teen accused of being multi-millionaire cyber-criminal (BBC) - https://www.bbc.co.uk/news/technology-60864283

Dynamic Secrets (HashiCorp) - https://developer.hashicorp.com/vault/tutorials/getting-started/getting-started-dynamic-secrets

Crappy code, crappy Copilot. GitHub Copilot is writing vulnerable code and it could be your fault (GitGuardian) - https://blog.gitguardian.com/crappy-code-crappy-copilot/

trufflesecurity/trufflehog (GitHub) - https://github.com/trufflesecurity/trufflehog

gitleaks/gitleaks (GitHub) - https://github.com/gitleaks/gitleaks

Git (Wikipedia) - https://en.wikipedia.org/wiki/Git

awslabs/git-secrets (GitHub) - https://github.com/awslabs/git-secrets

In this episode of the Forensic Focus podcast, Si and Desi talk to Mackenzie Jackson, Developer Advocate at Git Guardian.

Mackenzie discusses the problem of hard-coded and leaked credentials in Git repositories, the task of scanning Git repositories for leaked credentials, and how that’s helped by the setup of GitHub and Git.

He also looks at some public and private cases of security breaches through Git repositories and recommends tools you can use to combat attackers on Git.

Show Notes:

Toyota Suffered a Data Breach by Accidentally Exposing A Secret Key Publicly On GitHub (GitGuardian) - https://blog.gitguardian.com/toyota-accidently-exposed-a-secret-key-publicly-on-github-for-five-years/

GitHub.com rotates its exposed private SSH key (Bleeping Computer) - https://www.bleepingcomputer.com/news/security/githubcom-rotates-its-exposed-private-ssh-key/

Conpago - https://www.conpago.com.au/

Source Code as a Vulnerability - A Deep Dive into the Real Security Threats From the Twitch Leak (GitGuardian) - https://blog.gitguardian.com/security-threats-from-the-twitch-leak/

Teenagers Leveraging Insider Threats: Lapsus$ Hacker Group (Forbes) - https://www.forbes.com/sites/emilsayegh/2023/03/15/teenagers-leveraging-insider-threats-lapsus-hacker-group

Lapsus$: Oxford teen accused of being multi-millionaire cyber-criminal (BBC) - https://www.bbc.co.uk/news/technology-60864283

Dynamic Secrets (HashiCorp) - https://developer.hashicorp.com/vault/tutorials/getting-started/getting-started-dynamic-secrets

Crappy code, crappy Copilot. GitHub Copilot is writing vulnerable code and it could be your fault (GitGuardian) - https://blog.gitguardian.com/crappy-code-crappy-copilot/

trufflesecurity/trufflehog (GitHub) - https://github.com/trufflesecurity/trufflehog

gitleaks/gitleaks (GitHub) - https://github.com/gitleaks/gitleaks

Git (Wikipedia) - https://en.wikipedia.org/wiki/Git

awslabs/git-secrets (GitHub) - https://github.com/awslabs/git-secrets

YouTube Video UCQajlJPesqmyWJDN52AZI4Q_BX15Z_xF8mA

Preventing Data Leaks With Git Guardian

Forensic Focus 3rd May 2023 11:07 am

This error message is only visible to WordPress admins

Important: No API Key Entered.

Many features are not available without adding an API Key. Please go to the YouTube Feed settings page to add an API key after following these instructions.

Latest Articles

Share to...