As digital forensics moves toward a more established presence within the forensic sciences, 2022 starts off on an existential note thanks to a set of papers exploring digital forensic expertise and process, and the language used to describe both.
Technical papers published during the month of January include discussions on the technological behaviors of previously convicted child sexual exploitation material (CSEM) offenders, the forensic discoverability of iOS vault applications, ransomware detection and forensic analysis research, and an attention-based explainable deepfake detection algorithm.
Digital forensic expertise and process – and quality
At WIREs Forensic Science, Teesside University’s Graeme Horsman and independent expert consultant Brett Shavers asked: “Who is the digital forensic expert and what is their expertise?” The paper focuses on expert witnesses called to conduct specific investigative work and provide opinion evidence at trial, noting, “In operational reality, the DFS field has long since outgrown the period in which one single individual may viably consider themselves to have extensive expertise of all its facets.”
Horsman and Shavers cited two critical factors in this:
- The rapid pace of change in technology that can make it more difficult for a digital forensics practitioner to attain and maintain “expert” status.
- The likelihood that those seeking an expert witness may not have the knowledge (or the time capacity) needed to make an informed decision.
Their recommendation: for practitioners to self-define particular subsets of expertise using a simple structure: “I maintain expertise in the areas of <DEFINE>, acquired via <DEFINE> and evidenceable through <DEFINE>.”
The structure takes into account expertise acquired both formally and informally, through case experience and research as well as professional development, and the evidence needed to support that expertise.
These definitions mitigate not only the “shelf life” of expertise, but also the risk of an expert assuming expertise in one area transfers to another. Ultimately, the authors argued, “an expert’s true expertise comes from their ability to test, interpret, quantify, and communicate any findings with rigor and accuracy.”
This includes bias mitigation, the subject of a paper written by the Norwegian Police University College’s Nina Sunde. “Strategies for safeguarding examiner objectivity and evidence reliability during digital forensic investigations” focuses on one example of digital forensic examiners’ practical, day to day work: formulating hypotheses as to how the evidence got on the device.
Based on a survey of 53 practitioners who had participated in an experiment, Sunde’s work confirms “that hypotheses played an important role in how the DF practitioners handled contextual information before the analysis and safeguarding objectivity during the analysis.”
However, the survey results highlighted three ways in which the practitioners’ hypotheses fell short:
- 45% of respondents analyzed evidence without an innocence hypothesis in mind.
- 34% applied no techniques to maintain their objectivity during the analysis.
- Although the majority used dual tool verification to ensure their evidence was reliable, 38% did not use any techniques at all to examine or control evidence reliability.
“These insights,” wrote Sunde, “are essential for the DF community to guide the development of procedures that safeguard fair investigations and effective error mitigation strategies.”
Hypothesis formation is one part of “those core thought processes, decisions and behaviours that form part of effective investigative practices” that are frequently omitted from current foundational digital forensic investigation process models, which focus on physical examination tasks.
That’s the conclusion of a different piece, “Unboxing the digital forensic investigation process,” coauthored by Sunde with Horsman at Science & Justice. Their suggested solution: the Digital Forensic Workflow Model (DFWM), “a contribution to the digital forensic management toolbox [that] enables the identification and management of risk and supports error mitigation at each stage of the workflow.”
The paper’s abstract stresses the need for continued research into the digital forensics’ various physical and cognitive tasks, taking into account case investigative strategies.
Part of this is shoring up parts of the process that happen outside the forensic lab. Horsman’s paper “Conducting a ‘manual examination’ of a device as part of a digital investigation,” published at Forensic Science International: Digital Investigation, describes the on-scene need for first responders or investigators – sometimes – to examine live devices for potential evidence.
To support these personnel in, first, determining whether a device contains relevant data, and second, preparing to conduct an interrogation or other action based on that data, Horsman proposed a device manual examination framework or procedure they could follow.
Underpinning these processes are standards – whether formally adopted and used to accredit labs and their operations, or informally followed. The language those standards use can help or hinder, wrote Angus Marshall at Forensic Science International: Digital Investigation, in “The unwanted effects of imprecise language in forensic science standards.”
By comparing the language used by two sources – ISO/IEC 17025 (2017) and the 2020 guidance on digital forensic methodology from the United Kingdom’s Forensic Science Regulator (FSR) – with language from ISO/IEC 27037, ISO/IEC 27041, and ISO/IEC 27042, Marshall argued that the FSR overemphasizes the criminal justice system as customer for digital forensics services.
Doing so, he concluded, could risk the following impacts:
- End user requirements that are so over-emphasized as to detract from requirements for “purely internal use of processes.”
- Overly complex, difficult to validate “monolithic” processes and methods that give only an appearance of being “fit for purpose.”
- Confusion around method validation and re-validation concepts.
- Pressure to “check the boxes” around becoming accredited.
- Reports that focus on mitigating risk to the customer.
- Potential for accreditation of incorrect processes leading to “cognitive and confirmation biased examinations,” or in other words, “for the accredited processes to fall below what might be considered the best standard possible.”
To mitigate these risks, Marshall argued in favor of taking the time to reevaluate and redesign standard operating procedures, and in particular to review “customer” and “end user” terminology in context.
Drilling into technical whys and hows toward better results
The four technical papers from January diverge in content and focus, but each seeks to improve results in some way either social or technical or, arguably, both.
Investigating, treating, and deterring child sexual exploitation material (CSEM) offenders starts with understanding how and why they choose the technologies they choose. At the Journal of Digital Forensics, Security, and Law (JDFSL), authors Chad Steel of George Mason University, and University of Edinburgh’s Emily Newman, Suzanne O’Rourke, and Ethel Quayle, revealed the “Technical Behaviours of Child Sexual Exploitation Material Offenders” based on a survey of 78 United States-based convicted offenders:
- Both utility and perceived risk factored into the offenders’ technology choices.
- Most respondents used more than one technology to view CSEM.
- The most common “gateway” technologies were peer-to-peer and web browser software.
- Another “gateway”: adult pornography, which almost all CSEM offenders had started viewing at first.
- A “substantial minority” of CSEM offenders viewed CSEM without storing it.
- Offenders use countermeasures mainly to mitigate and reduce anxiety rather than to avoid detection, so while they used more countermeasures than the nonoffending public, they weren’t more likely to use encryption.
Many CSEM offenders do continue to store illicit material, though, as do offenders in other crime categories. Also at JDFSL last month, “Forensic Discoverability of iOS Vault Applications,” authored by Purdue University’s Alissa Gilbert and Kathryn C. Seigfried-Spellar, used three forensic toolkits to analyze artifacts left behind by five different iOS photo vaults.
Their results showed that although the toolkits’ results were each different, all found some artifacts. “The media left behind was due to the photo vaults not protecting their information as claimed,” the authors concluded, by “using basic obfuscation techniques in place of security controls.” Gilbert and Seigfried-Spellar intend to continue their research by testing their methods against newer security controls.
Researchers from the University of Texas at San Antonio explored “Deepfake forensics analysis: An explainable hierarchical ensemble of weakly supervised models” at Forensic Science International: Synergy. Samuel Henrique Silva, Mazal Bethany, Alexis Megan Votto, Ian Henry Scarff, Nicole Beebe, and Peyman Najafirad argued that the convolutional neural networks used for the binary classification task in deep learning – an overall successful approach – may not be appropriate “in a scenario in which the generation algorithms constantly evolve.”
Their solution: “a hierarchical explainable forensics algorithm that incorporates humans in the detection loop.” The algorithm is attention-based, evaluating faces based on more regions while simultaneously relying on a single specific focal point to decide. It’s also based on more than one model to address generalization. Tested against the Deepfake Detection Challenge Dataset, the model achieved 92.4% accuracy, which was maintained in datasets not used in the training process.
Finally, ransomware detection and forensic analysis research suffers from a deficiency in reproducibility, write authors from Edinburgh Napier University. At Forensic Science International: Digital Investigation, Simon R. Davies, Richard Macfarlane, and William J. Buchanan offered “NapierOne: A modern mixed file data set alternative to Govdocs1.”
Designed to address the dearth of “detail on how the test data set was created, or sufficient description of its actual content, to allow it to be recreated by other researchers interested in reconstructing their environment and validating the research results,” NapierOne, with its nearly 500,000 unique files composing 100 separate data subsets divided between 44 distinct file types, is intended to complement the original Govdocs1 dataset and is available to researchers free of charge.