by Christa Miller, Forensic Focus
The need for image recognition and categorization has never been more in demand thanks to the spread of extremist propaganda, child sexual abuse material (CSAM), and other illicit activity across the internet.
Because of the sheer amount of material online, investigators assigned to these kinds of cases need ways to recognize it quickly and also to categorize it — to separate known from fresh material. This is particularly important when it comes to neutralizing active threats and rescuing victims — as well as preserving investigators’ mental health by limiting the amount of material they have to see.
Thanks to developments in machine learning and artificial intelligence, a number of vendor products have been able to incorporate rapid recognition or categorization tools into their software. Here, we take a look at what’s available.
Image Recognition & Categorization Technology
Image recognition is often powered by artificial intelligence, which is trained through machine learning to differentiate objects (and, on a higher end, faces) in pictures. Examples include money, weapons, drugs, militant clothing, nudity, etc.
A step or more beyond hashing, image recognition technology focuses on the object or action rather than on whether it’s been seen before, as hashing does. That makes it possible to discern new victims or new crimes, as described in this article from Wired.
Once pictures and videos are identified, of course, it’s important to be able to categorize them according to the degree of criminal activity they represent — an important step in bringing charges. Most often applied to CSAM, categorization helps investigators to separate the known from the unknown, the relevant from the irrelevant, and even animation from real-life imagery.
Categorization resources like Project VIC (in the United States), the Child Abuse Image Database (CAID) in the United Kingdom, INTERPOL’s International Child Sexual Exploitation (ICSE) database, and C4All form a way for investigators to collaborate and share hashed data, breaking down data silos so that everyone can benefit. Many of the tools listed below offer the ability to export new hashed images to these databases.
AccessData FTK / Lab
FTK gave users the ability to import and export Project VIC data in 2017, but more recently — as of FTK v7.1 — AccessData brought in machine learning-powered object and facial recognition. By relying on open-source machine learning technology from Google, the new system allows investigators to feed images into the software, training it in real time to make accurate, precise identifications of what the images contain. By training the system to seek specific individuals or objects within images, users can then filter the results to focus on those pictures.
ADF have been offering image classification for CSAM detection since 2006. It was developed internally using the HaarWavelet feature and k-NearestNeighbors classification algorithm. The results were not as accurate as they are now, but it eliminated a lot of noise very quickly. In 2018, ADF upgraded its image classifier to use TensorFlow, the leading artificial intelligence library, and trained it to recognize a variety of visual classes relevant to its users (such as bestiality, child abuse, people, pornography, portraits, scanned documents, upskirting, US currency, vehicles, and weapons).
Because image classification is time-consuming and the ADF tools are often used to quickly qualify exhibits on-scene or in the lab, the classification only starts after the data collection is complete. To further save time, all pictures go through an initial filter that eliminates non-photographs (icons, cliparts, and other pixel arts) so only the remaining pictures are classified.
The image classifier works on all the collected pictures and videos coming from the file system, archives, databases, thumbnails, documents, emails, messages, downloads, and unallocated space. And users have the ability to select the exact confidence value to show more pictures of the same class as time permits.
In addition to image classification, the ADF products support image matching with PhotoDNA as well as MD5/SHA1 hash matching by importing VICS-compatible datasets.
Click here to watch a short video that shows ADF image recognition and classification in action.
Autopsy is an open source digital forensics platform, known for being the graphical user interface to the Sleuth Kit command line forensics framework. As open source software, Autopsy is completely free to use and might be a good alternative for smaller labs on a budget, that need to investigate child exploitation cases and get fast results.
Primarily developed and supported by Basis Technology, Autopsy is supplemented through community-developed modules. Among them:
- FDRI—Facial Detection and Recognition in Images. Taking Second Place in the 2018 OSDFCon Module Development Contest, FDRI relies on deep learning for its face detection and face recognition capabilities.
- Image Classification for Autopsy, a submission to the 2018 contest, automatically classifies the objects — cars, guns, or anything the user selects — it finds in images. Watch the video here and find the source code here.
- A file-level ingest module, FaceRadar, which detects image files and then scans each one for faces.
- An “OpenCV” object detection module, located in Autopsy’s “Experimental” part because it doesn’t come with any trained models. A classification module is planned for a future release.
Autopsy integrates Project VIC and C4All databases via modules in its Law Enforcement Bundle.
Belkasoft Evidence Center
Belkasoft Evidence Center supports a variety of image recognition: faces, text (optical character), and pornography within pictures and video key frames, as well as forged image detection. As of v9.5, BEC also identifies pictures modified with hand-drawn arrows or other marks that denote potential drug dead-drops or other distribution points.
BEC relies on artificial neural networks for its image recognition. It draws on detection technology for skin, eyes, nose, and mouth features to help accurately identify faces and explicit content; and on de-skew, resolution increase, and other technologies to help identify text.
In addition, a separate Forgery Detection Plugin detects altered or modified JPEG pictures, including those saved at a different compression level, cropped, or with altered content such as exposure tuning.
BlackBag partnered with Image Analyzer early in 2019 to bring AI-driven image recognition combined with triage and prioritization techniques. As of its 2019 R1 version, BlackLight searches for pornography, weapons, drugs, extremism, gore, alcohol, and even swimwear and underwear, identifying new material not previously hashed.
By focusing on the detection of visual threats, as opposed to everyday objects, Image Analyzer shortens investigative time. Because it’s built into BlackLight, the solution can be run on pictures and videos even with no Internet connection — and at no additional charge.
The integration allows investigators to examine the riskiest content categories first. This makes it possible for users to start reviewing the images or videos with the likeliest relevance to their case based on the algorithm’s confidence in whether a given content type is present. The available categories will continue to grow as BlackBag cooperates with Image Analyzer to provide user requested categories.
For categorization, BlackLight allows the export of relevant pictures to Project VIC, BlueBear LACE, and C4All to categorize the images. However, it also partners with Semantics 21 to analyze and categorize media, making for a more streamlined approach to categorization.
Cellebrite UFED Analytics
Separate from its extraction or forensic analysis tools, Cellebrite’s UFED Analytics includes algorithms that identify weapons, drugs, CSAM, adult content, documents, and screenshots. In addition, facial recognition allows investigators to cross-reference and match individual faces across collected pictures and videos.
Once pictures are identified, UFED Analytics automatically categorizes images and individual video frames. Not only does this eliminate manual review of duplicative evidence; it also identifies and correlates unknown or unique images.
The platform’s integration with Project VIC, CAID and other defined hash value databases then allows investigators to match collected evidence against hash values of existing known material, as well as to categorize and export newly discovered material for sharing with other investigators.
Cyan Forensics’ triage tools relies on Contraband Filters to allow investigators to scan pictures and videos on site. These filters replace MD5 hashing, allowing for speedier identification of contraband. Built using original material of extremist and CSAM content, the software allows investigators to identify whether contraband exists without exposing additional material — key if triaging on scene.
Griffeye Analyze DI
One of the most widely known and used image categorization tools, Analyze DI makes use of a new generation of algorithms for its Face Recognition technology. Face Recognition identifies and matches suspects and victims in imported images and videos — even in complex lighting conditions, blurry and noisy streams, or when faces are positioned at an angle.
This enables investigators to break out all unique individuals so they can narrow down, structure and prioritize relevant material, reducing both the time they have to take and their own exposure to the material. In addition, Face Recognition enables users to quickly find images in a case that contain similar faces to a suspect or victim. It enhances Analyze DI’s Analyze Relations link analysis tool by linking images of the same person together to show potential new links.
In addition, Griffeye Brain, a CSAM and object classifier trained on real case data, scans large numbers of previously unseen pictures and footage and suggests images that are likely to depict CSAM.
To categorize the images it finds, Analyze DI incorporates technologies and methodologies produced through Project VIC. Its image and video hashing pre-categorizes known data and stacks duplicates. In addition, Griffeye partners with many leading digital forensic software solutions including tools from ADF Solutions, Amped, BlueBear, Magnet Forensics, and Nuix.
Griffeye DI Core is free for everyone to use.
Over the past two years, Magnet Forensics has been busy adding image recognition capabilities to its Magnet.AI feature within AXIOM. Magnet.AI scans any artifact, including chat and email attachments, web cache data, and video thumbnails, that contains image data.
Potential results include depictions of child sexual abuse, nudity, weapons, and drugs; potential screenshots; money, documents, and personal identification such as driver’s licenses or passports; and vehicles, buildings (exteriors) and drones.
AXIOM integrates with Griffeye AnalyzeDI and Semantics 21 products, but it also features newly enhanced compatibility with Project VIC and CAID hash sets for redesigned media categorization in its own platform. Like other solutions, it makes it possible for investigators to share new data with the investigative community.
The extraction tool XRY from MSAB automatically recognizes the contents of images during the decoding process and sorts them into categories such as drugs, weapons and people. Images identified as CSAM during this process are hidden from examiners’ eyes, protecting them from trauma.
Examiners can then immediately zero in on categories of interest when later examining the data in XAMN – the analysis software suite from MSAB. A Location Range Filter gives investigators the ability to focus searches by geographic location.
XAMN is interoperable with the Project VIC database. If new pornographic images are found, examiners can tag them and export the hash values to the Project VIC database.
Nuix incorporates Google’s open-source machine-learning capabilities, enabling investigators to rapidly identify images using predefined models to filter images of drugs, guns, money, weapons, cars, adult content and child abuse content.
This streamlines investigations by allowing investigators to quickly differentiate between previously known and verified images, and new images that may provide new clues about unknown victims and violators. Additionally, the system can be trained in real time to accurately and precisely identify what the images contain based on their specific investigative requirements. Nuix also incorporates skin tone analysis and facial detection technologies to further empower investigators.
Investigators can easily use these capabilities to enhance existing workflows and automate the processing and analysis of huge volumes of data. Nuix also facilitates the exchange of intelligence information and allows for the sharing of files, tags, and other metadata from any application or tool by eliminating the need to reprocess data.
Using Nuix, investigators can export unclassified files, using OData, into tools such as Griffeye Analyze DI, then analyze, categorize, and tag these unknown files. Nuix also integrates with Project VIC and CAID.
Oxygen Forensic Detective
In June 2019 Oxygen Forensics announced that integrated technology from Rank One Computing, a leading provider of facial recognition and biometrics technology, would allow for facial recognition within its tool at no additional cost to customers. The capability will allow Oxygen Forensics Detective users to capture and analyze aggregated image and video data extracted from more than 27,000 unique devices.
Oxygen Forensics Detective’s seamless integration with Project VIC allows users to identify CSAM using hash sets. The found results are visualized both in the software’s Project VIC section, and on a separate tab in File Browser, categorized according to Project VIC classification.
Paraben Corporation E3 Platform Boost
Paraben also partners with Image Analyzer to offer the enhanced Image Analysis Boost to its Electronic Evidence Examiner (E3) Platform. Able to obtain data from mobile, computer, cloud, Internet of Things, and email sources, E3 relies on the Boost to detect images containing pornography, swimwear and underwear, extremis, drugs, gore, and weapons. Variable threshold options allow the Image Analysis Boost to be configured for use by any organization, from law enforcement to corporate environments. (Note: an active E3 Platform 2.0 license is required to be able to deploy the Image Analysis Add-On.)
Semantics 21 Laser-i Series
Semantics 21’s suite of products categorizes images from photos and video, and can triage suspect media. Facial identification, victim identification, age recognition, object detection, and nudity detection are all part of the feature set, filtering through media to understand content, its relevance, and whether further review is needed. A real-time intelligence database has full compliance and support for Project VIC (and its UK CAID variant).
T3K’s Law Enforcement Analytical Product (LEAP) is designed for front-line personnel to triage content on smartphones within minutes and without the need for specialist knowledge or training. Its picture recognition focuses on terrorism, human trafficking, smuggling and the trade of illegal goods, and CSAM using advanced object recognition.
As artificial intelligence advances and becomes easier to deploy, expect more vendors to add solutions — via development or partnership — that make your job easier to do. Watch Forensic Focus for the latest news!