Database of Software “Fingerprints” Expands to Include Computer Games

by Richard Press, NIST

One of the largest software libraries in the world just grew larger. The National Software Reference Library (NSRL), which archives copies of the world’s most widely installed software titles, has expanded to include computer game software from three popular PC gaming distribution platforms—Steam, Origin and Blizzard.

The NSRL, which is maintained by computer scientists at the National Institute of Standards and Technology (NIST), allows cybersecurity and forensics experts to keep track of the immense and ever-growing volume of software on the world’s computers, mobile phones and other digital devices. It is the largest publicly known collection of its kind in the world.

To people who work in cybersecurity and digital forensics, the world is a vast and ever-rising ocean of digital objects. NIST’s Reference Data Set—a list of more than 40 million hashes, or digital “fingerprints” of known software files—helps them quickly find what they’re looking for.
Credit: K. Irvine/NIST

The NSRL does not loan out the software in its collection. However, NIST runs every file in the NSRL through an algorithm that generates a digital “fingerprint”—a 60-character string of letters and numbers, also known as a hash, that uniquely identifies that file. Every quarter, NIST releases an updated list of hashes to the public. The list, which NIST calls the Reference Data Set, or RDS, can be freely downloaded from the agency’s website. The latest RDS contains more than 40 million hashes, including those for the recently added video game files.

To people who work in the fields of cybersecurity and digital forensics, the world is a vast and ever-rising ocean of digital objects. The RDS allows them to navigate that ocean and quickly find what they’re looking for.


Get The Latest DFIR News

Join the Forensic Focus newsletter for the best DFIR articles in your inbox every month.


Unsubscribe any time. We respect your privacy - read our privacy policy.

Many crimes today involve some form of digital evidence, and the NSRL helps investigators to process that evidence more quickly. If investigators have a seized hard drive or mobile phone, for instance, they can quickly hash all the files on that device, then compare that hash list to NIST’s RDS. All the files that match can be typically ignored because they are known software files that wouldn’t contain information relevant to the investigation.

“After they filter out all of the known files, they’re left with everything that’s not recognized,” said Doug White, the NIST computer scientist who runs the NSRL. “Those are the files that might be interesting.”

Digital forensic investigators at all levels of government and in private industry rely on the RDS to efficiently manage their caseload.

The NSRL contains operating system software, office software, media players, device drivers—all types of software files that are commonly installed on personal computers. In 2016, the NSRL expanded to include hundreds of thousands of mobile apps, which extended its usefulness to mobile phones.

The recent addition of gaming software to the NSRL reflects the growing popularity of that software category. “We’re not watching what gamers are doing,” White said. “But we need to include gaming software in the NSRL if we want to stay relevant.”

Among the video game titles added to the NSRL are “PlayerUnknown’s Battlegrounds,” “World of Warcraft” and “Mass Effect.”

“These games are insanely popular,” said Eric Trapnell, a NIST computer scientist who helped curate the collection and is a gamer in his spare time. “Some of them have install bases in the millions.”

Many of the titles were donated to the NSRL by Valve Software, which owns the Steam platform; Electronic Arts, which owns Origin; and Activision Blizzard, which owns Blizzard. Other titles were purchased if their install base was large enough to justify the expense. All titles in the NSRL are properly licensed and acquired.

While the NSRL exists primarily to support cybersecurity and law enforcement efforts, it is also considered a repository of culturally significant digital artifacts. While important books, films and audio recordings are preserved at the Library of Congress, the NSRL functions as a national software archive. Historians consider this important because most of modern culture is both produced and consumed using software.

“Think of all the PowerPoints and Word documents that have tremendous historical significance,” said Trevor Owens, head of Digital Content Management at the Library of Congress. He might have added digital artworks, maps and interactive media. “Those documents might be lost, if future historians don’t have access to a comprehensive collection of software.”

An earlier batch of video games was added to the NSRL two years ago, including first editions of “Mario Bros.,” “Asteroids” and “Sim City,” preserving these retro titles and associated artwork for posterity.

While law enforcement professionals and digital culture geeks might seem strange bedfellows, White says he’s not surprised by their shared interest in the software library. “We preserve the software and make the RDS available to the public,” White said. “The more people who find that useful, the better.”

This article was originally published on NIST.gov.

Leave a Comment

Latest Videos

Si and Desi interview Emi Polito from Amped about how to become an Amped FIVE Certified Examiner (AFCE). They discuss the exam requirements, format, timeline for certification, and Amped’s future plans. Emi explains that the certification is aimed at demonstrating competency with the Amped FIVE video analysis software after completing training. The exam consists of multiple choice questions on theory and practical exercises using the software. Emi talks about the online exam format and process for passing or failing.

Emi also discusses the broader challenges many organizations face with validation and accreditation. He emphasizes Amped's commitment to developing tools that facilitate that process. The hosts reflect on the confusing accreditation landscape and Amped’s passion for improving training and certification in forensics. This episode provides an overview of Amped's new certification and perspective on challenges in the field of video forensics.

Show Notes:

Introducing The AFCE Certification (Amped FIVE Certified Examiner) - https://www.forensicfocus.com/news/introducing-the-afce-certification-amped-five-certified-examiner/

Video Evidence Principles With Amped Software - https://www.forensicfocus.com/podcast/video-evidence-principles-with-amped-software/

Digital Image Authenticity And Integrity With Amped Authenticate - https://www.forensicfocus.com/podcast/digital-image-authenticity-and-integrity-with-amped-authenticate/

File Analysis And DVR Conversion Training From Amped Software - https://www.forensicfocus.com/reviews/file-analysis-and-dvr-conversion-training-from-amped-software/

Amped FIVE Speed Estimation 2d Filter And Training From Amped Software - https://www.forensicfocus.com/reviews/amped-five-speed-estimation-2d-filter-and-training-from-amped-software/

Amped Software’s Martino Jerian on Key Challenges and Opportunities for Video Evidence - https://www.forensicfocus.com/podcast/amped-softwares-martino-jerian-on-key-challenges-and-opportunities-for-video-evidence/

LEVA 2023 Training Symposium - https://www.leva.org/

Forensic Collision Investigation & Reconstruction Ltd - https://www.fcir.co.uk/

Amped FIVE Certified Examiner - https://ampedsoftware.com/afce-certification 

Introducing the Amped FIVE Certification Program - https://blog.ampedsoftware.com/2023/10/04/introducing-the-amped-five-certification-program

Amped Software YouTube - https://www.youtube.com/ampedsoftware
How to Use the Validation Tool in Amped FIVE - https://blog.ampedsoftware.com/2023/03/29/how-to-use-the-validation-tool-in-amped-five

Si and Desi interview Emi Polito from Amped about how to become an Amped FIVE Certified Examiner (AFCE). They discuss the exam requirements, format, timeline for certification, and Amped’s future plans. Emi explains that the certification is aimed at demonstrating competency with the Amped FIVE video analysis software after completing training. The exam consists of multiple choice questions on theory and practical exercises using the software. Emi talks about the online exam format and process for passing or failing.

Emi also discusses the broader challenges many organizations face with validation and accreditation. He emphasizes Amped's commitment to developing tools that facilitate that process. The hosts reflect on the confusing accreditation landscape and Amped’s passion for improving training and certification in forensics. This episode provides an overview of Amped's new certification and perspective on challenges in the field of video forensics.

Show Notes:

Introducing The AFCE Certification (Amped FIVE Certified Examiner) - https://www.forensicfocus.com/news/introducing-the-afce-certification-amped-five-certified-examiner/

Video Evidence Principles With Amped Software - https://www.forensicfocus.com/podcast/video-evidence-principles-with-amped-software/

Digital Image Authenticity And Integrity With Amped Authenticate - https://www.forensicfocus.com/podcast/digital-image-authenticity-and-integrity-with-amped-authenticate/

File Analysis And DVR Conversion Training From Amped Software - https://www.forensicfocus.com/reviews/file-analysis-and-dvr-conversion-training-from-amped-software/

Amped FIVE Speed Estimation 2d Filter And Training From Amped Software - https://www.forensicfocus.com/reviews/amped-five-speed-estimation-2d-filter-and-training-from-amped-software/

Amped Software’s Martino Jerian on Key Challenges and Opportunities for Video Evidence - https://www.forensicfocus.com/podcast/amped-softwares-martino-jerian-on-key-challenges-and-opportunities-for-video-evidence/

LEVA 2023 Training Symposium - https://www.leva.org/

Forensic Collision Investigation & Reconstruction Ltd - https://www.fcir.co.uk/

Amped FIVE Certified Examiner - https://ampedsoftware.com/afce-certification

Introducing the Amped FIVE Certification Program - https://blog.ampedsoftware.com/2023/10/04/introducing-the-amped-five-certification-program

Amped Software YouTube - https://www.youtube.com/ampedsoftware
How to Use the Validation Tool in Amped FIVE - https://blog.ampedsoftware.com/2023/03/29/how-to-use-the-validation-tool-in-amped-five

YouTube Video UCQajlJPesqmyWJDN52AZI4Q_VKk-mhlae1c

Becoming An Amped FIVE Certified Examiner (AFCE)

Forensic Focus 1st December 2023 4:25 pm

Subscribe to the Forensic Focus Podcast: https://www.forensicfocus.com/podcast/

Si and Desi are joined by Brittany and Ailsa from digital forensics software company ADF Solutions. They discuss how ADF is addressing key challenges for digital forensics practitioners, including handling the massive volumes of data from mobile devices and the cloud.

The guests outline ADF's focus on developing their software as an easy-to-use onsite triage tool that can help quickly identify pertinent evidence. Key features include advanced handling of video files, AI-assisted classification of images, and new screen recording capabilities for mobile devices that allow suspects to safely share relevant data. 

The hosts and guests also explore ADF's ongoing research into areas like facial recognition, handling new device types like games consoles and smart watches, and identifying deepfake media.

00:00 – Introduction to Ailsa and Brittany
03:00 – The challenge of vast amounts of data
05:50 – Recovering data from Chromebooks
08:50 – Triaging using ADF tools
12:30 – Benefits of using ADF Solutions’ tools
15:50 – Limitations in types of apps
17:20 – Keeping up with technological advancements
19:15 – ADF customer base
21:00 - Artificial intelligence in classifying images
30:00 – ADF Solutions’ triaging kit
37:00 – Training with ADF
40:00 – Target user
44:50 – Roadmap of future devices to examine
51:30 – Main focus for ADF Solutions going forwards

Show Notes:
AI-generated CSAM article on Sky News - https://news.sky.com/story/thousands-of-ai-generated-child-abuse-images-being-shared-online-research-finds-12991727

Subscribe to the Forensic Focus Podcast: https://www.forensicfocus.com/podcast/

Si and Desi are joined by Brittany and Ailsa from digital forensics software company ADF Solutions. They discuss how ADF is addressing key challenges for digital forensics practitioners, including handling the massive volumes of data from mobile devices and the cloud.

The guests outline ADF's focus on developing their software as an easy-to-use onsite triage tool that can help quickly identify pertinent evidence. Key features include advanced handling of video files, AI-assisted classification of images, and new screen recording capabilities for mobile devices that allow suspects to safely share relevant data.

The hosts and guests also explore ADF's ongoing research into areas like facial recognition, handling new device types like games consoles and smart watches, and identifying deepfake media.

00:00 – Introduction to Ailsa and Brittany
03:00 – The challenge of vast amounts of data
05:50 – Recovering data from Chromebooks
08:50 – Triaging using ADF tools
12:30 – Benefits of using ADF Solutions’ tools
15:50 – Limitations in types of apps
17:20 – Keeping up with technological advancements
19:15 – ADF customer base
21:00 - Artificial intelligence in classifying images
30:00 – ADF Solutions’ triaging kit
37:00 – Training with ADF
40:00 – Target user
44:50 – Roadmap of future devices to examine
51:30 – Main focus for ADF Solutions going forwards

Show Notes:
AI-generated CSAM article on Sky News - https://news.sky.com/story/thousands-of-ai-generated-child-abuse-images-being-shared-online-research-finds-12991727

YouTube Video UCQajlJPesqmyWJDN52AZI4Q_4z-EgH54KZk

The Power Of Digital Forensics: How ADF Solutions Is Revolutionizing The Digital Forensics Industry

Forensic Focus 30th November 2023 2:57 pm

This error message is only visible to WordPress admins

Important: No API Key Entered.

Many features are not available without adding an API Key. Please go to the YouTube Feed settings page to add an API key after following these instructions.

Latest Articles