±User Info
Welcome Anonymous
Membership:
Latest: skippynz
New Today: 11
New Yesterday: 3
Overall: 21758
People Online:
Members: 0
Visitors: 26
Bots: 7
Staff: 0
Staff Online:
No staff members are online!
±Follow Us
±Latest Jobs
±Latest Articles
· Interview with Noreen Tehrani, Applied Trauma Psychologist, NTA
· Digital Forensics and eDiscovery Employment – The State of the Market 2012
· Key Twitter and Facebook Metadata Fields Forensic Investigators Need to be Aware of
· 689 Published Cases Involving Social Media Evidence (with full case listing)
· Overcoming Potential Legal Challenges to the Authentication of Social Media Evidence
· Dealing with Data Encryption in Criminal Cases
· AccessData FTK 4.0: initial impressions
· Firefox Cache Format and Extraction
· Android Tracking – from a forensic point of view
±Downloads
1: Forensic Examination of Digital Evidence: A Guide for Law Enforcement (pdf)
2: ACPO Good Practice Guide for Computer based Electronic Evidence
3: Ancysoft Data Recovery Software
4: Electronic Crime Scene Investigation: A Guide for First Responders (pdf)
5: HELIX incident response CD
6: PDA Forensic Tools:An Overview and Analysis
7: Recover My Files
8: Autopsy Forensic Browser Version 2.03 (source code)
9: Handy Recovery
10: PC On/Off Time
Unique File Identification in the National Software Reference Library
Back to top Back to main Skip to menuUnique File Identification in the National Software Reference Library
Page: 1/8
National Institute of Standards & Technology
100 Bureau Drive, Stop 8970
Gaithersburg, MD 20899
smead@nist.gov
The National Software Reference Library (NSRL) provides a repository of known software, file profiles, and file signatures for use by law enforcement and other organizations involved with computer forensic investigations. The NSRL is comprised of three major elements:
- A physical library of commercial software packages.
- A database of information about each file within each software package.
- A smaller database of the most widely used information that is updated and released quarterly. This database is called the NSRL Reference Data Set (RDS) and is NIST Special Database #28 [18].
This paper examines whether the techniques used to create file signatures in the NSRL produce unique results—a core characteristic that the NSRL depends on for the majority of its uses. The uniqueness of the file identification is analyzed via two methods: an empirical analysis of the file signatures within the NSRL and research into the recent attacks on the hash algorithms used to generate the file signatures within the NSRL.
The research addresses the following questions:
- Are the file signatures in the NSRL unique? The NSRL was examined for distinct files that generated the same signature (i.e., a collision).
- How likely is it that collisions will occur in the future? The probability of future collisions depends directly on the randomness of the file signatures. We ran statistical tests to answer the following questions:
- Do file signatures appear to be random?
- Do files bias the randomness of the file signatures in any detectable way?
- Do the recent attacks on MD5 and SHA-1 pose any specific threats to the NSRL? The conclusions of this paper are:
- There are no file signature collisions in the NSRL for either MD5 or SHA-1.
- There was no detectable bias introduced by hashing files, and so the probability of future collisions is negligible.1
- Although there are methods to attack the underlying hash algorithms, they are not relevant to the NSRL.
1 The probability of a collision between hashes in either MD5 or SHA1 is so small that it is effectively zero. Even if the size of the NSRL doubles each year in size, it would take more than 50 years before there would be enough SHA-1 file signatures to encounter a collision just by chance.
1.0 Introduction
The National Software Reference Library (NSRL) provides a repository of known software, file profiles, and file signatures for use by law enforcement and other organizations with computer forensic investigations. The NSRL is comprised of three major elements:
- A physical library of commercial software packages.
- A database of information about each file within each software package.
- A smaller database of the most widely used information that is updated and released quarterly. This database is called the NSRL Reference Data Set (RDS) and is NIST Special Database #28 [18].
The NSRL project was initiated in 1999 at the request of the FBI, the DoD Cyber Crime Center and the National Institute of Justice. The project is a part of the forensics sciences program at NIST’s Office of Law Enforcement Standards.2 The first release of the NSRL RDS3 was in 2001.
As of February 2006, the NSRL consists of over 7083 software application packages that include over 34 million files. Since many of the files are used within multiple applications, there are many duplicate files within the NSRL. Currently, there are over 10 million unique files. 4
During a forensic investigation, hundreds of thousands of files may be encountered. The NSRL is used to identify known files. This can reduce the amount of time spent examining a computer. Matches for common operating system files or applications do not need to be searched, either manually or electronically, for evidence. For example, if the forensic examiner was searching images on a Microsoft Windows™ 2000 system, a comparison against the NSRL would identify over 4000 image files that come with the standard Windows™ installation5. These specific files could be excluded from further examination, as the content is known, and would not contain evidence.
Additionally, some NSRL matches are used to determine what software applications were used on a system. This may provide information for the investigator to determine how and where to search for evidence. For example, if a computer system contains applications to crack passwords, keyboard loggers, or rootkit6 packages, this may lead to
further investigation to determine if the system was used to hack into other computer systems. This type of matching could also be used to resolve an intellectual property question if a system contained proprietary software for which the system owner had no license.
2 Additional information on the Office of Law Enforcement Standards (OLES) is at http://www.eeel.nist.gov/oles/index.html. 3 The RDS is available at http://www.nsrl.nist.gov. 4 The most recent version of the NSRL, 2.11 contains 34,994,666 files, and 10,793,831 unique SHA-1 file signatures. 5 Certain commercial software, equipment, instruments, or materials are identified in this paper to foster understanding. Such identification does not imply recommendation or endorsement by the National Institute of Standards and Technology, nor does it imply that the materials or equipment identified are necessarily the best available for the purpose. 6 A rootkit commonly refers to software installed after an attacker has gained access to a system to allow continued access, and to actively hide traces of the attacker’s activity.















