First published November 2004
The following is an extract from Chapter 8 of “Windows Forensics & Incident Recovery” by Harlan Carvey.
Collecting data from a potentially compromised system is relatively simple. There are several methodologies for collecting data that an investigator can adapt to her needs. Some investigators may simply go to the “victim” system and run native tools at the console and any installed anti-virus software. Others may download tools from the Internet or a network drive or bring those tools with them on a diskette. Still others may take a more stringent approach to their investigative techniques in an effort to preserve potential evidence on the system, realizing that their actions will leave footprints of some kind and attempting to minimize the effects of their actions on the “victim” system.
It should go without saying that collecting and analyzing data from a potentially compromised system is paramount. When a system is suspected to have been compromised in some manner, simply reinstalling the system from “clean” media or from a known-good image can be just as bad as ignoring the issue. Without determining the nature of the incident, there is no way for the administrator or investigator to know how to protect the system. Placing that system back into production may lead to it being compromised all over again. In addition, other systems within the infrastructure may also have been compromised, so the investigator needs to understand the complete nature and scope of the incident. This can only be done by collecting and analyzing data from potentially compromised systems. In order to accomplish this, the investigator needs a methodology and toolkit that will allow her to quickly and efficiently gather and correlate data from multiple systems, if necessary, so that she can then make decisions and provide guidance regarding follow-up actions. Shooting from the hip and speculating about what happened can lead to a complete misunderstanding and misrepresentation of the issue, and the actions taken to resolve the issue may end up being completely ineffective.
This chapter will initially cover the collection of data, but we have to realize that collecting the data is the easy part. Correlating the data and understanding what it means requires an additional step. How does the investigator find files or processes that the attacker has taken great effort to hide? What constitutes “suspicious” activity? The primary focus of this chapter will be to address the forensically sound collection of data from a system, but this chapter will also discuss how to understand the data that has been collected.
The actual Perl code for the Forensic Server Project (FSP) server and client components will not be included in this chapter, as has been done in previous chapters. The code and its function will be described in detail, but the actual code itself is hundreds of lines long. The code for the server component and the two client components described in this chapter is included on the accompanying CD.
The Forensic Server Project
The preferred method of obtaining volatile (and some non-volatile) data from a Windows system in a forensically sound manner is to use netcat or cryptcat (see the “Netcat” sidebar in Chapter 3, Data Hiding). This methodology lets the investigator pipe the output of commands run from a CD through the network connection provided by netcat/cryptcat to a waiting listener on a remote system. However, this process still requires that the investigator record a good deal of documentation by hand, making the process cumbersome and unlikely to be used in all cases. The purpose of the Forensic Server Project (FSP) is to provide a framework for performing forensically sound data collection from potentially compromised systems. The project accomplishes this by collecting data and transporting it to a waiting server via the system’s network interface. This way, files are not written to the potentially compromised system, as doing so will overwrite deleted files and potentially compromise a follow-up litigious investigation. The general framework for the FSP not only allows for it to be run from removable media, such as a USBconnected thumb drive, but with minor modifications to the code, it can also write data to those thumb drives.
The FSP consists of a server component that resides on a system managed by the investigator and client components that the investigator places on a CD (or thumb drive) for use on “victim” systems. The client components retrieve information from the “victim” systems and send it to the server. In the current version of the FSP, the communications between the client components and the server is not encrypted. However, the open source nature of the FSP makes this capability easy to add, and this capability will be included in future versions of the FSP. The client components communicate with the server by using verbs, or action identifier keywords. When the client wants the server to take a particular action, it will send a keyword, and the server will perform a set of predefined actions based on that keyword.
The client components communicate with the server using TCP/IP in order to provide a greater level of flexibility in diverse network environments. Other protocols, such as FTP or Microsoft file sharing, can limit the communications between the components by requiring specific ports to be open. In some cases, communications may be required to pass through firewalls that limit a wide range of communications protocols and ports. By allowing any port to be used, the FSP provides a great deal of flexibility for a variety of network environments.
Professor Ronald L. Rivest of MIT developed the MD5 message digest algorithm. This algorithm takes an input message of arbitrary length and produces a 128-bit message digest, or “.ngerprint.” According to the executive summary of RFC 13211 describing the MD5 message digest, it should be computationally infeasible to produce two messages having the same message digest. This characteristic makes the MD5 hash a powerful tool for ensuring the integrity of data. For example, an MD5 hash generated for a .le will be different if so much as a single bit within that .le is changed. For this reason, applications that monitor .le systems for changes use the MD5 algorithm. As the MD5 message digest algorithm provides for one-way encryption (i.e., the resulting digest cannot be decrypted), it provides an excellent facility for protecting information such as passwords and for generating hashes to ensure the integrity of data such as strings and .les. The MD5 message digest algorithm is implemented in Perl via the Digest::MD5 module. This module is not part of the standard ActiveState Perl distribution but is easily installed via the Perl Package Manager (PPM). The Secure Hash Algorithm 1 (SHA-1) was developed by the National Institute of Standards and Technology (NIST)2 and is described in RFC 31743. Similar to the MD5 algorithm, the SHA-1 takes an input message of arbitrary length and computes a 120- bit message digest. Like the MD5 message digest algorithm, SHA-1 provides an excellent mechanism for ensuring the integrity of .les. The algorithm is implemented in Perl via the Digest::SHA1 module, which is also installed via PPM. The server component also has facilities for performing analysis and correlation of the collected data, making it easier for the investigator to review it and make decisions. Some of these facilities, such as scripts for correlating data, come with the server. However, using a little imagination and Perl programming skill, the investigator can extend these capabilities and even create new ones.
The FSP is intended for use by the investigator. The FSP is an open source project, written in Perl so that it can be easily extended. As the FSP is written in Perl, the server component can be run on either Windows or Linux systems, and client components can be written in Python, Visual Basic, or other scripting or programming languages. The investigator can use specially designed client components to collect specific information from various systems, or a single client component can be designed for use by the first responder to collect a wide range of data. Using Perl, for example, the Forensic Server Project client components consist of the following:
The First Responder Utility (FRU), or fru.pl, which automatically collects a wide range of volatile (and some non-volatile) data from the “victim” system. This component does not have the flexibility inherent in the other components and must be fully configured by the administrator or investigator prior to being written to a CD. The reason for this is to reduce the amount of interaction the first responder has with the system by automating the collection of a wide range of information. All the first responder has to do is insert the CD into the “victim” system, launch a known-good, “clean” command prompt (cmd.exe retrieved from a “clean” system) for the appropriate system, and launch the FRU (i.e., the fru.pl script) via a batch file. Once the GUI dialog appears, the first responder will enter the IP address and port of the Forensic Server and hit the “GO” button. As long as the server component is configured and running and is reachable via TCP/IP communications, the FRU will begin automatically transferring the data it collects to the server. The server will record the activity and store the collected data for later analysis. The FRU can then be moved to another machine and run again without restarting the server. The Forensic Server Project includes a component for copying files from the “victim” system. The investigator first uses the GUI to select the files she wants to copy from the “victim” system (i.e., web server log files, suspicious executables, etc.). This component will then automatically collect information from the selected files, such as their MAC times, hashes, and other information, before copying the file to the server. Once on the server, the file’s hash is verified to ensure its integrity during transport. (Note: This component is available with the version of the FSP shipped with this book.)
Other components can be easily created using Perl or any other programming language the investigator (or developer) chooses. Possible components include:
A volatile information collector for retrieving information from the memory on the “victim” system, such as clipboard contents, processes, network connections, etc. This is a subset of what the FRU collects from a system, and it can be extended to include items such as the contents of process memory, etc.
A component for running commands external to the programming or scripting language. This is also a subset of the functionality available in the FRU, and it can provide additional flexibility in the toolset for the investigator. The purpose of such a component would be to provide the investigator with necessary framework for running arbitrary commands instead of using a preconfigured, hard-coded component such as the FRU.
The function of the FSP is not only to facilitate the collection of information in a forensically sound manner but also to automate the documentation of that collection and to allow for the analysis of that data. When collecting information from systems using tools such as netcat or cryptcat (netcat with TwoFish encryption) by piping the output of the commands through the tool to a waiting server, the investigator needs to document each of the commands used. When copying files, specific information about each file (MAC times, hashes, etc.) needs to be collected and documented. Once the files have been copied, the hashes need to be verified. This can be a time-consuming, laborious process that is also prone to mistakes. The server component of the FSP automates the collection of information as well as the creation of documentation. When the server receives data or a file, it automatically calculates and logs (to the case log file) MD5 and SHA-1 hashes for the files.