January Newsletter: Talking Point
From this month's newsletter:
Each month we will focus on a particular issue as suggested by a Forensic Focus member and give other members the opportunity to discuss the matter in the forums and/or respond to the questioner directly. This month David Peterson from Forensics Explorers submitted the following:
"I would like to see some research or an article that deals with the massive storage we'll be forced to "dig" through soon as technology increases the size of common hard-drives. Hitachi recently released a 1/2 terrabyte hard-drive for use in an everyday PC. How is the computer forensics field dealing with these problems (i.e. our software taking longer and longer to look through larger and larger storage devices)? How does "dead-forensics" matter if the information is so old by the time it is found that it becomes non-pertinent? Where does network forensics fit into law enforcment investigations as pertains to CALEA & the Patriot Act? Thanks for your time."
If you would like to discuss the above in the Forensic Focus forums please do so here!
Well I'd like to start this discussion out by suggesting one known method for decreasing the amount of data it is necessary to sort through and another method which is currently being worked on.
1) Currently, there is a common method used to decrease the amount of data which needs to be searched through the use of exclusionary hash lists. What I mean by exclusionary hash lists is a list of files which are specific to a certain program or operating system and their corresponding hash values. For example if a computer has been illegally accessed and it is believed there is a root kit installed on the system, you can take a hash list of a known-good installation of whatever operating system was on the computer and compared the hash values with those of the ones of files on the system. It would then be safe to say that every file whose hash value matches up hash not been altered and would then allow the examiner to focus on only those files which appear to have been modified since the original installation.
Another example of using hash lists to speed up the process of examining a computer is through the use of the FBI's Hashkeeper lists which keep hash lists of known and suspected child pornography. By using these lists, an examiner can quickly find images on the computer which have been proven to be child pornography and the corresponding cases to back-up his evidence. This example would fall more into the inclusionary use of hash lists than exclusionary, but still helps an examiner quickly identify what he is looking for.
2) Another method which is currently being isolated for use in computer forensics is a heuristic-type approach. By identifying a computer user's behavior pattern in how they use their computer, where they store personal files, when they typically access certain types of documents, etc. It will soon be possibly to use an automated process for first creating a profile of the 'user' and then utilizing that profile to help identify potential areas of importance in an investigation. Since I am not intimately familiar with this process yet, I'll simply leave it at that and allow the rest of you to comment on and build your own suppositions.
Good points, Jeff.
I was trying to recall where I'd read about the heuristic approach before and then remembered it being mentioned in Neil Barrett's "Traces of Guilt" book. Do you have any links to ongoing work at all?
Just to touch on the "massive data" issue the following news item from a couple of days ago seems relevant:
I've just posted a topic which may have some relevance to this discussion (i.e. the seraching of large amounts of data). Thought I'd mention it here.