The Cost of Storing Digital Images
Several weeks ago I had the honor of being mentored one-on-one in computer forensics, for a full week by Greg Marshall, EnCE. Greg is not only a world class Computer Forensics practitioner; he is also an investigatorâ€™s investigator. After doing nothing but eat, drink and sleep investigation for the past 41 years, I believe Iâ€™m qualified to judge investigators and for my money, Greg is a Top Gun.
Although Iâ€™m still reeling from all of the sophisticated CF techniques Greg taught me, (or more accurately tried to teach me!) my most immediate problem lies in the mundane administrative area of the storage of computer images and cost reduction.
Although Iâ€™ve dealt with all sorts of evidence over the years and have never had a loss or destruction of evidence, Greg has convinced me that when comes to HD images, we are dealing with something so fragile that we are professionally obligated to go above and beyond, in ensuring that computer forensic images is protected for months or even years.
We all know that the textbook solution to storage of evidence of this type, is - redundancy and remote distribution. Those in law enforcement often seize, and get to hold the original HD as evidence, as long as they might need it. That gives them the opportunity of going back to the original HD, should the forensic image be lost, damaged or called into question. In civil practice, it is less often that we can hold a HD as evidence. We often have to use a combination of an implied specter of a court order and gentle persuasion to get the HD image we need. For example, tomorrow, I will go to a lawyerâ€™s office and image the HD of a laptop of his client, a former girlfriend of a multi-million dollar embezzler. I will get one shot at the computer and walk away with an external HD containing an image.
Greg will tell you that before he does anything else with that image, he would burn a verified copy on a set of DVDs. In this case, I donâ€™t know how big the drive will be, but it is unlikely that it will be more than 40 GB. For the sake of this discussion, Iâ€™ll assume that that is the case. OK, I have a first generation image on my external HD and a verified copy of that on DVDs. Iâ€™m feeling pretty safe. Letâ€™s see what that storage will cost me.
I figure that it will take me about 1.5 hours to burn the 8-10 DVDs needed for this 40 GB backup. Our normal billing rate is $75/hour. That means that it will cost me $112 in un-billable time just to make a redundant set of discs. My storage costs are:
Hard Disk space = $40
DVD Media = $4
Labor = $112 to burn DVDs
Total holding costs = $156.00
I donâ€™t like it, but I can live with it. I can even pass that $156 cost on to my client.
The problem is that aside from laptops and the like, 40 GB is no longer representative. I recently sent a 60GB HD back to Maxtor under warranty. They replaced it with an 80GB because they no longer had a 60 GB in stock. More to the point, standard stock drives are getting larger and larger. We recently took into evidence a 200GB and a 250GB and have another 250 GB that I will take into evidence next week. Using the same approach as we did previously letâ€™s look at the storage costs.
Hard Disk space = $250
DVD Media = $25
Labor = $ 715 labor to burn DVDs
Total holding costs = $990.
As you can see, our costs come to almost $4/GB before we even start an analysis. I am interested in ideas or alternate approaches that would allow us to do the job right and at the same time, cut costs.
Actually im doing a 1st gen. image on a HP NAS, redundant disks and interfaces, then i do a DLT Backup of the image on a fresh DLT …
The source disk or disks are hooked to a live CD linux and transfered through a crossover 5e through 1g ethernet …
Similar… We image to a Network RAID, then archive to tape (AIT). Although tape is sounds oldfashioned, its a tried & tested method of backing up your data. Tape might be the way to go Al, if you are considering keeping those backups for some time. Its a little more costly for the initial outlay (but check eBay - you might pick up a bargain) than for example DVD, but at least you can fit 100's of gigs on one tape in one go, rather than many hours buring to disk, with the risk of a failure every disk in so many.
Andy, hard to beat the cost and linear speed of a DLT!
Thank you Al for the kind words. I think that this board is a good resource due to the richness of experience of it's contributors. Alan contributes greatly to that richness and it has been a pleasure to get to know him.
I agree with Andy that tape backups may be a good option for you Al. I have always been concerned however with the long term viability of the tapes. I have to admit that I don't use tape backup, nor have I ever. I am wary however from the experiences of others that a tape drives heads may shift over time. While backups saved one day, and restored a week later would be fine, it may be a different story when years have passed, and maybe you are using a different drive than the one originally used. Perhaps techology has improved and this issue is no longer a concern. If not it seems a better option than DVD's. DVD's are far from trouble free, but by using good media, and verifying the images I feel pretty good about using it. I don't bill all that time, only a fraction of it. Mainly it's just swapping disks, and can be done on a dedicated machine while working on other things. You need a fast burner, but processor speeds and memory are not that important. I usually burn a disk, put it in my analysis machine for verification and start another one in the burner. I rarely get a bad disk unless I'm tasking the computer with other things as it burns.
Another option, which is currently pretty costly, is a robot system such as those sold by forensic-computers.com. The $5000 unit holds 25 disks, burns, prints labels, and verifies the data.
I have been looking at external storage options myself. I don't need network storage as I'm the only one accessing the images, but am favoring some type of firewire RAID. Not the Lacie units that I know you've had trouble with, but perhaps a unit that could be configured as RAID 5 such as those from Weibtech. If I had a RAID 5 for image storage I wouldn't feel the need to archive right away (although it's probably still a good idea). These units are also somewhat portable, which would allow them to be used in the field for acquiring a large RAID should the need arise.
I don't want to veer too off topic here but in terms of backing up an image to lets say multiple CD/ DVD's what is the recommended process off verifying the integrity of the finished product.
Assuming I have a 40gb image (and md5 value associated) that has been backed up to 8 or so DVDâ€™sâ€¦ Do you have to rebuild the image to verify the hash matches the original or are checks done during the copying sufficient?
Verification can be accomplished in a couple of different ways. Nero, and probably some other burning applications, have a verify process built in that can be set to run after each burn. This makes the burn process longer, but requires no action by you. Swap disks every 20 minutes or so as long as everything is going well. I don't use it just because I haven't tested it's reliability very thoroughly.
If you acquire images to the .e01 format then you know that these evidence files have within them crc values for each block of data (default block size is 32k in Encase) as well as an md5 hash value for the entire evidence file. Encase has a verification tool built in that recomputes each of these crc values and compares it with the original. It also recomputes the md5 hash of the evidence file as a whole. If any are different the sector blocks are flagged by Encase. If I put 3 image files on a DVD I can point Encase at all 3 at once and let it run. Takes about 11 minutes to complete and doesn't require me to verify each one seperately.
If you are using some other application without this type function you could compute a hash value of each evidence file and recomputer after burning.
Either way you go verification is a necessity. Errors are too common when burning to optical media.
Thanks for that Greg,
As with a lot who are starting out I am not quite ready for the $$$ of Encase and have been doing much of my work on Winhex and others.
This does however seem like another good feature and as I move along I will surely delve into Guidanceâ€™s software and add it to my toolkit… I have the demo CD which many have stated gives you an idea of the software but can be a bit frustrating…
Thanks again for the info, it cleared up some of my fogginess on the subject.
femur, we use SAIT (Super AIT) which transfers at 500GB (that's manufacturers GB) at 30MB/s (uncompressed). It has a larger capasity than DLT.
To my knowledge SDLT 600 transfers at 300GB at 36MB/s. Slightly faster but its not got the storage.
I actually don't have much to do with the tape backup we use, as a colleague has taken ownership of archiving, but he tells me that many factors come into play with using tape, such as, where on the network we place the drive?, what software is used? etc.
If you are not storing vast amounts of data (we currently have about 5 Terabytes, with room for 5 more or there abouts), then I agree with Greg, DVD is a realistic option. But be careful of the DC/DVD burning robots, as we've had one for a couple of years, and its been a complete waste of money. It hardly ever worked right.
It wasn't from fernico was it
Hi Mark, its a Cedar Rimage…… I can't remember where we got it from. Some snake oil salesman duped my previous manager into throwing good money away on it 🙂
In researching the storage options, I made contact with Rob Caffey of Chase Laboratories 310-577-1702. He is an expert on tape backup. His equipment and supplies are not cheap, but they could be made to fit into the budget. I found him to be very helpful in spite of the fact, I decided against the tape option. I had previously pretty much given up on DVD which take up too much of my time.
As I analyzed it, my hardware infrastructure set the limitations and was really the deciding factor.
Some of you guys are working with elaborate networks that are set up to image on RAID drives and have servers dedecated to backups. My laboratory consists of one very good and one mediocre desktop, both of which I made myself. In addition to both being licenced for several different CF applications, those machines are also used a variety of investigative tasks, business administration and personal tasks.
My field unit is a middle of the road HP laptop. When not in the field, that unit is part of my "Network" that consists of trying to share a printer and accessing the internet. Probably 90% of my imaging takes place out in the field. Both my Lab units and my field units have drive locks and a variety of adapters. My images are placed on a 300 GB Maxtor external HD in the field and transported back to the office. (We have 10 of those in service.) That is the point at which the backup decision has to be made. If I go the DVD route, my personal time and 50% of my laboratoy is (assuming a 250GB image) tied up for at least 11 hours. To use modern tape technology I would probably have to bring in another computer and enhance my network at a cost of about $10,000.
After considering all the options and risks I have decided to place all of my backup images on to one of those external HDs, take it out of service, tag it, seal-it and put it in my SD box at the bank. My cost will be a minimum of $1/GB depending upon the number and size of the backup images I put on each drive. That cost is being partially offset by a $1/GB storage surcharge I am charging my client.
How long will one of those HDs last if it is stored unpowered in a bank vault? I don't know! In fact, I'm not sure how long I want to store them.
One huge difference in private sector work and Law Enforcement work is the time allowed for acquisitions. I never appreciated how much a luxury it was to be able to start an acquisition, lock up the lab, and come in and find it complete Monday morning. If it hung up somewhere, so what, I could do it again. The benefit being I could take the extra time to compress images. I remember imaging a 120gb drive (which turned out never to have been used) with full compression into less than 1 gb worth of evidence files. Based on this experience I would suppose that one could achieve compression (using Encase) of unused space to 1/100th of its original size. Some of these huge drives we are seeing could be compressed way down since people rarely use all that space (In my experience 40gb of allocated space on a drive is rare). I'm going to put in a feature request with Guidance Software that a "recompression" feature be built in. The image can be acquired in the fastest manner (no compression) then compressed later over night or a weekend for archival.
Another possibility would be to mount the image with one of the utilities out there that can do this and re-acquire the mounted image with full compression. It would be an interesting verification test to see if the hashes would match. I'm not sure how the various mount image applications would cache the necessary writes to the drive and whether that cache would be part of the mounted drive or not. If it is then the new md5 wont match the original. I've got a couple of cases here that I'll try it out on. If anyone's interested I'll post the results.
Winhex backup manager allows for images to be compressed and hashed with an integrity check done when restoring the image to its original size.
The compression will obviously differ depending on the data contained within the image but testing has shown anywhere from 15-85%. I havenâ€™t been able to test on many bigger drives but I would think the same results will hold true.
You actually have the option to save as .whx backups with compression and encryption, raw images with no compression or Encase .e01 image files compressed but encryption unavailable.
All splittable at your discression (650mb for example for cd's)..which sort of answers my earlier questions on spanning of images..
Thanks for the above posts as they piqued my interest into the image archival aspect of forensicsâ€¦which I hadnâ€™t much looked into.
I received a quick reply from Guidance stating that "recompression" (they call it re-acquisition) has been possible for some time now. So I guess you learn something every day. It will be my standard practice now.