Open source/free so...
 
Notifications
Clear all

Open source/free software solutions for extractions?

10 Posts
5 Users
0 Likes
1,400 Views
 Suai
(@suai)
Posts: 14
Active Member
Topic starter
 

Dear members, just recently I came across this forum and this will be my first post, so let me start with a brief introduction and hope you can solve some issues I'm facing.

I've been tasked with the creation of a technical support unit within a law enforcement department in what I consider to have been the initial months of a stressfull work environment (no training, very limited resources and a very large workload). It's deifnitely become a high-speed learning experience, so please excuse but correct me if my terminology is not precise.

At the moment I'm handling disk extractions providing police investigators with a complete image of the disks. This is quickly generating issues regarding file space and interpreting the image file systems once they are mounted.

I've seen similar units work more eficientlly with licenced software (EnCase) with an embedded script which extracts just what investigators might be interested in (images, videos, documents, e-mails, etc) in folders, together with a generated CSV file with crucial information about the extractions such as file hashes, timestamps and file route.

I've tried mimicking the process with free software such as Autopsy, but working this way is not time efficient, as I have to manually tag the files (I can´t extract the 'analyzed' directories by Autopsy I'm interested in , such as "fotos"). I haven´t worked with many other software programs but Autopsy seems to handle system resources pretty poorly and l have a hard time processing larger files as well.

Does anyone know of a more time efficient open source software solution to handle this process? Extract file types based on their extensions in organized folders with a generated CSV file for investigators? Would a licensed software suite such as X-Ways perform this process any better?

 
Posted : 27/01/2020 1:03 pm
keydet89
(@keydet89)
Posts: 3568
Famed Member
 

If you're trying to, say, extract all image files (based on extension) from a mounted image, something like a batch file or shell script would work just fine.

You could even do something like generate a CSV and include hashes, using a scripting language, such as Powershell (include error checking, etc.).

 
Posted : 27/01/2020 1:58 pm
jaclaz
(@jaclaz)
Posts: 5133
Illustrious Member
 

If you're trying to, say, extract all image files (based on extension) from a mounted image, something like a batch file or shell script would work just fine.

You could even do something like generate a CSV and include hashes, using a scripting language, such as Powershell (include error checking, etc.).

Sure, as all images have the .jpg, .gif and .bmp extension …

@Suai
For images you can try ghiro
https://www.getghiro.org/

Most probably you will need additionally - though it is not among the fastest tools on earth - a parser like bulk-extractor (this is its Kali Linux page)
https://tools.kali.org/forensics/bulk-extractor
or the (hopefully enhanced) bulk-extractor-rec
http//www.kazamiya.net/en/bulk_extractor-rec

And a carver like photorec
https://www.cgsecurity.org/wiki/PhotoRec

jaclaz

 
Posted : 27/01/2020 4:33 pm
(@athulin)
Posts: 1156
Noble Member
 

Does anyone know of a more time efficient open source software solution to handle this process? Extract file types based on their extensions in organized folders with a generated CSV file for investigators? Would a licensed software suite such as X-Ways perform this process any better?

First, 'more time efficient' depends very much on what degree of efficiency you have now, and you haven't said.

It also depends on your platform. You mention EnCase, which means at least Windows platform. You mentioned Autopsy, which is multi-platform, so you may also have Mac or Linux. What you choose depends on that are you committed to Windows only? Any particular release? Or do you need software that is easily portable?

With fairly simple Unix scripting skills you can use tools like find(1) to identify image files in a mounted drive/image, and pass them on for extraction as well as any pre- or postprocessing such as hashing, or in cases where intrusions are suspected, look for signs of poor system configuration.

In a Windows environment you may be able to use the Windows 10 Ubuntu app, or other similar environments such as Cygwin, etc, if you need to have tools on several platforms. Don't know about the reverse situation, but PowerShell is said to run on Unix. (I have no experience from that, though.)

What other platform-dependent tools do you rely on?

With no Unix scripting skills … the picture is changed. Can you get that knowledge from oustide? Perhaps there is no alternative except to get your solution ready-made?

But without knowing exactly where the technical limitations, or your personal limitations, or exactly what 'more time efficient' means, there can be only vague ideas here. EnCase relied very much on scripting top get some of the more useful things done – I've not used it since 6.x, so I can't say if modern scripting modes help here. However, when platform changes, and script changes are required, and you find that you don't have the source code … if all you have is a packaged script (EnPack?), it's a problem.

Ultimately, your fundamental question about time efficiency can only be answered by testing.

You may need to think of getting a local partner that you can sit down with, and discuss different options with, and who can do any scripting or similar minor development or tests that you yourself don't have time for. Sometimes law enforcement and similar forensic organizations have benchmarking partners. In those I've been involved with, this partnership was, in part, a yearly catch-the-flag challenge from one part to the other the most valuable experience was the post-mortem discussion of 'Why did you do this? Why didn't you do that? We had placed clues here, and here … did you find them? Why or why not?' and the discussions that led to.

 
Posted : 27/01/2020 4:55 pm
 Suai
(@suai)
Posts: 14
Active Member
Topic starter
 

Thanks for the input.

As far as scripting goes, it's beyond my knowledge to write a script at the moment.

Regarding platforms, I'm currently working both on Windows 10 machines as well as booting on Linux distributions (CAINE, DEFT, PALADIN). Most extractions are performed in a lab environment and not on the field. I'm not dealing with in-depth forensic analysis, simply providing investigators with an 'easy to read' format of seized digital devices.

As an example, an investigator dealing with a money laundering case might simply want to analyze any e-mails, documents and images off of seized PCs and laptops. By efficient, I mean being able to deal with an overload of seized PCs, laptops, usb drives and being able to extract data based off of file extensions that might be of interest to the investigator in categorized folders along with info about the files that have been extracted (hashes, creation date, route) leaving the actual analysis and individual selection to the investigator. Autopsy seems to do a decent job at analyzing the disk image but actually selecting and exporting full categories that might be of interest to investigators is not possible.

I certainly need to test some of the tools that have been mentioned but with the workload staking up it's hard to find the time to experiment.

 
Posted : 27/01/2020 11:22 pm
jaclaz
(@jaclaz)
Posts: 5133
Illustrious Member
 

As an example, an investigator dealing with a money laundering case might simply want to analyze any e-mails, documents and images off of seized PCs and laptops.

What you just described is more *like* e-discovery than forensics, i.e. something *like* docfetcher

http//docfetcher.sourceforge.net/en/index.html

or more generally content search tools may do? ?

Or something more complex *like* FreeEd

http//www.freeeed.org/index.php
https://github.com/shmsoft/FreeEed
https://github.com/shmsoft/FreeEed/wiki

I am not so sure that it is a good approach to give to an investigator a "partial" extraction based on file extension, because, as hinted, if you go for "canonical" file extensions, you won't get any of the (say) .jlf files (which may be actually renamed .zip files where the suspect keeps all relevant data) nor any .prn files (which maybe not printer files) .

jaclaz

 
Posted : 28/01/2020 9:16 am
 Suai
(@suai)
Posts: 14
Active Member
Topic starter
 

I've been catching up on some of your helpfull answers.
I ask you to bear with my inexperience.

@jaclaz, it seems like what I'm trying to achieve is e-discovery using a digital forensic suite to analyze a disk image.
As with the example using EnCase software and ouputing file info using a script, if I'm not mistaken this would carve out data from unnalocated space as well as hidden file types and present them accordingly.

Looked into docfetcher but that seems to work installing the software on the actual system which would mean modifying the original evidence. Maybe I'm missing something.

 
Posted : 30/01/2020 11:45 am
jaclaz
(@jaclaz)
Posts: 5133
Illustrious Member
 

I've been catching up on some of your helpfull answers.
I ask you to bear with my inexperience.

@jaclaz, it seems like what I'm trying to achieve is e-discovery using a digital forensic suite to analyze a disk image.
As with the example using EnCase software and ouputing file info using a script, if I'm not mistaken this would carve out data from unnalocated space as well as hidden file types and present them accordingly.

Looked into docfetcher but that seems to work installing the software on the actual system which would mean modifying the original evidence. Maybe I'm missing something.

Sure it needs to be installed (although there is also a "portable" version) to "a" system. like any other programs.
I.e.
1) install (or use a portable version)
2) attach/mount an evidence file (image of suspect system disk) as a drive
3) ?
4) profit!

But it answers to only a subset of the (more general) forensic questions (as it won't touch anything that is not in the file system).

Basically it is you that have to take a decision.

e-discovery is (simplified) the gathering of files that (again roughly) respond to two (or three) requisites
1) they do exist in the filesystem
2) they are correctly named (meaning having the correct extension) <- this applies to simple tools like docfetcher
3) or they are however non-corrupted <- this applies to more complex tools like FreeEd that detects file types beyond and besides their extension

digital forensics is about *everything*, i.e. it includes
1) file existing in the file system, files deleted, fragments (or whole files) recovered from unallocated, *everything*
2) the classification is then by file type (and not necessarily by extension, that may be misleading)
3) and more generally what happened, who did what (and when), etc.

It makes little sense to say once that you want to "group" and "export" files by extension to have an investigator (not a forensics expert) look at them and later wanting also files (or fragments thereof) deleted or from unallocated space, as these latter may well be (and often are)
1) wrongly categorized
2) mis-identified
3) partial or partially corrupted
and thus likely to be missed, impossible to be rendered/parsed (or both) by a non-expert in digital forensics.

With all due respect ) it seems to me like you need to first focus on the specific *needs* you have and from those find a suitable procedure/process and then find if this (or that) tool fit in the plan, in Italy we have a saying "avere la botte piena e la moglie ubriaca (e l'uva sulla vigna)" that translates roughly to "wanting the wife drunk and the wine barrel full (and the grapes in the vineyard)", which is the equivalent of the English "have your cake and eat it too" and I believe it is the Spanish "También yo quisiera tener el pastel y comerlo."

The first approach, let's call it "light e-discovery", is "simple" and can be carried (in a short time) by almost anyone with a very minimal set of tools and very minimal training.

The second one, let's call it "full e-discovery", is "intermediate" and can be carried (in a longer time) by almost anyone with a proper (and not-so-minimal) training.

The third one, let's call it "proper digital forensics" is "advanced" and can be carried (in a much, much longer time) properly only by a professional and well trained digital forensics practitioner [1], using each and every tool of the trade (+1).

Maybe you additionally need a "triage" tool preventively, see a few related threads here
https://www.forensicfocus.com/Forums/viewtopic/t=5585/
https://www.forensicfocus.com/Forums/viewtopic/t=10931/
https://www.forensicfocus.com/Forums/viewtopic/t=12514/
https://www.forensicfocus.com/Forums/viewtopic/t=12958/

The side problem is the risk (indirect) of calling things with the "wrong" name that may led to consequences, if you call a"light e-discovery" (carried on by an untrained investigator with this or that simple tool) a "forensics investigation" (which is something a lot more complex) you expose yourself (and/or your office) to possible issues when (if) a "real" forensics investigation is carried and it reveals data that contradicts your findings, so I believe that you need a sound definition of the whatever you will choose to perform, exposing both the scopes and limitations besides the procedures involved.

As a side-side note, read about "assumptionware"
https://www.forensicfocus.com/Forums/viewtopic/t=12169/
http//www.zdziarski.com/blog/?p=3717

jaclaz

[1] please understand how the various Commercial or Freeware tools are exactly that, tools, what really counts for valid results is the experience and knowledge of the examiner using them, unlike what most people thinks tools are (still) far away from being an all-round solution, even if the trend (dictated by time, budget and in some cases ignorance of the managers) is towards the idea of half-trained monkeys pressing buttons and the tool automagically providing complete and valid results.

 
Posted : 30/01/2020 1:41 pm
bshavers
(@bshavers)
Posts: 210
Estimable Member
 

If just looking to extract by file type/extension for an investigator to review, it is easy enough and free (not open source, but free).

1. Add the image with FTK Imager (or if it is the original evidence machine, boot to WinFE)
2. Create a new "Custom Content Sources" container.
3. Add the file types (*jpg as an example)
4. Create the container (check the "create directory listing of all files in the image after they are created")

You'll then have an encapsulated file with all the .jpg files, along with a csv file listing. The investigators can also use FTK Imager to view the images along with being able to export what they need from the container file.

No scripts needed, with a widely-used forensic application.

 
Posted : 31/01/2020 6:43 pm
 Suai
(@suai)
Posts: 14
Active Member
Topic starter
 

Ok, getting some thoughts sorted out with your contributions.

As you pointed out jaclaz in one of your points, the correct wording I should have used is by "file type" and not necessarily extension.

My goal is the treatment of digital evidence to provide investigators (not forensic nor necesarrily computer literate) with an extraction in a 'familiar' format (typical 'click and view' windows folders) whenever a complete or exact disk image is not required or asked for, as this generates problems in terms of both digital storage space and understanding/interpreting of disk images.

As much as I am slowly finding computer forensics more interesting, the purpose of my work, aside from digital aquisitions, is to speed up investigations without the need for a digital computer forensic expert to perform a posterior analysis of every piece of seized evidence as the back-log is too much.

I agree that focussing on specific "needs" and finding a suitable procedure is what I should prioritize, and is in fact what I'm trying to get my head around without exposing myself to legal issues and trying to find the right balance between "efficient and sufficient". It's quickly becoming obvious that there it no "one type solution" for all.

I believe you hit the nail on the head when you mention time, budget and ignorance of managers/superiors. In my particular case, I'm the half-trained (or better said, self-trained) monkey lol .

@bshavers, I think your proposal might be along the lines of what I'm trying to achieve. I'll run some tests with FTK Imager and see.

Thanks again.

 
Posted : 03/02/2020 7:16 pm
Share: