±Partners and Sponsors

±Your Account


Nickname
Password


Forgotten password/username?


Membership:
New Today: 2
New Yesterday: 2
Overall: 26231
Visitors: 50

±Forensics Europe Expo


±Follow Forensic Focus

Join our LinkedIn group

Subscribe to news

Subscribe to forums

Subscribe to blog

Subscribe to tweets

Search Hits

Computer forensics discussion. Please ensure that your post is not better suited to one of the forums below (if it is, please post it there instead!)
Reply to topicReply to topic Printer Friendly Page
Forum FAQSearchView unanswered posts
 
  

Search Hits

Post Posted: Wed Dec 12, 2012 12:31 pm

Hi,

I'm looking for a 'one stop' software solution (preferably free) that will index files and using a keyword list produce output (csv) of what keywords appear in each document. These are 'general' office docs and mail (EML and MSG)

For example:

Doc1.doc Keyword1 Keyword2 Keyword3
Test.xlsx Keyword2 keyword3
Scan.pdf Keyword1 keyword3
Message.eml keyword 4

These keywords need to able to be input via GREP and the keyword might also comprise of two parts which I want to show if only both appear ie 'elephant' and 'giraffe' but not report on if only one appears.

I've got an idea of how to go about this in EnCase and playing around with the csv output in a database and querying it etc but is there a simpler solution out there?

Many thanks for any advice.

Kindest regard  

shep47
Member
 
 
  

Re: Search Hits

Post Posted: Thu Dec 13, 2012 3:43 am

As far as I know this is either impossible or impractical in EnCase v6. There is no way to create an exported list like that or combine search hits in the way you describe using a basic keyword search. You may be able to do it if you index the case, but sadly Indexing in EnCase 6 is very poor.

XWays, on the other hand, will do everything you need it to from a simple keyword search. Once a search is complete, XWays will populate a special column with "keyword hits". You could then export the filename along with this column to get the output you want.
XWays also handles "combination" keywords very well. You can do exactly what you specified quite easily - i.e, only show files where "elephant" AND "giraffe" appear.  

Chris_Ed
Senior Member
 
 
  

Re: Search Hits

Post Posted: Thu Dec 13, 2012 7:05 am

Hi Chris,

Many thanks for the reply. I agree with EnCase being impractical, it is achievable as I've done in a small test data set but it involves a several stage process to get to my desired result and putting this into practice on several more much larger sets and larger number of keywords would be achievable but logistically a nightmare.

I will give X-Ways a go.

Shep  

shep47
Member
 
 
  

Re: Search Hits

Post Posted: Thu Dec 13, 2012 9:46 am

dtSearch will provide a list of keywords, and their frequency.
It also has the ability to look "inside" non-text documents, such as MSG, PDF, spreadsheets and MS Office documents.  

jhup
Senior Member
 
 
  

Re: Search Hits

Post Posted: Thu Dec 13, 2012 11:51 am

You may want to try P2 Commander. P2 Commander does a good job with email and can perform recursive searches through email, archives, ole strings, etc. so the search results will be comprehensive. You can then bookmark the results and create a .CSV report showing the path to the search results. You can even select to have the target files exported along with the report.

PM me and I can get you a sample report.
_________________
Paraben Corporation 

paraben
Member
 
 
  

Re: Search Hits

Post Posted: Thu Dec 13, 2012 12:36 pm

In FTK, you can use the Labels function to assign labels to your responsive files. You could then export the file properties with the Labels column included in the report. However, this means running one keyword search at a time then creating the label and applying it to the responsive files for that search instance. This becomes impractical when you have a long list of keywords.  

flamerescue150
Member
 
 
  

Re: Search Hits

Post Posted: Thu Dec 13, 2012 7:21 pm

Was the production of a CSV file the ultimate aim?

Or was the production of a CSV file just a intermediate step which you believed was necessary to perform a search via grep or to load up the data into a custom database?

Because if the ultimate aim was to do a search across a set of documents (with various boolean expressions) there are much better ways to go about building a index that what you have proposed.

For example your solution doesn't deal with stemming, exact phrase searches, different character sets, stop words, ranking results by relevance, breaking up Email archive files into individual Emails, searching by criteria other that keywords (eg dates, file names, Email from addresses), wildcard searches, etc...
Your solution will also be slow.

You are much better off using a pre-made solution, like the well respected products suggested above, or our own OSForensics software, which may well do what you want for free.  

Passmark
Senior Member
 
 
Reply to topicReply to topic

Share this forum topic to encourage more replies



Page 1 of 1