Hello,
Does anyone know of any good whitepapers/reports on effective keyword searching and use of conditions in Encase?
Thanks
You should read the *.CHM help files that you're prompted to install or leave off when you install EnCase on your machine.
As far as conditions go, a good rule of thumb is if you can't say in plain English what your condition is doing in a short paragraph, it's too complex. Having perfect logic is not the whole picture. You need to be able to clearly express what you're doing to someone who's knowledge of information sciences will be far less than your own.
Now keywords… you'll want to keep them simple for your own sanity (and safety). Never use embedded or compounded GREP statements when you don't have to. The engine is home grown, and as useful as EnCase is, as much of a lifeline its search is, it is buggy. The good thing is, you should never experience the bugs when sticking to regular day-to-day keyword terms and simple GREPs. Two rules to follow…
* never nest groups "(expression1 (exp2 | exp3))"
* never use more than one OR "|" when possible (it is, of course, ok to write simple keywords like "blah.(com)|(net)|(org)"
And to be nit-picky, the EnCE Exam book is dead wrong on its GREP examples. When it showed some example on searching credit card numbers or social security #s, it used "[^#](expression….)[^#]" … ie not-a-number, then my expression, then not-a-number. That fails if your intended search hit is the 1st or last text entry of a file. …So read up on GREP in Unix tutorials, and you'll have a better understanding on how to apply EnCase's implemented subset of GREP.
…plainly, just dont be afraid of simplicity & breaking your conditions and keywords into a couple of simple items. Hope that's enough help.
Thanks Logg. What I am actually looking for is information such with keywords choosing "whole word" doesn't find hits in the registry, or web URL's b/c of the possible characters surrounding the hits. Or things like test scenarios were if you start a keyword search and are getting to many hits, you want to re-evaluate your keywords, but before you stop, delete and restart you want to restart your case before you SAVE ALL otherwise all your false positives will still be loaded up in RAM w/ the case file.
As with conditions, again just certain gotchas w/ layering.
I am just looking for tips, tricks and hints from those in the field who have found these out first hand.
Thanks,
Read the EnScripts called "..within 'n' words" and you'll quickly see what a "word" is programatically defined as. It is very difficult to define a word while encompassing all the regular exceptions.
When you want to find "whole word," that's the exact scenario as I defined with the EnCE exam book. You will not get hits if your GREP is too specific (as the above example).
As far as getting an avalanche of hits and wanting to stop, just stop as you wrote out, evaluate your hits & why you've gotten the overrun, delete the hits (they're still in ram), close w/o saving (so you should have hit Save All before the run), close, reopen, make and run a new keyword. This is my example of *simple & many* … if you want an SSN, you need 2 expressions
1) GREP keyword "[^#]#{3,3}[ \-]?##[ \-]?#{4,4}[^#]
(this takes all mid-file hits … plus some false-positives)
2) condition *using the keyword above* where (hit location is at EntryFile length = 0 || hit location + hit length = end of file - 1)
(this takes care of start and end of file hits that you'd otherwise miss)
(the equation changes if you want logical vs physical)
So you can see that when asking something as simple as "find me a whole word," the answer can be quite complex.
As with conditions, again just certain gotchas w/ layering.
Written above
* never nest groups "(expression1 (exp2 | exp3))"
* never use more than one OR "|" when possible (it is, of course, ok to write simple keywords like "blah.(com)|(net)|(org)"
if you can't say in plain English what your condition is doing in a short paragraph, it's too complex.
Finally, biggest tip READ UNIX GREP BOOKS. You'll understand the caveats to GREP so you know how to prepare when building your own terms.