EnCase GREP / REGEX...
 
Notifications
Clear all

EnCase GREP / REGEX dialect

10 Posts
3 Users
0 Likes
3,637 Views
(@xandstorm)
Posts: 56
Trusted Member
Topic starter
 

Hi guys,

Anyone on this list have any recent experience with the EnCase (version 8) GREP / REGEX feateres?

Specifically, what exact GREP / REGEX dialect does EnCase actually "speaK"?
There is no mention of it in the manual and the response of the Guidance / OpenText tech support rep. was "sorry I don't know".

I am running a few tests within EnCase and besides some very rudimentary basic search pattern results, the majority of the operators appear not to work properly at all.

Some examples

1. The logical OR "|" pipe character "breaks" any result as soon as something is typed after the "|" character.
2. The quantifier ranges {x,x} same thing.
3. Unicode character codes even return a syntax error.
Etc.

Basically, anything else then just plain text words, EnCase doesn't appear to find anything anymore.

Note The same image evidence files were tested in other forensic software suites and performed as expected.

Anyone have similar experience with this?

Saludos,
Lex

 
Posted : 26/01/2019 4:21 pm
(@gsibat)
Posts: 12
Active Member
 

The EnCase use of GREP is covered on our Building an Investigation Course. EnCase uses a subset of operators from those used in nix systems and has been the same through previous versions. With regards to the Pipe symbol. Whatever appears before or after the pipe represents the logic of ‘OR’ whether that be a character, set, or Group. Eg cat|sat would only look for the logic of OR between the t and s not the words cat or sat. A set is represented by Square Brackets [cat]|[sat] would only look for a single instance of the letters within the square brackets and not all of them together. This represents a logic of OR for these letters.

In order to look for the words cat or sat. This would be achieved by using a set, represented by the letters within parentheses (cat). This represents the logic of AND of the letters within the parentheses. Therefore to look for cat or sat our expression would be (cat)|(sat) which would look for cat OR sat.

{min,max} this Represents whatever character, set or group that precedes this operator and means to repeat a minimum to maximum number of times. You can set the value that represents the minimum number which can be as low as 1 (one) to a maximum of 256.

To represent Unicode characters you represent the Unicode character with ‘\w’ eg to represent uppercase A in Unicode would be \w0041. For Hexadecimal you would use \x, so A would be \x41

The most obvious thing that people miss is to select the option to enable GREP within the search creation dialogue box.

Hope that helps.

 
Posted : 26/01/2019 5:52 pm
(@athulin)
Posts: 1156
Noble Member
 

A set is represented by Square Brackets [cat]|[sat] would only look for a single instance of the letters within the square brackets […]

In order to look for the words cat or sat. This would be achieved by using a set, represented by the letters within parentheses (cat).

That probably answers the OP's question the terminology is not anywhere near standard regex terminology, so of course it will be difficult to disentangle.

I think I remember a precedence problem back in 6.x days – that might have been |, which in Unix grep has lower precedence than concatenation (actually lower than anything else), so cat|sat works 'cat' or 'sat', not 'catat' or 'casat'.

If EnCase really needs (cat)|(sat), as suggested by Gsibat, that would indicates concatenation has lower precedence than | ! And if so … anyone knowing Unix grep will be at a serious disadvantage when using EnCase grep, just because of that his/her regexps just won't work as expected, if they rely one Unix-style precedence.

 
Posted : 26/01/2019 7:50 pm
(@xandstorm)
Posts: 56
Trusted Member
Topic starter
 

Hope that helps.

Hello Gsibat,

Thank you for your feedback.
It does help a little to understand what syntax to apply.

Still, the results / hits in the key word tester come up either empty or irregular.
After adjusting the test text file and reloading it into the keyword tester, sometimes results show up but sometimes not.

2 more examples

1. Searching for patterns that contain the word "advisor" optionally followed by any number of characters. When testing that search pattern as (advisor.*), it sometimes returns a hit but then again sometimes it doesn't. And when there is a hit, it appears to stop searching as other instances of the same key word in that same file are not returned / highlighted as a hit.

2. GREP / REGEX search patterns are said to be logical "AND" by default but within most if not all REGEX dialects, one would use look aheads to achieve logical AND search patterns. For instance the logical AND search for the words "BOB" AND "ALICE".
Within the special characters available within EnCase GREP I have no idea on how to achieve that.

You have any advise on the above?

Rg,
Lex

 
Posted : 26/01/2019 8:30 pm
(@gsibat)
Posts: 12
Active Member
 

Hi Lex

Usually anything that is entered into the search field assumes the logic of AND unless qualified by use of the grep operators. To achieve that which you seek, if you were to create a simple expression of 'Bob Alice', you would get hits based exactly as you typed ie bob [space] alice. You could instead create 2 separate keywords or any number of individual keywords and then in the Keywords Hit tab, combine the search results by selecting the hits you wish too combine and applying the logic of 'AND' by choosing 'must match all' or the logic of 'OR' by choosing 'must match any'.

 
Posted : 27/01/2019 7:20 pm
(@xandstorm)
Posts: 56
Trusted Member
Topic starter
 

Hi Gsibat,

Thanks again for your feedback.

This case concerns a second opinion case in which the use of GREP / REGEX search patterns is mandatory.

I think the key word searches that have to be applied are pretty straight forward from the "logic" perspective, but for som reason I do not get it done. The latest test runs even return "s" characters when applying the more or less standard REGEX "\s" space character operator in a search pattern.

Another aspect that appears not to work at all is the "whole word" tick box, and I also tried with the word boundary "\b" operator.
The results still list file hits in which only part of the word bounded search pattern are present.

Is there a way to custom add specific characters that are to be considered part of the aplicable alphabet character set, so word boundaries can more explicitly be determined?

So again, for some reason I don't get it done, which might say more about me then about EnCase but I am going to give up on EnCase for this specific case for now.

Rg,
Lex

 
Posted : 28/01/2019 3:51 am
(@gsibat)
Posts: 12
Active Member
 

Lex

The first thing to remind you of is that EnCase Grep is not Regex therefore the variables that are applicable for Regex do not apply for EnCase Grep. There is no way to add any additional operators.

When it comes to whole word, the way it is implemented in EnCase Grep is to simply enable the option to use whole word. What it does is look for a space before and after your expression. eg if you searched for the whole word 'soft' it would not report a hit in 'Microsoft'. You can represent the same by typing \x20soft\x20 where \x20 is hexadecimal for space.

In EnCase Grep the use of the '\' would treat any Grep operator as its literal representation eg where ? is to repeat the preceding character, set or group one or zero times \? would look for the ? as a punctuation character.

Cheers

 
Posted : 28/01/2019 1:54 pm
(@xandstorm)
Posts: 56
Trusted Member
Topic starter
 

When it comes to whole word, the way it is implemented in EnCase Grep is to simply enable the option to use whole word. What it does is look for a space before and after your expression. eg if you searched for the whole word 'soft' it would not report a hit in 'Microsoft'. You can represent the same by typing \x20soft\x20 where \x20 is hexadecimal for space.

That "looking for a space" to determine a word boundary is exactly what is failing here.
In all my test runs, also sub strings of "whole word" grep searches were listed and highlighted as hits.

The result is a huge number of false positives that, according to the EnCase GREP syntax, shouldn't be listed as hits at all.

This was the main origin of my question regarding the exact version of GREP / REGEX is implemented within EnCase.

Saludos,
Lex

 
Posted : 28/01/2019 3:53 pm
(@gsibat)
Posts: 12
Active Member
 

How big is the data set you're working with? Is it something that can be shared? Happy to work with you offline to see what's happening.

cheers

 
Posted : 28/01/2019 10:43 pm
(@xandstorm)
Posts: 56
Trusted Member
Topic starter
 

How big is the data set you're working with? Is it something that can be shared? Happy to work with you offline to see what's happening.
cheers

Hi Gsibat,

It is a rather big dataset encompassing almost 1TB so too big to share confieniently I'm afraid.

Rg,
Lex

 
Posted : 28/01/2019 11:40 pm
Share: