Forensic vs. Interp...
 
Notifications
Clear all

Forensic vs. Interpretive or Analytical Tools?

29 Posts
6 Users
0 Likes
2,062 Views
pcstopper18
(@pcstopper18)
Posts: 60
Trusted Member
Topic starter
 

All,

I have recently become aware of a perspective regarding tool validation and verification and wanted to reach out for some feedback. Anyone familiar with an approach to some tools being categorized as interpretive or analytical in nature and as such are not required to be validated? Only "forensic tools" or the “forensic” functions of a tool suite. I assume that such functions are imaging, hashing, carving, searching, or similar.

I recall conversation in the past about working toward accreditation in one’s environment and having one’s imaging capability accredited as opposed to everything. I assume this approach may be related to that or similar thought process?

The bottom line is, must every “tool” be validated before use or just a “forensic tool?” Is there a distinction to be made between the software used in examination? If so, what is it and what is the basis?

 
Posted : 18/10/2015 5:08 am
(@mscotgrove)
Posts: 938
Prominent Member
 

Most tools are commercial products, and I am sure all tools have bugs. Most tools also have updates, and chances to 'change' the bugs. Without updates, the tool becomes out of date very quickly

I am sure there are multiple ways to investigate a disk and come to different, but possibly valid conclusions. Is this the tool, or the operator, and how would validation handle this?

The most important part of any investigation must be the operator. If a tool shows something important, the operator must be able to determine if the tool is correct or not by different means. This might be by using a tool from a completely different source, or going in with a Hex Viewer.

I don't think validation is very practical. One reason for this is that tools work on completely different data each time they are run. The only way to validate a tools would be so it has seen all possible types of data, past and present. Every small release would then have to test with this infinite amount of data again.

Personally, I am often concerned that some questions on this forum assume blind faith in a tool. I am sure we all trust our favourite tools, but ultimately each run needs to be checked for giving a valid result. The operator must understand what the tool is doing and not blindly trust anything.

(As we have seen with certain diesel cars recently, passing a test does not mean it works!)

 
Posted : 18/10/2015 4:45 pm
jaclaz
(@jaclaz)
Posts: 5133
Illustrious Member
 

If I may, this issue is touching the philosophical aspect (more properly the epistemology field).
There are mainly two "tool paradigms"

  1. a given (computer/OS/network/whatever) process/mechanism is known and documented, it leaves traces or artifacts and the tool is only a tool that either automates the analysis or collection of these traces or simply makes the analysis or collection faster than using a "plain" tool (like an hex editor)
  2. a given (computer/OS/network/whatever) process/mechanism is discovered by a Commercial entity that does not document it (or does not document it properly) but simply implements it in a program (or hardware)
  3. [/listo]

    Of course in the first case any finding can be validated easily, by simply using another similar tool or checking manually the actual values/fields/whatever and verifying them against the result of the tool usage and the documented process.

    The second case is much more tricky, in some cases it produces a verifiable result (as an example a Commercial password cracker may well use a proprietary algorithm making use of an unpublished vulnerability of the encryption, since what it produces in the end is a password, if the password is valid, i.e. it opens the archive or encrypted container, there is no real need to know HOW the tool reached the goal, as the goal in itself is verifiable), but in other cases the result may derive by an interpretation (along a proprietary discovery or algorithm or "translation") of the actual data that - since the actual process creating the artifact or leaving the traces is not or not enough documented - may (or may not) be accurate and is not directly verifiable.

    About the point that mscotgrove raised ) I will give you an example in a completely different field 😯 .
    We all have used Excel since - say - 1992 to create (simple or complex) spreadsheets essentially based on the 4 main mathematical operators (+ a few more operators/functions of course) and the results of these calculations were always found correct (maybe sometimes with a loss of accuracy when very small numbers are involved, but still within the expected tolerances).
    Everyone - from the sheer mass of Excel users - has been empirically convinced that Excel could do multiplications correctly (and of course this is normally true) so the tool had been "validated by acclamation".
    Yet, it was found "suddenly" that Excel 2007 could not multiply (a fully known and documented process) 77.1*850 properly (it came out with 100,000 instead of 65,535), see here
    http//www.theregister.co.uk/2007/09/26/excel_2007_bug/
    https://blogs.office.com/2007/09/25/calculation-issue-update/
    http//www.joelonsoftware.com/items/2007/09/26b.html

    More or less you could multiply *any* couple of numbers (exception made for this specific couple of numbers or another handful of them) and every single multiplication would give a correct result, but not this particular one.
    Since noone can possibly test an infinite number of couples of numbers to be multiplied, the tool is likely to be validated, still in some cases it would be greatly inaccurate.

    And now, still OT (but not much), some recent news
    http//arstechnica.com/tech-policy/2015/10/secret-source-code-pronounces-you-guilty-as-charged/

    jaclaz

 
Posted : 18/10/2015 6:20 pm
pcstopper18
(@pcstopper18)
Posts: 60
Trusted Member
Topic starter
 

mscotgrove

I am sure that many if not most will agree with your points. I know that I do. The practitioner and their skills, knowledge, education, and training is what is paramount. I don’t know anyone (which doesn’t mean there are none) in an accredited environment that believes validation is “practical.” I don’t know anyone who would make that argument. Practicality aside, it is a normative or becoming a more normative practice due to belief that the concept is the most prudent and correct action to take.

However, it is not the necessity or lack thereof that I am concerned with at the moment.

“Personally, I am often concerned that some questions on this forum assume blind faith in a tool. I am sure we all trust our favourite tools, but ultimately each run needs to be checked for giving a valid result. The operator must understand what the tool is doing and not blindly trust anything.”

I don’t know if this is directed at my question or if it is a frustrated observation?

jaclaz

My original question is both philosophical and practical. Does the notion I describe ring a bell and does anyone practice that idea?

As far as I am concerned, your first paradigm should be what is normal. Even if everything can’t be documented and known to provide a foundation for the tools in paradigm 1 at all times, the process is more sound, verifiable, defensible, and can be duplicated by anyone. All of which you have noted. If I had my way, any tool being used in the criminal justice system must be open source. Notice I did not say free, and I did not say it couldn’t be proprietary (which is what copyright and patent laws are for). The secrecy serves no one but the vendor as opposed to the criminal justice process. It’s antithetical to the search for truth that the justice system (however imperfectly) seeks to find.

Great feedback overall. However my original question remains. The assumption is that the practitioner works in an accredited lab. They have to validate their tools as a matter of accredited practice. A suggested policy is - Can they categorize some tools or certain functions as “forensic” in nature and other tools used to help you view/collate/interact/etc. as "interpretive" or "analytical" and only validate the forensic ones? Is there a basis for this? Is this inline or contrary to practices/policies anyone is familiar with?

 
Posted : 18/10/2015 10:39 pm
(@mscotgrove)
Posts: 938
Prominent Member
 

"The secrecy serves no one but the vendor"

Being the vendor I can agree with this. Having been writing commercial software for over 30 years I have built up knowledge, methods etc that I do not want to share. The income has been used to pay staff, fund more development etc. I have zero trust in copyright or patent laws, so I just keep my secrets to myself. Open source is not something I would ever consider.

As an observation, in my opinion, the Forensics world is much more open about knowledge than the data recovery world.

I am amused by the Excel error. 65535 as everyone will know is 0xFFFF, or -1 in 16 bit systems if you must use signed values. This was a common value for ERROR which functions could return. I would guess that the function returned the value which was then the same as the error return.

NB, Microsoft has now redefined ERROR to be 0 (same as FALSE)

 
Posted : 19/10/2015 2:20 pm
jaclaz
(@jaclaz)
Posts: 5133
Illustrious Member
 

However my original question remains. The assumption is that the practitioner works in an accredited lab. They have to validate their tools as a matter of accredited practice. A suggested policy is - Can they categorize some tools or certain functions as “forensic” in nature and other tools used to help you view/collate/interact/etc. as "interpretive" or "analytical" and only validate the forensic ones? Is there a basis for this? Is this inline or contrary to practices/policies anyone is familiar with?

Yep ) , but the point is that validation does not exist 😯 (or it only exists in the minds of people that believe in it).

As mscotgrove stated it is impossible to actually validate anything more complex than a very simple program (which BTW in your classification would be the one that would not need a validation at all).
And there is anyway the issue, - the more complex the analysis is - of validating the operator.

All comes from a false (or rather inappropriate) perception.

That forensics (digital as well as other branches for it) is either an industry (it is not) to which you can apply ISO 9001 (which is not about "levels of quality" but rather about "constance of quality", making leverage of the implied repetition of industry product manufacturing) or a laboratory (which also it is not) to which you can apply ISO 27000 (as what you use in a laboratory are instruments of measure that work along known principles, can be objectively tuned/calibrated and in the end measure physical matters).

Of course bits and pieces of both the standards can be applied to digital forensics, but only as "generic procedures" or to some (the most easy, known, documented) parts of the processes involved.

The fact is that IMHO a forensic analyst is more than anything else an artist (or rather an artisan), and the *whatever* he/she produces in the end is often outside a pre-set scheme or on the border (or beyond it) of future (or however of new discoveries).

Let's say that researcher A finds a new kind of artifact that may (say) give the exact location of a device at a given time (let's imagine that this fictional example is about an undocumented log of the GPS part of a phone written in an encrypted form).
Let's also say that the researcher managed to make more than a few tests with data he had and verified his unencryption algorithm.
Now let's say that he is from Norway and that his results are repeatable in Norway and in all the EU.
Someone (a professional forensic order or state regulator) may well decide (something like 3 or 4 years later, i.e. when two or three later generations of new phones is commonly in use) to "validate" the method.
Some US investigators may decide, after the EU validation, not before it, to use the program or algorithm.
The corresponding US regulator approval/validation is likely to arrive (if ever) in another two years time, but in the meantime a few people have been jailed and/or condemned thanks to the data thus acquired (or acquitted on the base of those same data), when another case makes evident that the EU algorithm is not valid in the US because - still say - *something* is lost in converting metric to imperial, and the data has a bias or offset in the location of 2^12 furlongs (or possibly 3^11 rods) wink

So who is wrong here?
The researcher from Norway, the EU regulator, the US investigators or all three?
And what about the n EU investigators that decided to wait the three years before using the novel technique/tool?
How many people will have been condemned when innocent and acquitted when guilty in the meantime?

jaclaz

 
Posted : 19/10/2015 5:54 pm
pcstopper18
(@pcstopper18)
Posts: 60
Trusted Member
Topic starter
 

mscotgrove

As a quick FYI, I have no issue with vendors, small or large, commercial or open source alike, and meant no slight with my secrecy statement. It was only my observation and opinion on the fundamental priorities of those who make the tools verses that of criminal justice systems being different. I also know that for most, copyrights and patents don’t do anything to “protect”, only to punish and hopefully reimburse. That very situation is just an additional issue that compounds the problem. Other industries can be very cutthroat in this. It is in your best interest and others to protect your secrets. I am not ignorant of that fact. I purely wanted to communicate an ideal in an flawed world.

jaclaz

“Perfect” validation does not exist. I agree. However, it is definitely possible to validate a single function (purpose and complexity of the function aside). The lack of perfect validation doesn’t undermine those who believe in the concept, only to what extent they think it possible to apply.

The classification I described said nothing of simplicity vs. complexity. It was “forensic” (whatever one determines that to be) vs. interpretive or analytical (whatever that is determined to be).

You forgot ISO 17025. D There are in fact some things that are very “industry” and “laboratory” in application within forensics. Testing related to disciplines such as toxicology is very repetitive in nature and the instruments have to be calibrated and such. So those standards can be applied (however perfectly or imperfectly). Digital on the other, is what doesn’t fit. And I would agree that many of the standards are inappropriate in their application to digital forensic practice.

Your example scenario highlights only the inconvenience of validation (in this context). Not whether or not the notion is infeasible. I would say no one in that scenario was “wrong.” If they all did as described they were doing what had been deemed appropriate for their “space.” The “what if” must be weighed appropriately as many people can be and are condemned or acquitted “in the meantime” regardless of what is done and how it’s done, just like every other possible law, cure, etc.

With all that in mind, IYO, is there a more appropriate approach? Is the application of standards problematic, or just those you named? This matters of course since my original question would be based on one's foundational understanding/position.

 
Posted : 19/10/2015 7:00 pm
jaclaz
(@jaclaz)
Posts: 5133
Illustrious Member
 

We - unfortunately - need to go by example.

You can validate (to a certain extent) a simple tool that transforms (say)
0x56250490 to GMT Mon, 19 Oct 2015 145616 GMT
but only the sheer moment you have validated the notion that the hex number is represented "normally" (while what you see with a hex editor would be 90 04 25 56 on Intel systems) and that the field actually represents an Epoch time.

Though there will always be the risk that for some "strange" date the tool under validation won't work, it would be reasonable to expect that it works for "valid" dates (let's say dates up to 10 years in the past and ten years in the future).

But let's say that that four bytes field is undocumented and that a proprietary tool has found out that it is an Epoch timestamp that is used to "tag" an entry in a proprietary database format (and what you find on disk is actually - still say - that thingy ROT13 or XORred), and that the corresponding field is normally all 00's unless the *whatever* app has had a crash within 5 minutes of updating the database.

Noone else has this piece of info and though another tool (opensource or proprietary) can actually read other fields of the database, that particular field is marked "unknown (always zero)".

You will analyze tens or hundreds of such databases successfully before finding a case where that field is non-zero, and when it happens the first tool will provide a valid result, while the second may well go astray or provide a "false" result.

Though not really the same thing, the work by Jonathan Greir
http//www.forensicfocus.com/Forums/viewtopic/p=6560130/#6560130

and - in a much minor way - my somehow related observations (more questions than answer, JFYI)
http//www.forensicfocus.com/Forums/viewtopic/t=13018/
http//reboot.pro/topic/19746-queer-ntfs-andor-xp-behaviour/

represent something that it is still if not "unknown", at least "largely debatable" that pertains to the most used filesystem and Operating Systems since 1995 or so, or if you prefer, after 20 years or so we are still finding some queer things that are not fully explainable, not fully documented and not even entirely "reliable" or "reproducible".

jaclaz

 
Posted : 19/10/2015 8:43 pm
pcstopper18
(@pcstopper18)
Posts: 60
Trusted Member
Topic starter
 

jaclaz

Where are you going with this example? I can't tell if you are responding to my last post or if this is a similar example in the vein of your last post and you're making a point…

 
Posted : 19/10/2015 10:14 pm
jaclaz
(@jaclaz)
Posts: 5133
Illustrious Member
 

jaclaz

Where are you going with this example? I can't tell if you are responding to my last post or if this is a similar example in the vein of your last post and you're making a point…

The point was made before, the example in my last post was more about how you could validate the one tool or the other, i.e. in practice which tool you would validate between the two?

One tool is almost always right (but it is not 100% accurate) and it is possible/probable that you or someone else will validate it given a number of known, documented, prerequisites/assumptions.

The second one will also validate with the same prerequisites/assumptions on the same test data sets.

So you will have two validated tools that in some cases (not common but not so rare as the couple of numbers of the previous Excel related command) will give you different results.

What is the actual value of the validation?

You have two validated tools and you cannot say (without further information) which of the two provides the right result, and on each other data set they both produce the same result.

Which one will you choose?

jaclaz

 
Posted : 19/10/2015 11:20 pm
Page 1 / 3
Share: