Forensic vs. Interp...
 
Notifications
Clear all

Forensic vs. Interpretive or Analytical Tools?

29 Posts
6 Users
0 Reactions
3,589 Views
jaclaz
(@jaclaz)
Illustrious Member
Joined: 18 years ago
Posts: 5133
 

Not a “committee” but rather an actual organization/entity.

Sure ) , but there was a reason why I used the term "committee" wink . we have now a dedicated site to make one's own treeswing cartoon
http//projectcartoon.com/

There are organizations that demonstrate it can be done more timely. The Defense Cyber Crime Center has a research arm, the Defense Cyber Crime Institute. DCCI does, among other research things, tool validations. Their validations are more extensive and much faster than NIST (though not perfect of course). Unfortunately for everyone else, they are only available to DoD and Federal LE and intelligence.
I don’t think any good solution can ever be bleeding edge, but keeping up can be done if it is uniquely purposed and open for all.

More timely doesn't unfortunately mean "timely enough".

The Excel example posted before is a classical example of why the (very few) people that had multiplied successfully 77.1*850 till then suddenly couldn't anymore.
From the release date of the new software tool version (end 2006/beginning 2007) and the finding of the bug (September 2007).

Imagine all the missiles that went 36 miles off target in these 9 months, all the over-invoicing in the world, etc. IF Excel was actually used for "mission critical" calculations.

There is a reason why *somewhere* in the Eula the good MS guys put a clause denying responsibility about results of calculations made on Excel.

The only way out would be a pre-validation, i.e. the organization should examine and validate the software tool before the software house sells it to their clients (and the same should happen for each and every update).
More or less it would equate to strictly regulate the commerce of the software or restrict the use in Court to only the (possibly very few) actually validated softwares (including obsolete versions, possibly missing "new features" that softwares in the unregulated market would provide).

Possible but unlikely to ever happen for obvious reasons.

And you probably would still have to validate the actual forensic examiner and the way the software is used (far beyond what ISO 9000/17000/27000/whatever require) …

jaclaz


   
ReplyQuote
pcstopper18
(@pcstopper18)
Trusted Member
Joined: 15 years ago
Posts: 60
Topic starter  

jaclaz

More timely doesn't unfortunately mean "timely enough".

Well of course. Techincally speaking anything happening after release of the software is not timely enough. As you point out. The only conclusion is a pre-release validation scenario. Ideally, this is done by an party independent of vendor influence. This could solve timeliness of catching issues but not update related ones. It would still be problemmatic if the validator couldn't keep up with the need of having updated software readily available.


   
ReplyQuote
(@Anonymous 6593)
Guest
Joined: 17 years ago
Posts: 1158
 

You don’t validate a tool to then use it on other carefully created data…."fabricated data" as jaclaz put it. You validate it to support its use in the uncontrolled environment or controlled environment with uncontrolled data.

We seem to differ as to terminology. My view of 'validate' is that it takes place only on … 'fabricated data' if you like. The 'validation' is either of a claim of the tool manufacturer ('This tool does X') or of a strong expectation of the tool user ('This tool should do X'), and the validation verifies that it does so within some limits of acceptability. It may even verify what behaviour the tool shows on non-X information, or X-data that lies in illegal or ambiguous areas. (Example What does tool X do with incorrect ISO 9660 time stamps, say, one that specified month 43 of a year – formally illegal –, or one that specifies February 30th as a date – legal according to ISO 9660, but not a date that can be directly compared with normal dates. I've seen such dates on a few CDs – almost certainly a bug in the CD-mastering software. Even if it's illegal, I want tools to tell me that there's something wrong, not silently translate February 30th into March 2nd.)

Wild data is … 'wild'. There is no assurance that it covers the domain of X sufficiently well. For a tool that examines FAT12/16, say, there's no evidence that it covers different implementations of FAT12/16. Or, say, that it covers the FAT profile known as ECMA-107. Or that it can deal with a MBR or VBR that doesn't follow expectation, but still can be executed by the standard boot process. All that can be hoped for is that it is statistically representative.

Wild data can be very useful for discovering variations in behaviour, but then it has to have some claim to being statistically representative. For CD time stamps, I would like to see a corpus of a few thousand ISO images, or so, from different mastering programs. Something like https://archive.org/details/cdbbsarchive.
And wild data can be useful when the tool totally chokes – like some tools do (or did) on Windows Server 2012 R2 file system on which data deduplication has been enabled – provided there is some understanding of what is going on.

And wild data cannot be tested. A test is not 'let's run it and see if it works reasonably well'. It's is … or should be … the production of a test protocol, which then can be evaluated in a context. That either presupposes a protocol, or sufficient information for the tester to create one.

… wouldn’t NIST’s data sets and Digital Corpora qualify?

Thanks for mentioning these!

The NIST data sets is very, very close to what I would like to see – indeed they may even reach or surpass it, at least as far as the principles they mention go. I find no note that I've looked at this site before, so thanks again for mentioning it.

The file carving test seem to include almost all the information that I think is necessary. There seems to be areas left out (possibly deliberately), so I would like to see some kind of design philosophy why these particular files, why these particular transformation, and what should also be tested, but was not possible or convenient to do at the time. A tool to create these kinds of images would add to the utility, and allow testers to create tests that do not allow themselves to … call it 'pre-meditation' (e.g. 'Evil file carving tool checks image hash, realizes it's a known NIST test image, and simply spews out a prepared report').

If the File Carving tests are representative for the entire site, this looks very good. (I will spend some time going over this site.)

The Digital Corpora are, as far as I can see, still 'wild data', and so useless for the purposes I have in mind. There's no way to look at file Y in an NTFS image and see that it should be a junction point, for example.

It is probably useful for education, … which seems to be the reason why it exists. But it doesn't seem useful for *testing* a tool. Perhaps for dry-testing a methodology or investigative procedures.

But then I have software development background, so I carry a lot of testing methodology ideas from there.


   
ReplyQuote
jaclaz
(@jaclaz)
Illustrious Member
Joined: 18 years ago
Posts: 5133
 

Maybe we can put it this way (I am not sure if there is so much difference between "validation" and "verification", as jhup earlier stated they are interconnected and seem to me like two sides of a same coin).

Validation is a process through which we make sure that the tool creates a given "expected" result from analyzing "expected" (or "standard" or "common" or "compliant") datasets.

Verification is a process in which the single result is verified to be correct by comparing it to the result of either another tool (when/if such an alternative tool exists) or manually (when/if information to do so is available).

As I see it both processes are valid and needed/useful as long as both condition apply

  1. the dataset is "expected" (or "standard" or "common" or "compliant")
  2. the correct result is known or can be calculated (as said through one or more other tools or manually)
  3. [/listo]
    Everything is fine and dandy.

    But the issue remains and as soon as the dataset is in any way "unexpected" (or also "non-standard" or "uncommon" or "non-compliant") what happens is "a suffusion of yellow".

    Reference for those non familiar with Douglas Adams books
    http//www.thateden.co.uk/dirk/

    There is no way to know in advance how a given tool (particularly a "closed source" one) will behave with "unexpected" data, because the basic nature of something unexpected is its unexpectedness, NOBODY expects the Spanish Inquisition!

    Reference for those non-familiar with Monty-Python's sketches
    https://en.wikipedia.org/wiki/The_Spanish_Inquisition_(Monty_Python)

    So the point might be that a "validated tool" i.e. something that passes all tests with expected data has better probabilities to deal correctly with the unexpected than a "non-validated" tool, and while this might have some grounds, in the sense that a tools that passes validation is likely to have been carefully and accurately written/programmed, it doesn't really stand against the exception.

    As another common example softwares that have been run on the internet for years or tens of years and considered to be reliable, one day are found to be vulnerable to hacks very often related to something as simple as a string parsing errors, or similars, but anyway dealing with sending "unexpected" data, in the case of the famous heartbleed bug it was the non-standard length of the "echo" request, as xkcd nicely explained it
    http//xkcd.com/1354/

    jaclaz


   
ReplyQuote
pcstopper18
(@pcstopper18)
Trusted Member
Joined: 15 years ago
Posts: 60
Topic starter  

athulin
You are very welcome sir!

jaclaz
I whole heartedly agree. I might even use some of your verbiage in my own notes. Hope you don't mind -)

I think conceptually we are all on the same page. I have been able to glean a great deal. As with most things in life one's perspective (practitioner vs. programmer vs. researcher, etc.) informs their take on this. Now, its just a matter of to what extent can this be done from a practical standpoint? What extent will withstand legal scrutiny, and what extent is congruent with the "science."?

In a perfect situation, I would think a comprehensive protocol for validation of everything a vendor "says" something does is in order. As reality would dictate, this is infeasible. I don't even recommend it. Who has the time and the resources? (If you do then by all means raise the bar.) With those being limited, what is sufficient? Should software testing by the vendors be scrutinized and taken to task?

Right now, I think function-based testing on only those things you "actually" use should be sufficient. Or, as noted in my OP, you say XYZ functions (independent of tool) are "forensic" and only do those?

Would a call for less complex tools also be a solution? Stop making massive suites and stick to more streamlined functionality perhaps?


   
ReplyQuote
jaclaz
(@jaclaz)
Illustrious Member
Joined: 18 years ago
Posts: 5133
 

I might even use some of your verbiage in my own notes. Hope you don't mind -)

Of course not ) , it's rather a honour for a non-native English speaker as I am that my "verbiage" is considered "reproducible".

Would a call for less complex tools also be a solution? Stop making massive suites and stick to more streamlined functionality perhaps?

As a dinosaur coming from the DOS era I would obviously approve of such a move, surely it might make things easier for the validation processes, as you could validate single parts or "commands" more easily.
At the end of the day the "suite" would be nothing but a batch file issuing a given set of more elementary commands in sequence, and (in a perfect world) the single command/tool could be replaceable with another one doing the same functions, but I doubt that any of the software makers would be interested in doing something like that.
If you look at the many small .exe's from (say) Mares or Tzworks, that is seemingly the idea behind, and in that case the "suite" is just the digital forensic investigator knowledge/experience, but we risk to go off topic, landing on the "artist/artisan" nature of the procedure vs. the "one button forensics" one. 😯

jaclaz


   
ReplyQuote
MDCR
 MDCR
(@mdcr)
Reputable Member
Joined: 15 years ago
Posts: 376
 

Like the conversation, keep it up.

Just going to throw a few branches on the fire to keep it going

1. I ran into a timestamp from the future (totally unrelated to Doc and Marty), i thought - hmm… is there something wrong with the date parser? Is the database field set to the right format? But i ended up realising that the battery on the specific computer was broken and the clock was basically producing weird datestamps. So, do not dismiss the tool before you validate the data by looking at it.

2. A little request to you developers out there I would appreciate that programs written for forensic purposes would tell you if it is operating outside its normal parameters, i.e. a program written for processing data on NTFS pre version 5 partitions would tell me that that was the case so i would know that results may not be correct (That was just an example, i've actually not found any problems with NTFS 5 and any tools, but you get my point).


   
ReplyQuote
pcstopper18
(@pcstopper18)
Trusted Member
Joined: 15 years ago
Posts: 60
Topic starter  

jaclaz
Well sometimes someone does a better job of putting things into words in some particular instance and I have found that when it happens make use of it. )

Yeah, from a validation perspective that may be an ideal setup for forensic tools. Unfortunately, many prefer the suites for ease of use. I have no issue with that, however if there is anything to be learned from our discussion, is that there is more going on behind the scenes and in the greater over all picture that "ease of use" could sometimes, and for some people, be "convenient" at best, a "crutch" at worst…and for a variety of reasons. Validation, especially in this discussion/context, is one of those reasons.

MDCR
Good points. As noted earlier in this discussion, "failing loud" is definitely a preferred capability that is not universally applied. This would even help address the validation question when tools can tell you in some fashion "this was found and it fits what I know, this was found and I don't know what it is." Any variation or expression of this capability would go a long way. Some do, many don't.


   
ReplyQuote
(@mscotgrove)
Prominent Member
Joined: 17 years ago
Posts: 940
 

Would a call for less complex tools also be a solution? Stop making massive suites and stick to more streamlined functionality perhaps?

As a developer, simple tools are easier to develop and verify. However, which group of functions do you leave out?

Lots of small apps is one approach, but many programs will need the same library of features (logging, drive selection, error reporting etc) that all need maintaining. One may need to update 20 programs, rather than maintain just one.

If a customer (the person who pays the bills) wants a new feature, it is often easier to add to the main program, rather than a new app, and month by month we end up with bigger programs with many complex features. With parallel processing multiple tasks can be carried out at the same time, eg one can image and do keyword searching with a single disk pass. This should be quicker than 2 apps, doing two disk passes.


   
ReplyQuote
Page 3 / 3
Share: