±Forensic Focus Partners

Become an advertising partner

±Your Account


Username
Password

Forgotten password/username?

Site Members:

New Today: 0 Overall: 35390
New Yesterday: 0 Visitors: 112

±Follow Forensic Focus

Forensic Focus Facebook PageForensic Focus on TwitterForensic Focus LinkedIn GroupForensic Focus YouTube Channel

RSS feeds: News Forums Articles

±Latest Articles

±Latest Webinars

Child Exploitation Hash Sets

Computer forensics discussion. Please ensure that your post is not better suited to one of the forums below (if it is, please post it there instead!)
Reply to topicReply to topic Printer Friendly Page
Forum FAQSearchView unanswered posts
Page Previous  1, 2, 3, 4  Next 
  

Chris_Ed
Senior Member
 

Re: Child Exploitation Hash Sets

Post Posted: Sep 16, 16 06:30

While it of course is programmaticaly easy to change a file in order to generate a new hash, there will always be enough uncertainty in detection that there won't be huge efforts made in this regard. With a readily available hashset, there is no uncertainty.

If the hashsets were made publicly available I would be utterly shocked if there weren't sites on Tor which, within 24 hours of this availability, would guarantee (and advertise) that their entire collection of material is not found in any LE hashset.

Hash sets are not entirely ideal, yes, and perhaps with a large enough collection of child abuse images then we could effectively train a machine to spot them with decent accuracy, but for right now it is still the fastest way to detect this sort of stuff.

I work in the private sector and I can appreciate the frustrations, but IMO there are some things which rightfully should remain in the LE domain.  
 
  

PaulSanderson
Senior Member
 

Re: Child Exploitation Hash Sets

Post Posted: Sep 16, 16 06:58

If you are developing a tool that you want to work with a particular has set then you don't need the hashes themselves as others have said.

The format of the hashset(s) is all you need and I see no reason why they can't be publicly available.
_________________
Paul Sanderson
SQLite Forensics Book
www.amazon.com/SQLite-...entries*=0

Forensic Toolkit for SQLite
sandersonforensics.com...for-SQLite 
 
  

jaclaz
Senior Member
 

Re: Child Exploitation Hash Sets

Post Posted: Sep 16, 16 11:09

- Chris_Ed

Hash sets are not entirely ideal, yes, and perhaps with a large enough collection of child abuse images then we could effectively train a machine to spot them with decent accuracy, but for right now it is still the fastest way to detect this sort of stuff.

Allow me to partially disagree.
Image recognition is already accurate enough to identify images, maybe not as accurate as one might like it to be, and possibly the real issue is processing power/times.
But one could use the approach to EXCLUDE "not-CP" images.
I mean you have to analyze 10,000 images.
You pass them through a program that recognizes categories, like "panorama", "buildings", "trees" etc. or more generally any image that does not contain a human figure, what remains is further analyzed.

There is this site by Wolfram (the people that make Mathematica) that (at least to me) is impressive:
www.imageidentify.com/

And even something *like* google image search, once provided with a large enough set of images would be capable of doing at least the exclusion.

It seems to me like we are already a generations ahead of what was discussed here a few years ago:
www.forensicfocus.com/...ic/t=9693/

jaclaz
_________________
- In theory there is no difference between theory and practice, but in practice there is. - 
 
  

Chris_Ed
Senior Member
 

Re: Child Exploitation Hash Sets

Post Posted: Sep 16, 16 11:19

That's actually a really good idea - I hadn't considered using it to exclude irrelevant stuff. It could even give you a breakdown of the general subjects so that you could dip test, if you wanted. Like:

54043 - cats
3334 - cars
849332 - flowers

It still has the problem of requiring a significantly-sized server farm to do this in any sort of reasonable time, so it's probably out of the question for many provincial police forces, but on a nationwide scale..  
 
  

jaclaz
Senior Member
 

Re: Child Exploitation Hash Sets

Post Posted: Sep 16, 16 12:48

- Chris_Ed
That's actually a really good idea - I hadn't considered using it to exclude irrelevant stuff. It could even give you a breakdown of the general subjects so that you could dip test, if you wanted.

That is the good thing about exchanging ideas Smile , everyone may have different ways to see the same thing, as an example I saw the real drawback of "public CP hashset", in a totally opposite way:
- Chris_Ed

If the hashsets were made publicly available I would be utterly shocked if there weren't sites on Tor which, within 24 hours of this availability, would guarantee (and advertise) that their entire collection of material is not found in any LE hashset.


I would expect to find soon in the "normal" web all kind of images, including and especially demotional posters and lolcats modified to compute a hash included in the CP hashset ....
... creating tens, hundreds, thousands, millions of false positives on all computers ... Shocked

jaclaz
_________________
- In theory there is no difference between theory and practice, but in practice there is. - 
 
  

tracedf
Senior Member
 

Re: Child Exploitation Hash Sets

Post Posted: Sep 16, 16 15:53

- jaclaz

I would expect to find soon in the "normal" web all kind of images, including and especially demotional posters and lolcats modified to compute a hash included in the CP hashset ....
... creating tens, hundreds, thousands, millions of false positives on all computers ... Shocked

jaclaz


That's not possible given any of the currently known attacks on MD5 or SHA-1. There are two basic criteria for a hash function:

1) It should not be feasible to find two inputs with the same hash (a collision).
2) Given a hash, it should not be possible to find an input that produces that specific hash.

MD5 fails at #1; there are known collisions. Researchers have been able to produce collisions in SHA-1 (using 64 GPUs for 10 days), but only if they can pick the Initialization Vector which, in practice, you cannot.

Collisions, which violate criteria #1, are different than what you need for criteria #2 because you don't have to match a specific hash. If you want to produce two messages, A and B, that produce the same hash value X, you are free to modify both A and B. They don't have to land on any specific value as long as they are the same.

Think of the difference this way:

1) Find two people that have the same birthday.
2) Find someone else that has MY birthday.

What you are talking about is an attack on criteria #2. Given known hash values (from the CP hashset) find additional images that produce the same hash values. There is no currently known attack that would allow you to do this for MD5 or SHA-1.

SHA-2 and SHA-3 (which I haven't seen used in forensics as far as I can recall) are currently safe against #1 and #2.

Without a serious advance in the cryptanalysis of MD5 and/or SHA-1, poisoning the hashset by producing non-CP images that match the CP hashset is an impossibility. If such an advance occurs, all forensic products and hash sets would need to move away from MD5 and adopt another hash algorithm.

-Steven  
 
  

tracedf
Senior Member
 

Re: Child Exploitation Hash Sets

Post Posted: Sep 16, 16 15:57

- Chris_Ed

If the hashsets were made publicly available I would be utterly shocked if there weren't sites on Tor which, within 24 hours of this availability, would guarantee (and advertise) that their entire collection of material is not found in any LE hashset.



Any of them could do this now. There's no reason to have the hash set; they just need to make each image unique by modifying at least one bit of the image. They could build a web app to serve up the images and modify one pixel at random each time the image is accessed; every download would be unique and therefore every hash value would be unique.

-Steven  
 

Page 3 of 4
Page Previous  1, 2, 3, 4  Next