±Forensic Focus Partners
|New Today: 0||Overall: 35407|
|New Yesterday: 7||Visitors: 161|
WebinarsBack to top Back to main Skip to menu
Griffeye 101: A Crash Course In Visual Media Investigations
Join the forum discussion here.
View the webinar on YouTube here.
Read a full transcript of the webinar here.
Rey Navarro: Hello everybody. We haven’t started the webinar yet. This is Rey Navarro with Griffeye, and I see that we got a number of folks on, some early in the morning, some in the afternoon, some late into the evening. I just want to see if all of you can hear me just fine, if my audio is coming in, and that you see the Griffeye 101 splash screen up on your screen.
Excellent, excellent. I’m getting a lot of confirmations, so that is great.
We’re going to give a couple of folks some more time to log in, and start at 10 AM sharp. I see Thailand, I see Switzerland, [France], good old USA, [02:43], Netherlands. All across the world, so this is excellent. Iceland, Canada. Welcome to everyone.
So, we will give everyone some time to get in, and then we’ll begin. We have a minute to go. So, bear with me and we will begin.
[03:08] the shout-outs – New England, Poland, North Carolina, Thailand again, the UK, Russia, Las Vegas! Right. Wisconsin, India, Nebraska, Connecticut. I’m based out of DC, so it’s mid-morning for me, and I look forward to teaching you guys hopefully something new this morning.
Sound has stopped. Don’t worry, I’m still here.
So, it is 10 AM sharp here, on the east coast. I’m based out of DC. This is Rey Navarro, I’m with Griffeye. Senior Director from North America, and I appreciate everyone that has logged in for this morning’s WebEx and webinar on Griffeye. This is Griffeye 101, a crash course in visual media investigations.
So just one more check – if I could get a couple more confirmations that you can see Griffeye 101 on your screen and that you hear me loud and clear.
Excellent, I’m getting those confirmations now. As you all can see, you’ve got our chat up on the screens for today’s format.
Because I can’t hear you all, I would like to definitely open up the presentation, the WebEx to questions, but save those for the end. The presentation and the practical portion of the WebEx is going to take about an hour, and then I’m going to save the last half hour to try to field as many questions as I can keep up with, assuming that you all may have similar questions.
To give you a brief introduction to myself, I’m the Senior Director for North America. Taking care of things primarily out of the United States, but taking care of Canada, US, and Mexico. And we’ve got a number of folks around the world joining us today. Again, some will be early hours of the morning, some late into the evening, across the entire world. Last time I checked, we had just over 300 or so people at least registered a couple of days ago, so … love that we’ve got great attendance.
With that, I shall begin. Some of you may or may not be familiar with Griffeye, but what we try to help law enforcement with is a platform to investigate visual media – we’re talking about pictures and videos. This is the premier platform for that. And so, why do we do that?
Well, pictures and videos are being produced on a massive scale, and it’s just getting even more and more pronounced and exponential in growth as time goes on. You’ve got digital cameras which are kind of older technology but continue to make it easy for us to produce those kinds of images, or produce images, period. And videos. Or surveillance cams. Everyone’s got a cellphone, or at least two of them. You’ve got one for work, you’ve got one for your own personal use. Of course, you’ve got the internet to post all your images and share, and then your laptop to store them. All these different hard drives … storage is so easy to get, and cheap. It’s inexpensive, it’s accessible. And so, they’re being produced on a massive scale.
What does that mean? That means that everyone that’s on this WebEx now is probably handling that sort of evidence. With it being so easy to produce, it is also a big and important part of your investigation. Here, in the United States, we’ve got the Department of Homeland Security, we have a specific subsection of it, Homeland Security Investigations HSI, and just them alone, in the type of investigations that involve child pornography, they collected over six petabytes of data in 2016. That is a staggering figure – petabytes. I mean, we think in terabytes, we think in gigabytes, but as far as the total … that is a lot of pictures and videos to mow through. And that is just for that one use case. You’ve got homicides to investigate, you’ve got gangs and trafficking and all sorts of other crimes, fraud, to track and try to keep up with.
So, this means that you have a problem set that … how do you get through that? I’m going to keep going and referencing back to the child pornography case, but those cases … the largest case I’ve heard of, single case, involved over 22 gigabytes of pictures and videos. 22 gigabytes – you buy a laptop, today’s standard, 500 gigabyte, one terabyte, maybe two terabyte … this is someone that collected, hoarded, stored, and distributed. This is a wild, wild case.
So you need something to help you mow through that. The bottom line is that you are overwhelmed with pictures and videos. That’s what we try to help you with. We’re trying to help … and that’s what we’re going to go over. How do you get through that quickly and efficiently?
The whole idea is to … in the very beginning, you’ve got all your different sources. You need a way to process and analyze just that overwhelming amount of information. It needs to be automated and prioritized for you, so that it’s easy and you have a running head start. And how do you do that? We’ll go over that. Basically, it works off a model of having a set of knowns and having your platform, having your software figure out what you don’t need to look at and where you should start.
Once you have that and you know – these are perhaps images and videos that you haven’t looked at or haven’t been seen by anybody else, then now you can start your investigation and look, detect, highlight those critical clues, so you can prosecute. Once you have that, then perhaps you can find the relationships between the data. Not necessarily full-length analysis, but each one of these images and videos, they’re typically very rich with embedded information, primarily Exif, but just from the pictures themselves, they can be indexed in such a way that any sort of visual similarities can also be [matched] for you.
So, once you have the important information that you need to focus on, you’ve found all your clues, you need to wait to visualize and review it. And if you’re doing this, depending on what size your organization is – you’ve got a single agency, perhaps a police department, a sheriff’s department, sheriff’s office here in the United States that may include a couple of examiners and a couple of investigators that do this type of work, and then you also have the national scale – you’ve got [TICAC], you’ve got the [ICAC] over here. You’ve got HSI and federal agencies, large-scale organizations. It needs to be able to scale out from the basic level or smallest level out to the national level. And once you have that, you need to be able to manage and organize your information, and if you’re doing it right and you’re building that repository of knowns, now you have an intelligence source, and you’re building intelligence that you can reference in future cases, and keep accumulating that knowledge.
Eventually, the ideal situation is: once you’ve accumulated that knowledge, you can share that information with other organizations and be even stronger in your investigations, because what you’ll see after today is how to cut down your work, but by building that intelligence repository, you’re cutting down the work of not just your organization, but, if you’re sharing it, everyone else.
So, how does it work? Well, first, we Griffeye is an open platform that is modular. The core of the program is to investigate media, visual media, and to build upon it, there is an API. So, there are some integrations that are fully built in, such as video fingerprinting with F1, Microsoft photo DNA for photo [stacking] and deduplicating, the identifier for very, very fine, granular level object identification. Then you’ve got camera forensics if you want to be able to hang out and look for Exif data on the internet. You’ve got social media identifier to leverage the social media artefacts hat you can find in the images and videos. There are all these different ways, but the whole idea is keep it open, keep it plugged and ready, provide that API openly to law enforcement and our users, and then other things, such as integration or plug-in ability with Hubstream, third-party tools such as Magnet, Cellebrite, XRY, [Amp] software is fully integrated now, Nuix, you’ve got all these different possibilities.
But beyond that, if you notice at the bottom of the screen, if you’ve invested in visual technology such as facial recognition for your agency, perhaps plate recognition, you’ve got fingerprinting services, you’ve got ID card identification or perhaps passport services in agencies within your countries – all of those are pictures of everybody, and if you can map those out, index them, and keep them in your intelligence library of [hashes], your set of “knowns”, now you’ve got a really powerful way to get a head start on your work. And that API facilitates plugging into your government-owned systems that you’ve already invested in. So, this way, you don’t have a company like ours trying to do what everyone else already does well. We can stay focused on what we do well, which is help you investigate pictures and videos, and then, with that API, go ahead and plug in all that other technology, to make you even more effective in your investigations.
So, I hope that all makes sense, and this visual, this diagram, helps describe that in a really effective way.
So, what is being kept track of? What we have here in the United States is we have a project called Project VIC, and it is a project geared towards those that investigate child pornography, child exploitation, those child sex abuse type of cases. And what is kept track of – this program started off about five or six years ago, and what it did was it provided a way for law enforcement to collaborate and share all those hashes of known CP and be able to help others figure out if their cases contain it, if they’re distributing it, if there’s victims in there.
But the whole idea – not just the victims, but where they are – the whole idea was to make it standard. Standardize the way that information was shared, through the json format, the VICS hash record, which contains everything from binary hashes to [row bus] hashes, and fuzzy hashes such as photo DNA, all the Exif data, they’re sharing the categorization, so that they can say, “Hey, these are the hashes and knowns of child pornography, and if you are an investigator that does those type of cases, this helps you in a way that not just processing your cases, but also, helping you with your mental health.” So you don’t have to look at something that someone – not just someone, but three other law enforcement officials have already said, “That is known child porn, known child sex abuse material.” You don’t have to look at that unless you need to validate, and you can focus on all the different things that haven’t been seen by anybody else before.
So, that is the rundown on Project VIC. The whole idea was open standards and also changing the investigation workflow and mentality from just prosecuting someone for having the bad material to also getting you to a point where you can look at the rest of the material effectively, so that we can start not just identifying the victims and the other victims that may be in the dataset, or may be in the rest of the digital evidence, but also locate them.
This is actually the original napkin sketch of that idea, of sharing, collaborating. And what I like about it is that it has almost forced the industry, the rest of the third-part tools out there, the forensic image acquisition tools, to also find a way to collaborate with each other, so that you could share this information freely and easily between the platforms, and that universal language has become the Project Vic format, the json format for databases. Very cool stuff.
An example of that workflow would be, from left to right: you’ve got your forensic tool, you’ve collected all your images, and then you would pump out all those different … just the images and the videos, to send it into a software like this, to just go ahead and start your investigation on all the visual media. If you work in a team environment, you can set it up in a collaboration server setup, where everyone is able to work collaboratively, simultaneously, on the same case. I had mentioned the case that involved over 22 million images – that’s a lot for one person. So, perhaps [18:45] the work and being able to work real-time all on the same case is ideal. And eventually, you’re saving all that work, all the hashes that you’re collecting from case to case to case, so that you can reference them, perhaps an enterprise setup, and build out that intelligence library and repository of known hashes. When you’re in that situation, you’ve got a really, really good setup for the future.
I had mentioned the open API – an example of that, and a great example of that, is the work that we’ve been doing with one of our partners, Magnet, with the AXIOM platform. That’s kind of a one-stop shop for collecting all your images, right? You can image your hard drives, you can take your cellphone dumps, you can grab some internet artefacts and put it all in one place. But with the integration efforts that we’ve been working on with Magnet, you can then take the images and videos and send them directly to Griffeye, generally through json format, work on what you need to do, then readjust that work back into AXIOM, so that you can take the case, look at it, prosecute on it from a holistic standpoint, taking into account perhaps also the messages and emails that you’ve also collected. There’s not just images and videos that will seal the deal here. You would need to take the case as a whole. So, doing that and working with Magnet has been great, because now there’s one integrated platform – or, excuse me, two integrated platforms that work seamlessly together to help you work even more effectively.
This is just a little roadmap of where that is, just to inform everyone on this webinar – right now, you have your Magnet AXIOM platform, you go ahead and export all your pictures and videos, put them right into your Griffeye platform, do your work. In the near term, we’re working on making that kind of a background sync process, a workflow that you can setup. And the future is to make that a fully automated process between the platforms, so that you can go seamlessly between one and the other.
Let’s go into some of the more core functions, techniques that you would be using within Griffeye. First is visual matching. How do we do that? You’ve got binary hashing, you’ve got robust hashing. Going back to visual matching, that is done through photo DNA and similar search patterns. What I’ve got up on my screen right now is the Griffeye platform. And you’ll notice that to the right, after a case is processed, you are bouncing everything off of your knowns. The whole idea is you’ve got hash databases, so up here to the right, you see that items were already categorized for you, and you’ve got so many in non-pertinent, so many in – excuse me, non-pertinent one-offs, and files of interest, and a set of uncategorized. How is that done?
First, you’ve got hash databases. These hash databases, as we mentioned, house the hashes, the binary hashes, specifically SHA1 and MD5. If you are able to get SHA256, 384, and also 512, it’s keeping track of that. You’ve got a way here to setup your known hash values. On top of that, you’re also keeping the photo DNA hashes. So, there’s different ways to do that. Besides those that I mentioned, you also have, if you notice, NSRL List, that you can pump in.
What does that do for you? Here in the United States, we have the National Institute of Standards and Testing, which goes ahead and baselines all the different software that’s available and that the government uses. And what it does in that process is it also provides a hash list of known system files. Every time you image a hard drive or, say, a specific operating system, you’ve got a set of files that you know you’re going to get every time, the system files.
Probably at least … generally, over 60,000 of those files each time that you will recover and parse out. And so, if you can have a set of known hashes that … every time I run into these system files, I’ll set them aside as not important, basically. Then you’re reducing the amount that you have to look at. You also have the ability, besides Project Vic here in the United States, if you’re working with Interpol and have access to ICSE files, go ahead and put those in as your knowns. You’ve got EnCase and C4P support, for [C4All] and EnCase’s tools. You can also put in different files. Let’s say you’ve got animated files, and if you just want to go ahead and throw those in and point the sources to that, then you can go ahead and do that and say, “These hashes are known categories for us.” So, that way, you can set up your known.
There was one question [24:51] going to take one second here. I’m seeing a lot of folks now are – or have been having trouble with their browsers. Chrome is probably the best way to use [25:06]. So, those on Firefox or any other browsers, sorry for the troubles you’ve had, but with Chrome, it’s probably your best bet.
Oh, looks like Safari works as well. So I hope that clears that up.
And I see a question here about [FTK]. Yes, [FTK] is also supported.
In this case … let me show you what you’d be able to ingest. If I hover over Forensic Images, you can go ahead and you can ingest raw images, such as EO1s from your imaging tools, like FTK, EnCase. You’ve also got support to put in [ISOs, bins, raws, DBs, IMGs, DMGs, and VHDs]. So, if you want to go ahead and put those in, you can put them in directly and just ingest those sources. Just one thing to note: Griffeye Analyze is not a forensic tool, and so, although it will go ahead and extract all the pictures and videos, it will not carve out from unallocated or the deleted files. You’ll want to do that prior to ingesting, and perhaps export it out to a folder format, which is another format that can be supported. So, let’s say you did this in UFED and you did a file system dump, you put it to the root folder, go ahead and say, “Hey, this is iPhone 7-Plus,” and mark it as such. That’s what you can do as far as putting in your different sources.
The first level of efficiency was achieved by going ahead and bouncing off of your knowns, and what you’re left with here is a total of, in this small case, almost 6500 files, uncategorized, which means it didn’t hit any of the databases. We’re about a thousand. And generally, what we’re hearing from the ones that are using the platform, they’re achieving about anywhere between 80 and 90% efficiency when they have a high-quality hash database. It’s pretty effective in that manner. If you started off with, say, 6500 and went down to 1000, you’re also achieving another level of efficiency by seeing what is left after clicking on Categorize. What I’m left after … on Categorize, you see here, 1000 are left, and then you have 768. That’s because of photo stacking.
So, let me jump back to the slides and go over some of that technology. There is one question I saw here real quick, about Physical Analyzer. That is good, if you want to use the [C4All] XML format. You can also export directly to a json format, with the attachments. That could work. And I was just teaching some folks here, out of a sheriff’s office on the west coast, and we were able to also just put in the binary file form that physical. If you want to [28:43] [that bin] file – because you’re not going to worry about all of the message databases and things like that, you just need to go pull out all the pictures and videos, right? So, if you’ve got a solid [.bin] file, you can just go ahead and point it towards the [.bin]. And that would be just as effective, from the physical. And that would be from any cellphone extraction tool.
So, when it comes to visual matching technologies, you’ve got binary and robust hashing. I think for this group, Forensic Focus, I don’t really need to go too much into binary hashing. Essentially, it’s a unique fingerprint for every file. So, in the case of pictures and videos, if you went in and, say, modified one of the pixels, it produces a totally different hash. So it’s a good way to find exact matches.
In this case, you’ve got … for example, this dog. And its MD5 hash is such, you find another file, you run into another file that has the same MD5 hash, you got a match. The problem is, as I mentioned, you modify that same picture just slightly, you’re going to produce a different hash, you’re not going to get a match. So, you could see this same dog … you could see it multiple times in your case. Let’s say you image, in your original acquisition tool, for your hard drive, and your reviewing everything, in gallery view, you’re going to see this same picture, just slightly modified, over and over again, and we don’t want that, especially in a child pornography case.
So, you’ve got robust hashing. Robust hashing would essentially – using probably DNA in this case – allows you, regardless of file format, size, maybe a slight change in hue, to just find the same images and stack them, and allow you to deduplicate right off the bat, before you even start your review.
So, what do I mean by that? In Griffeye, let’s say you’ve got these [groups] here. Looks like grapes, they’re actually gooseberries. And if I hover over the top left corner of the thumbnail, you’ll notice there are six visual copies. Now, those six visual copies – let me expand this just a little bit – those six visual copies reside in these folders. You notice I’ve got four different sources for this, actually five different sources for this data set. And those different gooseberries, the same pictures, reside in those folders, and there are two exact copies, two binary copies, that are in those locations.
So, the dataset included eight of these grapes. And you would have normally run into these grapes eight times. And that’s a lot of work. It could really total up to a lot of work, especially if you were talking about [31:54]. So, with the six visual copies, the two original jpegs, the same exact size; you’ll notice the other one is a [pink] in a different resolution, or is a [pink] file, and the other one is an additional [pink] in a separate, different resolution. So those are the visual copies, here is the binary copies, but you are only reviewing this once.
And if you want to go ahead and categorize that, you’ll notice that there’s a number corresponding to each category. I’m going to change this to category two by hitting the button number two, and now, all six of those files have been categorized with category two, in this case. This is a generic use case that you’re seeing on your screen right now. So, that’s your next set of efficiency – you’ve got a thousand, but you’re really only reviewing, if you look at the number and the figure down here in the bottom, you’re only reviewing 768, so you’ve just removed about 300 items from your review [32:56], as far as reviewing.
So, with PhotoDNA, how does it work? PhotoDNA is essentially a way to normalize photos. It takes an image, turns it to black and white, turns it into a square and grids it out. And each one of these grids then produces a color distribution/histogram that can be matched up. If you’ve got another photo that happened to be of the same dog, if enough of the blocks match up as far as their color distribution and histogram analysis, you’ve got a match. And it produces a really lightweight, 144-byte, unique hash, PhotoDNA hash signature in this case. You’ve got that, and it allows you to go ahead and take something like this original picture, apply that threshold [for matching], and find the other same photos. And in this case, you’ve got one that’s in black and white, you’ve got – and slightly bigger, you’ve got another one that is smaller in resolution, but it’s the same photo, and so those would be stacked for you within the Griffeye platform.
Now, if you’ve got other ones that are just … deviate too much from that threshold, then they’re not going to be a match, and so, it’s really [fine-tuned]. If something is too small, you’ve got that. If it’s modified too much, such as the one in the bottom that has a little obscured box over the bowl, then it’s also not going to be a match.
So, this allows you to quickly group them and essentially deduplicate, stack them in your review. When that happens, then all of the different duplicates or visual duplicates, end up getting pre-categorized, and you’re already hitting the ground with a running start, thereby eliminating things that you don’t need to see, especially if they’ve been pre-categorized and non-pertinent, not important essentially. And so you’re just whittling away all that work before you even start.
Now, before we jump into metadata, let me show you what I’m talking about when you are also leveraging the visual similarities. Let’s say, for example, you’ve got a photo like this. Here’s a group of people, and they have picture frames, and you say, “I need to find other photos that also have this group of people.” The key thing here being picture frames. So, I’m going to go ahead and hit the ENTER button, and that will produce a set of results that find other photos immediately that share common characteristics to this, namely what is being looked for as far as the image record information is colors, contours, shapes, textures – textures are really important to find these matches.
So, what it did was it found a set of photos that also contained – looks like the common theme here is picture frame. And some of the same people in the same exact outfits, and the next one down then shows your first false positive. So this search technique is not [thresholded] like it is in PhotoDNA. That way you can still perhaps look at other results. We don’t want to eliminate other possibilities, so we have this [on] threshold for you. But you can see pretty quickly, you’ve found five other photos right off the bat that had the same group by hitting ENTER.
Now, if I use this as another example – I’m selecting the photo of the child that’s in the urban playground with the adult. You can see that the environment in this case is something that also can be taken into account. So, I’m going to hit ENTER to find similar images to that one, and you’ll see that you have results that are given to you where there’s other photos of the same girl, in the same outfit, hanging out with the same adult in the [blue polo].
If I scroll down to some of the false positives, you’ll see that we’ve got some false positive results, but the results aren’t too long, there aren’t too many to have to go through. Because here’s another one where there’s the same adult in different orientation with the same young girl, and if you scroll down a little more further, you’ll notice that you have another photo, completely different face, but it did pick up on the girl’s outfit, and with another adult, because the original adult was actually this person with the blue polo. So, that’s one easy way to go ahead, just simply picking one and hitting ENTER. More than likely, if you’re investigating one of these photos, it’s because you’ve had a specific interest in it and it’s worth just hitting ENTER to see what other ones were taken probably in the same place or had the same people.
You can also try to get a little more granular.
I’m noticing that some folks are still having trouble getting up [39:07], there’s one question about being able to view it later. Yes, we will be providing this for replay after it is done today.
Then there was one question about … is there any way of turning [39:25] specifically [regarding reporting due to providing a number of reports and keys]. [39:31] [fully chargeable images].
There is not a way to necessarily turn it off, because you can’t necessarily reduce the amount of files that you had ingested in the case, that’s how many there were in there. It will show you the representation of it, but it will lay out how many visual copies there were and how many actual copies of the image of interest. Because there can only be so many binary duplicates. I hope that helps.
So, with that, perhaps you want to get a little more granular in your search.
So, let’s say I take this photo, for example. There is a photo of … looks like a female being obscured in a room. So, I’m going to double-click that to kind of blow it up. And you’ll notice that, well … perhaps I want to see what other things were going on here, in this room. And I can choose a reference point. So, I’m going to take this selection tool from the top and select a unique reference point in the room, this being, in this case, the stand to her right. And if you want that as your reference point, I’m going to right-click and choose to search for similar images to this current frame. And the results are pretty good. You’ve got a number of photos that were taken in this room that contain that frame. And in some of them, the person is actually revealed and un-obscured. So, pretty good search right there.
Now, what about metadata? Metadata, it’s in almost all your photos. There are going to be definitely cases where it doesn’t include metadata, but when it does, you’ve got a wealth of information which you can compare against and look for. Everything from the basic stuff, like when it was taken, when the photo was taken, where it was taken perhaps, with GPS data, the camera model, serial number, the resolution, you’ve got all this different metadata that can be… that is very useful information for you.
So, what happens when you have that?
If we use metadata as our reference point and try to see what other photos share the same metadata, then that has a different keystroke, and I’m going to use one of the photos here. Let’s use this one as an example. This one is kind of a dark picture – it looks the back of a car, perhaps a taxi, and if you wanted to take a closer look … let’s do a quick shadow boosts and get a better idea of what’s going wrong in this picture, like “yeah, this is something that I want to compare against and investigate a little further.” So, you’ll notice here that it does contain GPS and Exif data when I hover over that particular icon and the thumbnail header. So, if I take this photo and I do a different keystroke, hold down SHIFT and hit ENTER, you’ll find other photos that share that same Exif.
So, initially, he was looking at the camera, and the first set of results, if you notice, the first one is another one of him looking away, but in the bottom of each of these thumbnail headers is GPS and date and resolution. So, that would match up on the first eight results. All these first search results, by hitting SHIFT-ENTER, showed me other photos that were taken in the same place, same vicinity, on the same date, with the same resolution. Those were the strongest similarities by Exif, and if it’s with the same resolution, it probably was taken with the same camera. So, whoever took his photo in the cab was near this hotel, it could provide, probably, you a new lead, to say, “Hey …” perhaps I take that initial photo, take it to the hotel, boost it up, and say, “Hey, have you seen this guy?” Because you can tell right away where that photo was taken.
And using that as an example, going over to my map view … let’s go ahead and turn on some actual cities and streets, you can see where that was initially taken by turning on the GPS. And I’m scrolling down, it looks like it was taken in London. So, it’s a great way to map it out. If you want to do the same thing … take all the photos. I’m going to select one and pick all of them. Let’s go ahead and map out everything, leverage that GPS data. So, what is presented to me here is where all the different photos were taken across the world – all the red dots represent that. So I can click on each one of these that have GPS data. If you want to refine that a little further … let’s say, actually, all I am interested in is going in and finding which photos were taken specifically in the good old US of A. And perhaps to refine it even further, I’m only interested in what photos were taken in the Bay Area.
So, narrowing down my screen here, I’m going to filter based on that map. So, the results here – you’ll notice I have two filters right now, based on those [46:12], and these photos in my screen were the ones that were taken in the Bay Area. Going back to my map, plotting out the Bay Area, it looks a path of photos … there were a number of photos taken in San Francisco and San Jose area. Alright?
I had talked earlier about Amped and the integration efforts. So, I just wanted to touch on that a little bit. Again, with that Open API, you can see that Amped here is one of the ones listed on this diagram that I wanted to key in on. They’ve got a pretty cool set of software, a suite of software that’s geared towards law enforcement, and specifically products that are integrated with Amped or Amped FIVE, Authenticate, and DVRConvert. So if you want to do a little pre-work on the evidence before ingesting it into Griffeye, or perhaps work on it while you’re in Griffeye, you have that integration already built in. Where that is best seen is … I’m going to open up the … excuse me … and Analyze Forensic Market, and you’ll see down here all the different plugins that are fully integrated and ready for you to enable and work out why you’re in the platform. Okay?
So, shifting over from images, now let’s talk about videos. With videos … I’m going to filter on just videos now.
You’ll notice that videos are presented to you in a 64-frame storyboard. That way, while you have videos up, you can get a good sense right away whether or not it’s worth taking a look. So, if I took this one and I wanted to do a quick triage, a quick review of it, I can take my mouse and mouseover it, just to see what is happening in each video. And no, no, nope, nope, this is all junk.
Well, if I did want to take a closer look at something like this one, a surveillance cam on a garage door. Let’s go ahead and open that up. This can be presented to you in either a single frame or [multi-stream]. So, you’ve got six [screens] or six [streams], nine streams or twelve streams, all of them tabbed out. So, if you took one of the markers and just went from position one to position two, you’ve just rolled through the entire video from the start to end, through your six different screens.
There’s also analysis done here at the bottom that shows you, in blue, motion; in green, nudity, which doesn’t apply in this case with the surveillance cam; and there would be another one if there was sound, in orange, for audio.
We’ve also got, to the left, the video broken down into chapters, essentially scenes at five-minute minimum intervals. So, that would be to the left, where you could kind of pick and choose your different scenes. And I could take a scene like this, a kind of scene with the overview of that, and kind of go back and forth. But the power in this type of video, when it comes to surveillance, is the motion filtering.
So, let’s say I wanted to go ahead and take this … there’s a lot of dead time, as you can see. And I only care about what’s happening when cars are coming in and out. So, let’s go ahead and apply a motion filter. And you’ll notice these large grey areas pop up. Depending on what threshold you set over here for sensitivity, you can go ahead and filter on motion, play over the filter points, and now, if you watch the screen and the marker, you’ll notice that the marker will jump from the points of the video that have something going on, and then skip all of the dead space. So, let’s speed this up. And now the video can be reviewed where you’re only watching the portions where there was something going on.
Perhaps you want to get a little more granular than the overall video. If you remember the example that I used with the female in the room and that stand … then, we can go ahead and try to pick out a certain part of the video if that’s what’s of interest to you. So, let’s say I want to analyze this video but I only care about what happened around this SUV. So, I’m creating a box around it, and the software is going in and analyzing all the frames for this region of the video. And that’s the way videos are treated.
During the processing of your files, besides hashing and all the different things that’s going on, one thing that happens is in videos, frames are extracted and analyzed for you, because [52:31] … videos are stitched-together pictures, when it comes down to it. So, now that it’s been analyzed for movement around just this car, using the same technique and filtering out just for motion, you’ll notice that the graph at the bottom has changed, and I’m going to apply the motion filter, and now, playing at the same sped-up, eight-times speed, play the filtered portion, and just jump to the points of the video where there’s somebody or something happening around this car, skipping any passers-by or perhaps anything else that happened outside of the box that I had created.
So, another really effective way of reviewing videos … perhaps you want to see what happened around a postal box or a doorway. This is an effective way to do that.
Now, jumping back to images … I’m going to go back to the thumbnail view and clear all these filters. One thing that I would say at the end of your investigation as a good “check your work” search would be for thumbnail mismatch. We’re going to filter on photos that do not contain … or that contain embedded thumbnails within the jpeg or the [ping] file that do not match what you see here.
So, let’s say you had acquired … created your image in another tool, and you’re doing a review. I would also probably skip over files like this where it’s hard to see what’s going on. These are low-quality, probably really small images. But the thing is, with the thumbnail mismatch filtering, it tells you … let me click on this little dialog … create this dialog … and tells you that there’s other thumbnails embedded within that picture. The first thumbnail is not too bad. The original that’s, in this case, a 50x50, and it looks like there is another image that is embedded in there, that is a 160x120. That’s okay. It gives you a little bit of a better picture. But it looks like there’s more than one. And if we toggle over to the next image, this is a much better one. Probably of the original, and at 570x375, and now I’ve got a much clearer face, and I can work with this. So, this is a good “check your work” technique – that’s what I like to call it – where you can look over the things that you may have overlooked, if you will. So that’s one way.
So, all these different things that I’ve shown you, these are things that you can do within the platform, you could be on a closed network gap system, and you’ll be fine. But if you do have access to a [white] line and an open internet, there are also ways to leverage the information that’s embedded in the dataset for … using external, third-party tools … excuse me, tools that hang out of the software.
What do I mean by that? Well, let’s use a specific scenario as an example. You’ve got a case and you recovered a laptop, a phone, and also recovered a digital camera, a DSLR in this case. So, let’s go down … that digital camera happens to be a Canon EOS 450D. In this case, there were 108 photos that were taken with the Canon EOS 450D. So, let’s [56:58] on that. And, in this case, this is actually the victim. So, you’re seeing a lot of photos that were taken of the victim. And notice that when I hover over this, it matches up. This matches up with the serial number, Exif serial number that was used for this photo or that you got from this photo is 088 and so on, ending in 2234. Well, I want to find other photos that were also taken with that specific serial number. So, I’m going to go ahead and right-click it and find the photos that were taken with that specific camera body serial number. The result being these photos. Well, if you look closely at the thumbnail header, it looks like there’s also a match on camera forensics.
Now, I got some folks here from the UK, there’s a person out of the UK, Matt Barnes, who started a site called Stolen Camera Finder. He lost his very expensive DSLR, and to try to find it, he created a web crawler tool that basically indexed Exif data from the internet, open source. And waited until somebody posted a photo that contained Exif which matched with his serial number for his camera. And one day, he did get an email, and it worked. And he found his camera after that. Well, if I hit Camera Forensics, it will bring me to the cameraforensics.com site … [58:47] opening on Camera Forensics now, and what Camera Forensics does is essential indexes that open-source information, that open-source Exif. We bring that over – that’s what popped up. And I actually was here, and that photo, with that serial number, matched up, and one of the photos that was taken was posted to this blog. And if I click on that source, it actually brings me to the Norwegian blog site for what appears to be the owner of that camera.
So, it’s a good way to leverage that information. If you have that Exif data, go ahead … or, excuse me, if you got a match on Camera Forensics based on that Exif, it will tell you, and it can also help you track down where that photo … or who took that photo. Or [59:51].
Now, what if you don’t have Exif data? Well, you got a problem. Especially in the case of social media. Let’s go and clear these filters again. And now, filter on social media identifier. Now, this tool, if I filter on Facebook … what’s the problem with social media posts? The Exif data gets stripped out. These are photos that are posted to the internet, and you’ve got … what, for example, Facebook will do, it will rename it and post it, but it won’t contain the Exif data anymore, it’ll be stripped out. So, if I take these guys right here, this group of buddies, and I see that the social media identifier icon is lit up. I can go ahead and click on Social Media Identifier, and it will take me to the Facebook site from which it was downloaded, and you’ll notice that in this case, he’s also got his buddies unobscured, because it’s from the original photo. So, that’s a way to leverage that information, even if the Exif is not present.
I’m kind of coming up to the end of my hour, so I’m going to go through a couple more slides and information here. We had talked a little bit about Amped, and so this is a good example of what that photo enhancement can do for you, especially in this case, in the bottom, you’ve got reflections of it getting in the way of a clear picture for you. So, go going ahead and use that within the Griffeye platform is a pretty powerful way to get some results here.
Now, we’ve talked a little bit about the similarity search, where you can go ahead and find matches based on similarities. But you’ve also got low-level feature extraction. This is where you can take something like a street sign or a tattoo and have that mapped out for you. The [identifier] is a tool that is integrated into the Griffeye platform and allows you to pick out those finer details and see what else contains it. So, since we’re running a little short, let me show you an example of what I’m talking about. Let’s say you want to find other pictures that contain this tattoo. What that [identifier] technology – what it will do is it will create unique digital fingerprints for 300 points in the picture. Those are typically hardline edges, high-contrast points. So, when you do that, you want to find where this ‘Rivera Kills’ tattoo exists in any of the other photos. It found it first in this mug shot, where he kind of had a bad day. And then, you’ve got another mug shot where it looks like he had a much better day. But with the work of the other folks, who, if you notice, these are bookmarks, associated with three other gang members that led to other relationships. But the point being here, kind of looking at this photo as your starting point.
That is a pretty integration right there. Other examples of that would be this mural, where you can see all the different points that were pulled out of it and visual fingerprints created, to find other photos of that mural, or, in this case, depending on which you’re keying in on, whether that is the back wall or the person, perhaps logos, those are the matches that were found. And you can see this imaginary person, where in one case the ears are matched up and the eyes are matched up. So, a good way to visually see what we’re talking about as far as low-level object identification.
So, in summary, when it comes to these investigations, these is a checklist that’s definitely a fluid document, but based on what we’ve heard and feedback that we’ve received from users and investigators, you want to review your images contained within your hidden and encrypted folders. Make sure you look at that, because we’re not going to be able to carve it out for you. So, go ahead and look at that stuff, especially [01:05:05] files. Carve that, grab that information from the hidden folders. Take into account all your crime scene photos and the backgrounds, because you can use those as reference points to find similarities. You want to search for images taken by any cameras found in the suspect’s possession. Look for images with GPS data, because you can map them all out. And look for images with Exif data, because that’s a pretty powerful reference point for you to compare with the other images and videos in your evidence. You want to review images with largest file size first, because those are probably the most important for you. Look for people’s names … we didn’t get to go into the grid view and keyword matches, but that is something you can search against. And you want to look for new files with embedded thumbnails. This is the thumbnail mismatch example that I was talking about.
So, with that, I wanted to save some time at the end for questions. And one of the questions … or it looks like there was a lot of chat activity. I’m sorry, I couldn’t keep track of it and talk at the same time, I had to go through the presentation. But there was one question regarding DI Core and DI Pro. DI Core is the free version for law enforcement, and the best way to explain that is if I go to the forensic marketplace, this tells you what is reserved for Pro, in the blue, and what is free for … what is in Core version. Now, the only one that’s not in blue is CameraForensics. CameraForensics is included in Pro but not included in Core. I hope that answers that question.
Let’s see. I’m going to start scrolling through here. Can Griffeye import data from UFED Cellebrite? Yes. In UFED, there is an export function that I believe contains the Griffeye logo that will push out a json file and a folder with attachments. And that, in that case, when you want to add that file to Griffeye … I’m going to bring up a prompt here for adding files … what you’ll want to do is choose the [Vic’s case] format, which is json. Navigate to where that folder is, and bring that in. Make sure that the attachment folder is in the same folder as the json file and you’re good to go. The other option you notice here for forensic image – you can import the binary. So, if you were able to obtain a physical with UFED, you can just go and send in the binary file directly. If you did a logical with UFED, then you’ll want to take the root folder of the logical extraction and just point it to that. So, that’s another option.
How do you get a copy of Griffeye? That’s another question. Let me bring that up. So, what you’ll want to do is go to griffeye.com and, over here, in the top right, is My Griffeye. And you’ll want to go from griffeye.com to My Griffeye on the right, hover down to register, fill all that in, and then, from there we can help you with getting a free license, if you’re looking to get a DI Core license.
Next question is: Do you see Griffeye continuing to provide the DI Core for law enforcement for the next several years? That is our commitment, and we haven’t changed it through the last six years. It’s hard for me to predict the future, but I know that we want to keep supporting law enforcement and that we’ve not just continued to provide the core flash-free version to law enforcement, we opened it up to all law enforcement. So, I think that’s a good sign. It used to just be a pro bono version for those that investigated CSA cases, child sex abuse cases, but now, starting this year, it is a free version for all law enforcement.
The latest physical is Analyzer 72, and it’s still exporting in XML and not in json. I’d have to double-check with Cellebrite. Unfortunately, I don’t have the latest on where they are. I know that initially it was a [C4All] export, and the idea was to get it to pump out in json. But even if that’s the case, the workaround at this point is just to go ahead and … when you choose your import choice, I would go either binary, if you’ve got the physical, go ahead and put in the binary directly; or do the logical folder, if you’ve got that.
There was a question here about the difference between DI Core and DI Pro. Hang on one second. What I’ll do is take a little time to explain that sum. If you can bear with me, I’m going to bring up another slide deck here.
And expand on it a little bit more …
So, with DI Core, the first difference is in the fingerprinting service, the database manager. Whereas before you had … With Core, you had to keep it as a locally maintained hash database, but with DI Pro, you connect unlimited other DI Pros to the same hash database. That’s one big difference. I didn’t really mention that – it’s a finer detail. But if I stick with this slide deck here …
The user interface is pretty much the same. In reporting, you have the ability to create additional templates in DI Pro. So yeah, that’s one difference. It’s a difference worth noting.
Jump across here – video utility pack, we did the work within the video … when it comes to filtering out motion, taking a very specific region of the video and filtering on that region, analyzing it, that is part of the video utility pack, which is only in DI Pro. Everything as far as categorization and intelligence features, like voting, tags, bookmarking, series, annotations, those are all the same across the board. Excuse me – except for annotations. We didn’t really go over that, but it’s a way to mark up the photos with your own comments, directly on to the photo. Alerts and notifications, pretty much the same.
But the big difference came in the apps. I had brought that up last time. Let me jump back to Griffeye here. So, in the forensic marketplace … the ones that are included in … let’s just say the ones that you have access to in Core are the Forensic Utility Pack, NCMEC Utility Pack, VICS (Odata), and the NetClean app. So, that is the biggest difference. Everything else here that I have in blue and CameraForensics, those are reserved for the DI Pro version.
So, the ones that would probably affect you the most are the Video Utility Pack – when we went to the Facebook page from the photo, the social media identifier, camera forensics, where we went to any other sites that had the same Exif on the internet, as far as the photos that are posted – those are probably the key apps and capabilities that you’ll be missing out on with the Core version. And then, with the Pro version, you also have access to not just Amped third-party integration, but most of the other integrations down the path that we’re working on with other third-party tools. Those are reserved for the Pro user. I hope that answers that question.
Now with the [densifier], I’d have to get back to you on the pricing there. It is a separate third-party license, but it does require you having the collaboration server instead of the DI Pro license. Just because, as I’d mentioned, with the [densifier] and object identification, what’s being created are 300 fingerprints per photo. So, that’s a pretty heavy load, especially for the database in which it’s … so you’ll want to be on the server license instead of just the individual workstation. So, there’s that. But the server license starts at USD 5000. I’m not sure which country you’re in. So, we can talk about that later on.
Regarding … there was another chat here, a message. One of the viewers was able to test it, with the UFED, and the best thing to do with the export from UFED is to use the [C4All] XML format, and it will take it and run with that. So, just for your information … and thank you for that.
Cost of DI Pro – I can tell you that. That’s USD 995. I’m not sure which … Larry Harrison, so it sounds like you may be here in the US. Just about a thousand bucks.
And another question I had – this may be in-depth, but does analyze make use of multicore processing such as eight- to 16-core machines; if so, is this configurable to adjust the amount of threads allowed for Analyze to use. Great question. Yes. Griffeye will leverage all of your cores and distribute the workload and all the different tasks and processes across all of your cores if you have that sort of setup. It’s actually the ideal setup. It will also take advantage, obviously, of all your RAM at this point. Not necessarily or not really leveraging GPUs yet, but the hope and plan is perhaps to do that in the future. But yes, with multicores, most definitely, because we are running several … multiple processes every time we’re processing a case. So, I hope that answers that question.
Are there any other questions that I missed? If it was early, early on, I probably missed it as the stream of chat messages came through. So please feel free to put the question back up if you don’t mind, just so I can see it. Otherwise, I think that may be it.
Free trials – okay, here’s another question. I see free trials on the website, how to go about a point of contact to test the software? I am a point of contact, but the best thing to do … I’ll put my point of contact information up in a second. But I wanted to bring up this page … one second.
[Everyone sees] the Griffeye page … when you register, you will have access to a free trial. So, you’ll want to go ahead and register here. And you should be good to go.
That’s the best step. Once you’re in, we get … let me show you what the My Griffeye portal looks like. Bear with me one second while I bring up this web page.
So once you’re in here … this is the Griffeye portal, and I’ve logged in with my login, and now on the top right, you’ve got Products. If I click on Products, [01:19:48] trial … all that registration page will do is get you an account, with your My Griffeye account. Make sure you use your agency email address. But you’ll notice you can either obtain a trial, which will be a DI Pro trial, or a DI Core, a license, both of which show no cost. [01:20:10] trial. You can add that to your cart. Let’s say, for example, the DI Core. Add to your cart, cart out, and then you’ll get an email with that license, and it should be good to go.
There was a question here – can Griffeye work with [X-Ways] or [Lace]? I believe … I know that [X-Ways] will support the export that you can ingest into Griffeye. You can also, like I said, take the raw images – go ahead and put those in. There’s also [Lace]. And that one I’m not too sure about, but we’ve … yeah, I don’t want to give you bad information. So, I’m not sure. Griffeye – the question here is does it require an internet connection terminal? It does not require an internet connection. We do have an online activation process for your license, but you also have an offline method. And you don’t need to use any of the internet-related features if you don’t … you don’t need one. You don’t need an internet connection to run it.
There’s a question here – I noticed in the case I worked yesterday, there are a few identical photos that are 90-degree rotations of one another, but they were not linked in Griffeye as visually similar. Is this just an artefact of Photo DNA or is there a manual way to link them?
When they’re rotated like that, Photo DNA will not stack them. That’s actually one of the limitations. The other limitation is 50x50 pixels – if it’s too small, it’s not going to have enough to compare against. And if there is enough of a change, like say someone tried to obscure their face with a black box, then that won’t stack them in Photo DNA. Which … visually similar … it should be able to find it, even with the rotation. But if you want to link them, you can just put them under a common bookmark or … but there’s no way for you to manually stack them if they haven’t been stacked for you. I hope that works. I hope that answers your question.
How long does the free Pro trial last? It should be 30 days. Yeah, one month. One month for the trial. What I did was I brought up the trial license that I could also add to the cart, and just check out, and it says subscription length is one month.
Alright. I think that rounds out the questions here. If there are no other questions, we’ll go ahead and … bring up one thing.
Some of you wanted my contact information. So, I’ll go ahead and put that up on the screen for you. And with that, I guess this …
This will conclude today’s webinar and WebEx. I truly thank everyone for joining us today. And I have my contact information up on the screen in case you’d like to … in case any other questions come up along the way. Everyone, have a great day. It’s amazing to see the turnout that we have. It’s all across the world, all different times of the day, and I truly appreciate your time today. Thank you very much.
End of Transcript