Martino: Good morning, good afternoon, good night, whenever you are. So, as Michelle said, there’s a lot of things. This is not our typical webinar, in the sense that it’ll not be too technical. I will not show any software demo or any new feature, but I will talk in general about some very important topics of our industry. If you follow our blog, you may be already familiar with some of the topics. Some other are more recent studies that we have done.
In general, I think it could be interesting for newcomers of our field, but also offer some different perspective for experts. So, I hope you’ll find interesting. From my end I see here on the list of participants. I see a lot of well-known names! For the others, yeah, I’m Martino Jerian CEO and founder of Amped Software. I’m electronic engineer. I have past experience as a forensic expert and contract professor of various universities. And I founded Amped Software in 2008, in Italy. And since 2019, we have a subsidiary in the US. And what’s very important, in my opinion, which is underlying the overarching topic of this webinar and what we do in general, is the idea of justice through science.
So, a bit more information about us. This is a group photo of the people working with me from past year meeting here in Trieste. Here, there are some new colleagues as well, but I won’t Photoshop them in the photo! Just a quick summary of our company values, which I think are very important for the kind of work that we do.
First of all, we are completely focused on the best possible support to the customer. I think that having the best technology is useless if you don’t provide the best possible support. We have the most complete products on the market for digital media evidence, image enhancement, authentication for forensic purposes. And our software have been widely adopted and accepted worldwide. They are designed from…since the beginning for forensic application. And we are also quite involved in the scientific and forensic community.
A couple of words about our users. We have users in more than one hundred countries. Between licenses and multiple seats of our software we have more than 4,500 licenses and seats in more than 1,400 organizations. And in these years we have trained more than 3000 people. So you have more or less an idea about where and how much we are in the field.
I think I’ll start with an interesting trend, since we speak about companies. I wanted to do brief mention of the current industry landscape. As many of you probably know, there has been a lot of changes in the digital forensic industry. I think this picture on the right is pretty representative of the current situation! We’ve seen in the past years a lot of small companies acquired by big companies, big companies acquired by huge companies, or go public. And we now have fewer players, but bigger ones.
We are (by choice) one of the few independent companies left. We are 100% privately owned, no investors. And disclaimer: this 100% is me! So pretty easy. This allows keeping our entire focus on the customer and employee satisfaction. And this is, in my opinion, the most important thing for us to be fully aligned on this point.
So, very few words about our ecosystem of solution. We are mostly known for Amped Five, which stands for Forensic Image and Video Enhancement, which is a kind of Swiss army knife for anything you need during image and video investigations. We have more than 140 different tools for a need conversion, presentation, announcement, analysis, measurements, so likely you are familiar with it.
Then we have Amped Replay, which is the multi-tool just with the essential knives, those which are most often useful and which are not too sharp, not too dangerous! Because it’s a tool which is supposed to go into the hands also of people who don’t have a lot of technical knowledge about video evidence, but still have to deal with the same issues on videos.
Then we have DVRCONV, which being an Italian company, I thought was pretty appropriate, to have a kind of pasta maker, which takes videos in proprietary formats and give them the shape or the format that you need to make them playable on your system. And finally, Authenticate, which is a tool for image authentication. And also we provide some video tools. So, you see here a screenshot of our deep fake detection tool. And here you see, I mean, more or less the level of competence that you need to use the various tools.
So, let’s go into the core of the webinar. I start with some references about the power of video evidence. I think, I always stress this in my presentation, but I think it’s always an actual topic. So, let’s lay the foundation straight for starters. What is forensic video analysis? “Forensic video analysis is the scientific examination, comparison and/or evaluation or video in legal matters.” There are different terms with slightly different nuances: forensic image analysis, video image forensics, multimedia forensics, digital multimedia evidence, video evidence. I mean, they are more or less slightly different aspect of the same topic or field, just so we make things a bit more confusing!
Why is it so helpful and important? And if you think about the 5WH investigative model, images and videos can often answer all these questions: who, where, what, when, why, and how has a crime or offense been committed? If you think about other kinds of evidence, very few other types of can give you so complete picture, literally, of the crime scene.
This has been published on Police1 by a police department in the US and shows the amount, impressive amount, of video data during investigations in seven years. From 2013 to 2019, we see an increase of 125 times the amount of footage. And it is not just the quantity, it’s also the results that we can obtain. This is another very interesting study done by the British Transport Police.
Basically, they examinated more than 250,000 cases, and they saw that, as written here “CCTV was available to investigators in 45% of cases and judged to be useful in 29% (65% of cases in which it was available).” And so you see two-thirds of cases, where it is available, it’s of some values. There is no other kind of evidence which is so widely available and useful to investigations.
This is also another similar study, which shows that in 42 of the 44 homicide cases studied, CCTV helped in some capacity, much more than any other kind of evidence as you see here in the plot on the right. And very interestingly, also they mentioned the challenges that we will speak about during this webinar. And you can see here, “currently the integrity and provenance of CCTV evidence may easily be compromised by risky practices and decisions made around how footage is recovered, shared, viewed, interpreted, and packaged.” Very, very interesting study.
And also we have our survey from last year, the state of video forensics. And also here you see similar results. CCTV and DVR videos are, according to this survey, the most useful kind of evidence. Here we compare with other kind of both digital and physical evidence.
So all this power, but there are challenges. And here you see the outline, basically, of the very vast amount of topics that I will cover in this webinar. So, video is still growing and we need to manage in the right way, the amount of data. We can’t always trust photo and video evidence, especially nowadays with deep fakes and all kind of crazy AI things!
Proprietary video formats coming from surveillance system complicate investigation and introduce various risks, including cybersecurity risks. Then the big issue of image quality and why forensic video enhancement is essential to reconstruct truthfully a fact. The impact of artificial intelligence on video forensics and on new AI processing technologies, how much can we trust them? And the challenges with the video evidence interpretation.
Let’s start! So, video evidence is still growing. it has been growing when I started, and it’s still, every year. I just want to show you again this…show a minute ago. We have data until 2019 here, but what has been happening after 2019? There has been COVID. So, I was very curious to discover what has been the impact. A few customers told me, “you know, we have less cases”. I thought, “really?”
So, let’s do a survey! And as you see here, it was very rare. In almost 60% of our respondents, casework increased during pandemic was almost 60%, and increased a little in about 24% of the cases. So, in three quarter of the cases it increased, and now the others was either decrease or remain constant. So, what was the reason for this increase? According to value replies, increase of crime, change on image and video quality, which means better quality.
We don’t just throw away the video because it’s completely useless. And various other things. Interestingly, we see a lot of more devices, which I thought it was already quite saturated. Many mentioned cloud connected cameras, doorbell cameras, like Nest, Ring and so on.
And we have this not so recent, but growing, trend to adopt what are called digital evidence management systems, DEMS, or in some places, DAMS, digital assets management systems. They are…I see a big variation. In some countries, they are virtually unheard of. In some countries, they are adopted centrally or by various police forces.
There is a very wide array of solutions from companies, very small, huge, there are cloud solutions on premise solutions. And they are key to properly…I mean, collect, share and store evidence. And about one year ago, I did a little poll on my LinkedIn profile. Interestingly, maybe the situation has changed so far, but the majority of respondents still were using not real DEMS, just storing on local hard disk, NAS, DVDs. For what regard DEMS, the adoption is mostly cloud with slightly half, or less half, was on premises.
Still, they come with some challenges, of course. We must remind that DEMS are not analysis tools. In some instances, we have seen that DEMS have been sold like the final solution for any need related to image and video evidence management. They are a great tool, but I did hear a kind of a very rough comparison that are kind of the evidence locker, not the microscope for, let’s say, traditional evidence forensics. So, let’s not confuse things.
Also it’s very important to properly understand the relationship between DEMS and media. Understanding if in the DEM solution that we are choosing, is file integrity guaranteed, or they undergo some changes, which can be critical for the chain of custody? If and how proprietary video formats are handled, or they just report standard video formats or even just image specific DEMS mostly?
Some of them offer public submission directly for some portal to provide to citizens, and this is not, let’s say, a technical, let’s say, limitation, much of DEMS, but mostly, kind of, a policy. How much can we trust someone to collect the evidence for us rather than we as expert, or as investigators, to do it? Also about accurate video playback. Many DEMS are provided as web applications. This means that most of the times video needs to be transcoded before being played.
And in a web environment is not always easy to provide the pixel to pixel accurate playback, frame by frame seeking and things like that. For this reason, my recommendation is the choice of DEMS to involve also some video domain experts, not just people expert on IT topics. Someone who understands about video and can validate the system and the requirements.
Okay, second topic: trusting what we see. I think most of you have seen this. Let’s see…speak about the deep fakes. About one year ago (I think) towards the beginning of the war between Russia and Ukraine, this video from President Zelinsky surfaced on social media, where basically he was telling his soldiers to surrender, but actually was a deep fake, not done very well, but still problematic.
Another kind of deep fake: synthesized faces. Faces of people who does not exist. This has been a pretty popular LinkedIn post of mine. It’s been already a couple of years that I routinely receive, let’s say, contacts request for addition of contacts on LinkedIn, which rings a bell!
So, I put this lady’s picture into our deep fake detector in Amped Authenticate, clearly marked it is fake. Also, you look at the profile, there are a few things which really don’t sound right. But it’s a reality. I mean, social media nowadays is full of bots and is full of people that does not exist. And this is one of the, probably the most popular engine for creating these faces.
These people is…seems pretty legit, but they don’t exist. Normally, they still have some irks. In this case, you see the earrings, other times you see other imperfections. And the problem is that they work very well. From this study, you see that basically they did a survey and they tried to understand how much with different networks from different years, how much fake faces were looking real. And from the ’19, you see here in this plot on the right, basically since ’19 fake faces seem more real than real faces. Because they’re kind of average faces, normal, they don’t normally don’t have anything notable. So, people trust them more than real images.
So this is a screenshot for our GAN deep fake detection in Authenticate. This is a picture of two colleagues. They are real, as you see. This is a GAN generated image, and we also try to put the same face over one of my colleagues, and is still detected. So you see, for this kind of deep fake, we already have a solution which works with most of current generators, but our research on other kind of deep fakes is…has been going since a few years, and we are still working on that.
However, the problem is subtler in my opinion, is not just about deep fakes, which are all over the place. It’s around normal images. This is a pretty recent case. Basically, there was a guy on Reddit claiming that pictures of the moon done with some Samsung phone were basically created adding detail from pre-trained pictures of the moon. And so there was been…this is actually something which has been going on for a few years, online, on various forums, that have been different position from Samsung, different people doing different testing.
In this case, the guy took a picture, did a very low resolution picture project on the screen, and take the picture of his screen with the lower solution moon, and the system added more detail than actually was in the picture projected! And I think as usual, there is this site, which I like a lot, How-To-Geek, gives pretty nice and simple reply, I want to read it with you. They say, “the image then runs through an AI detail enhancement engine to eliminate noise in the photos and enhance details.
Since the AI model was trained on real photos of the moon, it can fill in details that may not be visible in the raw data.” So, it’s doing something. It’s not just processing what you have there. It’s putting some small page maybe, and they add, “that’s how modern smartphone captures all photos, not just the photos of the moon. The AI replacement is just more aggressive with moon photos on Galaxys phones.” It can fill in the details, which were not there.
But that’s not all. AI video compression is coming. So this seems like the new revolution, and you already hear about heavy AI on cameras. So, the big question is, can we trust images and videos anymore? Eventually it’ll become quite rare to have images and videos which don’t have a measurable computational component, in my opinion. And it was my opinion seven years ago, I wrote a blog post on the topic, and it’s getting bigger and bigger.
We must say though, that this is not a new issue. Even with traditional digital photos and videos, there’s a lot of processing and risk of getting things wrong, as we will see in the last part of this presentation. So, nothing really changes because maybe worse, but also now technical preparation is key. And I want to give you a practical example. Maybe you’ve seen this video (just put a frame here). This was a, kind of, public appearance of president Putin in a video.
The video which surfaced initially was quite low quality, and people was claiming that it was green screen, that was fake, because you see very clearly in the video that the hand was passing through the microphone. So, “oh, this is clearly fake”. They found other kind of evidence, but this was the main thing that people were noticing.
Then they found a higher quality video where basically it was clear that, I mean, the hand and the microphone relationship was fine and normal. So, this is one of the main examples of how a simple video compression can make things appear different than they are. This is now with the traditional compression and other artifacts from imaging.
It will be worse with AI everywhere, but is not a new problem, really. So, I don’t think we need to say in the future, “we cannot accept any kind of evidence because it’s done in some part with AI”, but we will need to understand much better what we can trust and what not, and be always alert about it.
Third topics: proprietary video: “it’s not just play it”. So, again, from our survey from the past year, kind of frequency, we wanted to understand the frequency of different data types, data sources for video. And this is always the same. The vast majority of video come from CCTV and DVR, followed by videos and images from mobile phones. The problem that they have is that videos from CCTV (as most of you know) they don’t come in an easy to play format, but with the different kind of files to say the many different extensions that need the specific player.
And there are many issues. It’s really…if you look around mailing lists, forums, communities of people working on video evidence, nine questions out of ten are, “do you have the player for .XYZ format?” You see your list (I won’t go through all of them) of challenges with proprietary formats, but in my opinion, the biggest are IT security risk of looking for low quality players around the internet, trying different tools which often compromise the stability of a system with conflicting codex and libraries.
The quality, can they provide the maximum quality during playback export, or they do something bad under the hood? And also how much time is needed for looking for a player, for verifying if they are properly working, being sure that they work, because that’s not obvious. So this is a big issue. And, I mean, you are familiar for sure about many of these, these are just a couple of screenshot of typical DVR players.
And if you don’t do properly the acquisition, for example, you do a screen capture, or you export them without being careful about the bio settings, you can get things like this: putting out images with embedding, interlacing and compression artifacts, more effects from doing a picture of the screen. You see here a wrong decoding artifacts. There is also a grammatical error in the name of the camera, but that’s another topic! Stuff like this.
I mean, pictures of a screen, videos of a screen maybe sent to WhatsApp. This is not good evidence. I mean, the file is the evidence and it must be treated as such. The visual component is a part, it’s important, but it’s just a part. Every conversion, remember, may cause a loss of data. And essentially working on a screenshot is like do ballistic analysis from a photo of a bullet. It’s not the evidence, it’s a photo of the evidence. So, we have these effects if we don’t do it properly. Loss of file originality.
So, it’s a clear impact on the chain of custody. How much can we trust this if we don’t trust the starting point? Loss of quality? For sure. Every step will introduce quality degradation (can be big or small), but maybe we are borderline with the license plate or a face, and we’ll lose a little left. And loss of metadata, information about timing, about maybe GPS information about camera name, which can be essential during an investigation.
So, to solve this, we have our Amped Conversion Engine, which allows to automatically and accurately convert videos from the vast majority of proprietary formats. You may be familiar with it because we are using it in Amped FIVE, DVRCONV, Replay, and we are actually working with some DEMS providers to include our conversion engine in their systems, which we help a lot to properly manage video evidence in their systems.
Okay, so another of my favorite topic: video enhancement. Video enhancement, as I write here, is essential for the truth. Low image quality is often the main issue. Again, from our survey from the past year, you see that low image quality is by far, with 118 replies the biggest issue. Much more than proprietary CCTV video file, which is still a pretty big issue, and bigger than amount of cases, interpretation and all other issues. So, this is the typical unlucky cases that we have to deal with.
So, why is enhancement important for forensics? And I want to give you a bit of a background. A digital image, as I write here, is basically created by a sequence of physical and digital processes that create a representation of a light information in a specific moment, in a specific place as a sequence of 0s and 1s. The technical limitations of the systems will introduce some defects that will make the image different, compare to the original seat, and often less intelligible or useful during investigations.
So, to give you an example, the real walls are usually straight, not curved, like in this case. So, it’s of fundamental importance to understand how these defects are introduced and in which order to correct them. This, to obtain a more accurate and faithful representation of the scene. This is the real reason why enhancement is important. Not just because we need a nice picture. Actually often the final picture would be overall worse, but we need to get a more realistic form of what we already have.
So, what are the requirements for evidentiary use? Accuracy? (This is in general about all kind of evidence processing.) Accuracy, so, the methodologies should as much as possible, be free from error and avoid and limiting bias and other kind of deviations. They should be repeatable. If I do the same analysis in one day, one week, one year, I should get the same result, and they should be reproducible by a third party with proper competencies and tools. If we don’t respect these three principles, we cannot speak about scientific evidence processing. So, which kind of algorithms are acceptable to fit into this framework?
First of all, they should be explainable and understandable by a competent operator, no black box. In how often we see in court, the need to explain what you have done, how does it work? You don’t need to know all the nitty gritty details of an algorithm and its implementation, but more or less explaining what it does. It’s very important. Otherwise, it’s black magic! They should be validated.
They should be a certain scientific community, or at least open to scrutiny in order to validate them. They should be deterministic in order to guarantee repeatability or reproducibility, they cannot change every time we run them. And to avoid bias, they should not rely on external data. They should be self-contained and work only with data coming from the input data.
So, this about the single algorithms. What about multiple algorithms? And if you have attend our training in the past, you know about the idea behind the image generation model. As I explained before, from the real scene to the actual final visualization of an image there are different steps which convert the light into, ultimately, a sequence of numbers. In each of this stage, seeing the camera, the storage, and the final playback, there are some imperfection, like, for example, in the camera, we’ve seen the example, optical distortion.
There are different issues of color and intensity levels, interlacing, blurring. In the storage there may be compression. There are different effects driving from different steps in a certain order. So what is important is to understand this effect, and invert them in the opposite order as they were introduced. And there is an intuitive justification, like when we dress up, we put first our socks, and then our shoes. When we undress, we first take out our shoes and then our socks! There is also a mathematical justification and explanation that we normally explain in our training, but also from the pretty practicals point of view makes a lot of sense.
And you can see a couple of example. In this case, we have a boarding pass for a flight with…affected by blur and perspective. We need to understand in which order perspective and blur has been introduced, creating the image. And here you see, if you correct them in the wrong order, first you correct the perspective, and then the blur, you get this result. If you follow the right order, you see the picture is still not great, but you can see the name of the passenger, flight and many other details. Another example is the interlacing and stabilization. (This is actually a video, not a single frame. You see strong effect of interlacing.)
If we apply the wrong order, first we stabilize, and then with deinterlace, you get this. If you deinterlace, then stabilize, you get a much better image. And deinterlace is something we also say often in the training, when an image is interlaced, the interlace is usually one of the very first thing that you have to do because in the image processing and creation chain is one of the last step, let’s say. Another example, we need to isolate and undistort this building. In this case, if we first crop, and then try to undistort, you get this. If you first take…apply the undistortion filter to the image (the undistortion works with an image whose distortion is at the center), and you correct and then crop, you see, you get the correct building.
So, what can we do? And this is something I always say (and it’s very important), with the forensic image and with announcement, we can reach excellent results, but only when the formation is already present in the original data. if this information is there, but it’s kind of hidden by defects, within certain limits, we can attenuate this defect and show you better the actual information. But only if it’s there. We cannot, and we should not, create new information. So, how successful is in practice imagine and video enhancement?
This was not from the last past year survey from the one in 2021. We asked our users how often they get different kind of improvement. And overall they get good results in 27% of cases and partial results in 31%. This is a bit subjective, which is a pressure results or good, but as you see here, almost 60% of cases, they get useful for their investigation. In other cases, there is no improvement. Maybe because we have the typical three pixels, white license plate, or very small faces. In other cases, they say, “maybe there is something, but I’m not able to get something out”. So, maybe with a bit of training or experience or help from colleagues, we can get something still, you see, very often it’s useful.
What are the most frequent quality issues? This is from internal data that we have. By far the biggest issue is compression. Even more than low resolution. You see, is the second, low resolution. This is because, especially nowadays with the full HD, 4K and higher cameras, often we have a lot of pixels. But then the values of these pixels is not good. They are…we have big blocks of pixel with the same values, a lot of artifacts. So, in my opinion, we should…I mean I’m at a better compression rather than adding pixel, pixel, pixel. And then follow also the low contrast, motion blur, issue with the capture. A lot of ‘other’, because every time you see something different, but compression artifacts and low resolution are the biggest issues.
So, this is a case where there’s nothing to do. If you check here, the vertical solution of this license plate is three pixels. Even with pen and paper, you cannot draw a number with the three pixels. Another typical situation is saturated license plate. Everything becomes white. Nothing to do here. No information to recover at all. Let’s see a few nicer examples. In this case, we have corrected the perspective and applied the blurring to make the license plate appear. In this case, we work at…on a video, putting together exposure, stabilization and prime integration. Very typical workflow, but easy and successful.
This is kind of borderline image. You can read the license plate, not perfectly, but still successful. And this is an algorithm that we develop and publish, and I’m very proud of, which allows to automatically align different perspectives of same planner object, typical license plates, and is a filter that you see in Amped Five as a perspective stabilization to put together with perspective resolution. This is an example of a photo of a screen, which very blur. And with the blurring, we can read the words in the chat. Another nice example.
So, we spoke a bit about enhancement, but the most important question that I get asked all the time is this one: how can I justify to the court the fact that I processed an image used as evidence? People is very scared about this. And the reply is actually quite straight forward, and it’s summarized what I said in the introduction: if we understand the facts and we correct them, we can obtain a more accurate representation of the scene of interest compared to the original image or video, which poses an interesting observation in my opinion.
Maybe the processed image is more authentic than the original one. I think this is good food for thoughts. It’s…you know, the classical definition of authentication is [inaudible] if an image represents what it purports to be. So, you see, a processed image in this case, if done with the proper workflow, and the proper algorithm is, in a sense, more authentic, actually is not the original file, but better represent the original scene. And here you can see the…our blog post when I go in much more details into this topic.
Another big topic: can we use AR for video forensics, AI processing? And I wanted to show you this tweet from Sam Altman. Sam Altman is the CEO of OpenAI, which is the company behind ChatGPT, which is everywhere nowadays, and even Sam Altman says very clearly, and read it with me: “ChatGPT is incredibly limited, but good enough at some things to create a misleading impression of greatness.” And he goes further: “It’s a mistake to be relying on it for anything important right now. It’s a preview of progress.
We have lots of work to do on robustness and truthfulness.” And finally, “it does know a lot, but the danger is that it is confident around a significant fraction of the time.” And this is speaking about text, but for images is exactly the same. And also here…this is a blog post which is quite old, because Google was one of the first to work on images with AI, and all the people was, like, crazy every time, “oh, finally we have CSI enhancement!” I say, “no, this is not an enhanced image! It’s creating a new image taking two pixel as an example, based on the training dataset.”
What are the issue with AI enhancement? Remember these are the principles we need to follow for choosing an algorithm good for forensics. And basically the issues are here: explainable. Most of the current AI systems cannot be explained. We just feed some training data, some input should give some output, and then it trains, and we cannot really explain very well how they work. And also they depend a lot on the data used for the training.
So, depending on how I train the machine, I can get different results. And they…it’s been shown to introduce a lot of bias, both at algorithmic and human level during interpretation. So, essentially, I trained my network, and we did a lot of experimentation and also published a paper on the topic. We…I train this network to do AI upsampling. In some cases works nicely. In this case, this is Will Smith. Looks not bad. This is Angelina Jolie after AI enhancement. It’s…in case it’s decent, the other doesn’t work very well. Why? We don’t know. It’s the way it is.
This is an image of Morgan Freeman. You see the original image in high solution. You see traditional bicubic upsampling, and on the right you see AI upsampling. If you don’t know him, you see this image is amazing, starting from a low resolution image, it’s very nice. But would you trust using something like this for comparison? As Sam Altman put it, is wrong but confident! This seems very legit, but if you look the eyes, they are quite different. Look the the hair, it removes the earrings. This stuff in my opinion, is very dangerous. It’s technologically, is amazing, but for forensics…
And I show you another practical example. You may remember this from the last year, the slap of Will Smith to Chris Rock. There was kind of conspiracy going on the internet. Some people zoom in on the frame, say, “look Chris Rock seems he’s wearing a pad maybe to stage, he put a pad to protect himself because he was expecting this slap.”
Actually it has been shown on this (you can read more on this link), that basically they took a frame of the video and they enhancement it with the AI enhancement app call Remedy, and they were able to reproduce this result. So, this is an artifact of the processing. Imagine if this was an actual evidence, not just gossip of conspiracy theories! So, there are many examples in real life, right? Of things that can go wrong. So, I wrote a quite long and referenced blog post on the topic. It would need an entire presentation to go through it. But I want to put my conclusion in brief, in this table.
I divide application between evidentiary use and investigative use. Evidentiary use of course, means that you use the created image as evidence. Investigative use means that you just use as a lead to advanced investigation, and you not use it in court. And I also distinguish the…between enhancement. So, when you, from an image use AI to produce another image (usually better quality) and analysis, when you need to get some decision from an image. So, from an input image you get some decision. Kind of facial identification, CSC detection picture, motion detection, event recognition, stuff like that.
So, my assessment is that we should not use AI-based enhancement for image to be used as evidence. For what regard investigations, you can probably use it, but they should be clearly market as not to become evidence because, you know, the boundaries are sometimes quite blurred between having something for investigation to become evidence and educate the users between…they can…if it’s led by wrongly enhanced the image, which looks very nice. For what regard analysis, they can be used for evidence, but only for decision support. I wouldn’t trust an AI system to definitely say, “it’s the same person, these two pictures.”
I can get the help to kind of triage a lot of suspect, but the final decision should belong to the analyst. For investigation, also same idea: a decision support only. Non reliability. We must say, “okay, how often is this algorithm wrong and right? In which cases? What’s the reliability in this case?” And also limit bias, because if I get an outcome from a specific system, I can discard other repositories. So I should not bias mitigation techniques.
Okay. And this is an application we have been studying with AI. Low quality license plate reading. So the idea is to take a low quality image of a license plate. From these, we obtain a better quality image, and then we try to read it, putting together possible readings, but from the original quality and the enhanced one. So, I show an example of how the system works. This is the original image. Here you see here the denoised image. And then from here we propose a combination with the probability.
So, if you see here the first letter is an A, is marked blue, because it’s the actual ground root, it’s correct. And you see the probability of being correct is 0.99 something. In this case, despite being quite low quality, the network has read all the letters properly. Let’s see, a more difficult case. Again, here you see the original image here and the processed image. Also, graphically we could obtain better results. We had actually a more aggressive network trained as well, which was giving perfect letters, but again, Sam Altman said, was a misleading impression of greatness, because the results were actually worse.
It was looking like you get perfect numbers, but they were more often than, in this case, not the right numbers. And here you see the results. You see it got the right numbers here. The fourth number, basically it’s read as a zero, with the probability of 0.66. The actual number is 7, which has been detected with a very low probability. So, you see it’s here, got it wrong, but it wasn’t sure about it anyways. In this case, also, the second option was the correct one, which was detectable with a very low probability.
Still, this was a bit more worrying, because it said, “okay with very high probability, 93”. Still not high as others, but was looking pretty confident. Anyways, you see that putting together the various combination, it’s a great tool for triage. Again, using as this automatic system as evidence directly is not a great idea! But it can be very, very useful to test the different combinations. And we have been publishing the result of this. This is the paper that you can go read if you’re interested about it.
And we approach the last main topic of this presentation about video interpretation. What we see can be misleading. So, this is probably the biggest challenge of our field. That video evidence seems easy. Everybody can do it! Everybody can watch a movie. Everybody takes picture of their vacations, but it’s actually easy to get it wrong. So, anybody can view a video. It doesn’t mean that anybody can analyze and understand it properly. Why? There are two reasons. The limitation of the technology, the imaging technology, the compression issues, the optical issues, and the limitation and the bias of the human visual system, which is not just the eye, but especially the brain. The tricks that our brain apply to what the eye see. And I show a few examples from our blog series, “video evidence pitfalls”.
Probably one of the biggest one and very, very common is infrared. From infrared, you cannot get color of footage. Most of cameras, most of the surveillance cameras, when it’s night, if there is not enough light, they switch to infrared, and from the image on the right, you cannot get the colors of the image on the left simply because it is not visible color. It’s infrared light, and it’s a completely different kind of signal. You see here, another example, T-shirt, again, in infrared, you see changing of it, different fabrics, not changing of colors in the same fabric expectation. Aspect ration, this is also an easy to miss.
Most surveillance system, if not exported and handled correctly, will, kind of, present incorrect aspect ratio, the relation between vertical and horizontal proportion. So, as in this case, you can misled limousine for a van, or vice versa, if you’re not careful and you don’t know how to do it. We have quite a few blog post on aspect ratio on our blog as well. It’s a topic which seems easy, but there are a lot of complexities behind it.
Lense distortion. Different lense, different optics can change a lot. The shape of things. Change the appearance of the face of a person, very important for face comparison. Perspective. Size comparison may be tricky. We should measure in a crime scene if possible.
Doing simple comparison can be really good. You know, there are a lot of different optical illusion and tricks you can find a lot on the internet, and it’s important to understand that whenever possible you need to measure things. Again, color, fidelity. Video, it’s hard to be evaluated from poor quality CCTV. This is a case where the same exact subject appear completely different in two different cameras from the same system. And you see here in one case appears blue, in the others gray, but it’s the same person, the same cloth.
And then compression artifacts. Compression artifacts, they do two big things. They hide existing real details, and they add artifacts, which are kind of fake details due…because of the compression. In this case, you see here, it’s very strong effect of compression. You can see close to the wheel, something which may or may not be a scratch, or yeah, most likely just an artifact of the compression. And then there is the big topic of bias. As humans, often we see what we want to see.
This is a very nice example in my opinion, because from the picture on the left we try to recover a license plate and working with the parameters, someone obtained this one. If you really want to see some numbers, you see here, probably, 5, 7, 1, maybe a 2 later. But if you look at the original image, these are not real digits. There is no information here.
You see the patterns all around the image, the artifact of the blurring, the ringing effect, you see this kind of pattern or around the image and this pattern, by chance, if you find the right parameter and your squeeze your eyes hard enough, you can see some numbers, but it’s just because you want to see them. And this is just one of many examples. I mean, people, they can count the number of person in a car where there are just shadows from the compressions. I mean, there are many, many cases. We need to be objective.
And so we reach the conclusion. And I want to draft a few final consideration, which I think it’s…can summarize more or less the topics of this presentation. Everyone is stakeholder. So, the macro issue, in my opinion, is that we as forensic analysts are just some of the stakeholders. We are aware of the power, the challenges of image video, but most of the people is not.
And what about first responders, investigation chiefs in a police organization, judge, prosecutor, but even the public opinion, journalist, layman that maybe go to work in a jury? They understand that DNA, fingerprints, maybe mobile forensic needs some experts. But video, I mean, why you need specialized tools, why you need competencies, experts? Because it’s a video. Everybody can watch it and understand. And as you have seen this presentation, it’s not.
So, how can we improve the status quo for our industry? How to work correctly on image, video evidence? And I drafted year a few very, very high level topics. First of all, awareness and education. First of all, to solve an issue, we must realize that we have one, and we realize many others don’t! We need to educate all the stakeholders that you have seen on the previous presentation, on the power of and complexities of video evidence. We need to strive for originality and proper acquisition.
The evidence should be the original, acquired in the correct way, respect the chain of custody. We grant in this way the best quality as and less risk of evidence being rejected in court. And then we need to authenticate and verify. We should always always ask ourself, “this video that you receive is the original and unaltered? Is a beta transcoding? Is the subject of some tampering? Can we trust the media? Can we trust the source of this data?” And the key here is: verify. And we don’t need, in all the cases, a kind of full authentication assessment, but at least in our workflow, always guess what if?
And do the appropriate steps depending on the situation. And finally, the processing should be correct and transparent. We should follow a workflow, which is scientific, correct, repeatable and reproducible. Following guidelines, which…there are a lot of guidelines in the industry. There are those from the [inaudible], from the Ozark, from the UKFSR, there are from the MC. I mean, they are of very good quality, been validated by different scientific and forensic communities. We should use them more because they are very important to ensure a consistent workflow across the same lab and across the industry.
And my personal contribution to the first point, to create awareness is in a lot of active activities I’ve been doing in the past couple of years. The biggest so far was this event that I organized at the European Parliament about raising awareness on those issues that they presented. They, of course, in a much, much higher level than here. This is a picture also of myself with the President of the European Parliament. And there will be a follow up event this year in May again. And I prepared a document and a dissemination plan with a set of principles.
I call it “Essential Concepts and Principle for the Use of Video Evidence for Public Safety and Criminal Justice”. And it has two sections, one is a higher level institutional section, and one more technical section, with the very general recommendation about working in the proper way on video evidence. This is not a replacement for any guidelines. As I said, there are many of very high quality that go way more in depth than I could do! You can consider this kind of an advertisement for the actual guidelines from the organizations like MC, [inaudible], and so on, that I mentioned before. And it basically, it aims at raising awareness. It’s not a very long document, but I think it contains the most important points that we should be aware of to improve our industry.
So, if we follow these few main points that I mentioned so far in order to cope with the issues that I presented in this webinar, what hopefully can we obtain? First of all, improved safety and security. We streamline further processing across industries across the industry, different labs, colleagues. And in my opinion, working correctly allows for solving more crimes, which is what we want! We improve justice.
If we use the original data. We assess the authenticity, the integrity, we process them in the right way. We stop taking pictures of a monitor with the DVR player on it. We can have fairer trials and reduced judicial errors. And overall, I mean, you can have it all. You get a better efficiency. Doing things the right way is also faster because we have a more efficient workflow. That is less time spent by people reinventing the wheel, and hopefully fewer issues caused by mishandled evidence.
And that’s it on my side. I know it’s been very dense. If you want to reach me out, very likely you have my email address already! But also you can reach me on LinkedIn, which I think is a good platform to stay in touch and update each other. If you need any logistic information about this webinar and follow up, you can reach our info email address. And if you have questions, I’m here looking for the chat and the Q&A. Thanks a lot for your time. I hope you enjoyed the presentation.