by David Spreadborough, Amped Software
To help in setting the scene, let me tell you a story.
A young guy decided to steal a charity box from a post office. The theft itself could not be seen on the video, and the only CCTV of him running out of the store was from behind, so you could not see what was in front of him. He was picked up again on street surveillance and was seen to run into an alleyway that was poorly lit. He exited seconds later with nothing in his arms.
A few minutes later he was stopped in the street by local police officers and one went into the alleyway where they found the stolen container on the floor.
During interviews, he stated that he had nothing to do with the theft and the item was already in the alleyway when he got there. At this point, there was no actual proof that he committed the theft.
This is where forensic video analysis comes in, and the use of techniques to give us clean snapshots of moments in time without video noise or other moving objects and then mix those snapshots to only visualize the differences.
We will come back to this case a little later but first, let us understand the first technique, frame averaging.
This is a mathematical process to find the average value in a set of pixels. This is extremely powerful as it assists us in removing noise as well as undesired moving objects (e.g. mosquitos) in a video.
Let us say you had a greyscale video of a static vehicle and you captured 6 frames. There would be noise in each frame and that noise would be of different intensity and in different locations. It would be random. The vehicle would look patterned or speckled with the noise. frame averaging would compute the mean value of every pixel across the 6 frames so the vehicle is ‘smooth’ once again.
In our vehicle example, our values for the luminance at an individual pixel on each frame may be: 181, 187, 189, 175, 167, 181. Remember we have 256 total values between black and white (0-255)
We now need to add our values together and then divide them by the number of values (6).
The resulting pixel now has a value of 179.
Frame averaging is a ‘many to one’ process. Many frames go in, and a single image comes out.
Thanks to the wonder of powerful computers, it is possible to crunch millions and millions of values to achieve our result.
In this shot of a height chart, the top image shows a single frame with the signal and compression noise.
At the bottom, we have an average of 382 frames. The result being no random digital noise. It is now much easier to see the edges on the height chart blocks.
How about a bigger example….
You record 10 minutes of footage of a street scene on a static camera. At 25 frames per second, that is 15,000 frames. As this is color, we have values for the Red, the Green, and the Blue so that is 45,000 values for each pixel in an image!
If we had a person walk through the scene, a vehicle drive by, or a bird fly through, the frame averaging would remove these and your result would be a nice, clean shot of the area where the impact of noise and the moving object is dramatically reduced.
The power of digital image processing means that we can now also stabilize objects to hold them in the same location before frame averaging. The results can be utterly amazing and are one of the most powerful weapons in a forensic video analyst’s arsenal.
Frame averaging is not just a tool for forensic image restoration and enhancement. It is used a lot in landscape and astrophotography too.
So, now we know that we can remove noise and even moving objects using frame averaging, what about identifying differences in two moments in time?
Well, let us return briefly to our opening case. We have a static camera recording the area of the alleyway. We have a period of time before the male ran in and a period of time after he came out.
The area is at the end of the camera’s field of view, is dark and the video has signal and compression noise.
What we must ask therefore is: are there any differences in the floor area after the male exits that would suggest that an item is present that was not there before? We must be mindful of objects and noise in the video and we must not misinterpret noise as a static object.
The process itself is quite simple but before we look at it practically, let me go through it in stages.
- We select a series of frames, let us say 50, which is our ‘before’ instance. We process these for any color and light enhancements and then we frame average them together to create a base ‘before’ image.
- We do the same, with the same number of frames after the incident.
- We mix the two images and use a blending option called ‘Absolute Difference’. We therefore only see the differences between the two moments in time. Random differences such as movements or noise are not seen as these have been averaged out.
The resulting image would reveal an area of luminance on the floor of the alleyway – an item was not there when he went in but was there when he came out.
I have used this process many times on a variety of case types and have often found it to be the only way to ‘see’ something!
If we had only used a single frame from our time before, and a single frame from after, the changes and noise in the video may mask the static objects we were trying to reveal.
To complete this tutorial, we can look at this test example. The difference here is quite large for demonstration purposes, but the process would identify a change in a single pixel.
In Amped FIVE, we can utilize our video in many ways without the need to make various copies or clips. In the test example, I have 5 chains in the History panel, all with the same video as the start point.
Full = The entire video
Before = 101 frames before an item is removed from the bookshelf
After = 101 frames after an item is removed from the bookshelf
Average Mix = A blend between the two frame average results
Single Frame Mix = A blend between a single frame before, and a single frame after
After loading the video, the next thing is a basic Levels adjustment to increase the light. I could have used one of several different filters, such as Exposure, or Retinex, but the choice will depend on the video being used and the issues encountered.
In this example, I then selected 101 frames in the Range Selector filter.
Finally, we frame average!
This has removed all the noise from the image and any objects appearing temporarily such as perhaps a moth flying through!
This is our ‘Before’ chain.
I can copy this chain, rename the copy as ‘After’, and then simply change my Range parameters to select 101 frames after something has occurred.
I now have my clean time snapshots in ‘Before’, and ‘After’ chains.
Now its time to utilize the Video Mixer, found under the Link category of filters.
After selecting our Input chains, the Before and After, we can choose the mixing Mode. I want Absolute Difference.
Any pixels that are the same will be black. Any differences will be brighter. Sometimes the differences are very subtle and to see these, the filter has a gain control slider to help in visualizing the differences.
The time differences are understandable as the clock was moving on the date overlay. There is light change on an object attached to the fridge, but the most visible change is in the bookshelf – the area where an item was removed.
To see what the image would be like without the frame averaging first, we can complete another mix but only select a single frame and not our averages. I have zoomed in a little to see all the noise.
In two recent cases where I used this exact process, the object to be visualized was only 4-5 pixels in size. In both instances, if I had not frame averaged, the objects would have been the same size as video and compression noise and as such would not have been seen!
The process of creating clean time snapshots and visualizing the difference between the two is so simple and quick to do in Amped FIVE. It is extremely powerful and impactive and really does help you to see through the noise to identify small changes in time.