Raahat, tell us a bit about yourself and your background. What does a day in your life look like?
Well, I’m a PhD research scholar, currently working in Panjab University, Chandigarh, India. I’m actually writing my thesis these days! I have an academic background (bachelor’s, master’s, PhD… the usual).A typical day in my life consists of research, more research, writing about the research, some more research, and talking to other colleagues about their research (boring… I know!). I do however enjoy several other non-research related pursuits, but I guess I’ll mention those in the response to the final question.
What was it that first sparked your interest in digital forensics?
I happened upon this research area purely by chance. I was looking for an interesting research problem to take up for my PhD and I started reading the literature pertaining to a research field called video content analysis. That particular research area basically deals with automatic analysis of videos to detect and determine temporal and spatial events; it includes functionalities like motion detection, flame and smoke detection, shape recognition, face recognition, number place detection, object tracking, and such. It was through the study of that literature that I came across the topic of video tamper and forgery detection, and once I did, I never moved on to anything else. I decided then and there that this was the research field I was looking for all along.
You're passionate about video forensics in particular: why is this such an important area of investigation?
As mediums of information, videos are especially privileged. They provide an unmitigated eyewitness account of an event, and it is this power of description that leads us to use them in an evidentiary capacity for making numerous critical decisions and judgments with long-term consequences in highly sensitive areas like politics, journalism, civil and criminal investigations and litigations, surveillance, and military and intelligence operations. Now, in all these areas, video evidence is considered particularly persuasive, which, as it turns out, is not without its dangers, because as influential as it may be, video evidence is neither self–proving nor necessarily true.
Video data, especially digital video data, is inherently prone to conscious semantic manipulations; by altering a handful of pixels, one can essentially alter the version of “reality” presented by a video. So, to use the evidence of videos safely and with reasonable conviction, it becomes essential to first ascertain that this video has not undergone any kind of content modification operation that could have altered the meaning conveyed by it, thereby causing it to become an inaccurate representation of the event of interest, because in a situation where reliance on video evidence is unavoidable, dependence on untrustworthy evidence could be detrimental.
You've recently done some research into detecting copy-paste forgeries in digital videos. Tell us a bit about the aims of the project and your methodology.
Copy-paste forgeries are the kinds of forgeries that involve insertion or removal of an object or objects into or from a set of video frames. Copy-paste forgeries alter the information presented by the video scene, which has a direct effect on our basic understanding of what that scene represents, and so, from a forensic standpoint, the challenge of detecting such forgeries is especially significant.
Now, forgery or tamper detection is an intricate challenge, and if subjective inspection by a human expert fails to do the job, we have to rely on certain specialized forensic schemes that are designed to detect the various inconsistencies and abnormalities that every content modification operation inevitably introduces in the underlying characteristics of the video in question. Basically, everything that happens to a video once it has been generated leaves a unique mark or trace on it. Every trace is like a fingerprint of a particular content modification operation; we detect the trace and it leads us back to that specific operation.
My most recent research has resulted in development of three specialized copy-paste detection schemes. The first scheme uses Sensor Pattern Noise (SPN) as a forensic feature (‘forensic features’ are those features of the digital content that exhibit the effects of content modifications; effects of the modification usually manifest as abnormalities or variations in the normal behavior or patterns of these features).
SPN is the unique uncorrelated patter of noise that every digital recording device introduces in every image or video that it records. This noise is the result of inhomogeneity of the imaging sensors (such as charge coupled device (CCD) sensors or complementary metal oxide semiconductor (CMOS) sensors) used in the recording device. These inhomogeneities arise due to different sensitivities of the silicon wafers used in the imaging sensors.
SPN not only varies from camera to camera but also follows a consistent pattern over every single image or video frame recorded by a particular camera. Now, when a video frame undergoes copy-paste forgery, SPN patterns in those frames, which were previously quite uniform, start to exhibit certain inconsistencies. These inconsistencies can be detected if we examine correlation between SPN patterns in spatially similar frame-blocks in temporally adjacent frames.
The reason for these inconsistencies is that after an object or region is removed from a particular frame, the resultant missing regions or holes in the affected frames need to be filled-up in a visually plausible manner. This is done with the help of certain inpainting techniques, which use the most coherent regions from neighboring frames or regions of the same frame to fill up the missing regions, so as to maintain temporal and spatial consistency within the forged video. This implies that videos that suffer from such forgeries exhibit unnatural similarities or correlations between regions of successive video frames or regions of the same frame. Such similarities are not present among those frame-regions that have not undergone any kind of tampering. So, by identifying abnormal SPN correlations among different frames or frame-regions, tampered frame-regions can be distinguished from authentic (untouched) frame-regions. This final distinction is made with the help of a Gaussian Mixture Model (GMM).
The second scheme uses abnormalities in interpolation artifacts to identify copy-pasted regions in video frames. Interpolation is the process whereby new pixel values are estimated with the help of known values of surrounding pixels.
During the life cycle of a digital video, interpolation occurs either during the video acquisition process, or when the video frames undergo any geometric transformation such as scaling, translation, or rotation. Interpolation artifacts are also known as ‘demosaicing artifacts’ or ‘Color Filter Array (CFA) artifacts’. During the acquisition of a color frame (with three color-channels − red, green, and blue), the light entering the lens of the acquisition device is filtered by a CFA before reaching the CCD or CMOS sensor. Therefore, for each pixel location, only one particular color is gathered and the missing pixel values for the three color-layers are then obtained by applying an interpolation process known as demosaicing (hence the name ‘demosaicing’ or ‘CFA artifacts’).
Basically, a video frame acquired using a digital camera, in the absence of any post-processing operation, exhibits uniform demosaicing artifacts on every group of pixels. In a frame with tampered regions however, these artifacts will be markedly inconsistent. By analyzing local inconsistencies in these artifacts, copy-pasted regions can be identified.
The third scheme is a simplistic Hausdorff distance based clustering scheme where identical pixels (i.e., pixels that are copied from one frame-region and pasted onto another region) are clustered into groups, thereby separating them from the normal groups of pixels within the tampered frames.
Aside from these copy-paste detection schemes, I have previously developed forensic solutions for the detection of other kinds of video forgeries, namely inter-frame forgeries (which involve manipulation of the arrangement of frames in a video, such as frame-insertion, frame-removal, or frame-replication), and upscale-crop forgeries (which entail cropping the frames of a video to eliminate some incriminating evidence at the extremities of said frames, and then enlarging the affected frames so as to maintain consistent resolution across the entire video).
How common are digital video forgeries, and can you give us some examples of when they might occur?
In the past, because of the inherent complexity of video data processing and a general lack of readily available high-tech video processing tools, we had fewer reservations about accepting videos as unbiased and truthful representations of reality. This, however, is no longer the case.
There have been some documented instances of footage tampering in the past, such as the instance of missing footage as witnessed in case of July 2005 police shooting of Charles de Menezes in London and again in the 2013 Kendrick Johnson murder case in Georgia, US. Then there were the infamous cases of footage tampering that came to light during the trail of Srebrenican war criminal Radovan Karadžić in 2006 and again in the Sandra Bland custodial death case in 2015. Now, in all these and several other cases, there is an evident lack of inexperience on the part of the forgers, but this inexperience neither precludes nor undermines the very tangible threat that content tampering poses in our society today. After all, the forgers can always grow more clever, and the forgeries, more insidious and inconspicuous. It’s always a good idea to be several steps ahead to the enemy, isn’t it?
What conclusions did you come to during the research project?
During this, and all my previous research projects, the first conclusion I came to is that regardless of what has been achieved, these is loads more left to accomplish. Like Einstein once said, “In theory, theory and practice are the same. In practice, they are not.” In the field of video tamper detection, we may have come a long way over the last two decades in theory, but in practice, we still have a lot of milestones to reach.
Another conclusion (of sorts) that I always come to is an even deeper understanding of the fact that video tamper detection, like any endeavor of discovery and deduction, is equal parts science and art. The trace evidence generated a content modification operation requires scientific quantification, but the interpretation of what the presence (or absence) of this evidence means in the given context, requires the precision and receptiveness of an artist.
Are you working on any new projects at the moment?
I am not working on any new projects at the moment. I have recently finished developing a comprehensive video content authentication framework, which was the task I set out to do for my PhD; I’m currently writing the thesis.
Finally, when you're not working, what do you enjoy doing in your spare time?
In my spare time, whenever I can find any, I basically just listen to music, read a book, or paint or sketch, play my guitar a little bit, or just binge-watch a good TV show. Anything in the crime/thriller/detective fiction/police procedural drama category would do!
Raahat Devender Singh is a Research Scholar at Panjab University, Chandigarh. You can find her online course, Digital Video Forensics: Uncovering the Truth in a World of Distorted Realities, on eForensics Magazine.