How To Do Context Analysis of Digital Images in Amped Authenticate

By Marco Fontani, Forensics Director at Amped Software

For several years, researchers and people in the field of image and video processing have been saying that “seeing is no longer believing”. Image and video manipulation techniques have become powerful and widely available, making the presumed objectivity of pictures an outdated concept.

The recent Russian invasion of Ukraine that shattered Eastern Europe is also making evident that nowadays’ conflicts are not only fought with weapons, but also with misleading information. Deepfakes and various kinds of manipulated media often become viral before someone can check their reliability.

Still, experts have often been able to debunk fake content, and quite often, they didn’t achieve that by detecting traces of manipulation, they rather resorted to the so-called “Context Analysis”. But what is Context Analysis? How can Amped Authenticate help with that? This article deals with these questions and will hopefully broaden your view of what image authentication is.

Image Authentication Is Not Just “Finding Manipulated Pixels”

Image authentication deals with the task of establishing whether a given picture is a true and accurate representation of what it purports to be. Authenticating an image includes several steps, which are very well explained in the recent Image Authentication Best Practice Manual (IA-BPM) created by the European Network of Forensic Science Institutes (ENFSI) [1].


Get The Latest DFIR News

Join the Forensic Focus newsletter for the best DFIR articles in your inbox every month.

Unsubscribe any time. We respect your privacy - read our privacy policy.


While many link image authentication with the idea of localizing manipulated pixels within an image,  “Local Analysis” is just part of the process, as shown in the diagram below taken from the IA-BPM.

A flow chart with boxes outlined in blue, red, green, and yellow
Figure 1: Illustration of methods for digital image authentication (from the ENFSI Image Authentication BPM [1])

The complete authentication workflow includes many steps, for example, examining the digital file before reaching pixels: this is referred to as “Auxiliary Data Analysis” in the BPM, and it includes analysis of external digital context data, file structure analysis, and embedded metadata analysis.

Then we turn to “Image Content Analysis”, where we find “Analysis of Visual Content”, “Global Analysis”, and as anticipated, Local Analysis. The diagram then contains a “Strategy” section where Context Analysis is presented. Citing the BPM, “Context analysis aims to discover all possible elements which are inconsistent with the temporal and geographical context along with technical acquisition and encoding context.”

That sounds very interesting, doesn’t it?  Let’s jump more in-depth into Context Analysis.

Context Analysis Within Amped Authenticate

The fact that an image has not been tampered with at the pixel level does not imply it is authentic. For example, if someone uses a picture of a hostage to prove they are alive, it becomes very important to assess the reliability of image metadata, especially time. Another example: a picture is used to show that soldiers are entering a city. It will be important to check the geographical consistency of what is shown in the image against the declared location.

As you may have guessed already, Context Analysis can be targeted from two different perspectives:

  • We may look at the properties of the digital file (to reveal, for example, if the alleged source device is not consistent with how information is stored). This relates to checking the “technical acquisition and encoding context”, as recommended in the BPM.
  • We may cross-check what is shown in the visual content of the image and in its metadata against external references. This is more related to “discover elements which are inconsistent with the temporal and geographical context”.

Let’s see how we can target both tasks with Amped Authenticate.

Checking the Technical Acquisition and Encoding Context

Despite most images are nowadays captured and stored in a few common standard formats (principally JPEG and HEIF), acquisition devices still have plenty of customization possibilities. Some notable examples are:

  • The file name and location on the device;
  • The ordering, structure, and presence of some data within the file (e.g., presence of metadata, thumbnail image, preview image, etc.);
  • For JPEG images, the possible use of a customized JPEG Quantization Table instead of the standard ones.

So if an image is allegedly captured with an iPhone X smartphone and it is not modified in any way, we would expect it to have a name like “IMG_NNNN” (this is the standard for Apple devices), and to find a thumbnail image encapsulated in it. Also, if the image is in JPEG format, we would expect such an image to use one of Apple’s customized JPEG Quantization Tables.

For example, this is a sample captured with an iPhone 11 Pro.

It is indeed named IMG_7947.jpg and, once loaded in Authenticate, we can see that it has all the typical properties described above (consistent metadata, Apple-specific quantization tables).

But more generally, how could we check the file structure and metadata for a given picture? The best way to go is to obtain some reference images, which means, images that we know have been taken from a device of the same model, and possibly, running the same firmware/operating system.

How can we obtain such reference images? Either we look for a physical device and capture them personally, or we use the content on the web. Amped Authenticate features two tools for looking for reference images on the web:

  • Search Images From Same Camera Model (Flickr)… 
  • Search Images From Same Camera Model (CameraForensics)….

Both tools have the same objective: locate images on the web to be used for comparison. The particularity of Flickr is that it can be used for free and lets you customize the search up to a certain point. For example, you can exclude from the search images that contain an editing software name in their Exif Software metadata. 

Once you hit “Search”, Authenticate will look on Flickr to find suitable images. This may take a while since the search is done online. The results are presented in a table, where you can select images and download them with a right-click.

On the other hand, CameraForensics’ tool is more powerful since it leverages the CameraForensics.com database, but of course, you’ll need an active subscription with the third party website. As you can see below, you can even restrict the search to images having the same JPEG Quantization Table, and once you hit “Search”, the results will be returned instantaneously.

Once we have our reference imagery, we can load each reference image in the Reference panel of Amped Authenticate, and compare the JPEG Structure, Exif, and JPEG QT filters output side by side.

For example, the JPEG Structure comparison reveals that everything looks similar except for the presence of an extra image inside our evidence file.

Clicking on Preview, you’ll notice the evidence file indeed contains a depth image, which is captured with the infrared camera capabilities of the iPhone 11 Pro.

If all these checks went smoothly, we are now more confident about the technical acquisition and encoding context part of the Context Analysis.

Of course, don’t forget to ask yourself even more basic questions. For example, the above image is dated March 2017, and it says it was captured with an iPhone 11 Pro. Did that model exist at that time? If it wasn’t released yet, we may have a problem! And the same for the Exif Software field, which in Apple devices corresponds to the installed iOS version: was it released at the declared acquisition time?

Checking the Temporal and Geographical Context

When images are used as a proof for something, it’s usually important to know when and where they’ve been originally captured. This information is often available in Exif metadata. Let’s take this image as an example:

After selecting the Exif filter, the image reveals that metadata contains both the (declared) GPS position and acquisition time for this shot.

However, you probably know already that this kind of metadata is easily editable (a hex editor is more than enough to do that) while virtually leaving no trace.

What we can do, however, is to resort to external information sources to check the geographical and temporal context. In Amped Authenticate, this is as simple as clicking on the Tools menu, which reveals two useful functions:

Show Image Location on Google Maps is quite a self-explaining tool: it will pick up GPS coordinates from Exif metadata and show the corresponding location on Google Maps using your default browser:

When it’s available, you could then turn to Google Street View to view pictures taken in nearby places. In our case, this reveals that several pictures are available for that location:

Visiting a few of these pictures reveals a quite consistent geographical context:

Remember that the GPS location is not necessarily accurate, it may be off by several meters (or even completely wrong if the device is experiencing issues, or counter-forensics measures are being taken).

Based on the above, it seems that the geographical information in the image is reliable. Now what about the temporal information? The Exif “CreateDate” field reads 2017:03:17, 18:56:10. How can we check whether we should trust this piece of data?

Here the Check Sun Position for Image Location and Date on Suncalc.org tool looks promising! Let’s first have a look at how Suncalc.org presents the information for a typical image:

Now let’s use the tool for our evidence image, and we’ll be sent again to the default system browser, this time on a page where the image’s GPS position and date are automatically configured. As you can see below, on the date and time declared in the “CreateDate” metadata the Sun had already set!

This is clearly in contrast with the visual content of the image, where the Sun is still up! Having been warned by this potential inconsistency, we decide to take a deeper look at Exif metadata to compare the various time information. We see that all-time metadata provided by the device read 18:56, but the one provided by the GPS clock reads 15:56.

Before jumping to conclusions, remember that the device time is usually compensated for the timezone, while GPS time follows the UTC time zone, so we first need to compensate the timezone before comparing these values. As shown on https://www.timeanddate.com, the timezone at the image location on March, 17th is UTC+1. This means that the GPS clock says the picture was taken at 16:56, not 18:56 as declared in image metadata. Which one should we trust?

Let’s put 16:56 on the same Suncalc.org page we had before (you can just drag the yellow circle on the top bar to set the hour).

We now see that the Sun had not set yet at that time. We can also compare the direction of sunlight and the direction of shadows in the image. Suncalc even offers an additional tool, which allows you to calculate and draw on the map the direction and length of an object’s shadow, provided you know its real height.

We can thus cross-check shadow directions and lengths in the image with those provided by the website, and we see that 16:56 is a much more credible acquisition time for the evidence image. This leads us to hypothesize that, for some innocent or malicious reason, the clock of the device which captured the photo was set to a different time than the real one, or image metadata were altered after the acquisition. Luckily for us, the GPS clock information is retrieved directly from the GPS and was not affected (or, if the time was forged, the forger forgot to change the GPS clock data).

Now, all this Sun-related analysis helps us check the reliability of the time of the day, but it doesn’t really ensure much about the specific date: if the picture was actually taken a few days before or after, the Sun position would change by a very little amount, perhaps too little to be noticeable with the above comparison.

You may then consider using weather information as a (weak) verification technique: if the image shows a crystal-clear sky, while archived weather data report a storm on that day, something must be wrong! For example, weatherarchive.com reports that on March 17, 2017, there was no rain reported in Florence, which is consistent with the visual content of the image.

Before concluding, one more note: besides the technical and geographical/temporal context, one more element that may need investigation is the publishing context and the possible repurposing of an existing image. When possible, it is usually a good idea to check whether the investigated image can be found on the web, and gather more information based on where, when, and in which context it was published.

Although this process would deserve an entire article (or even book!), Amped Authenticate’s Search Similar Images on the Web tool is a good starting point.

It will ask you for permission of sending the image pixels to Google Images, and it will show the search results in your default system browser.

Conclusion

The good and bad side of context authentication is that you can dive as deep as you wish (with an associated risk of falling into the rabbit’s hole every time). The general idea is that using the web and open-source intelligence techniques you can often increase or disqualify the reliability of a visual content, in a way that would be hardly achievable focusing on pixel analysis only.

As we have seen, Amped Authenticate provides several tools that let you investigate the digital and physical context of an image effectively and quickly!

Bibliography

[1] European Network of Forensic Science Institutes, Best Practice Manual for Digital Image Authentication (ENFSI-BPM-DI-03), April 2021, available online: https://enfsi.eu/about-enfsi/structure/working-groups/documents-page/documents/best-practice-manuals/

Contact Amped Software to learn more.

Leave a Comment