Join Us!

Notifications
Clear all

PDF Manipulated  

  RSS
Fra93
(@fra93)
New Member

Hi everybody,

I need to understand if some PDF files, send me by costumer, are been manipulated or not.

I seen the medatata in Acrobat Professional and also extract this information with Exiftool.

For two of this three PDF files, the creation date and modification date are different.
Moreover, the creation date is later then the modification date.

This problem with the date it's made by the saving of pdf file with the costumer's password?

There are other check to understand if this PDF files was been manipulated?

Thanks
Francesco

Quote
Posted : 21/08/2019 8:55 am
jaclaz
(@jaclaz)
Community Legend

Strange "creation" vs. "modified" dates are pretty much normal if the file has been copied across filesystems, see (only as an example, last two threads revolving around similar issues)
https://www.forensicfocus.com/Forums/viewtopic/t=17972/
https://www.forensicfocus.com/Forums/viewtopic/t=17992/

Assuming you are talking of the "file" dates.

The PDF format has also some internal dates, depending on how exactly it has been edited and by which tool you might be able to check those internal dates for modification, see
https://stackoverflow.com/questions/1664161/how-to-check-if-pdf-was-modified

But there are hundreds or thousands of PDF tools, of which many simply don't respect (or workaround) the specifications, so YMMGV.

jaclaz

ReplyQuote
Posted : 22/08/2019 9:21 am
UnallocatedClusters
(@unallocatedclusters)
Senior Member

Hello,

I recommend opening each PDF file using OSForensics' File Viewer tool, and then "Extract All Text".

OSForensics will extract all available embedded text from your PDF files and let you visually review the extracted text for Dates and other important metadata.

For example, if one of your PDF files has an embedded JPG picture file, and the embedded JPG picture file has EXIF metadata, OSForensics will extract the embedded EXIF metadata dates as well as other embedded date values as per below

PDF Files can contain embedded “XML” packets which are designated by bracketed “xap” values. These “XML” packet metadata dates are not File System dates, but rather date values that are automatically embedded within PDF files by Adobe software recording the dates and times files were added to a given PDF file

An example of embedded PDF “XML” metadata date streams
2014-02-02T194317Z
2014-02-02T194317Z
2014-02-02T194317Z

PDF files can contain multiple XML streams and packets, which can be embedded and appended within a given PDF file at different dates and times.

If one adds, embeds, a new JPG file into an existing PDF file, Adobe will add new XML metadata "xap" metadata date values for the newly embedded JPG file in addition to the existing internal embedded "xap" metadata date values.

Typically at the very end of the embedded text within PDF files, one will see the traditional PDF file Creation and Modified dates (note these values do NOT have the "xap" value

/CreationDate (D20140203034317+08'00')
/ModDate (D20140203034317+08'00')

Another embedded XML stream to look at is the “/Producer” value which shows what PDF generation engine version was used to create your PDF files

For example a “/Producer” value of
560065007200730069006f006e002000310037002e0031002e0030002e003500370032> translates to “Corel PDF Engine Version 17”.

One can Google the release dates of the Corel's PDF Engine Version 17 to determine when the PDF file was generated (on or after the PDF engine release date), for example.

NOTE I have no association with Passmark (maker of OSForensics) but I have found OSFornesics' ability to "extract" embedded text from files (and then make that text searchable) invaluable for this type of forensic analysis.

ReplyQuote
Posted : 22/08/2019 10:26 pm
fissa
(@fissa)
New Member

Hi Jaclaz and unallocated, i find this topic very interesting since i have a case where i must examine a pdf document as well.
In my case the pdf has been added as an attachment within an email.
How do i extract and save the pdf on my system without changing the modified create and last accessed date? I thought id save the entire .msg including the pdf and add it as single file in Encase. Would this work?
Or shouldnt i be bothered, Just save the pdf on my system and look into it with the suggested tools listed above?
Its a fraude investigation, so no pictures in the document.

Thanks tot the help. Im new to pdf investigation.

With kind regards.

ReplyQuote
Posted : 24/08/2019 7:55 am
Fra93
(@fra93)
New Member

Hello everyone,
thanks for your feedback and sorry for the delay in responding.

In these days I have done further analysis.

To explain you better, the pdf that the client send me are bank statements.

In one of them, the one with different modification date and creation, I can't select the numbers in the pdf (Credits and debits).

I used both OSforensic and Xpdf, and the result is the same the numbers present are from images.

However, in the metadata I extracted, with both tools, I don't have the embedded PDF "XML" metadata.

Could someone explain to me why?

ReplyQuote
Posted : 30/08/2019 8:28 am
jaclaz
(@jaclaz)
Community Legend

More generally (besides and before any tampering detection) if you have a set of n documents with the same (exactly same) origin, i.e. provided by a same party, through the same means, and automatically generated by a same software and if any document nth - m with m bigger than 0, you already have enough grounds to suspect something "q***r" happened.

Still, given the number of programs/tools/OS built-in provisions and what not that are "compatible" with PDF files, the "exactly" is an issue.

There is no doubt that either all documents sent from the bank are exactly the same format or from a given date onwards you have a change to a "new" format that however you can find in later documents (before another change).

But let's say that
1) periodically user "A" gets in the browser the link to the .pdf and proceeds to "Save as" to a given directory
2) one day user "B" (or user "A" after vacation , or just distracted) instead opens the .pdf file and then proceeds to "print to .pdf"

The file in the latter case would be different, while still not having been tampered with at all.

AFAIK, the XML data (actually XMP to be picky) can be inside the .pdf, not must, i.e. some tools/application do that, others don't, and there are also different versions
https://www.pdflib.com/pdf-knowledge-base/xmp/xmp-overview/
https://en.wikipedia.org/wiki/Extensible_Metadata_Platform

And (the matter is documented by Adobe), when there are two sources of metadata, tools should "choose" which one to "trust"
https://help.adobe.com/en_US/livecycle/11.0/Services/WS92d06802c76abadb-2460d90e12eb4e989f1-7ffe.2.html
and may also update the "wrong" one.

And, once said how the XML fields are optional, the actual .pdf standard has actual provisions for Created and Modified date in the Info dictionary, see the thread on stackoverflow I already linked to, and here
https://superuser.com/questions/161576/is-there-any-authoring-information-of-pdf
BUT again there are lots of tools around that produce .pdf files which are not fully compliant to standards

jaclaz

ReplyQuote
Posted : 30/08/2019 3:34 pm
Cults14
(@cults14)
Active Member

Also interesting for me right now for a case I'm working on (internal investigation)

I can see from PDF metadata that a small number of invoices "from a vendor" matches one previous quotation; title, producer, creation date, even visual layout. Invoices from the vendor before and after the suspect ones don't look the same and don't have the same metadata.

There are other elements that point the finger at unauthorised creation of these PDF invoices by one of our users, is it possible to prove that the user actually created them? Or is the best we can do is show a pattern of usual events, evidence and behaviour vs what we think is unusual? i.e. circumstantial?

Cheers

ReplyQuote
Posted : 08/09/2019 10:19 pm
athulin
(@athulin)
Community Legend

I can see from PDF metadata that a small number of invoices "from a vendor" matches one previous quotation; title, producer, creation date, even visual layout. Invoices from the vendor before and after the suspect ones don't look the same and don't have the same metadata.

So? To my ears that sounds overly suspicious. You don't look at metadata to decide if a legal document has been faked.

An invoice can be delivered, lost (I once lost half a dozen of invoices when my bag was lost at a railway station), copies requested (with new metadata, possibly), printed, sent in, rejected ('we must have PDF'), scanned on my office scanner (into a new PDF documents), delivered and accepted and paid. None of all that changed the legal document of the invoice (unless you have specific legislation and regulation or specific circumstances that does so, of course.) Yes, they final PDF documents may look unusual, but then I don't lose my bag that often. The legal documents have not changed a bit – still the same amount to pay, and still the same bank account to pay to, and no changes in any relevant fine print.

Any company can have temporary IT problems and have to fall back on alternate productions of invoices. Or they may even be trying out a new invoicing service.

It seems to me that the very first thing to check if you fear that an invoice has been altered is to verify it with its point of origin. Request a copy, not from the user but the seller. If there's no difference in legal content, strange metadata are unlikely to be relevant, and are probably best explained by the user. If there is a difference, you may have something to investigate. (If your concerns are so large that you think this would alert someone, you (as a company) probably talk to a lawyer or even law enforcement. The first stop for you, though, may be your boss.)

If you have auditors, this may be something best done by them.

There are other elements that point the finger at unauthorised creation of these PDF invoices by one of our users, is it possible to prove that the user actually created them?

Anything is possible. PDF as a document format does not do it automatically, though. If you want that, you insist on digitally signed documents.

If you do have a concept of 'authorized creation of PDF', you probably have additional regulations that you are or should be the experts on yourself. If you have it, you should not need to look at metadata to determine if it has taken place or not.

Or is the best we can do is show a pattern of usual events, evidence and behaviour vs what we think is unusual?

That is a question that should be answered by you yourself or someone in your organization. As I don't know who 'we' are, I can only suggest that you identify the best person in your organization to talk to.

Unusual events are not necessarily fraudulent or have hostile intent. They're just unusual. Incidents are often best investigated from the standard legal principle 'innocent unless proven guilty'. Your company may have policies or guidelines on this – if not, go ask the right person for guidance.

ReplyQuote
Posted : 09/09/2019 8:27 am
Cults14
(@cults14)
Active Member

Athulin, thanks for your comments, I will PM you

Cheers

ReplyQuote
Posted : 09/09/2019 9:26 am
EugeneBelk
(@eugenebelk)
New Member

My two cents on the basis of my experience with Belkasoft. It can extract metadata from PDF files and file metadata from the corresponding file system. It is the analysis of this metadata taken together that makes it possible to determine whether any manipulations with PDF files have been made or not.

ReplyQuote
Posted : 10/09/2019 2:53 pm
Cults14
(@cults14)
Active Member

Thanks

Yes, but can we say for sure who manipulated it? UserA has 3 different versions of the same PDF in outgoing mail attachments, but that's the only place they appear "live". One version (the last one) exists in all Volume Shadow Copies

But there is no record of any application accessing the PDF anywhere, not even on external media or network shares

Cheers

ReplyQuote
Posted : 10/09/2019 3:06 pm
Cults14
(@cults14)
Active Member

Same subject but different, does anyone know of a tool that you can point at a bunch of PDFs and get a CSV or other report on all the metadata fields which you see in Properties of PDF documents in Reader?

It's the Date fields I'm after, BEC seems to do that for M$ Office docs but not PDF

Cheers

ReplyQuote
Posted : 16/09/2019 12:01 pm
jaclaz
(@jaclaz)
Community Legend

Same subject but different, does anyone know of a tool that you can point at a bunch of PDFs and get a CSV or other report on all the metadata fields which you see in Properties of PDF documents in Reader?

It's the Date fields I'm after, BEC seems to do that for M$ Office docs but not PDF

Cheers

Doesn't the "simple" exiftool provide what you need/want?

https://www.sno.phy.queensu.ca/~phil/exiftool/

https://www.sno.phy.queensu.ca/~phil/exiftool/TagNames/PDF.html

Like many similar tools, sintax (for anything more that "plain-plain") is a bit complex
https://sno.phy.queensu.ca/~phil/exiftool/exiftool_pod.html

but of course it can be managed with a little of dedication/patience, simple use is simple wink and there are examples.
https://owl.phy.queensu.ca/~phil/exiftool/examples.html
https://www.crossref.org/blog/exiftool/

Again, remember that metadata in info dictionary and XMP metadata are different sets.

jaclaz

ReplyQuote
Posted : 16/09/2019 1:04 pm
Cults14
(@cults14)
Active Member

Thanks as always Jaclaz (in both threads D

Peter

ReplyQuote
Posted : 16/09/2019 8:52 pm
Sachin999
(@sachin999)
New Member

The suspected forged pdf document that I am working on has the Modified date (Filesystem) lesser than the Pdf creation date (info). Is this a clue for the forgery?

More details

The pdf file is found in two different computers with the same "anomaly". It is "produced by" MS Word 10 and does have the section of XMP metadata XML (must be manually removed).

modified time (MFT) of two different copies of the same file
Thu Apr 27 205453.0000000 UTC+0530 2017

creation and modified time (info) (both are equal)
2017-04-27 222429

I have tested saving a word file as pdf, the creation and modification for MFT and Pdf info are exactly same.

I am new to such analysis, any help or comment is highly appreciated.

ReplyQuote
Posted : 14/12/2019 10:11 am
Share: