Notifications
Clear all

PDF Manipulated

16 Posts
9 Users
0 Likes
9,401 Views
(@fra93)
Posts: 10
Eminent Member
Topic starter
 

Hi everybody,

I need to understand if some PDF files, send me by costumer, are been manipulated or not.

I seen the medatata in Acrobat Professional and also extract this information with Exiftool.

For two of this three PDF files, the creation date and modification date are different.
Moreover, the creation date is later then the modification date.

This problem with the date it's made by the saving of pdf file with the costumer's password?

There are other check to understand if this PDF files was been manipulated?

Thanks
Francesco

 
Posted : 21/08/2019 7:55 am
jaclaz
(@jaclaz)
Posts: 5133
Illustrious Member
 

Strange "creation" vs. "modified" dates are pretty much normal if the file has been copied across filesystems, see (only as an example, last two threads revolving around similar issues)
https://www.forensicfocus.com/Forums/viewtopic/t=17972/
https://www.forensicfocus.com/Forums/viewtopic/t=17992/

Assuming you are talking of the "file" dates.

The PDF format has also some internal dates, depending on how exactly it has been edited and by which tool you might be able to check those internal dates for modification, see
https://stackoverflow.com/questions/1664161/how-to-check-if-pdf-was-modified

But there are hundreds or thousands of PDF tools, of which many simply don't respect (or workaround) the specifications, so YMMGV.

jaclaz

 
Posted : 22/08/2019 8:21 am
UnallocatedClusters
(@unallocatedclusters)
Posts: 577
Honorable Member
 

Hello,

I recommend opening each PDF file using OSForensics' File Viewer tool, and then "Extract All Text".

OSForensics will extract all available embedded text from your PDF files and let you visually review the extracted text for Dates and other important metadata.

For example, if one of your PDF files has an embedded JPG picture file, and the embedded JPG picture file has EXIF metadata, OSForensics will extract the embedded EXIF metadata dates as well as other embedded date values as per below

PDF Files can contain embedded “XML” packets which are designated by bracketed “xap” values. These “XML” packet metadata dates are not File System dates, but rather date values that are automatically embedded within PDF files by Adobe software recording the dates and times files were added to a given PDF file

An example of embedded PDF “XML” metadata date streams
<xapCreateDate>2014-02-02T194317Z</xapCreateDate>
<xapModifyDate>2014-02-02T194317Z</xapModifyDate>
<xapMetadataDate>2014-02-02T194317Z</xapMetadataDate

PDF files can contain multiple XML streams and packets, which can be embedded and appended within a given PDF file at different dates and times.

If one adds, embeds, a new JPG file into an existing PDF file, Adobe will add new XML metadata "xap" metadata date values for the newly embedded JPG file in addition to the existing internal embedded "xap" metadata date values.

Typically at the very end of the embedded text within PDF files, one will see the traditional PDF file Creation and Modified dates (note these values do NOT have the "xap" value

/CreationDate (D20140203034317+08'00')
/ModDate (D20140203034317+08'00')

Another embedded XML stream to look at is the “/Producer” value which shows what PDF generation engine version was used to create your PDF files

For example a “/Producer” value of
<feff0043006f00720065006c002000500044004600200045006e00670069006e0065002000
560065007200730069006f006e002000310037002e0031002e0030002e003500370032> translates to “Corel PDF Engine Version 17”.

One can Google the release dates of the Corel's PDF Engine Version 17 to determine when the PDF file was generated (on or after the PDF engine release date), for example.

NOTE I have no association with Passmark (maker of OSForensics) but I have found OSFornesics' ability to "extract" embedded text from files (and then make that text searchable) invaluable for this type of forensic analysis.

 
Posted : 22/08/2019 9:26 pm
(@fissa)
Posts: 27
Eminent Member
 

Hi Jaclaz and unallocated, i find this topic very interesting since i have a case where i must examine a pdf document as well.
In my case the pdf has been added as an attachment within an email.
How do i extract and save the pdf on my system without changing the modified create and last accessed date? I thought id save the entire .msg including the pdf and add it as single file in Encase. Would this work?
Or shouldnt i be bothered, Just save the pdf on my system and look into it with the suggested tools listed above?
Its a fraude investigation, so no pictures in the document.

Thanks tot the help. Im new to pdf investigation.

With kind regards.

 
Posted : 24/08/2019 6:55 am
(@fra93)
Posts: 10
Eminent Member
Topic starter
 

Hello everyone,
thanks for your feedback and sorry for the delay in responding.

In these days I have done further analysis.

To explain you better, the pdf that the client send me are bank statements.

In one of them, the one with different modification date and creation, I can't select the numbers in the pdf (Credits and debits).

I used both OSforensic and Xpdf, and the result is the same the numbers present are from images.

However, in the metadata I extracted, with both tools, I don't have the embedded PDF "XML" metadata.

Could someone explain to me why?

 
Posted : 30/08/2019 7:28 am
jaclaz
(@jaclaz)
Posts: 5133
Illustrious Member
 

More generally (besides and before any tampering detection) if you have a set of n documents with the same (exactly same) origin, i.e. provided by a same party, through the same means, and automatically generated by a same software and if any document nth - m with m bigger than 0, you already have enough grounds to suspect something "queer" happened.

Still, given the number of programs/tools/OS built-in provisions and what not that are "compatible" with PDF files, the "exactly" is an issue.

There is no doubt that either all documents sent from the bank are exactly the same format or from a given date onwards you have a change to a "new" format that however you can find in later documents (before another change).

But let's say that
1) periodically user "A" gets in the browser the link to the .pdf and proceeds to "Save as" to a given directory
2) one day user "B" (or user "A" after vacation , or just distracted) instead opens the .pdf file and then proceeds to "print to .pdf"

The file in the latter case would be different, while still not having been tampered with at all.

AFAIK, the XML data (actually XMP to be picky) can be inside the .pdf, not must, i.e. some tools/application do that, others don't, and there are also different versions
https://www.pdflib.com/pdf-knowledge-base/xmp/xmp-overview/
https://en.wikipedia.org/wiki/Extensible_Metadata_Platform

And (the matter is documented by Adobe), when there are two sources of metadata, tools should "choose" which one to "trust"
https://help.adobe.com/en_US/livecycle/11.0/Services/WS92d06802c76abadb-2460d90e12eb4e989f1-7ffe.2.html
and may also update the "wrong" one.

And, once said how the XML fields are optional, the actual .pdf standard has actual provisions for Created and Modified date in the Info dictionary, see the thread on stackoverflow I already linked to, and here
https://superuser.com/questions/161576/is-there-any-authoring-information-of-pdf
BUT again there are lots of tools around that produce .pdf files which are not fully compliant to standards

jaclaz

 
Posted : 30/08/2019 2:34 pm
(@cults14)
Posts: 367
Reputable Member
 

Also interesting for me right now for a case I'm working on (internal investigation)

I can see from PDF metadata that a small number of invoices "from a vendor" matches one previous quotation; title, producer, creation date, even visual layout. Invoices from the vendor before and after the suspect ones don't look the same and don't have the same metadata.

There are other elements that point the finger at unauthorised creation of these PDF invoices by one of our users, is it possible to prove that the user actually created them? Or is the best we can do is show a pattern of usual events, evidence and behaviour vs what we think is unusual? i.e. circumstantial?

Cheers

 
Posted : 08/09/2019 9:19 pm
(@athulin)
Posts: 1156
Noble Member
 

I can see from PDF metadata that a small number of invoices "from a vendor" matches one previous quotation; title, producer, creation date, even visual layout. Invoices from the vendor before and after the suspect ones don't look the same and don't have the same metadata.

So? To my ears that sounds overly suspicious. You don't look at metadata to decide if a legal document has been faked.

An invoice can be delivered, lost (I once lost half a dozen of invoices when my bag was lost at a railway station), copies requested (with new metadata, possibly), printed, sent in, rejected ('we must have PDF'), scanned on my office scanner (into a new PDF documents), delivered and accepted and paid. None of all that changed the legal document of the invoice (unless you have specific legislation and regulation or specific circumstances that does so, of course.) Yes, they final PDF documents may look unusual, but then I don't lose my bag that often. The legal documents have not changed a bit – still the same amount to pay, and still the same bank account to pay to, and no changes in any relevant fine print.

Any company can have temporary IT problems and have to fall back on alternate productions of invoices. Or they may even be trying out a new invoicing service.

It seems to me that the very first thing to check if you fear that an invoice has been altered is to verify it with its point of origin. Request a copy, not from the user but the seller. If there's no difference in legal content, strange metadata are unlikely to be relevant, and are probably best explained by the user. If there is a difference, you may have something to investigate. (If your concerns are so large that you think this would alert someone, you (as a company) probably talk to a lawyer or even law enforcement. The first stop for you, though, may be your boss.)

If you have auditors, this may be something best done by them.

There are other elements that point the finger at unauthorised creation of these PDF invoices by one of our users, is it possible to prove that the user actually created them?

Anything is possible. PDF as a document format does not do it automatically, though. If you want that, you insist on digitally signed documents.

If you do have a concept of 'authorized creation of PDF', you probably have additional regulations that you are or should be the experts on yourself. If you have it, you should not need to look at metadata to determine if it has taken place or not.

Or is the best we can do is show a pattern of usual events, evidence and behaviour vs what we think is unusual?

That is a question that should be answered by you yourself or someone in your organization. As I don't know who 'we' are, I can only suggest that you identify the best person in your organization to talk to.

Unusual events are not necessarily fraudulent or have hostile intent. They're just unusual. Incidents are often best investigated from the standard legal principle 'innocent unless proven guilty'. Your company may have policies or guidelines on this – if not, go ask the right person for guidance.

 
Posted : 09/09/2019 7:27 am
(@cults14)
Posts: 367
Reputable Member
 

Athulin, thanks for your comments, I will PM you

Cheers

 
Posted : 09/09/2019 8:26 am
(@eugenebelk)
Posts: 16
Active Member
 

My two cents on the basis of my experience with Belkasoft. It can extract metadata from PDF files and file metadata from the corresponding file system. It is the analysis of this metadata taken together that makes it possible to determine whether any manipulations with PDF files have been made or not.

 
Posted : 10/09/2019 1:53 pm
Page 1 / 2
Share: