PDF file analysis
PDF files have evolved to run specific actions and allow for the execution of JavaScript. For PDF analysis, what we can do is extract event information and analyze what the JavaScript will do. We can use Didier Stevens' PDF Tools to help us analyze PDFs. This toolset runs using Python, so we will again need that installed. PDF Tools can be downloaded from https://blog.didierstevens.com/programs/pdf-tools/. If you go to the site, you will get a description about each tool in the package.
Let's try using the tool with https://github.com/PacktPublishing/Mastering-Reverse-Engineering/blob/master/ch13/demo_01.pdf. Using pdfid.py
, execute the following line:
python pdfid.py demo_01.pdf
The following screenshot shows the result of pdfid
on demo_01.pdf
:
Here, we can see that there is JavaScript code embedded to it. Let's now try the pdf-parser.py
file so that we can extract more information. Some elements in the PDF file can be compressed and will not be readable. The pdf-parser
tool...