I have been testing the -m 10400 through -m 10700 PDF functionality with a buddy and stumbled on a weird issue. Basically I started out with a PDF that has the owner password set, but the user password unset. Using Acrobat 11 Pro I generated a PDF using the owner password 'hashcat'. Then I extracted the hash using the latest pdf2john from github with the following hashcat parameters:
The hash target specified in 'PDF-password-is-hashcat.hash' looked like this:
It only took about a half hour to crack on a GTX 670. So I imagine it'll be a lot faster on an ATI-7970.
Anyhow, here is where it gets weird.
The original file was output using the Adobe Acrobat 10.0 Paper Capture Plugin. The document security properties were identical other than a different password.
Since that worked I quickly put together a batch script to run through common password masks:
http://pastebin.com/2fDeQkQ6
The first line immediately threw an error:
The hash was extracted using the latest pdf2john.py the same way as how extracted it with the first PDF. That got me curious. I tried several other versions of pdf2john (john-1.8.0-jumbo-1, john179j5, and older copy I have) and got similar output.
For example, john-1.8.0-jumbo-1 output:
To make doubly sure the data was correct, I quickly grabbed a PDF-parser tool and wrote a companion script to convert the owner object hash to a hexstring for hashcat.
The two are clearly different:
pdf2john:
00e5507dabd18be0aa0d9f4c70607c0fa183ba9cf5e503026a524319b5e775f8
manual extraction:
00e5507dabd18be0aa5c5c729f4c70607c0fa183ba9cf5e503026a524319b5e775f8
To better show the difference:
Looking at the PDF in a hex editor I see it shows 33 bytes in total (including the starting 0x0 and ending 0xf8):
I've tried all three versions:
And all of them give the same error.
0x5c 0x5c 0x72 translates to: \\\r
So it looks like it's being processed as a carriage return.
Any ideas how to resolve this?
Code:
cudaHashcat64 -m 10500 -a 3 PDF-password-is-hashcat.hash ?l?l?l?l?l?l?l
The hash target specified in 'PDF-password-is-hashcat.hash' looked like this:
Code:
$pdf$2*3*128*-1028*1*16*da42ee15d4b3e08fe5b9ecea0e02ad0f*32*c9b59d72c7c670c42eeb4fca1d2ca15000000000000000000000000000000000*32*c4ff3e868dc87604626c2b8c259297a14d58c6309c70b00afdfb1fbba10ee571
It only took about a half hour to crack on a GTX 670. So I imagine it'll be a lot faster on an ATI-7970.
Anyhow, here is where it gets weird.
The original file was output using the Adobe Acrobat 10.0 Paper Capture Plugin. The document security properties were identical other than a different password.
Since that worked I quickly put together a batch script to run through common password masks:
http://pastebin.com/2fDeQkQ6
The first line immediately threw an error:
Code:
WARNING: Hashfile 'PDF-password-is-...hash' in line 1 ($pdf$4*4*128*-1324*1*32*e85bab525883d8493ece960c6038dcdcc75a428632fd4e45ba43bfe17ec3adc5*32*f9ce566a10eba70977b1b24f23d0861c00000000000000000000000000000000*32*00e5507dabd18be0aa0d9f4c70607c0fa183ba9cf5e503026a524319b5e775f8): Line-length exception
Parsed Hashes: 1/1 (100.00%)
The hash was extracted using the latest pdf2john.py the same way as how extracted it with the first PDF. That got me curious. I tried several other versions of pdf2john (john-1.8.0-jumbo-1, john179j5, and older copy I have) and got similar output.
For example, john-1.8.0-jumbo-1 output:
Code:
PDF-password-is-...pdf:$pdf$4*4*128*-1324*1*32*e85bab525883d8493ece960c6038dcdcc75a428632fd4e45ba43bfe17ec3adc5*:::::PDF-password-is-...pdf
To make doubly sure the data was correct, I quickly grabbed a PDF-parser tool and wrote a companion script to convert the owner object hash to a hexstring for hashcat.
The two are clearly different:
pdf2john:
00e5507dabd18be0aa0d9f4c70607c0fa183ba9cf5e503026a524319b5e775f8
manual extraction:
00e5507dabd18be0aa5c5c729f4c70607c0fa183ba9cf5e503026a524319b5e775f8
To better show the difference:
Code:
00e5507dabd18be0aa 0d 9f4c70607c0fa183ba9cf5e503026a524319b5e775f8
00e5507dabd18be0aa 5c5c72 9f4c70607c0fa183ba9cf5e503026a524319b5e775f8
Looking at the PDF in a hex editor I see it shows 33 bytes in total (including the starting 0x0 and ending 0xf8):
Code:
_ start-v v-my script screws up as well?
008c5fc8: 38 2f 4f 28 00 e5 50 7d ab d1 8b e0 aa 5c 72 9f 4c 70 60 7c 0f a1 83 ba :8/O(..P}.....\r.Lp`|....
008c5fe0: 9c f5 e5 03 02 6a 52 43 19 b5 e7 75 f8 29 2f 50 20 2d 31 33 32 34 2f 52 :.....jRC...u.)/P -1324/R
_ ^-end
I've tried all three versions:
Code:
00e5507dabd18be0aa 0d 9f4c70607c0fa183ba9cf5e503026a524319b5e775f8 : PDF2JOHN
00e5507dabd18be0aa 5c5c72 9f4c70607c0fa183ba9cf5e503026a524319b5e775f8 : Manual
00e5507dabd18be0aa 5c72 9f4c70607c0fa183ba9cf5e503026a524319b5e775f8 : Hex from file
And all of them give the same error.
0x5c 0x5c 0x72 translates to: \\\r
So it looks like it's being processed as a carriage return.
Any ideas how to resolve this?