Since I (re) started working on hashes, I've been trying to find better ways to represent my "found" hashes, so that they can be re-used. One of the first questions I posed was "how do you represent passwords with embedded CR/LF/NUL characters?" (or other control characters).
In hashcat, the only way that seems to be available is to use hex output; but this then means that you need to decide, in advance, which passwords might contain control characters, and output them to a separate filename. In certain cases (like a CR or LF at the beginning or end of a password), you can use hex-salt, and one of -m 10 or -m 20, but this is a real pain.
A few months ago, I became frustrated enough to write some software for this, and implement it myself. As a result, I invented a new way to represent passwords that contain control characters. For "regular" passwords, no change is made. "password" still looks like "password" in my dictionary files. But if the password contains a control character (other than space or tab, of course), I "switch modes", and output the hex equivalent, enclosed in $HEX[].
So, a password of "foobar1[CRLF]" becomes:
$HEX[666f6f626172310d0a]
This has the added benefit of being able to use _any_ character in a passwords. Most hash algorithms (SHA1/MD5, etc) work on the basis of bytes in any event, so creating an arbitrary-length string in a dictionary file, of any set of characters, becomes really easy.
But is it really needed?
In the isw 2012 challenge of 139,444,502 hashes, I was able to find 127,224,531 solutions. more than 260,000 required $HEX - for example:
$HEX[0000]
$HEX[696d0a]
$HEX[55470a]
and many more.
Adding rules to prepend/postpend CR/CRLF/NUL etc is an easy way to find these, but without a regularized output format, it's impossible to re-use the data - this is particularly the case with passwords containing one or more CR/LF characters.
But can't you just use --hex-output?
Again, yes, this is an option, but to effectively re-use the dictionaries, I would have to convert all of my existing dictionaries to hex-output format. This would double the size of the dictionaries, and increase the loading time.
But having to parse the $HEX[] format means that loading passwords would be slow!
That's not the case. Note: This is not a plug for my software; it's not available. I'm using this as an illustration only of time required to parse the input.
To check 258M passwords in 9 file against 2,894,100 MD5 hashes takes hashcat (without $HEX[] parsing)
real 1m46.145s
Using oclHashcat:
real 3m6.574s
Using mdxfind (my code, including the $HEX[] parsing).
real 1m8.213s
(so about 30% faster, doing the $HEX[] parsing. My system had a load of 30 while running this; it wasn't idle).
So $HEX isn't slowing anything down. It just is a better way to represent passwords that contain unprintable characters.
If you have a better idea, please let me know. I've been struggling for quite some time with this, and have yet heard no better plans.
Comments?
In hashcat, the only way that seems to be available is to use hex output; but this then means that you need to decide, in advance, which passwords might contain control characters, and output them to a separate filename. In certain cases (like a CR or LF at the beginning or end of a password), you can use hex-salt, and one of -m 10 or -m 20, but this is a real pain.
A few months ago, I became frustrated enough to write some software for this, and implement it myself. As a result, I invented a new way to represent passwords that contain control characters. For "regular" passwords, no change is made. "password" still looks like "password" in my dictionary files. But if the password contains a control character (other than space or tab, of course), I "switch modes", and output the hex equivalent, enclosed in $HEX[].
So, a password of "foobar1[CRLF]" becomes:
$HEX[666f6f626172310d0a]
This has the added benefit of being able to use _any_ character in a passwords. Most hash algorithms (SHA1/MD5, etc) work on the basis of bytes in any event, so creating an arbitrary-length string in a dictionary file, of any set of characters, becomes really easy.
But is it really needed?
In the isw 2012 challenge of 139,444,502 hashes, I was able to find 127,224,531 solutions. more than 260,000 required $HEX - for example:
$HEX[0000]
$HEX[696d0a]
$HEX[55470a]
and many more.
Adding rules to prepend/postpend CR/CRLF/NUL etc is an easy way to find these, but without a regularized output format, it's impossible to re-use the data - this is particularly the case with passwords containing one or more CR/LF characters.
But can't you just use --hex-output?
Again, yes, this is an option, but to effectively re-use the dictionaries, I would have to convert all of my existing dictionaries to hex-output format. This would double the size of the dictionaries, and increase the loading time.
But having to parse the $HEX[] format means that loading passwords would be slow!
That's not the case. Note: This is not a plug for my software; it's not available. I'm using this as an illustration only of time required to parse the input.
To check 258M passwords in 9 file against 2,894,100 MD5 hashes takes hashcat (without $HEX[] parsing)
real 1m46.145s
Using oclHashcat:
real 3m6.574s
Using mdxfind (my code, including the $HEX[] parsing).
real 1m8.213s
(so about 30% faster, doing the $HEX[] parsing. My system had a load of 30 while running this; it wasn't idle).
So $HEX isn't slowing anything down. It just is a better way to represent passwords that contain unprintable characters.
If you have a better idea, please let me know. I've been struggling for quite some time with this, and have yet heard no better plans.
Comments?