When dealing with a forensic image that contains encrypted files, our best friends are often those ever so helpful post-it notes, weak passwords, or instances of password reuse involving encoding methods that are easily defeated. However fortune doesn't always favor the forensicator, and periodically you have to look for another shortcut for recovering encrypted content.
One approach that can help with this is to build a password dictionary from printable character strings contained within evidence images. The basic idea is that a user may have stored their password (or a derivation of it) somewhere on the original media or that the password might still be retained on an internal page or swap file.
A reason to consider this approach is that the generation and use of a dictionary file can be achieved relatively quickly. Whereas, a brute-force attack against decently complex password > 6 chars can potentially take a very long time if you're up against a good cipher.
My initial forays into building case-specific password dictionaries involved the Linux string command, sed, awk, grep and LOTS of pipes; The overall processing time for this method was rather slow (basically run it and go to bed). However, using the incredibly versatile bulk_extractor tool by Dr. Simson Garfinkle (available in latest update of RŌNIN-Linux R1) we can generate a media-specific dictionary file fairly quickly.
If you've never used bulk_extractor before then I recommend checking out its ForensikWiki entry. The scope and utility of this tool is much broader than the topic of this post.
Here are some quick steps on building a case dictionary file using bulk_extractor and cracklib.
Using Bulk_Extractor To Build Initial WordList
With the command listed below: we are disabling all other scanners available in bulk_extractor (-E ) save for the wordlist scanner, we are outputting the generated wordlist in specific directory (-o), and we are designation the image to be evaluated. The default settings here will extract words between 6 to 14 characters long and this is adjustable with the -w flag.
|
- The scan method employed by bulk_extractor is 100% "agnostic" concerning the actual filesystem contained within the image. We can throw any digital content at it.
- Bulk_extractor employs parallelization for performance. The data read from image is split into 16M pages with one thread per core committed to processing each page.
- Bulk_extractor is able to pick up where it left off. If we kill this process and restart, then bulk_extractor will read its last read offset from our output folder and begin there.
After the run has completed, we will find a wordlist_split_000.txt file in our output directory.
A quick evaluation of this file shows us that bulk_extractor has extracted 388,950 unique potential password strings.
|
Obviously the majority of entries contained in our wordlist_split_000.txt file are junk. If desired, we can clean this dictionary up a bit more as well as obtain some string derivations by using the cracklib utility cracklib-format:
|
- Lowercases all words
- Remove Control Characters
- Sorts Lists
Since decent password cracking tools will employ case variance we often don't lose too much with this clean-up. However, retaining the wordlist_split_000.txt file is a good idea should your password cracking tool not support this.
Another option for reducing the password list to an even shorter set, is to use cracklib-check to create a list of weak passwords (short, dictionary based).
Another option for reducing the password list to an even shorter set, is to use cracklib-check to create a list of weak passwords (short, dictionary based).
|
Ideas? / Further Reading
Do you have another tool,method, or process that you use for this? I'd love to hear about it.
Here are a few other links that are useful/relevant:
Here are a few other links that are useful/relevant:
- List of password cracking tools.
- Great post on automating entropy measurements for detecting potentially encrypted files.
- NYU-Poly Bulk_Extractor Video Overview.
Making forensics interesting. Well done sir.
ReplyDeleteThis comment has been removed by a blog administrator.
ReplyDeleteGreat article. I'm new to bulk extractor and going thru the wordlist. When I ran the list came up with more gibberish than any discernable words. Is there a way to compare the bulk extractor wordlist against the english language dictionary (or some other method) to get a list of actual words rather than gibberish?
ReplyDeleteI'm a new bulk extractor user. Is there a method to get just a list of words rather than a gibberish list of everything? I'm getting so much junk that it is nearly unusuable.
ReplyDelete