Math puzzle for today
Posted by Bob the Hamster on June 13th, 2007Current mood: Mathy
I have often heard it said by computer security type people that using real words in a password is a terrible idea, and that all passwords should be made of random letters numbers and punctuation.
So here is your math puzzle for the day. Estimate which is of the following is a better password:
(1) 8 random characters that may include lowercase letters, uppercase letters, numbers and any punctuation found on a generic keyboard.
February 14th, 2008 at 7:41 am
In python:
>> len(open(‘/usr/share/dict/words’).read().split())**3
957652250706432L
>> import string
>> len(string.printable)**8
10000000000000000L
So the number of possibilities for 8 randomly selected characters is much better. However, if we decide to have four words instead:
>> len(open(‘/usr/share/dict/words’).read().split())**4
94393867047631589376L
The number of possibilities to brute force though is much larger. Additionally, a group of words have the advantage of being much easier to remember and therefore less prone to end up written down on a post-it note by the user. Few people have the ability to remember a bunch of meaningless characters, especially once case is involved. Actually, many security experts are now advocating longer passphrases and encouraging everyone to make password fields that allow up to at least 80 characters.
February 14th, 2008 at 1:43 pm
Nicely done! But your words file must be smaller than mine.
>>> len(open(“/usr/share/dict/words”).read().split())**3
12967440272894953L
April 10th, 2009 at 12:19 pm
And those words files must be smaller than the English language. Your number indicates a word file with 234,937 words. I’ve seen estimates that place the sum total of words in English (assuming that one includes lingo and slang) at upwards of six *million* words.
With three words, that represents 2.16e20 (216 million trillion, or 216000000000000000000L) possibilities. That’s 2,160,000 times the number of password possibilities represented by eight random characters.
Of course, implementing this would require taking the user’s word for it when she typed in three words that those words are, in fact, English (since the English vocabulary is larger than your dictionary and is growing all the time). This actually allows for *more* variation, since it allows for misspellings, nonsense words like “snargleflat”, and random character juxtaposition. :)