Arbitrary Password Restrictions: Why, Internet, Why?!

A Dirty Password History

I have a gaping backlog of weak and shared passwords for various online services I signed up for many, many years ago. My sense of password security has clearly evolved since then and I've recently started to overcome my lazyness: I am creating a secure password now whenever I notice an account is still using an old one.

The disturbing revelation: there are so many platforms that impose the weirdest restrictions on a password's format!

»Password too long«

I wouldn't mind if websites were only zealous about password security and complain about weak passwords. In fact, most websites do, and rightfully so: a password with less than 12 characters shouldn't even deserve the name. But you probably wouldn't believe how many websites set an upper limit for the password length. The worst part of it: it's not even a high boundary: I've seen »Password must be between 8 and 12 characters« on an insurance website.

An upper boundary for a password's length… what are the implications?

Back in the day, it was common to store cleartext passwords in the database. I know I did, if only for one very dirty pet-project that never made it beyond prototype state. I was young… also, I was writing PHP (there, I said it). If you store the password string into a database column, it might be that some overambitious database engineer made it a VARCHAR(12) field and that would explain the limitation.

But no one's using clear-text passwords anymore, right? Riiight?

But … Fixed-length Password Hashes?

Passwords are salted and hashed, then stored. A beautiful side-effect of common cryptographic hash algorithms is: they yield a string with a deterministic length. The length varies for each hash function, of course: the old MD5 generates a 32 characters string, SHA-1 produces 40 characters and a SHA-512 hash has a length of 128. But no matter which of these common hashing functions will be used: regardless of the (cleartext) input, the resulting string will end up with a fixed length.

The longer a password, the more entropy it has and hence the more difficult it is to crack. As XKCD #936 so adequately put: the all-lowercase, no-numbers and no-special chars word correcthorsebatterystaple contains approx. 44 bits of entropy and is still a password you can remember¹.

Why people would limit the maximum amount of characters for a password is completely beyond me, both from a security and a technical perspective.

¹More on passwords you can remember and why they are dangerous below.

»May not include special characters«

Some of the websites that told their users to use both upper- and lowercased letters and digits, interestingly enough, complained about special characters like ( ) { # $ % ö ß.

Having a greater pool to draw entropy from (ie. having more characters at your disposal) will increase the password strength dramatically. I can only imagine that the technical reasoning behind this restriction is encoding². They either don't trust their web frontend to get properly encoded data or they, again, store passwords in cleartext and fear database and/or table collation issues. But hey, the cryptographic hash functions mentioned above only produce ASCII output, so that should solve that, eh?

²There is a classic German geek tee sporting a joke: »Schei? encoding« - funny!

Dictionary Attack All The Things!

I've recently read an article about an experiment that tapped Project Gutenberg, Wikipedia and even Twitter to get more words for a dictionary attack. It was later enriched further with contents of news sites, publicly available mailing list archives, IRC chat logs and can theoretically suck in every digitally availble text under the sun. The result is a word list that can easily crack long passphrases, even when distorted with classic »l34t sp34<« substitutions (eg. E → 3, S → $).

Add to that the fact that modern password crackers use multiple GPUs in parallel to achieve a ridiculously high guesses/timeframe ratio – the machine presented in the article used four AMD Sapphire Radeon 7950 graphic cards, a commodity system for only 800 USD and it does 30 billion guesses per second.

Consider All Passwords Insecure

It has one good thing to it: no matter how hard I try to come up with a secure enough password, I will fail. And it really drove home the fact that I really shouldn't bother remembering passwords at all; only problem is: passwords are here to stay … at least for the time being (use 2FA wherever possible!).

I've been using the tiny but very handy program pwgen for a very long time (try it out with brew install pwgen / apt-get install pwgen). It's selling argument is to create random, but memorizable passwords – basically by putting vowels and consonants in an order so that humans can still pronounce them.

The problem with those passwords: even if you can pronounce them, they cannot be arbitrarily long if you want the remember them. Leaving that nesessity behind, pwgen -c -s -1 32 generates a nicely random, 32-characters string and an additional -y also includes symbols. The output is pretty neat and, I would say, reasonably secure.

The problem is: where to store it, but I'll leave that for another day.

Arbitrary Password Restrictions: Why, Internet, Why?! Implications of unreasonable format constraints for user passwords, especially password length