Managing Passwords without storing them
This article explores an alternative to regular password managers that attempts to store as little information about passwords as possible.
ADDENDUM 2023-12-27:
Threat modelling is important for any security decision. I did not mention threat modelling in this article, and I won’t go much into detail here, but basically it probably doesn’t improve security in any way. In fact the hashed account/service information described below were plain text embedded in the binary, instead of being stored in an encrypted form in the password database.
I don’t use the application I wrote for this application, but instead use KeepassXC. For more integrated password managers I recommend 1Password which I used previously, but Bitwarden is supposedly a good open source alternative.
I mention this because all of these support more than just storing passwords, including payment information, cryptographic keys, 2fa tokens, etc. While what I propose in the following article can be adjusted to integrate with your security key, I don’t think it is going to be particularly useful since changing the security key will change all of your passwords by design.
With that out of the way, let’s get into the original article, its release date is unknown to me but it was some time in November 2020:
Password managers are the generally recommended way of storing your passwords[1]. Currently available password managers typically have encrypted data files stored either locally or in the cloud, that can only be decrypted with your “master password” These data files contain all of the information necessary to log you in. Password managers also contain password generators and especially browser-based ones have autofill support.
This approach is good enough for most users and provides a signifficantly higher level of security than approaches that don’t use password managers (mostly involving the memorization of passwords)
The Initial Idea
I initially had the idea for doing this around 2015 after byuu, a former retro gaming console (most notably SNES) emulator author, briefly explained their personal password manager, which they apparently use to this day[2].
My initial implementation of the same basic idea was a lot more convoluted and certainly less secure (even though it shouldn’t matter for the password it generated in the end), however I hope my new implementation is a lot closer to secure.[3]
The Gist
In an ideal world, we would not need to store any information about the password at all. However due to conflicting password requirements and counterproductive password expiry rules, you will not be able to find a function that can generate the most secure passwords for every given available service.
We therefore have to store a small amount of metadata, but we can afford to store the sensitive parts of it in a hashed form, since it is only needed to pick the correct entry from the database.
In particular we need to store:
- A hash of some sort unique per login combination (service + username/email)
- The “iteration” number in case you need to rotate the password
- A description on how to generate the password (in machine-readable form, for example by naming a specific generator function)
Proposed Mechanism
Let me preface this section by saying that I’m not a cryptographer or a security export, so before you implement it and start using this mechanism in production, let someone a lot more qualified than me look over this article. I should also mention that I found less-than-ideal decisions with questionable impact on security in my personal implementation of my password manager. These issues have been rectified in this document.
Generic version
You need a CSPRNG, which is a random number generator which is cryptographically secure. In particular this also requires that no matter how much of the output you already know, you cannot predict the next n bits any more accurately than a 1/2^n
chance, however as a PRNG it is fully deterministic, meaning that for the same exact input you receive the same exact output.
This exact property is exploited in stream-ciphers[7] like chacha20 or AES-CTR, which XOR the data with enough random data from the namesake CSPRNG. It has the benefit that you can encrypt data without any padding being necessary, which are difficult to implement correctly and has the chance of being abused in certain cases[4].
The key of the CSPRNG is generated by a “Key Derivation Function” which creates a cryptographic key from less secure imputs, such as Passwords. The input to this KDF should be based on the user password, the login (service + username/email) and the “iteration” number to generate truly unique “seeds”.
The output of the CSPRNG should be used to choose individual characters inside of the password. This can include unicode characters for certain services.
Pseudocode:
key = kdf(password, salt=(site ++ username ++ iteration.to_bytes(8)))
rng = csprng(seed = key)
password = ""
for x in range(password_len):
password += rng.choose(alphabet)
Potential Implementation
This is close to my current implementation, but questionable choices have been replaced. The design does not allow changing the implementation without invalidating the passwords.
As CSPRNG I chose chacha20 with a 128 bit counter (and no nonce). This is because of the large block size (512 bits) and large key size (256 bits).
A nonce is not necessary as each key is only used for one purpose.Addendum 2023-12-28: You really do not want to skip out on the nonce.
As PRF I chose argon2id, as that is considered a great choice for hashing passwords. It has configurable computation and memory hardness and therefore makes bruteforce difficult.
The salt is the sha256 hash of the byte string
"{site}\0{username}\0{iteration}"
. The reason behind this is that this allows the password to be different even with the same master password[5]The database entry hashes are
HMAC-SHA256("{site}\0{username}")
with the key being generated by argon2id with salt “password”. This is to prevent “mining” the public part of a credential from the password databaseAddendum 2023-12-28: as above, this is more information than would be exposed in a password manager database.
Randomness Generation
Unlike described above in the pseudocode section, it might make sense to generate all of the password characters at once. n bits of randomness can generate n log_d(2)
with d
being the alphabet size, or alternatively, n
characters in an alphabet of d
requires ceil(n log_2(d))
bits of information.
An implementation of this would require a bigint implementation however as a 64 character password requires 420 bits (ASCII) or 953 bits (Unicode 11) of random data, but it would fundamentally work like this:
bits_needed = int(math.ceil(password_len * math.log2(len(alphabet))))
bits = rng.bits(bits_needed)
password = ""
for x in range(password_len):
password += alphabet[bits % len(alphabet)]
bits //= len(alphabet)
A more simple method could involve just generating a 32 bit number mod the alphabet size, even if it requires 4x the amount of randomness and doesn’t use bits efficiently. I don’t know if one is to be preferred over the other in cryptography
Password Alphabets
Currently I can’t recommend a unicode-based alphabet, even though Unicode password support is recommended by the NIST[6], support is spotty at best and might even cause problems with the software you are using. For example I found that passwords containing unicode 12, 12.1 or 13 codepoints did not copy correctly from termux, as they replace every codepoint not assigned (according to your device’s OS) with a replacement character. Other services may limit the amount of bytes a password can be in length as opposed to the amount of characters, which means that passwords with 64 random unicode characters are almost certainly longer than the limit of 72 bytes they impose.
On the other hand, many big tech companies do not struggle with unicode at all!
Cons of this Design
Due to design limitations there are a few things that it cannot do
- You can’t change the master password without changing all of your passwords
- You can’t change the username without the password
- You can’t add passwords that you don’t get to choose yourself
The implementation I described also can not safely store notes together with the passwords, such as recovery codes for 2fa. This is definitely possible with an encrypted database. I believe an encrypted database wouldn’t improve the security of the design, since it can be basically boiled down to 32 random bytes followed by a low number
Pros of this designs
Addendum:
I just think it’s neat. Not really a satisfying conclusion here. As described here, it would still need a database, which reduces some of the manageability from both byuu’s implementation of this idea, as well as what I found a few hours after publishing the original article.
Existing designs
EDIT 2 hours later: Shortly after releasing this blog article I have been told that an app similar in concept has already been publically released. It is similar in concept with a few key differences:
- It uses scrypt instead of argon2id. I do not believe there is any meaningful difference in security between these two
- It does not use a CSPRNG to generate randomness, but instead uses the hash directly. This limits the maximum password length to the hash output size.
- Maximum password length is only 20, but could be increased to 63 31, not effectively unlimited[8]
- The keyspace for the maximum security password is over 50x times smaller compared to password with the same length and same alphabet due to the added restrictions, it is however statistically unlikely to not have any alphabetic characters in the password[9]
- Implementations with graphical UIs already exists for all major operating systems
Note: the above list might be slightly inaccurate as the whitepaper on the homepage errorneously claims that hmac-sha-256 outputs 512 bits. Maybe the actual implementation uses HMAC-SHA-512 or maybe the theoretical password size limit is 31 EDIT: The whitepaper does errorneously claim that hmac-sha-256 outputs 512 bits. The implementation however seems to use the correct 256 bits.
- 1.See NCSC, Troy Hunt, etc ↩
- 2.Edited: All tweets surrounding this have been deleted, but they have been archived: https://web.archive.org/web/20201108172826/https://twitter.com/dark_kirb/status/1325490061501210625. Also weird phrasing on my part. As far as I can tell the article was written mere weeks after the tweets here. ↩
- 3.footnote story time addendum: So basically the reason why I don’t think it was secure is because I took the general form of chacha20 and expanded upon it in weird ways. I don’t think I have the original code anymore but it basically involved extending some of the values (like the bits) to extreme levels (I think I extended chacha20 to 16x512 bits instead of the standard 16x32bits). It was almost certainly not secure but it took longer than what this article shows, probably in part because python bignum arithmetic was involved. Sorry for the run-on paragraph but that’s a limitation with this site’s footnote parsing code. ↩
- 4.Padding Oracle Attacks for example ↩
- 5.addendum: Babby’s first canonicalization schema. Today I would probably use PASETO’s Pre-authentication encoding. ↩
- 6.NIST Special Publication 800-63B Section 5.1.1.2 ↩
- 7.Not all stream ciphers work like this, most notably the very broken RC4. ↩
- 8.The actual limit is at 2^131 bits of randomness, maximum limit in characters varies per charset ↩
- 9.There are 52 alphabetic characters out of 86 total characters, the chance of it not being chosen 20 times is
(1-(52/86))^20
or a percentage with 7 zeroes after the decimal point. ↩