Jul 19, 2021
4 mins read
Managing user data on a platform is always a challenge. It is impossible to be 100% secure against data leaks, and the impacts of these incidents can be very severe. Among sensitive information, we especially highlight the email&password pair.
Passwords have received our greatest attention, there are numerous restrictions on length, character mixes (upper and lower case, numbers and special characters). There are mnemonic techniques to help users remember, and password vaults to ensure you can use a different password on each account. Finally, we log passwords using salted-hashes, and even that isn't enough to keep accounts safe after a leak.
That said, when it comes to emails, why we continue saving values in plain text, just as we did decades ago? The reason is simple: platforms need to stay in touch with customers, informing them of updates, new products and even security incidents. Hashes are irreversible and therefore would make it impossible to retrieve an email address once saved. If your platform, for some reason, doesn't need to contact the user via email, keeping the address in a salted hash is a good option. But what about when contact is needed? Is it possible to register an email address securely? Fortunately, yes.
A Very Very Old Trick
When we talk about cryptography these days, most people will think of cryptocurrencies, blockchain, and derivative terms. But cryptography is much more than that. From Caesar's time, messages were encrypted to prevent enemies from acquiring important information.
We currently have many more resources than the distinguished officers of the Roman empire. There are methods we can use to encrypt an email address, sending the result reversible and without affecting the performance of servers. Again, don't expect 100% security. But hey! We're talking about adding one layer of security where none exists!
Let's Do It
One of my favorite methods is to use an XOR algorithm. In the links below you can understand more about how it works:
Now that you know how it works, we can start a Python implementation quickly. Below, we create two of the necessary functions:
As you can see, these are very simple functions. And what would be the best practices for encoding email addresses? As I said, XOR is not particularly the most secure encryption. But there are some tricks we can do:
Preferably, the key should have the same size (in number of characters) as the text to be encrypted. Keys larger than the source text do not add security as the excess will not be used by the algorithm.
XOR is not an expensive algorithm to process, so we can split the text into parts and encrypt them with different keys.
In our case, we can use the "@" as a delimiter, encoding the two halves separately.
How to Proceed?
For example, we will encrypt the email <78>[email protected]. We can split it into "email", "@", "example.com". The first element consists of five characters. The question now is: how to define a secure key? Well, we have some values available. Once the password is hashed, we can reuse a portion of this hash as a key.
In our example we will use an MD5 hash, we will choose a set of five characters from that hash. This value does not need to be saved as we already have the hash registered. This is also a guarantee that if the database is leaked, the specific part of the hash will not be clearly demonstrated.
Assuming that the user uses the password "badpass", and the salt "365ropmNUjtq08xSZOiMrgRjG9OMMe82Hh8LU1M" is added at the end, we will have the following result: 249ec31cf8b1946371bdeed6603b8341 We only need five characters, so let's use characters 2, 7, 16, 21, 6: "9", "c", "7", "e", "1". 9c7e1 is our first key. Applying the XOR to the first element, we have the following:
encrypt("email", "9c7e1") -> 5c0e560c5d
To test, let's do the reverse: decrypt(b"5c0e560c5d", "9c7e1") -> "email"
Nice! Repeat the process with the second element, after the "@". Use a different key. You have several values at your disposal: an internal administrative key, other parts of the password hash. You can even use the result of the first element! However, remember to prioritize a key with the same element size. Finally, your database will look like this:
email: [email protected],
Looks a little bit safer now, doesn't it?
The Final Test
In order to test the security of our method, here's a challenge: try to break the encryption used in the second element "example.com". Comment below the key used to generate the value found in the user's registration and what you think of using algorithms like this to increase, even if a little, the users' security.
Update: The use of MD5 here was illustrative. In real environments, prefer Blake2b or any other more secure hash.