The world of cryptography is vast and often very complicated, that is why today im going to go over the basics of MD5 and explain how it works.
MD stands for ‘Message Digest’ and describes a mathematical function that can take place on a variable length string. The number 5 simply depicts that MD5 was the successor to MD4. MD5 is essentially a checksum that is used to validate the authenticity of a file or a string and this is one of its most common uses. Lets take a look at a working example. Lets say you have released some software or a program that you want people to freely distribute, this is all good and well but what if someone was to tamper with your application with malicious intent? For example what if they added malware onto your program, how would people know? Well if you had taken an MD5 checksum of your original program and made this information public, then when people downloaded your software could then check their downloaded file and check that the MD5 checksum matches yours. If it does then great! If not then it means your program has been tampered with.
How does it work? – I’ll try and explain this as simply as I can, if your new to cryptography then this is where it can get complicated! Firstly the input file or string is split up into 512bit blocks, if the file or string is not divisble by 512 then it is padded so that it can be. The output for an MD5 hash is always a 128bit string. For the operation to work this is split into four 32bit words, lets call them A, B, C and D. Each of these words then performs 16 operations making 64 in total (These are called rounds). The operations work on each 512bit block, the operations are non linear functions, modular additions and left rotations. The output is then a 128bit Hex string.
Working example –
Let’s MD5 on the following string: ‘Hello World’ Here is the output:
5eb63bbbe01eeed093cb22bb8f5acdc3
Now to give you an idea on how the function works, let MD5 a very similar string: ‘Hello Worle’ Here is the output:
18c5650581f01f1a52c87eee5baa754a
Can you see how drastically difference the two strings are? In cryptography this is called ‘The Avalanche Effect’.
Vulnerabilities – MD5 is a one way function, this means it cannot be reversed. It cannot practically be bruted force either because of the length of the key. However, the most common form of attack on an MD5 string is a Rainbow Table attack. This works in a very similar way to brute force and basically uses a massive databasse of MD5 strings with their reversed outputs. There have also been numerous demonstrations showing that 2 different files can generate the same hash. Without any tampering though this is very very unlikely (the chances are 1 in several trillion trillion)
With these vulnerabilities in mind, most people are moving away from MD5 for uses in their applications, algoritihims such as SHA are more recommend when security is essential.
Nevertheless, MD5 has been around for years and still provides some decent level of security for certain things, it is commonly used to store passwords in databases (since MD5 cannot be reversed, passwords are consider secure and safe if they are stored in this format)
So there you have it, a brief and not too complicated introduction to MD5.

Leave a comment