Today, we will be explaining how hashing algorithms work and what their function is in a cryptocurrency.
What Is Hashing?
A hash algorithm compresses data to a specific size. Hashes enable computers to easily compare or identify files or databases. Instead of going through the entire data in its original form, they just compare the hash values. Hashing can be applied to storing passwords, computer graphics, SSL certificates, and many other functions.
The defining features of hashing are irreversibility and uniqueness. Once a price of data has been hashed, you cannot reverse the project. Also, you will never get the same hash values when hashing two different pieces of data. If two exact hashes are found for two distinct pieces of data, this occurrence is called “hash collision,” and such an algorithm is useless.
The mathematical function that converts the input data, which is of arbitrary length, into a compressed numerical value, that is the output of a fixed length, is called a hashing function. The numerical output is named the hash value or hash.
The output’s or hash’s length is dictated by the hashing algorithm used. The common range of length for hashing algorithms or functions range between 160 and 512 bits.
As we mentioned above, the hash value is generated from a base input number that was converted using the hashing algorithm.
In crypto, the public key is encrypted using the hash value. It is almost impossible to determine the original input numbers without having the data used by the hash value.
What Are Hashing Algorithms? How Do Hashing Algorithms Work?
A hash function is the core of the hashing algorithm. In order to generate the hash value of a pre-set length, the input data must be first divided into fixed-sized blocks, as the hash function takes in data at a fixed size. These are known as “data blocks.”
Data blocks will have different sizes depending on the algorithm applied. In most cases, the message won’t be in the multiples of the block size limit. Generally, the padding technique is used where the entire message is separated into data blocks of a fixed size. The hash function is applied for as many as possible data blocks that were resulted.
Blocks are processed one at a time, with the output of the first data block being given as input along with the second data block. Then, the output of the second is given as input with the third block, and this goes on until the last block is processed. Thus, the final result is a combination of all blocks. If one bit is modified in the message, this completely changes the hash value. This is known as an “avalanche effect.”
Even though hashing algorithms were created to serve as a one-way function that cannot be inverted, there were many cases of hashes that were compromised.
Cryptographic hashes are employed to create digital signatures, used in password storage, file verification systems, message authentication, and various other forms of authentication.
A problem that we will discuss in our how do hashing algorithms work guide are collisions. As hashes characterize a fixed-length string, this means that for every possible input, there are other inputs that could lead to the generation of the same hash.
If someone manages to create collisions on demand, he can use the fake files or data as proof he has the correct hash. Hash computing should not be overly efficient, as it facilitates the artificial computing of collisions.
A good hash function should be able to:
- Compute at fast speeds the hash value of any kind of data;
- Be impossible to retrace or calculate any message from it (brute force attack being the only option);
- Be resilient against “pre-image attacks” (in which hackers try to replicate the value that generated the hash);
- Avoid hash collisions; each message must have its own unique hash;
- Result in an avalanche effect when a change is made anywhere in the message.
Common Hashing Algorithms
MD5 was one of the first hashing algorithms that were used extensively until it was compromised. Because of its many vulnerabilities, MD5 has been deemed unsuitable for further use. Today, its only use if for checking data against accidental corruption.
Developed by the NSA, the Secure Hash Algorithm is a family of cryptographic hash functions. Their first algorithm, SHA-0 (launched in 1993), has been obsolete for decades now.
SHA-1 (1995) generated a hash value of 160-bit (20-byte) and brought only a minor improvement to MD5, which made the output a 40 digits long hexadecimal number. Also, because of theoretical collisions, the algorithm was compromised in 2005, but its mass replacement happened in 2010.
SHA-2 is the algorithm version that is still in use and regarded as safe. The SHA-2 family consists of six hash functions: SHA-224, SHA-256, SHA-384, SHA-512, SHA-512/224, SHA-512/256.
In 2006 competition held by the National Institute of Standards and Technology (NIST), SHA-3, was developed. SHA-3 became a standard in 2015, and even if it is named as the rest of the NSA algorithms, it actually belongs to a family of hashing algorithms known as KECCAK (pronounced ketch-ak).
The difference is that they have a sponge construction, which uses random permutations to absorb and output data while giving out randomized future inputs that are implemented into the algorithm.
Hashing and How It Is Used in Blockchain
In order to hash data, Bitcoin makes use of SHA256, while Ethereum currently employs a variant of the SHA-3 (KECCAK256). Ethereum’s proof of work algorithm, Dagger-Hashimoto, was calculated to be memory-hard for hardware computation.
Bitcoin’s SHA256 can only be computed using Application-Specific Integrated Circuits (or ASICs). Bitcoin hashes data with SHA256 by using two versions of the algorithm in its protocol. By using a double SHA256, Bitcoin can lessen the damages of a possible length-extension attack.
In this type of attack, the hacker tries to discover the length of a hash input and use it to trick the hash function to initiate a certain part of its internal state by assigning a secret string to the hash value.
We hope that now you know how hashing algorithms work and how this type of encryption is relevant to crypto and blockchain.
Featured image: Steemit