Using sha256sum command for ensuring file integrity

1.0 SHA-2

A cryptographic hash function (CHF) is an algorithm to map arbitrary size data, called the message, to a fixed sized bit array, called the hash value, or message digest. It is a one way function. It is not feasible to generate the original message from the hash value. A small change in the message would result in a totally different hash value.

Cryptographic hash functions are used in information security applications like digital signatures and generating checksums to validate file contents.

Secure Hash Algorithms (SHA) is a family of cryptographic hash functions published by the National Institute of Standards and Technology (NIST) as a US Federal Information Processing Standard (FIPS). Under this, four algorithms named SHA-0 through SHA-3 have been published. Of these, SHA-0 and SHA-1 have been found vulnerable to certain attacks and are deprecated. SHA-2 and SHA-3 have performed better in validation tests and are being used in information security applications.

SHA-2 is a family of two major hash functions, SHA-256 and SHA-512. SHA-256 has a 256 bit message digest size whereas SHA-512 has a 512 bit message digest size. SHA-256 is used in authenticating Debian software packages and also in DomainKeys Identified Mail (DKIM) message signing standard.

When we generate a file, whose contents should not be modified, we can also generate its SHA-256 checksum, storing the checksum along with the file name in a separate place. Later on, whenever required, we can validate the file contents with the SHA-256 checksum.

2.0 Using the sha256sum command

We can use the sha256sum command to compute and check the SHA256 message digest. Suppose we have the following files in a directory.

$ ls -ls
total 418688
 50648 -rw-rw-r-- 1 user1 user1  51857882 May 13 06:20 bar-1.sql
 51960 -rw-rw-r-- 1 user1 user1  53201666 May 12 15:55 bar.sql
166244 -rw-r--r-- 1 user1 user1 170228950 May 13 06:20 foo-1.tar.gz
149836 -rw-r--r-- 1 user1 user1 153424874 May 12 15:57 foo.tar.gz

We need to compute and store SHA256 checksum for each file so that anyone who downloads these files can validate the file contents using the checksum. We can compute and store the checksums with the commands,

$ mkdir ../tmp
$
$ sha256sum * >../tmp/checksums
$
$ mv ../tmp/checksums .
$ ls -ls
total 418692
 50648 -rw-rw-r-- 1 user1 user1  51857882 May 13 06:20 bar-1.sql
 51960 -rw-rw-r-- 1 user1 user1  53201666 May 12 15:55 bar.sql
     4 -rw-rw-r-- 1 user1 user1       306 May 13 06:59 checksums
166244 -rw-r--r-- 1 user1 user1 170228950 May 13 06:20 foo-1.tar.gz
149836 -rw-r--r-- 1 user1 user1 153424874 May 12 15:57 foo.tar.gz
$
$ cat checksums
122b3c43c319fc53cb9025053339c33f34c2eeda8ecca27982c504ae79e3fd74  bar-1.sql
19e4bdc32964bf8e45d155520612ded252c91898bf9b2ea824ca12544f052829  bar.sql
17db7824179bc8115b236480422f496aecee42ef656558c9d7ddce45ef4c3148  foo-1.tar.gz
4bc24764e4e6d24f5bedbcb0f5f7d6e0036719b142912e5a9a54033c847434b8  foo.tar.gz

We have taken care to create the checksums file outside the current directory so that sha256sum does not compute its checksum while the file is getting written with the checksum of other files. We can check the integrity of the files anytime with the command,

$ sha256sum -c checksums
bar-1.sql: OK
bar.sql: OK
foo-1.tar.gz: OK
foo.tar.gz: OK

which gives the OK result for the four files, as expected.

3.0 See Also