Computing file hash values on Linux and Windows

2015-11-06 | Martin Hoppenheit | 2 min read

Just some quick notes on computing hash values (aka checksums) for one or more files on Linux and Windows.

Linux/Bash

Compute a hash value for a single file FILE with algorithm SHA256 or MD5:

$ sha256sum FILE
$ md5sum FILE

To compute hash values for all files in a directory DIR (including subdirectories) you can use the good old find command. Either pipe its output to your preferred hash function via xargs or use the -execdir option of find:

$ find DIR -type f -print0 | xargs -0 md5sum
$ find DIR -type f -execdir md5sum {} +

Compare all files in two different directories (e.g., after copying from one to the other) based on their names (without leading directory paths) and hash values:

$ diff \
    <(find DIR1 -type f -execdir md5sum {} + | sort) \
    <(find DIR2 -type f -execdir md5sum {} + | sort)

Windows/PowerShell

Compute a hash value for a single file FILE with algorithm SHA256 (default) or MD5:

PS> Get-FileHash FILE
PS> Get-FileHash -Algorithm MD5 FILE
PS> Get-FileHash -Algorithm SHA256 FILE

To compute hash values for all files in a directory DIR (including subdirectories) you can use the Get-ChildItem commandlet. Pipe its output to the Get-FileHash commandlet:

PS> Get-ChildItem DIR -Recurse -File | Get-FileHash

Compare all files in two different directories (e.g., after copying from one to the other) based on their hash values:

PS> Compare-Object `
    $(Get-ChildItem DIR1 -Recurse -File | Get-FileHash) `
    $(Get-ChildItem DIR2 -Recurse -File | Get-FileHash) `
    -Property Hash

Note that this will compare just hash values, nothing more, not even file names. However, if you just want to check if your copy action went smoothly, that’s usually enough.

The hashdeep tool

If you frequently compare sets of files recursively based on their hash values then the hashdeep tool may be just what you need. It is available for Unix and Windows. With hashdeep, comparing all files in two different directories based on their hash values gets as easy as this:

$ hashdeep -ak <(hashdeep -br DIR1) -br DIR2

If you care for file paths you need a less concise version involving a temporary file (note the -l option instead of -b to use relative file paths instead of bare file names):

$ cd DIR1
$ hashdeep -lr . > /tmp/hashes
$ cd DIR2
$ hashdeep -lr . -ak /tmp/hashes