Computing file hash values on Linux and Windows
Just some quick notes on computing hash values (aka checksums) for one or more files on Linux and Windows.
Linux/Bash
Compute a hash value for a single file FILE
with
algorithm SHA256 or MD5:
$ sha256sum FILE
$ md5sum FILE
To compute hash values for all files in a directory DIR
(including subdirectories) you can use the good old find
command. Either pipe its output to your preferred hash function via
xargs
or use the -execdir
option of
find
:
$ find DIR -type f -print0 | xargs -0 md5sum
$ find DIR -type f -execdir md5sum {} +
Compare all files in two different directories (e.g., after copying from one to the other) based on their names (without leading directory paths) and hash values:
$ diff \
<(find DIR1 -type f -execdir md5sum {} + | sort) \
<(find DIR2 -type f -execdir md5sum {} + | sort)
Windows/PowerShell
Compute a hash value for a single file FILE
with
algorithm SHA256 (default) or MD5:
PS> Get-FileHash FILE
PS> Get-FileHash -Algorithm MD5 FILE
PS> Get-FileHash -Algorithm SHA256 FILE
To compute hash values for all files in a directory DIR
(including subdirectories) you can use the Get-ChildItem
commandlet. Pipe its output to the Get-FileHash
commandlet:
PS> Get-ChildItem DIR -Recurse -File | Get-FileHash
Compare all files in two different directories (e.g., after copying from one to the other) based on their hash values:
PS> Compare-Object `
$(Get-ChildItem DIR1 -Recurse -File | Get-FileHash) `
$(Get-ChildItem DIR2 -Recurse -File | Get-FileHash) `
-Property Hash
Note that this will compare just hash values, nothing more, not even file names. However, if you just want to check if your copy action went smoothly, that’s usually enough.
The hashdeep tool
If you frequently compare sets of files recursively based on their hash values then the hashdeep tool may be just what you need. It is available for Unix and Windows. With hashdeep, comparing all files in two different directories based on their hash values gets as easy as this:
$ hashdeep -ak <(hashdeep -br DIR1) -br DIR2
If you care for file paths you need a less concise version involving
a temporary file (note the -l
option instead of
-b
to use relative file paths instead of bare file
names):
$ cd DIR1
$ hashdeep -lr . > /tmp/hashes
$ cd DIR2
$ hashdeep -lr . -ak /tmp/hashes