CKSUM(1) BSD Reference Manual CKSUM(1)
cksum, sum - display file checksums and block counts
cksum [-a algorithms] [-o 1 | 2] [-b] [-p | -s string | file ...] cksum [-a algorithms] [-o 1 | 2] -G [file ...] cksum [-a algorithms] [-o 1 | 2] -t | -x | -c [checklist ...] sum [file ...]
The cksum utility writes to the standard output a single line for each input file. The format of this line varies with the algorithm being used as follows: cksum The output line consists of three whitespace separated fields: a CRC checksum, the number of octets in the input, and name of the file or string. Binary output consists of two 64-bit parts, corresponding to the checksum and length, respectively. If no file name is specified, the standard in- put is used and no file name is written. sum The output line consists of three whitespace separated fields: a CRC checksum, the number of kilobytes in the input, and name of the file or string. Binary output consists of two 64-bit parts, corresponding to the checksum and length, respectively. If no file name is specified, the standard in- put is used and no file name is written. sysvsum The output line consists of three whitespace separated fields: a CRC checksum, the number of 512-byte blocks in the input, and name of the file or string. Binary output consists of two 64-bit parts, corresponding to the checksum and length, respectively. If no file name is specified, the stan- dard input is used and no file name is written. sfv The output line consists of the file name followed by a space, unless the data to be processed is read from standard input, followed by a cksfv(1) compatible CRC expressed as eight sedecimal numbers. This algorithm is also called crc32b (the same as in zlib) or (vulgarily) just CRC32 or even CRC, for instance for verifying the integrity of downloads from a certain Redmond software house or by Anime fansub groups. You might want to add a comment generated by the following commands before the block of SFV hashes: $ stat -L -f '; %12z %Sm %N' -t '%H:%M.%S %F' file ... all others The output line consists of four whitespace separated fields: the name of the algorithm used, the name of the file or string in parentheses, an equals sign, and the cryptographic hash of the input. Binary output consists of simply the sede- cimal hash value, read from left to right (big endian), con- verted into bytes; the length is not encoded. If no file name is specified, the standard input is used and only the crypto- graphic hash is output. In the cdb case, the output is the DJB (CDB) hash of the input string. In the oaat case, the output is Jenkins' one-at-a-time hash of the input string. The oaat1 and oaat1s hashes use 0x100 instead of 0 as initial hash value, and oaat1s also uses RFC1321-style length padding on the input string. In the bafh, nzaat or nzat case, this is a MirBSD invented hash based on oaat. bafh is also partially based on AES. In the suma case, the output is a 33-bit CRC over the file, expressed as eight sedecimal numbers. Both in- put processing and binary output are in little-endian conven- tion. In the size case, the output is a decimal unsigned 64- bit quantity denominating the size of the data read; binary representation is big-endian. The sum utility is identical to the cksum utility, except that it de- faults to using historic algorithm 1, as described below. It is provided for compatibility only. The options are as follows: -a algorithms Use the specified algorithm(s) instead of the default (cksum). Supported algorithms include adler32, bafh, cdb, cksum, md4, md5, nzat, nzaat, oaat, oaat1, oaat1s, rmd160, sfv, sha1, sha256, sha384, sha512, size, sum, suma, sysvsum, tiger, and whirlpool. Multiple algorithms may be specified, separated by a comma or whitespace. Additionally, multiple -a options may be specified on the command line. If an algorithm is repeated, only the first in- stance is used. Case is ignored when matching algorithms. -b Print the checksum as binary to stdout. -c [checklist ...] Compares all checksums contained in the file checklist with newly computed checksums for the corresponding files. Output consists of the digest used, the file name, and an OK or FAILED for the result of the comparison. This will validate any of the supported checksums. If no file is given, stdin is used. The -c option may not be used in conjunction with more than a single -a option. The checklist must either be in normal cksum format or in GNU md5sum compatible format. Verifying cksfv-style input is not sup- ported. -G Be somewhat compatible to the GNU md5sum tool in our output. This will also be invoked if this program is called as md5sum, sha1sum, ... Note that only the -b and -t options are somewhat recognised (and ignored), the -c and -w options and any GNU long options are rejected, and this output mode does not make any sense for many algorithms, such as adler32, bafh, cdb, cksum, nzat, nzaat, oaat, oaat1, oaat1s, sfv, size, sum, suma, and sysvsum. -o 1 | 2 Use historic algorithms instead of the (superior) default one (see below). -p Echoes stdin to stdout and appends the checksum to stdout. -s string Prints a checksum of the given string. -t Runs a built-in time trial. -x Runs a built-in test script. The output conforms to the NESSIE test vector format, Set 1. Algorithm 1 (aka sum) is the algorithm used by historic BSD systems as the sum algorithm and by historic AT&T System V UNIX systems as the sum algorithm when using the -r option. This is a 16-bit checksum, with a right rotation before each addition; overflow is discarded. Algorithm 2 (aka sysvsum) is the algorithm used by historic AT&T System V UNIX systems as the default sum algorithm. This is a 32-bit checksum, and is defined as follows: s = sum of all bytes; r = s % 2^16 + (s % 2^32) / 2^16; cksum = (r % 2^16) + r / 2^16; Both algorithm 1 and 2 write to the standard output the same fields as the default algorithm, except that the size of the file in bytes is re- placed with the size of the file in blocks. For historic reasons, the block size is 1024 for algorithm 1 and 512 for algorithm 2. Partial blocks are rounded up. The default CRC used is based on the polynomial used for CRC error check- ing in the networking standard ISO 8802-3: 1989. The CRC checksum encod- ing is defined by the generating polynomial: G(x) = x^32 + x^26 + x^23 + x^22 + x^16 + x^12 + x^11 + x^10 + x^8 + x^7 + x^5 + x^4 + x^2 + x + 1 Mathematically, the CRC value corresponding to a given file is defined by the following procedure: The n bits to be evaluated are considered to be the coefficients of a mod 2 polynomial M(x) of degree n-1. These n bits are the bits from the file, with the most significant bit being the most signi- ficant bit of the first octet of the file and the last bit being the least significant bit of the last octet, padded with zero bits (if necessary) to achieve an integral number of octets, followed by one or more octets representing the length of the file as a binary value, least significant octet first. The smallest number of octets capable of representing this integer are used. M(x) is multiplied by x^32 (i.e., shifted left 32 bits) and divided by G(x) using mod 2 division, producing a remainder R(x) of degree <= 31. The coefficients of R(x) are considered to be a 32-bit sequence. The bit sequence is complemented and the result is the CRC. The sfv CRC is undocumented, cf. http://www.fodder.org/cksfv/ It seems to be widely known, though, and appears to use the same polyno- mial and conventions as the (non-ADLER32) crc32 function of gzip(1). The suma CRC uses little endian 32-bit block reading conventions, ini- tialisation of the CRC with an all-ones word and a different 33-bit poly- nomial. The other available algorithms are described in their respective man pages in section 3 of the manual.
The cksum and sum utilities exit 0 on success or >0 if an error occurred.
md5(1), rmd160(1), sha1(1), stat(1), adler32(3), md4(3), md5(3), rmd160(3), sfv(3), sha1(3), sha2(3), suma(3), tiger(3), whirlpool(3) The default calculation is identical to that given in pseudo-code in the following ACM article: Dilip V. Sarwate, "Computation of Cyclic Redundancy Checks Via Table Lookup", Communications of the ACM, August 1988. http://www.cryptonessie.org/
The cksum utility is compliant with the IEEE Std 1003.2-1992 ("POSIX.2") specification. The sfv format and the comment format given above are compatible with the output generated by Bryan Call's cksfv.
A sum command appeared in Version 2 AT&T UNIX. The cksum utility appeared in 4.4BSD and has been enhanced by new algorithms in OpenBSD and several times in MirBSD.
Do not use the adler32, bafh, cdb, cksum, md4, md5, nzat, nzaat, oaat, oaat1, oaat1s, sfv, sha1, size, sum, suma, sysvsum, or tiger algorithms to detect hostile binary modifications. For most of the algorithms listed above, an attacker can trivially produce backdoored daemons which have the same checksum as the standard versions. Even md4 has been long bro- ken, collisions for md5 are published and picked up by script kiddies, and the attack used for md5 has already been successfully mounted on a reduced form of sha1. Use a cryptographically strong checksum (such as RIPEMD-160) instead, or combine two algorithms from different families, for example, rmd160, whirlpool, and, optionally, one of the CRCs. MirBSD #10-current January 27, 2022 3