Inputs:
password (P): Bytes (0..232-1) Password (or message) to be hashed
salt (S): Bytes (8..232-1) Salt (16 bytes recommended for password hashing)
parallelism (p): Number (1..224-1) Degree of parallelism (i.e. number of threads)
tagLength (T): Number (4..232-1) Desired number of returned bytes
memorySizeKB (m): Number (8p..232-1) Amount of memory (in kibibytes) to use
iterations (t): Number (1..232-1) Number of iterations to perform
version (v): Number (0x13) The current version is 0x13 (19 decimal)
key (K): Bytes (0..232-1) Optional key (Errata: PDF says 0..32 bytes, RFC says 0..232 bytes)
associatedData (X): Bytes (0..232-1) Optional arbitrary extra data
hashType (y): Number (0=Argon2d, 1=Argon2i, 2=Argon2id)
Output:
tag: Bytes (tagLength) The resulting generated bytes, tagLength bytes long
Generate initial 64-byte block H0.
All the input parameters are concatenated and input as a source of additional entropy.
Errata: RFC says H0 is 64-bits; PDF says H0 is 64-bytes.
Errata: RFC says the Hash is H^, the PDF says it's ℋ (but doesn't document what ℋ is). It's actually Blake2b.
Variable length items are prepended with their length as 32-bit little-endian integers.
buffer ← parallelism ∥ tagLength ∥ memorySizeKB ∥ iterations ∥ version ∥ hashType
∥ Length(password) ∥ Password
∥ Length(salt) ∥ salt
∥ Length(key) ∥ key
∥ Length(associatedData) ∥ associatedData
H0 ← Blake2b(buffer, 64) //default hash size of Blake2b is 64-bytes
对于输入参数并行程度p来说,需要将内存分成一个内存矩阵B[i][j], 它是一个 p 行的矩阵。
计算矩阵B的值:
其中H′ 是一个基于H的变长hash算法。
我们给一下这个算法的实现:
Function Hash(message, digestSize)
Inputs:
message: Bytes (0..232-1) Message to be hashed
digestSize: Integer (1..232) Desired number of bytes to be returned
Output:
digest: Bytes (digestSize) The resulting generated bytes, digestSize bytes long
Hash is a variable-length hash function, built using Blake2b, capable of generating
digests up to 232 bytes.
If the requested digestSize is 64-bytes or lower, then we use Blake2b directly
if (digestSize <= 64) then
return Blake2b(digestSize ∥ message, digestSize) //concatenate 32-bit little endian digestSize with the message bytes
For desired hashes over 64-bytes (e.g. 1024 bytes for Argon2 blocks),
we use Blake2b to generate twice the number of needed 64-byte blocks,
and then only use 32-bytes from each block
Calculate the number of whole blocks (knowing we're only going to use 32-bytes from each)
r ← Ceil(digestSize/32)-1;
Generate r whole blocks.
Initial block is generated from message
V1 ← Blake2b(digestSize ∥ message, 64);
Subsequent blocks are generated from previous blocks
for i ← 2 to r do
Vi ← Blake2b(Vi-1, 64)
Generate the final (possibly partial) block
partialBytesNeeded ← digestSize – 32*r;
Vr+1 ← Blake2b(Vr, partialBytesNeeded)
Concatenate the first 32-bytes of each block Vi
(except the possibly partial last block, which we take the whole thing)
Let Ai represent the lower 32-bytes of block Vi
return A1 ∥ A2 ∥ ... ∥ Ar ∥ Vr+1
Calculate number of 1 KB blocks by rounding down memorySizeKB to the nearest multiple of 4*parallelism kibibytes
blockCount ← Floor(memorySizeKB, 4*parallelism)
Allocate two-dimensional array of 1 KiB blocks (parallelism rows x columnCount columns)
columnCount ← blockCount / parallelism; //In the RFC, columnCount is referred to as q
Compute the first and second block (i.e. column zero and one ) of each lane (i.e. row)
for i ← 0 to parallelism-1 do for each row
Bi[0] ← Hash(H0 ∥ 0 ∥ i, 1024) //Generate a 1024-byte digest
Bi[1] ← Hash(H0 ∥ 1 ∥ i, 1024) //Generate a 1024-byte digest
Compute remaining columns of each lane
for i ← 0 to parallelism-1 do //for each row
for j ← 2 to columnCount-1 do //for each subsequent column
//i' and j' indexes depend if it's Argon2i, Argon2d, or Argon2id (See section 3.4)
i′, j′ ← GetBlockIndexes(i, j) //the GetBlockIndexes function is not defined
Bi[j] = G(Bi[j-1], Bi′[j′]) //the G hash function is not defined
Further passes when iterations > 1
for nIteration ← 2 to iterations do
for i ← 0 to parallelism-1 do for each row
for j ← 0 to columnCount-1 do //for each subsequent column
//i' and j' indexes depend if it's Argon2i, Argon2d, or Argon2id (See section 3.4)
i′, j′ ← GetBlockIndexes(i, j)
if j == 0 then
Bi[0] = Bi[0] xor G(Bi[columnCount-1], Bi′[j′])
else
Bi[j] = Bi[j] xor G(Bi[j-1], Bi′[j′])
Compute final block C as the XOR of the last column of each row
C ← B0[columnCount-1]
for i ← 1 to parallelism-1 do
C ← C xor Bi[columnCount-1]
Compute output tag
return Hash(C, tagLength)