# Level to HN: Reverse engineering GitHub’s identicon algorithm

Silly small instrument to reverse-design Github identicons to particular person accounts.

Constructed by inspecting dgraham/identicon and stewartlord/identicon.js, and factual doing the reverse of what they’re doing.

### How does it work?

Right here is how the true algorithm generates the graphic:

1. It hashes the particular person identification. (The string model of it! Not the integer.)

2. Then nibbles the predominant 8 chunks of the hash digest in 4 bit increments. A bit is 8 bits, so 8 chunks invent 16 nibbles.

3. It uses the parity bit of every nibble (aka is it even or uncommon?) to deem whether or no longer a particular pixel in the 5×5 avatar is stuffed in or no longer. It begins with the heart column and moves outwards, from top to bottom. The first and the 2nd column are reflected alongside the third column, so it is in fact best the exercise of 3×5 nibbles.

Assuming a digest of the achieve `AB CD EF GH IJ KL MN OP`, it paints pixels in the next allege:

``````      commence
▼
Okay  F  A  F′ Okay′
L  G  B  G′ L′
M  H  C  H′ M′
N  I  D  I′ N′
O  J  E  J′ O′
▲
live
``````

The `P` nibble is unused. `F′` to `O′` are factual mirrors of their non-top values.

4. It also does some rgb/hsl math with the decrease 28 bits of the digest to carry a coloration for the pixels, but I didn’t bother studying how that works. Maybe that you just would possibly perchance perchance well presumably also figure it out and add give a enhance to.

And so this program factual implements the reverse of that. It computes the md5 for every number and tests that against the particular person-equipped bit string. The bit string is represented as a html grid, and transformed to a digest sample at compute time.

Read the Python code, it is less complicated to practice: `identicon.py`. (I grow to be in the origin going to exercise Pyodide for this, but that didn’t in fact determine. The code is gentle there factual in case.)

##### Right here is an instance.

This walks via buying for a particular person identification that yields the heart avatar from Jason’s blog put up.

Assuming 0s portray pixels which will doubtless be stuffed in and 1s portray pixels that achieve no longer appear to be, the initial grid is:

``````1 0 0 0 1
0 0 0 0 0
0 1 1 1 0
1 0 0 0 1
0 0 1 0 0
``````

or, since we best care about the predominant 3 columns:

``````1 0 0
0 0 0
0 1 1
1 0 0
0 0 1
``````

Loosely, this suggests we’re searching for a digest of the achieve `00 10 10 01 00 10 01 0-`, following the `A` to `O` instance from above.

Namely, nonetheless, it intention that we’re searching for integers with MD5
digests whose 8 most important chunks resemble the next, in binary:

``````xxx0xxx0
xxx1xxx0
xxx1xxx0
xxx0xxx1
xxx0xxx0
xxx1xxx0
xxx0xxx1
xxx0xxxx
``````

We best care about the parity bit for every nibble, so the `x`s shall be something else.

###### And now taking the number 2013 as an illustration:
>> for byte in h.digest()[:8]:
… print(layout(byte, “08b”)) # pad to 8 bits
10000000
00111000
11011010
10001001
11100100
10011010
11000101
11101010″>

```>>> import hashlib
>>> h = hashlib.md5(b"2013")
>>> for byte in h.digest()[:8]:
...     print(layout(byte, "08b")) # pad to 8 bits
10000000
00111000
11011010
10001001
11100100
10011010
11000101
11101010```

The output suits the sample from above! So theoretically, the account with
an identification of 2013 will must have a robotic as its default avatar. And it does: jeffsmith.png! (the username mapping occurs here)