Rendering a PNG on Ethereum: face.png

Mon Sep 6, 2021 · 2296 words · 12 min

How do you create a PNG image dynamically in Solidity? How do you make it into a non-fungible token? Why would you create such an abomination? Why are they blurry in safari?

The answer to some of these questions below!

at the time of writing, there are still ~40 left on facepng.art

Motivation

Why a PNG?

Everyone is making NFTs that generate their artwork on chain using SVG, but nobody is making ones that generate PNGs onchain.
— nick.eth (@nicksdjohnson) August 27, 2021

This is the tweet, right here.

I'm not sure if Nick knows it, but he nerd sniped me out of my entire weekend.

Why an NFT?

All my friends and colleagues were getting cool non-fungible tokens (NFTs) and using them as profile pictures on Twitter and GitHub and I felt left out, but I really like my profile picture:

o_0

I wanted to have my cake and eat it too: participating in the NFT hype, and not giving up my "brand".

Making a one-of-one collectible is pretty boring though, so this one is generative! Nothing too fancy, just choosing a couple properties and combining them, which is a good exercise for doing the encoding on-chain anyway.

So I settled on a small run of procedurally generated character art faces, with all the metadata and image data stored and calculated on-chain. Because I suffer from chronic laziness, I picked out seven eyes and three noses, giving a total supply of 147 faces. Not exactly a huge collection, but that's half the fun of collectibles anyway. To keep things recognizable, each face is unique based on the characters (and ignores the colors.)

Implementation

Anatomy of a PNG Image

Portable Network Graphics (PNG) images are a mainstay of the internet. libpng maintains a specification of the format that goes into a lot more depth, and if you're extremely interested in the fine details, I'd recommend reading it. The following hack job is my attempt to cover the parts important to generating the simplest PNG image I could think of, but I'll try to link to the relevant portions of the specification as we go along.

File Signature

The first eight bytes of the file identify the file as a PNG image¹:

+----+----+----+----+----+----+----+----+---
| 89 | 50 | 4e | 47 | 0d | 0a | 1a | 0a |
+----+----+----+----+----+----+----+----+---

There's actually some really cool tricks in the signature to detect various forms of corruption and transmission errors. For our purposes, the signature is just a static array of eight bytes to copy.

Chunks!

After the signature the PNG file is divided into chunks.² Each chunk has a four byte length, four byte type, some data, and a four byte cyclic redundancy check (CRC):

---+--+--+--+--+--+--+--+--+--===--+--+--+--+--+---
   | Length    | Type      | Data  | CRC       |
---+--+--+--+--+--+--+--+--+--===--+--+--+--+--+---

There are four chunks we need to pay attention to: IHDR, PLTE, IDAT, and IEND.

`IHDR` Chunk

The IHDR chunk, which I assume means image header, immediately follows the signature and contains metadata about the image itself:

---+--+--+--+--+--+--+--+--+---+---+---+---+---+---
   | Width     | Height    | D | C | Z | F | I |
---+--+--+--+--+--+--+--+--+---+---+---+---+---+---

D: Bit Depth
C: Color Type
Z: Compression Method
F: Filter Method
I: Interlace Method

Width and height are pretty self-explanatory, bit depth is the number of bits per sample (or palette index), color type determines how colors are represented in the image data, compression method is always zero (for DEFLATE³), filter method chooses a transform to apply to the image data before compression, and the interlace method indicates how the data is ordered (so it can be rendered while being transferred.)

For our purposes, filter⁴ and interlace⁵ will always be zero for the default filter and no interlacing respectively. Since the images I want to encode are fairly simple—basically pixel art—we can use the palette (3) color type with a bit depth of 1.

Width = 48
Height = 48
Bit Depth = 1
Color Type = 3
Compression Method = 0
Filter Method = 0
Interlace Method = 0

With the chunk header and CRC, the IHDR chunk ends up as another static block of bytes:

---++----+----+----+----++----+----+----+----++----+----+----+----++----+----+----+----++----++----++----++----++----++-------------------++---
   || Length            || Type              || Width             || Height            || D  || C  || Z  || F  || I  || CRC               ||
---++----+----+----+----++----+----+----+----++----+----+----+----++----+----+----+----++----++----++----++----++----++----+----+----+----++---
   || 00 | 00 | 00 | 0d || 49 | 48 | 44 | 52 || 00 | 00 | 00 | 30 || 00 | 00 | 00 | 30 || 01 || 03 || 00 || 00 || 00 || 6d | cc | 6b | c4 ||
---++----+----+----+----++----+----+----+----++----+----+----+----++----+----+----+----++----++----++----++----++----++----+----+----+----++---

`PLTE` Chunk

The PLTE chunk⁶ holds an array of colors which can be referenced by index in the image data. Interestingly, this chunk is our first bit of dynamic data that needs to be constructed at runtime.

Inside the PLTE chunk's data, we have each color encoded as three values: red, green, and blue. In face.png, we have only a foreground and a background color, so the PLTE length will be six.

Color is a generated attribute, so we can't just hard code the PLTE chunk, it has to be calculated on the fly.

First the static portion:

bytes constant private HEADER =
    hex"89504e470d0a1a0a"                                       // PNG Signature
    hex"0000000d49484452000000300000003001030000006dcc6bc4"     // IHDR Chunk
    hex"00000006504c5445";                                      // PLTE Chunk (Partial)

source

Then we copy the colors, and compute the checksum:

uint offset = 0;

// Copy the static portion of the header.
for (uint ii = 0; ii < HEADER.length; ii++) {
    output[offset++] = HEADER[ii];
}

// Copy the background color.
for (uint ii = 0; ii < bg.length; ii++) {
    output[offset++] = bg[ii];
}

// Copy the foreground color.
for (uint ii = 0; ii < fg.length; ii++) {
    output[offset++] = fg[ii];
}

// Compute the palette's checksum.
output.crc32(HEADER.length - 4, offset);
offset += 4;

source

Finally, the checksum implementation itself:

library Crc32 {
    bytes constant private TABLE = hex"[snip]";

    function table(uint index) private pure returns (uint32) {
        unchecked {
            index *= 4;

            uint32 result =
                uint32(uint8(TABLE[index    ])) << 24;
            result |= uint32(uint8(TABLE[index + 1])) << 16;
            result |= uint32(uint8(TABLE[index + 2])) << 8;
            result |= uint32(uint8(TABLE[index + 3]));
            return result;
        }
    }

    function crc32(bytes memory self, uint offset, uint end) internal pure {
        unchecked {
            uint32 crc = ~uint32(0);

            for (uint ii = offset; ii < end; ii++) {
                crc = (crc >> 8) ^ table((crc & 0xff) ^ uint8(self[ii]));
            }

            crc = ~crc;

            self[end    ] = bytes1(uint8(crc >> 24));
            self[end + 1] = bytes1(uint8(crc >> 16));
            self[end + 2] = bytes1(uint8(crc >> 8));
            self[end + 3] = bytes1(uint8(crc));
        }
    }
}

source

Length, identifier, bunch of colors, and a checksum. That's about it for the palette.

`IDAT` Chunk

The IDAT (or image data) chunk⁷ is where the real magic happens. In a more traditional PNG image, the encoder does some fancy magic to figure out what filters⁸ to apply, then compresses the data, and writes it out.

In our grossly simplified encoder we wrap the image data in a zlib stream without any compression:

---++----++----++-===--++----+----+----+----++---
   || M  || F  || Data || Adler32           ||
---++----++----++-===--++----+----+----+----++---
   || 78 || 01 || ...  || ?? | ?? | ?? | ?? ||
---++----++----++-===--++----+----+----+----++---

M: Compression method/flags code
F: Additional flags/check bits

The image data itself is 48 rows of 7 bytes. The first byte of each row is the filter mode, which is always zero for face.png. The remaining 6 bytes are the actual image data, which is divided into three chunks—the left eye, nose, and right eye. Each bit of the image data represents an index into the palette from the PLTE chunk, and here a zero is the background, and a one is the foreground.

Each segment is encoded as a byte array constant in Solidity:

bytes constant private EYES_CRY =
    hex"0000"
    hex"0000"
    hex"0000"
    hex"0000"
    hex"0000"
    hex"0000"
    hex"0000"
    hex"0000"
    hex"0000"
    hex"0000"
    hex"0004"
    hex"0004"
    hex"0002"
    hex"0002"
    hex"0002"
    hex"0006"
    hex"3ffe"
    hex"7ffc"
    hex"0630"
    hex"1818"
    hex"108c"
    hex"31c6"
    hex"21c6"
    hex"6086"
    hex"6006"
    hex"6006"
    hex"310e"
    hex"3ffc"
    hex"1e78"
    hex"0000"
    hex"0100"
    hex"0180"
    hex"0380"
    hex"0380"
    hex"0100"
    hex"0000"
    hex"0000"
    hex"0000"
    hex"0000"
    hex"0000"
    hex"0000"
    hex"0000"
    hex"0000"
    hex"0000"
    hex"0000"
    hex"0000"
    hex"0000"
    hex"0000";

source

EYES_CRY would render into something like:

Concatenate together each row of each template image, and you have your image data!

function render(bytes memory output, uint offset, uint8 leftEyeIndex, uint8 noseIndex, uint8 rightEyeIndex) private pure {
    unchecked {
        bytes memory sprite;

        sprite = eye(leftEyeIndex);

        for (uint line = 0; line < LINES; line++) {
            uint inOffset = line * SPRITE_LINE_BYTES;
            uint outOffset = 1 + (line * (WIDTH_BYTES + 1));

            for (uint column = 0; column < SPRITE_LINE_BYTES; column++) {
                output[offset + outOffset + column] = sprite[inOffset + column];
            }
        }

        sprite = nose(noseIndex);

        for (uint line = 0; line < LINES; line++) {
            uint inOffset = line * SPRITE_LINE_BYTES;
            uint outOffset = 1 + SPRITE_LINE_BYTES + (line * (WIDTH_BYTES + 1));

            for (uint column = 0; column < SPRITE_LINE_BYTES; column++) {
                output[offset + outOffset + column] = sprite[inOffset + column];
            }
        }

        sprite = eye(rightEyeIndex);

        for (uint line = 0; line < LINES; line++) {
            uint inOffset = line * SPRITE_LINE_BYTES;
            uint outOffset = 1 + (2 * SPRITE_LINE_BYTES) + (line * (WIDTH_BYTES + 1));

            for (uint column = 0; column < SPRITE_LINE_BYTES; column++) {
                output[offset + outOffset + column] = sprite[inOffset + column];
            }
        }
    }
}

source

Since the "compressed" and original data are the same, we can compute the [adler32][adler] checksum over the same bytes. The adler32 checksum in Solidity looks something like:

library Adler32 {
    uint32 constant private MOD = 65521;

    function adler32(bytes memory self, uint offset, uint end) internal pure {
        unchecked {
            uint32 a = 1;
            uint32 b = 0;

            // Process each byte of the data in order
            for (uint ii = offset; ii < end; ii++) {
                    a = (a + uint32(uint8(self[ii]))) % MOD;
                    b = (b + a) % MOD;
            }

            uint32 adler = (b << 16) | a;

            self[end    ] = bytes1(uint8(adler >> 24));
            self[end + 1] = bytes1(uint8(adler >> 16));
            self[end + 2] = bytes1(uint8(adler >> 8));
            self[end + 3] = bytes1(uint8(adler));
        }
    }
}

source

Then to finish off the IDAT chunk, we compute the crc32 checksum the same way we did for the PLTE chunk.

`IEND` Chunk

The IEND, or image trailer, chunk⁹ finishes off the rendered PNG. This chunk is just a constant:

bytes constant private TRAILER = hex"0000000049454e44ae426082"; // IEND Chunk

source

That's basically an entire PNG rendered dynamically on-chain.

We did take a couple shortcuts:

Fixed size/bit depth/color format
No compression
Only two colors

But technically these are valid PNG images!

Challenges

There were some interesting challenges that came up, some in tooling, some in implementation, and some in the marketplace!

Tooling Issues

While remix and solc have no trouble compiling render.sol, hardhat seems to have a memory leak and crashes.

The JavaScript VM in remix will hang if you attempt to render a PNG, at least in Firefox. I suspect that all the BigNumber math is messing things up.

Implementation Issues

I am not a Solidity developer by trade, so a lot of this was new to me. Having to manually implement slices and memcpy-type primitives was tedious and I'm sure I made mistakes.

I'd really like to see full support for slices and slice copies, so that it's possible to write something like:

contract Demo {
    bytes constant FOO = hex"......";

    function copy(uint offset) private pure returns (bytes memory) {
        return FOO[offset : offset + 3];
    }
}

Marketplace Issues

While I might write another post on the topic of ERC-721 metadata, one of the biggest and most annoying issues I encountered was in the encoding of images into a data URL (yes, data URL), nested in JSON, also encoded into another data URL.

The primary marketplace I used to test face.png is OpenSea. OpenSea is a pretty decent website, which is why I was quite surprised they don't support rendering PNG images in data URLs (eg. data:image/png;base64,...) directly, but do support PNG images embedded in an SVG image.

If you're curious why you get an SVG when you Save As... your favourite face.pngs from OpenSea, this is why.

So we have a PNG, encoded into base64, packed into an SVG, encoded into base64, packed into JSON, ALSO ENCODED INTO BASE64!

yo dawg

`IEND` (The End)

Thanks for taking the time to read about my wasted (but quite enjoyable) weekend! If you bought a token, I really appreciate it, but don't quite understand why.

If this post gets some attention, I might write a post on the token itself, on the random number algorithm I used for attributes, and possibly even on the terrible tooling I built around this project. Stay tuned, and happy hacking!

facepng.art if you want a face of your own