Solidity bytecode metadata


img

I’m not a smart contract developer but in my work, I’ve spent quite a lot of time reading smart contracts written in Solidity. I’ve spent even more time inspecting bytecodes generated by the Solidity programming language. The most frustrating thing for me working with Solidity is that there is no single place which describes all changes between different versions of Solidity. For all I know there is documentation in the official Solidity repo describing different parts of the language but they are kept up to date only for the latest version of Solidity. So if you’re using not the latest version there is a possibility that some features are changed in the latest version of Solidity and you have to dig through CHANGELOG to find the logic of these features in your current version.

The reason that motivated me to write this post is metadata encoding in Solidity bytecode. Theoretically, you can use this page which describes metadata encoding of the specified version but I wouldn’t be sure that it’s up to date. It already changed two times in recent minor releases of Solidity but the bytecode metadata documentation describes only the last version of Solidity (they didn’t update it in the last release, I created PR with update myself. Also, this file exists only for versions starting with 0.4.17.

In this post, I’ll describe how metadata encoding changed through different versions of Solidity. The information that described here is found mostly by trial and error inspecting bytecode and confirming guesses reading the Changelog of Solidity. If information is not full or it has errors, please suggest corrections and improvements.

Metadata encoding in different versions for Solidity

Just to give some background to those who don’t know how bytecode is structured. It consists of three parts:

  • Bytecode
  • Encoded metadata
  • Encoded constructor arguments

All these parts are appended together:

bytecode + encoded_metadata + encoded_constructor_arguments

Early versions of Solidity

It looks like earlier versions of Solidity didn’t have metadata in the bytecode. So constructor arguments were directly appended to bytecode.

Solidity ~ 0.4.17

Starting from this version (around the version) Solidity compiler started adding whisper hash encoded using CBOR:

“a165627a7a72305820” + whisper_hash + “0029”

Solidity >= 0.5.10

Starting from this version Solidity compiler along with whisper hash started encoding a compiler version.

“a265627a7a72305820” + whisper_hash + “64736f6c6343” + compiler_version + “0032”

Solidity >= 0.5.11

Starting from this version they bumped whisper version in the CBOR encoding so the only change is at the beginning of the encoding.

“a265627a7a72315820” + whisper_hash + “64736f6c6343” + compiler_version + “0032”