Writing Ethereum Contracts the Hard Way - Part 2

August 8, 2022


Yul and deployment

In this article, we will switch to the Yul language to write contracts. This is an experimental language supported by the Solidity compiler (solc should be installed). Still very close to the assembly, but helps to avoid some repetitive tasks. For example, PUSH instructions can be replaced with direct parameters. Instead of PUSH1 0x01, PUSH1 0x02 ADD, in Yul we can write add(0x01,0x02).

Another feature we need is handling multiple code block together: in the previous post we deployed our code using PUSH13, but it doesn't work with longer code as PUSH32 is a max (remember: stack slot size is 32 bytes). Therefore, we need a more dynamic approach to deploy. The idea is to combine the deployment code and contract code together:

image-20220401141102004

CODECOPY can access the code of the executing contract, and it can copy the bytes from a certain offset to the memory:

image-20220401141151695

In assembly it would look like something like this:

image-20220401141520444

This code can be compiled (600d600c600039600d6000f3), and the real contract code can be appended to it. Therefore, the only problem is that this code depends on the size of the contract which is appended (0x0d) and the size of the deployment code (0x0c).

Yul provides a helper method, which helps to get the offset and length of any binary block (or compiled source code), but this abstraction doesn’t hide the simplicity of using EVM instructions.

The simple ADD contract in Yul can be written as the following:

Here we use the CODECOPY instruction (called as datacopy here) and having some magic helpers (dataoffset, datasize) to access the offset and size of the code in the second part (runtime).

This code can be compiled with solc --strict-assembly map.yul which prints out the code to the standard output. cethacea has a helper to parse the output and save the code to a file:

Parse parameters

Now we have just enough abstraction to continue and enhance the ADD contract with parsing parameters. We can call the contract with one binary array (--data). Therefore we need to decide how it should be parsed. Let's use the following structure:

image-20220401142331854

To support the two distinct operations (ADD and SUB) we should check the first 4 bytes. It can be done with Yulc switch statement, which uses the following EVM instructions under the hood:

  • EQcompares the top two elements of stack and returns value
  • JUMPI conditional jump to an address in the code (based on destination and condition on the stack, can use the result of EQ)
  • JUMPDEST should be the instruction on the line which is targetted by JUMPI(just to avoid naugty behavior, doesn't do anything else)

Final Yul code looks like this:

The only tricky part is the shr (right shift). We need only the first 4 bytes to identify which operation should be called, but CALLDATALOAD copies 32 bytes. Therefore, we shift the value to right to keep only the valuable 4 bytes:

image-20220401143312117

Let's try it out:

And call the contract:

Using Ethereum contract interface

While our previous contract works perfectly and can be deployed and used, it's not very convenient to type so many zeros to call it. The easiest way to solve this is following the convention of Contract Application Interface (ABI) which defines one type of encoding which is followed by the majority of smart contract tools (Solidity and others). Technically we can use any other encoding, but following ABI would help us to call our contract with other libraries, Metamask or anything else (and luckily cethacea also supports the encoding).

For our simple use case the encoding is very simple:

  1. First 4 bytes are the SHA3/Keccak hash of the method signature (eg. keccak("add(uint256,uint256)"))
  2. uint256 input is represented with the 32 bytes (ABI encoding is also based on 32 bytes words, and this number fits to it exactly)

Let's check first the hashes of our methods:

After a new deployment, we can use the convention using the cethacea contract query instead of cethacea tx submit. With --debug option we can see that it calls exactly the same API, but the input parameters are encoded based on convention.

Creating ABI json file

We can further simplify the call with defining the input/output parameters in a JSON file. Let's create a new file:

It looks very long, but it's nothing more, just the input/output type definition of our methods to make it possible to call it with the right tooling.

Now we don't need to add the parameter types, it's enough to define the abi, and it works as before:

Using persistent store

So far we always submitted new transactions to execute code in our contract, but it's not always what is intended. Contracts run in an isolated environment which has ephemeral stack and memory assigned to it (they are dropped after contract execution). But contracts can also use a persistent store which is committed to the blockchain (see SSTORE instruction).

image-20220401153758956

As a result, we can have two type of contract calls:

  1. If we plan to change the persistent state of a contract, we need to create a new transaction as we did it until now. This is where we use eth_send[Raw]Transaction api calls. A good example is ERC-20 transfer, which requires to store the new balance of the source and destination wallet.
  2. But there is another type of contract execution which doesn't change the state of the contract. This can be done without any transaction, as the persistent state of the contract is always available on the nodes and code can be executed any time. This can be done with the eth_call RPC call, which has exactly the same parameters as the eth_sendTransaction .

Our helper tool supports both of them. Transaction can be created with cethacea contract call , and read only calls can be initiated with cethacea contract query .

Let's try int with a simple contract which stores (uint256) values associated to keys (uint256):

This is almost the same as the previous contract but instead of add/sub we store the value (SSTORE) or get the value (SLOAD).

And let's try to use the query first:

Please note that the last call (get) return with value 0 even if we tried to set it to 10 in the previous line. But it was not a transaction, the persistent state was committed to the blockchain. Let's try to change the value with real transaction:

It's very important to know which method calls may have side effects. Therefore, it can also be added to the abi json:

Here view means that it reads the internal state (it was pure in our ADD contract where internal state won't be required to read). For the put we use nonpayable which means that transaction is required but without a value.

Gas cost

When we use SSTORE it requires storing new state on the blockchain, which means that all Ethereum nodes will save the state. That's an expensive operation, as Ethereum chain data already requires a lot of space. Therefore, it should be less tempting to use.

During the execution of the contract, each step has a specific gas cost. The sum of the costs (+21000 base amount for each call) is multiplied by the current gas fee (cost of one unit of gas in ETH) and paid together with the transaction (query/read only call is free).

Executing an ADD operation is not a big deal, but using STORE is, because it asks all nodes to store some data forever. Let's try out multiple scenarios:

The first call is very expensive as each node should store one more amount of data, and they hate to do it.

The second transaction is cheap as it doesn't really change the data nodes already stored. Not good, but not terrible either.

The third is slightly more expensive as nodes should change the data.

The last one is the cheapest. It requires touching the persistent store, but after that it's possible to keep a smaller amount of data (as 0 is the default for all slots it's not required to be stored).

This is the reason why ERC-20 transactions can be more expensive if the target wallet is not yet used.

Conclusion

To sum up, in the previous blog post we created a very simple smart contract, together with the deployer code which deployed our ADD contract to the blockchain. In this post we followed the path of the hard-way and enhanced the smart contract with parameter parsing.

For production use cases a higher level contract language (such as Solidity) might be a better choice. But we hope that writing contracts with low-level tools, helps the understanding of the internals of EVM, and shows its simplicity behind all the complexity.

Share this blog post

Put Storj to the test.

It’s simple to set up and start using Storj. Sign up now to get 25GB free for 30 days.
Start your trial