One of the mantras of crypto is:

Don’t trust. Verify

But, what verify means? When we talk about blockchain, it means to start at the beginning of the blockchain (the genesis block) and to pull every transaction ever recorded on blocks from peers you are connected at. Then, to execute every transaction to calculate the intermediate states (UTXO in Bitcoin or accounts state in Ethereum) and repeat it until you reach the latest block. This process is called synchronization of a full validated (full node in Bitcoin) node.

However, you could remove old states and blocks (pruning mode) and still being a full validated node, as you have validated all the blockchain to reach the latest state. You haven’t trusted anyone.

Lately this process has been quite expensive for Ethereum due to DoS attacks and the deployment of arbitrary code (smart contracts has to be executed in order to move from one state to another). So, what requirements in terms of hardware are needed to run a full validated Ethereum node and to be able to perform the verification without trusting anyone?

Monitoring the synchronization process

I have chosen Parity because it is the only client that is able to validate the full blockchain and prune the old state on the fly (and avoid to store +1TB of data). Geth is working on it but it’s not done yet.

Following the command line executed:

parity -no-warp -cache-size=4096 -db-compaction=ssd -scale-verifiers -num-verifiers=2

The sync process started at: Sep 19, 2018 14:13:22
The sync process ended at: Sep 23, 2018 17:38:24

It take about 4days to validate all the Ethereum blockchain. It’s not that much. But, what was the hardware consumption?

I have used Collectd + Influxdb + Grafana to monitor this consumption.

Consider it random writes!

You can see the minimum, maximum and average.

We can see that Parity makes an intensive use of syscalls

As you can see, there are a lot of syscalls (CPU interruptions) per second.

Intel(R) Core(TM) i7–2600 CPU @ 3.40GHz

Despite syscalls, the CPU seems to be OK. Maybe it will be a problem if the client shares the CPU resource with others (in fact, I had to moved from a LPAR cloud provider due to CPU interrupt abuse).

From a total of 16GB

(The y-axis label is incorrect. It is blocks per SECOND). At the beginning, the rate is very high. If we get the average rate removing the first moments:

(The y-axis label is incorrect. It is blocks per SECOND). If Ethereum miners find a block, in average, every 14 second, it means ~4.29 blocks per minute. I have got an average of 5.67 blocks per second (340.2 blocks per minute) so I would expect to sync a full validated node with less resources but expending more time.

Conclusions

  • To get a full validated Ethereum node without storing +1TB of SSD, we need a client permitting pruning while is syncing the full blockchain
  • It makes an intensive use of random writes to disk. It’s a lot included for a SSD
  • It makes an intensive use of syscalls which interrupt the CPU a lot
  • A SSD able to perform: 68 MB/s of random writes and 30.9 MB/s of randoms reads on average. +112GB of capacity (24/09/2018).
  • 13–14GB of RAM
  • A CPU able to handle a lot of interrupts
  • These requirements are not the minimums. With the above hardware, you will get a block synced per minute average of 340.2. You will eventually get synced with a rate of 4.29 and above but it will take more time
  • The Ethereum community is aware about these requirements and is trying to build a more efficient client. Check out the work of Alexey Akhunov https://www.youtube.com/watch?v=kJi77aV7Fk0 and https://medium.com/@akhounov/turbo-geth-beta-constantinopole-tests-fda38cbe87a

Future work

The main goal of this work is know the MINIMUM required hardware to run a full validated node. I have shared a valid configuration to start. However, I will restrict these resources to get the minimum and share the results again. (SECOND PART RELEASED: here)

Leave a comment

Your email address will not be published. Required fields are marked *