Casa Blog - Bitcoin Security Made Easy

In order for a Bitcoin wallet to perform basic functions such as receiving and sending transactions, the wallet needs to communicate with the Bitcoin network. This communication requires connecting to one or more nodes on the network. There are a variety of methods to choose from such as Simplified Payment Verification (SPV), Bitcore, Electrum, and so on. I discussed the options in detail in this article:

Securing Your Financial Sovereignty
2017 is turning out to be the year of the airdropped Bitcoin fork. First Bitcoin Cash, then Bitcoin Gold, then SegWit2X. Now the ecosystem…

However, since writing that article SPV has been deprecated for several reasons such as its poor privacy properties and DoS vulnerabilities. As such, the next most mature protocol still standing is Electrum. Casa engineers recently rewrote our back end infrastructure to use Electrum - in order to architect the best possible solution we needed to investigate how the different server implementations compared in terms of performance. The following report will walk you through the entire process we undertook.

The contenders

ElectrumX - https://github.com/spesmilo/electrumx

The second iteration of Electrum server implementations, this project was started in late 2016. Until 2019 it was the only option available for people to run.

Electrs - https://github.com/romanz/electrs

Given the resource requirements of ElectrumX, this project was started specifically to be an efficient re-implementation of Electrum Server in Rust. It has become quite popular to run Electrs on Raspberry Pi-based nodes.

Electrs generally acts as a proxy and forwards many queries to Bitcoin Core to actually serve data.

Esplora Electrs - https://github.com/Blockstream/electrs

Blockstream was looking to run an electrum server to power their public Esplora instance at blockstream.info, but Electrs was not indexing data well enough to serve enterprise level queries. Blockstream's fork builds extended indexes and database storage for improved performance under high load.

With these new indexes, Bitcoin Core is no longer queried to serve user requests and is only polled periodically for new blocks and for syncing the mempool.

Not Applicable

Electrum-Server - https://github.com/spesmilo/electrum-server

The original Electrum server, it was the only implementation from 2012 to late 2016. The devs stopped maintaining it in 2017 and handed the reins over to ElectrumX.

Electrum Personal Server - https://github.com/chris-belcher/electrum-personal-server

This is basically a stripped-down Electrum server that is only designed to be used by a single user; it only accepts one connected client at a time. We don't find it worth performance testing since it's not designed to be highly performant.

Bitcoin Wallet Tracker - https://github.com/shesek/bwt#electrum-plugin

This is basically like Electrum Personal Server, but based upon Electrs and designed as an easily installed plugin for the Electrum Client. We don't find it worth performance testing since it's not designed to be highly performant.


Testing pains

Unfortunately we ran into several issues with electrs while testing it in our development environment. We hit one issue where electrs wouldn't shut down as expected:

SIGINT doesn’t stop gracefully · Issue #219 · romanz/electrs
electrs got stuck with the following output after I tried to stop it with sigint. I had to go and kill the process manually. I believe I've seen this before, although it doesn't happen ever...

Memory leaks were causing our server to become unresponsive:

Clean up RPC handles when connections are closed by champo · Pull Request #195 · romanz/electrs
Fixes #188We have a setup with 2 electrs running behind a TCP load balancer. The LB do a regular health check to electrs RPC port by seeing if they can open a socket, as soon as it's establish...

After those issues were patched, we then found ourselves having to work our way through electrum client challenges as we tuned our performance testing setup.

We initially wrote our test script to use this node-electrum-client npm package; however it appears to be abandoned. The repository has not been updated in over 2 years, including a pull request from that time with an unaddressed comment that indicates development was taken elsewhere.

The pull request in question was attempting to add "persistence" for connecting to Electrum servers using a ping strategy, which could potentially fix issues we had with sockets hanging. All of the electrum client forks start by merging that PR or equivalent code. We inspected all 47 forks of the "node electrum client" repository and found 4 that appear well maintained:

electrum-client-js - https://github.com/keep-network/electrum-client-js

  • pros: subscriptions, websockets, persistence, many examples, clean commit history, some tests & linting
  • cons: no batching, messy refactor, persistence support may be unpolished

electrumjs - https://github.com/randomnerd/electrumjs

  • pros: subscriptions, well maintained & documented, some tests & linting
  • cons: no batching, messy commit history & refactor

rn-electrum-client - https://github.com/photon-sdk/rn-electrum-client

  • pros: persistence, batch requests, recently maintained
  • cons: extra ReactNative support we didn't need, no tests

ReactNativeElectrumClient - https://github.com/on-ramp/ReactNativeElectrumClient

  • pros: subscriptions
  • cons: extra ReactNative support we didn't need, no tests

For simplicity, we ended up forking the old unmaintained client and added:

  • A timeout to all calls to protect our client from hanging
  • Socket pooling to reduce the overhead of opening and closing sockets

Test environment

To ensure consistency of results, we ran all of our tests on the same exact AWS instance with the following properties.

AWS EC2 instance type: c5.xlarge (4 vCPU, 8GB RAM)
AWS EBS volumes:

  • 8 GB gp2 (100 IOPS) / (Operating System)
  • 600 GB gp2 (1800 IOPS) /mnt/bitcoin (used for Bitcoin Core data)
  • 1000 GB gp2 (3000 IOPS) /mnt/data (used for Electrum server data)

The Electrum server and Bitcoin Core were fully synced at the start of each test. During each set of tests, the only processes running in addition to normal OS processes were:

  • bitcoind
  • electrum daemon
  • nodejs test script

It's worth noting that the entire test environment was running on a single machine, thus no data was sent over a network when making the queries we were timing. In real-world applications your perceived performance will be lower due to the results of network latency.

Test methodology

We're interested in understanding the performance characteristics of each implementation with regard to common queries that would be made for a wallet. Thus, for a wide variety of bitcoin addresses (script hashes) with different characteristics, we chose to benchmark the following queries:

  • Get address balance
  • Get entire list of transactions
  • Get entire list of UTXOs

We noticed that for each implementation the first test run would be substantially slower than subsequent test runs, likely due to caching at various layers of the application stack. Thus, in order to improve the precision of our results we decided to throw out the data from run 1, then we ran 9 more passes and took the median timing result for each address.

Each test run was performed sequentially; we did not test support for handling multithreaded request loads. We expect that the results of the single threaded tests make it quite clear which implementations would be best suited for multithreaded applications.

Test Input Data

All of the bitcoin addresses that received funds between block heights 599900 and 600100 were dumped to a list. They were then filtered to obtain the set of addresses that had their balances affected by between 10 and 999 transactions - initial experiments showed this range to be most interesting. This provided us with a list of 103,000 addresses that was used as the main data set for our tests.

When the first test run against electrs was halfway done, we realized that if we tried to finish 10 runs of the 103,000 address test set against electrs, it would take weeks to complete. As such, we filtered down the addresses again to a reduced set of 57,000 that had their balances affected by between 10 and 100 transactions and performed our multiple passes of tests against that address set. Thus on the resulting charts you'll see that there is far less data for Electrs, but we believe it is sufficient to draw conclusions about overall performance.

Implementation notes

ElectrumX

  • Data directory size: 50 GB
  • Performance was I/O bound, constrained by data volume read speed
  • Initial index build took multiple days
  • Requires txindex=1 enabled in Bitcoin Core

Config adjustments:

  • COST_SOFT_LIMIT = 0
  • COST_HARD_LIMIT = 0
  • CACHE_MB = 2048

Electrs (romanz)

  • Data directory size: 60 GB
  • Performance was not I/O bound (reached 30% utilization), but adjusting threads and caching met diminishing returns
  • Initial index build only took a few hours
  • Doesn't require txindex enabled in Bitcoin Core

Config adjustments:

  • bulk_index_threads = 12
  • tx_cache_size_mb = 4096
  • blocktxids_cache_size_mb = 4096
  • txid_limit = 1000

Esplora Electrs (Blockstream)

  • Data directory size: 591 GB
  • I/O bound, constrained by data volume read speed
  • Initial index build took multiple days
  • Doesn't require txindex enabled in Bitcoin Core

Config adjustments:

  • --electrum-txs-limit 1000

We also noted that Esplora Electrs sometimes took nearly 30 minutes to start up, and logging only seems to work at error or debug levels.


Performance test results

Time to return address balance versus total number of transactions affecting the address balance

Our first set of results is pretty straightforward: how long does it take to get the balance of a given bitcoin address? It becomes immediately obvious that Electrs is orders of magnitude slower than Esplora and ElectrumX, which are barely legible on this chart. We can also see that Electrs is scaling within a fairly well-defined best and worst case range. Even though we only tested for addresses that had up to 100 transactions worth of history affecting them, it's quite clear what performance to expect from addresses with 500 or 1000 transactions.

Time to return address balance versus total number of transactions affecting the address balance

Let's dig deeper into comparing Esplora with ElectrumX. By throwing out the Electrs results we can zoom in on this much closer race. We can also see that ElectrumX is scaling linearly within a very predictable best and worst case range. But what's going on with Esplora?

The answer lies in the schema.rs file:

Basically, Esplora will not bother to cache data unless an address has a certain number of transactions affecting its balance. This means it just queries bitcoind on the fly to get data for addresses with low activity. Thus we end up with an interesting trade-off in performance characteristics! ElectrumX performs better for addresses with < 100 transactions while Esplora performs better for addresses with > 100 transactions. And while ElectrumX's worst-case performance appears to slow down at a linear rate, Esplora appears to have a more performant constant-time caching system.

These are certainly considerations worth taking into account based upon the common characteristics of addresses in your system. It's also worth noting that you could always change the MIN_HISTORY_ITEMS_TO_CACHE setting for your Esplora instance.

Analysis of the performance of balance lookups against total transaction history showed explainable trend characteristics. But what about comparing balance lookups against total current UTXOs?

If you're familiar with how bitcoin transaction data is structured then you're aware that there is no "address balance" data - rather, an address balance is calculated by summing up the values of all the UTXOs that are encumbered by the locking script that is represented by that address. In terms of query performance we should expect to see a well-defined relationship between current number of UTXOs and time to calculate their balance, while the total number of transactions affecting an address should be more weakly correlated to balance query performance.

Time to return address balance versus current number of UTXOs affecting the address balance

That being said, the Electrs results don't match our expectations. While we can see a clear lower bound on the best case query performance, the results are generally all over the place. I presume that this is because Electrs does not index UTXOs by address, thus it is forced to query bitcoind for an address' entire transaction history and then find the UTXOs. Once again, Electrs is so much slower than the other implementations that we need to remove this data to focus on the real competition.

Time to return address balance versus current number of UTXOs affecting the address balance

We can once again see the effects of MIN_HISTORY_ITEMS_TO_CACHE on Esplora. ElectrumX again vastly outperforms Esplora for addresses with fewer than 100 UTXOs while the opposite is true for addresses with greater than 100 UTXOs. Esplora scales amazingly well for addresses with high UTXO counts, but Electrumx isn't bad either.

Time to return transaction history of address versus total number of transactions affecting address

Now we move on to simpler queries - how long does it take us to retrieve the entire transaction history for a given address? Naturally this will result in more data being returned so we expect to see a linear relationship between the amount of data and amount of time to return it.

The most interesting point to note here is that ElectrumX appears to scale better than Esplora when it comes to returning transaction history. ElectrumX never takes more than 20 milliseconds to return the transaction history, while Esplora can take several hundred milliseconds. Perhaps Esplora is not using the same constant-time caching scheme for transaction history that it's using for UTXOs and balances?

Time to return current UTXO list for address versus current number of UTXOs affecting address

Similarly, how long does it take us to retrieve the entire set of UTXOs for a given address? Electrs is still the clear loser yet again, though it's more performant than it was for balance lookups. Once again, we need to zoom in on the more interesting results between Esplora and ElectrumX.

Time to return current UTXO list for address versus current number of UTXOs affecting address

Here we can see Esplora really shine. Constant time scaling of UTXO lookups make it blazing fast, in some cases returning the entire result set in under a millisecond! ElectrumX returns UTXO history at a rate that scales linearly but its performance is no joke - 20 milliseconds for 900 UTXOs is impressive.

Conclusions

Electrs is a great low-resource, easy-to-install option for personal use - especially on resource-constrained hardware. But it's certainly not an option for operating within enterprise infrastructure or as a publicly listed Electrum server.

In general, it appears that ElectrumX provides the best "bang for your bitcoin" with regard to performance vs. resource requirements. But if you are willing to devote 10 times as much disk space in order to achieve maximum performance, Esplora is the way to go.