Respected cryptographer Moxie Marlinspike has written up some thoughts on Ethereum and the modern ‘crypto’ ecosystem. Although I was involved in Bitcoin right at the beginning, I’ve never been involved in Ethereum or Web3, don’t currently own any cryptocurrencies and I broadly agree with a lot of what he says. Nonetheless I disagree on a few critical points. In this essay I’ll summarize parts of Moxie’s argument, lay out a couple of disagreements and then provide some suggestions for a path forward.
I’ll start with a small technical disagreement before moving onto thoughts about servers and cryptography.
At the core of Moxie’s argument is the observation that Ethereum claims to be a decentralized ecosystem but isn’t. This is broadly true. He also observes that there are many excuses for this situation in circulation like “it’s early days” which, as someone who used Bitcoin in 2009, seems quite wrong. 12 years is sufficient to have solved these problems.
Unfortunately Moxie conflates Ethereum with all blockchain systems:
When people talk about blockchains, they talk about distributed trust, leaderless consensus, and all the mechanics of how that works, but often gloss over the reality that clients ultimately can’t participate in those mechanics. All the network diagrams are of servers, the trust model is between servers, everything is about servers. Blockchains are designed to be a network of peers, but not designed such that it’s really possible for your mobile device or your browser to be one of those peers.
With the shift to mobile, we now live firmly in a world of clients and servers — with the former completely unable to act as the latter — and those questions seem more important to me than ever. Meanwhile, ethereum actually refers to servers as “clients,” so there’s not even a word for an actual untrusted client/server interface that will have to exist somewhere, and no acknowledgement that if successful there will ultimately be billions (!) more clients than servers
I feel a need to respond to this because it’s only actually true for Ethereum. I should know because my first project when I joined the Bitcoin world was to work with Andreas Schildbach on a genuinely peer-to-peer mobile wallet app, which became Bitcoin Wallet for Android. It had/has a competitive UX and was able to grow a large userbase, despite being built in the most decentralized way possible.
We could do this because Satoshi thought very carefully about the untrusted client/server interface. From day one the Bitcoin protocol had a notion of a sort of lightweight client mode. Satoshi didn’t give this mode a clear name but the section of the paper that discussed it was titled “Simplified payment verification”, so I started calling apps that used it SPV wallets and the name stuck. Exactly how SPV mode works is fully explained elsewhere, but briefly, the client app bootstraps connections to the P2P network as normal but sends a special message saying “please don’t send me the contents of every block or transaction, I only want to see transactions matching a filter”. It then downloads the headers of every block from the peers, but not their contents, and does the necessary computations to select the block header chain with the highest total work. Transactions that match the filter come supplied with a Merkle branch linking them to the Merkle tree roots embedded into the headers. In this way a client can traverse a chain of blocks with fairly minimal bandwidth, storage and CPU requirements, whilst keeping the P2P network as an untrusted adversary. The filter in our implementation was a Bloom filter, so you could probabilistically hide what you were interested in (although in practice, real users cared much more about performance than this type of privacy).
Moxie observes that:
One of the surprising things to me about web3, despite being built on “crypto,” is how little cryptography seems to be involved!
I think the protocol just outlined does use cryptography in some interesting ways, or rather, ways that were interestingly new in 2011 when we built out the infrastructure.
This system was very complicated to implement but worked surprisingly well. We implemented lots of performance tricks like background wakeups to keep roughly synced, bandwidth adaptation, measuring peer latencies, syncing at night when plugged in to charge and so on. Whilst SPV wallets were never quite as fast as competitors that simply polled a centralized database, they were fast enough for many users.
That was then. This is now. Why doesn’t Ethereum have SPV clients like Bitcoin did? Well, simply put it wasn’t designed with resource consumption in mind (nor, frankly, ordinary commerce). A Bitcoin implementation not only controls how much work it does via SPV mode but can also parallelize and shard a lot of full-mode work to get great scalability. This is possible due to the way the contents of the blocks are designed. Ethereum kept the chain-of-blocks idea from Bitcoin but radically changed what those blocks had inside them, and in the process not only lost the ability to have mobile clients but also destroyed its own ability to scale through parallelism.
Unfortunately a frequent problem in the crypto/blockchain space is what Moxie is doing here: conflating Ethereum, Bitcoin and the block chain algorithm together, leading to incorrect conclusions like “blockchains don’t scale well” or “blockchains can’t have mobile clients” when the truth was closer to “Ethereum can’t do those things”. (It can of course do many other things Bitcoin couldn’t). If the question you’re interested in is “what’s up with NFTs?” then this distinction hardly matters, because after the Bitcoin community drank the kool-aid it collapsed as a medium of exchange — nowadays I see no more opportunities to buy and sell things with Bitcoin than I did a decade ago. The momentum and interest moved to Ethereum. But if the question you’re interested in is “how do I build decentralized, privacy preserving systems”, then the distinction does still matter.
The above argument is a bit nit-picky, so now I want to make a much bolder disagreement.
Through his work on Signal and WhatsApp, Moxie is the primary advocate for what I’d call centralized cryptography. He sums up his position well so I’ll just quote it here:
1. People don’t want to run their own servers, and never will.
2. A protocol moves much more slowly than a platform.
We should accept the premise that people will not run their own servers by designing systems that can distribute trust without having to distribute infrastructure. This means architecture that anticipates and accepts the inevitable outcome of relatively centralized client/server relationships, but uses cryptography (rather than infrastructure) to distribute trust.
After 30+ years, email is still unencrypted; meanwhile WhatsApp went from unencrypted to full e2ee in a year
I agree with points 1 and 2, but there’s a conceptual problem with the argument: cryptography cannot impose any limits on an adversary that also controls the client doing the encryption. Centralized infrastructure that uses cryptography to defeat the centralized infrastructure is a contradiction in terms that can never work.
Let’s put it less abstractly. Moxie claims that Signal and WhatsApp use end to end encryption to ensure they can’t read our messages. How do we know this claim is true? I have nothing against Moxie and have never seen any evidence he’s untrustworthy, but I also assign zero weight to this belief because WhatsApp could be silently changed tomorrow to disable that technology for one or more users, without anyone even noticing, including Moxie himself. As for Signal it’s at least open source, but there’s no way to check that the client I’m using, or my friend is using, actually matches that source code. Even if there was it’s irrelevant. Centralized infrastructure can claim to provide privacy but can never provide control: they can openly alter the deal at any time and I’d be forced to continue using it, if I couldn’t get my friends to switch to something else.
This is not a theoretical argument. Disabling E2E encryption has already happened, although hardly anyone knows about it. In 2019 WhatsApp imposed forwarding limits on messages in order to “slow down the spread of rumors, viral messages, and fake news”. This represents a total defeat of the Signal protocol’s cryptographic objectives: a basic goal of any modern cryptographic scheme is to ensure the same message encrypted twice doesn’t encrypt to the same bytes. The point of this is to stop the adversary knowing when you’re repeatedly sending the same message and encryption modes that get this wrong (like AES/ECB) are discredited. Yet once Facebook — the adversary — became dominated by authoritarians who see unlimited communication as chaotic, they simply changed the client to include a forwarding counter outside the encrypted part of the message. There was nothing anyone could do about this. It just showed up one day, and all the fancy mathematics designed to stop this “attack” were irrelevant.
If an encryption scheme can’t stop infrastructure providers having opinions on the moral value of messages, what’s the point of it?
Thus despite my respect for what Moxie has designed and accomplished, I have problems with pushing the Signal/WhatsApp approach as something that can provide privacy, decentralized control or even as something that has any effect at all. It is at best a building block for later work that could meet these goals and it probably does act as a brake on Facebook’s worst tendencies, so it’s not nothing, but it’s also not a robust foundation. Really, a simple blog post that says “we promise we don’t log messages to disk” should carry equal weight.
These are big problems. We don’t want people to lose trust in concepts like encryption, privacy or decentralization yet:
- Ethereum is advertised as decentralized, but in practice it’s not.
- E2E messengers are being advertised as private, but in practice they’re not.
What to do?
Firstly let’s look at mobile messengers. There should be a quick incremental improvement here that would let them keep some central control whilst providing meaningful guarantees to their users: threshold signatures. Mobile operating systems check the signatures on app packages before applying updates. These are normal ECDSA or RSA signatures. It’s possible to craft such signatures not from a single private key but from a group effort by the holders of several key ‘shards’. By splitting up their signing key and distributing the shards to a variety of auditing firms with access to the source code, updates can be approved by the group. If the firms are spread around the world and their contracts are public, this can be used to translate arbitrary natural language rules into a binary signed/not signed decision legible by Android/iOS app update engines.
This isn’t a total fix, because ultimately the audit firms would need to be paid (they have to check the source matches the social contract being advertised), and thus the central authority — our adversary — will be the ones picking the auditors. But it’s still a big step up and would mean that if Facebook suddenly decided merely blocking forwarding isn’t enough to fight “rumors” or “fake news”, they’d be stopped by the auditors who’d (hopefully) refuse to sign the update that takes out the encryption.
The downside to the app developer is relatively small — higher latency on pushing out updates, dollar cost — but it’d be transparent to end users and wouldn’t affect the UX which is what they prize most. They could continue to iterate on the app quickly and without needing to move an ecosystem.
Threshold signing with distributed audit would be a nice upgrade, but what about more conventional approaches? It seems like privacy is impossible without control, control is difficult without decentralization, but our attempts to build decentralized systems aren’t working. What can be done?
Let’s take a step back and re-examine some of our foundational assumptions. Moxie argues:
People don’t want to run their own servers, and never will
The first clause is certainly true. The second is a prediction about the future, which is a notoriously difficult thing to predict.
Is the reluctance to run servers fundamental or is it merely the way things are now? Why don’t non technical users want to run servers? After all, they have done in the past via programs like BitTorrent and Gnutella. Multiplayer games are often also servers for latency reasons, relying on a central meeting point only for matchmaking. AirDrop runs a server in order to work. Etc.
Some causes are trivially easy to identify:
- IPv4 address exhaustion / firewalls / NAT.
- Laptops and smartphones are energy/bandwidth constrained.
- The dominant server OS is some mix of Linux, AWS, Kubernetes, Apache/Nginx etc. This stack has horrific usability even for experts. Consider how awkward it is to configure working backups for a new Linux server — this is a basic task which should be easy, yet it’s not.
But these things are all attributes of our current infrastructure, not things that must universally be true. The ‘golden age’ of home servers like BitTorrent nodes, web servers running in people’s bedrooms etc was a few years after the millennium, when:
- The dominant OS was Windows and even server apps had a GUI.
- IPv4 addresses were plentiful.
- Firewalls and NATs were much less used.
- Computers were connected to mains power 24/7.
Clearly then, the factors that constrain self-run decentralized infrastructure today don’t have to be true, they are true because we don’t make them false. An entirely different world is imaginable.
It’s worth noting here that despite widespread industry groupthink of the form “everything should be a web app” (hence the gratuitously named Web3), Apple — the company most strongly associated with tech usability — is not really on board with the whole web/cloud trend. They implement everything as an app that runs locally on powerful and expensive hardware, where your data is fully under your control. Especially on macOS, you share as much or as little of it with Apple as you want. You can reject software updates, or accept them. You back up data to a “time capsule”, easily and locally. You can monitor what apps can do or send on the network. In many ways it’s a traditional mindset reflecting the values of the 90s, yet it has not held Apple back in any way in its competition with the drastically more centralized ChromeOS. I’m not saying Apple is some paragon of decentralization, as obviously they aren’t and the fact it’s worked out this way is mostly due to their history rather than any strong pro-liberty philosophical stance. But still. They’re an existence proof of what’s technically possible.
Thus I conclude it’s not actually a given that users won’t run their own infrastructure, or adopt more decentralized approaches. They don’t do that today because the software industry outside of Cupertino isn’t interested in easily letting them do that. Linux distros in particular have failed to make a highly usable system, even for their own kind of people. And regardless of how much companies like Google or Facebook talk about privacy, their culture is and always will be to immediately leap for “do it all on the server”, which of course Apple knows and fully exploits in their marketing.
I’m painting with a very broad brush here and there are exceptions — the trend towards running ML models on-device is a good and praiseworthy example of this. But I hope you’ll agree that the generalities are right.
I’ve been thinking about these problems for a long time. These days I’m focused on finding incremental, non-radical paths forward. No more peer-to-peer networks for me, at least not for a while.
Moxie suggests this:
We should try to reduce the burden of building software. At this point, software projects require an enormous amount of human effort. Even relatively simple apps require a group of people to sit in front of a computer for eight hours a day, every day, forever. This wasn’t always the case, and there was a time when 50 people working on a software project wasn’t considered a “small team.”
I completely agree. Top of my list: it should be way easier to build and distribute both desktop apps and small, one-machine servers. Far better than encryption is just not sending data to another place to begin with, and it’s possible far more frequently than we actually exploit. Apple can make ultra-competitive yet private and decentralized ‘experiences’ like Pages, Numbers, GarageBand etc because of their deep history of building electronics, client-side software and distribution/supply chains. Yet the iLife apps stay up to date because Apple built their own app store infrastructure and OS to ensure they do. Other parts of the software industry struggle with even these basics.
Step 1 for building a decentralized system: get code onto the powerful, high bandwidth devices users actually plug into AC power today, keep it up to date and for political reasons do it outside of app stores. Ethereum has totally failed at this and Moxie is right to point that out. But the reason they failed is obvious once you actually try to do it: it’s a universe of pain. Awkward, hacked together and frequently abandoned tools, poor software update systems, numerous package and installer formats, and even needing to render icons at lots of different sizes all slows you down.
I spent most of 2021 working on software that addresses all these problems. It lets you make self-updating desktop and server apps with only a small config file and a single command. It’s not quite ready for beta yet and I didn’t actually set out to write an advert for what I’m doing when I started this essay, thus currently there’s no website or mailing list for this project. You’ll just have to keep an eye on my blog. I’ll announce it here when it’s launched, so subscribe for updates if you’re interested.
Suffice it to say that the design lives the philosophy I just espoused:
- It’s a tool, not a service. You run it locally on whatever type of computer you have, and it can make fully signed and notarized downloads for Windows, Mac and Linux without needing you to own those operating systems.
- The apps stay fresh. Online updates are easy, just change the version in the config file, rebuild and re-upload the generated static files to a web server. You don’t have to sacrifice iteration speed and the update tech is the most ‘native’ on each platform (MSIX on Windows, Sparkle on macOS, package managers on Linux).
- The user can be empowered. As long as you allow it, users can review and reject upgrades e.g. because they’re about to give a presentation and don’t want their setup changing, or they don’t like the new UI etc. Or you can ensure apps always update silently Chrome style: useful in enterprise deployments or where there’s a rapidly changing client/server protocol. Pick what makes most sense for your target market.
- Generates Linux server packages. Again, from any OS. They use systemd, can be sandboxed, can depend on other packages like databases, start automatically on install/boot, handle upgrades at the right times etc.
Apps distributed this way don’t have to be peer to peer apps. Simple GUI frontends to centralized services like WhatsApp Web also get easier to make. But once you’re on the desktop (or a Linux server) you have way more options and available tradeoffs on the decentralization/control/privacy/usability spectrum. The first versions of this tool won’t support threshold signed updates for example, but it’s a feature we’d really like to find time for.
I’m excited to launch this product and the company behind it, because I think that really nailing the basics is a key step towards solving some of the problems Moxie identified. Make it easy for developers to do the right thing, and more of us will do it. It’s as simple as that.
Join the pack! Join 8000+ others registered users, and get chat, make groups, post updates and make friends around the world!