https://fiatjaf.com/d5031e5b.html

   

How IPFS is broken

I once fell for this talk about "content-addressing". It sounds very
nice. You know a certain file exists, you know there are probably
people who have it, but you don't know where or if it is hosted on a
domain somewhere. With content-addressing you can just say "start"
and the download will start. You don't have to care.

Other magic properties that address common frustrations: webpages
don't go offline, links don't break, valuable content always finds
its way, other people will distribute your website for you, any
content can be transmitted easily to people near you without anyone
having to rely on third-party centralized servers.

But you know what? Saying a thing is good doesn't automatically make
it possible and working. For example: saying stuff is addressed by
their content doesn't change the fact that the internet is
"location-addressed" and you still have to know where peers that have
the data you want are and connect to them.

And what is the solution for that? A DHT!

DHT?

Turns out DHTs have terrible incentive structure (as you would
expect, no one wants to hold and serve data they don't care about to
others for free) and the IPFS experience proves it doesn't work even
in a small network like the IPFS of today.

If you have run an IPFS client you'll notice how much it clogs your
computer. Or maybe you don't, if you are very rich and have a really
powerful computer, but still, it's not something suitable to be run
on the entire world, and on web pages, and servers, and mobile
devices. I imagine there may be a lot of unoptimized code and
technical debt responsible for these and other problems, but the DHT
is certainly the biggest part of it. IPFS can open up to 1000
connections by default and suck up all your bandwidth - and that's
just for exchanging keys with other DHT peers.

Even if you're in the "client" mode and limit your connections you'll
still get overwhelmed by connections that do stuff I don't understand
- and it makes no sense to run an IPFS node as a client, that defeats
the entire purpose of making every person host files they have and
content-addressability in general, centralizes the network and brings
back the dichotomy client/server that IPFS was created to replace.

Connections?

So, DHTs are a fatal flaw for a network that plans to be big and
interplanetary. But that's not the only problem.

Finding content on IPFS is the most slow experience ever and for some
reason I don't understand downloading is even slower. Even if you are
in the same LAN of another machine that has the content you need it
will still take hours to download some small file you would do in
seconds with scp - that's considering that IPFS managed to find the
other machine, otherwise your command will just be stuck for days.

Now even if you ignore that IPFS objects should be
content-addressable and not location-addressable and, knowing which
peer has the content you want, you go there and explicitly tell IPFS
to connect to the peer directly, maybe you can get some seconds of
(slow) download, but then IPFS will drop the connection and the
download will stop. Sometimes - but not always - it helps to add the
peer address to your bootstrap nodes list (but notice this isn't
something you should be doing at all).

IPFS Apps?

Now consider the kind of marketing IPFS does: it tells people to
build "apps" on IPFS. It sponsors "databases" on top of IPFS. It
basically advertises itself as a place where developers can just
connect their apps to and all users will automatically be connected
to each other, data will be saved somewhere between them all and
immediately available, everything will work in a peer-to-peer manner.

Except it doesn't work that way at all. "libp2p", the IPFS library
for connecting people, is broken and is rewritten every 6 months, but
they keep their beautiful landing pages that say everything works
magically and you can just plug it in. I'm not saying they should
have everything perfect, but at least they should be honest about
what they truly have in place.

It's impossible to connect to other people, after years there's no
js-ipfs and go-ipfs interoperability (and yet they advertise there
will be python-ipfs, haskell-ipfs, whoknowswhat-ipfs), connections
get dropped and many other problems.

So basically all IPFS "apps" out there are just apps that want to
connect two peers but can't do it manually because browsers and the
IPv4/NAT network don't provide easy ways to do it and WebRTC is hard
and requires servers. They have nothing to do with
"content-addressing" anything, they are not trying to build "a forest
of merkle trees" nor to distribute or archive content so it can be
accessed by all. I don't understand why IPFS has changed its core
message to this "full-stack p2p network" thing instead of the basic
content-addressable idea.

IPNS?

And what about the database stuff? How can you "content-address" a
database with values that are supposed to change? Their approach is
to just save all values, past and present, and then use new DHT
entries to communicate what are the newest value. This is the IPNS
thing.

Apparently just after coming up with the idea of
content-addressability IPFS folks realized this would never be able
to replace the normal internet as no one would even know what kinds
of content existed or when some content was updated - and they didn't
want to coexist with the normal internet, they wanted to replace it
all because this message is more bold and gets more funding, maybe?

So they invented IPNS, the name system that introduces
location-addressability back into the system that was supposed to be
only content-addressable.

And how do they manage to do it? Again, DHTs. And does it work? Not
really. It's limited, slow, much slower than normal
content-addressing fetches, most of the times it doesn't even work
after hours. But still although developers will tell it is not
working yet the IPFS marketing will talk about it as if it was a
thing.

Archiving content?

The main use case I had for IPFS was to store content that I
personally cared about and that other people might care too, like old
articles from dead websites, and videos, sometimes entire websites
before they're taken down.

So I did that. Over many months I've archived stuff on IPFS. The IPFS
API and CLI don't make it easy to track where stuff are. The pin
command doesn't help as it just throws your pinned hash in a sea of
hashes and subhashes and you're never able to find again what you
have pinned.

The IPFS daemon has a fake filesystem that is half-baked in
functionality but allows you to locally address things by names in a
tree structure. Very hard to update or add new things to it, but
still doable. It allows you to give names to hashes, basically. I
even began to write a wrapper for it, but suddenly after many weeks
of careful content curation and distribution all my entries in the
fake filesystem were gone.

Despite not having lost any of the files I did lose everything, as I
couldn't find them in the sea of hashes I had in my own computer.
After some digging and help from IPFS developers I managed to recover
a part of it, but it involved hacks. My things vanished because of a
bug at the fake filesystem. The bug was fixed, but soon after I
experienced a similar (new) bug. After that I even tried to build a
service for hash archival and discovery, but as all the problems
listed above began to pile up I eventually gave up. There were also
problems of content canonicalization, the code the IPFS daemon use to
serve default HTML content over HTTP, problems with the IPFS browser
extension and others.

Future-proof?

One of the core advertised features of IPFS was that it made content
future-proof. I'm not sure they used this expression, but basically
you have content, you hash that, you get an address that never
expires for that content, now everybody can refer to the same thing
by the same name. Actually, it's better: content is split and hashed
in a merkle-tree, so there's fine-grained deduplication, people can
store only chunks of files and when a file is to be downloaded lots
of people can serve it at the same time, like torrents.

But then come the protocol upgrades. IPFS has used different kinds of
hashing algorithms, different ways to format the hashes, and will
change the default algorithm for building the merkle-trees, so
basically the same content now has a gigantic number of possible
names/addresses, which defeats the entire purpose, and yes, files
hashed using different strategies aren't automagically compatible.

Actually, the merkle algorithm could have been changed by each person
on a file-by-file basis since the beginning (you could for example
split a book file by chapter or page instead of by chunks of bytes) -
although probably no one ever did that. I know it's not easy to come
up with the perfect hashing strategy in the first go, but the way
these matters are being approached make me wonder that IPFS promoters
aren't really worried about future-proof, or maybe we're just in Beta
phase forever.

Ethereum?

This is also a big problem. IPFS is built by Ethereum enthusiasts. I
can't read the mind of people behind IPFS, but I would imagine they
have a poor understanding of incentives like the Ethereum people, and
they tend towards scammer-like behavior like getting a ton of funds
for investors in exchange for promises they don't know they can
fulfill (like Filecoin and IPFS itself) based on half-truths,
changing stuff in the middle of the road because some top-managers
decided they wanted to change (move fast and break things) and
squatting fancy names like "distributed web".

The way they market IPFS (which is not the main thing IPFS was
initially designed to do) as a "peer-to-peer cloud" is very seductive
for Ethereum developers just like Ethereum itself is: as a place
somewhere that will run your code for you so you don't have to host a
server or have any responsibility, and then Infura will serve the
content to everybody. In the same vein, Infura is also hosting and
serving IPFS content for Ethereum developers these days for free.
Ironically, just like the Ethereum hoax peer-to-peer money, IPFS
peer-to-peer network may begin to work better for end users as things
get more and more centralized.

More about IPFS problems:

  * IPFS problems: Too much immutability^yr
  * IPFS problems: General confusion^yr
  * IPFS problems: Shitcoinery^yr
  * IPFS problems: Community^yr
  * IPFS problems: Pinning^yr
  * IPFS problems: Conceit^yr
  * IPFS problems: Inefficiency^yr
  * IPFS problems: Dynamic links^yr

See also

  * A crappy course on torrents, on the protocol that has done most
    things right
  * The Tragedy of IPFS in a series of links, an ongoing Twitter
    thread.

2020-01-20
Backlinks

  * IPFS-dropzone
  * Why IPFS cannot work, again
  * gravity
  * piln
  * inicio

criticism/ipfs

ethereum

logo