Preserving the web is an unsolved problem
β’ 514 words β’ ~3 min read β²
No website is guaranteed to exist for ever, not even for a timespan that significantly outlives its original author. Keeping one up and running is, although very cheap, not entirely free. But even loss-leader offerings like github pages or the aws free tier are payed for by somebody. And that somebody might stop doing so for many reasons.
The web is a young medium, yet in the three decades an incredible amount of content has been created and discussed on it. Some of which had deep and lasting impact. Yet, there is a high risk that a lot of what was going in and on the open web is vanishing at an alarming rate without a trace from our colletive intellectual and cultural history.
Take for example Ward Cunningham, the inventor of Wiki (both the concept and its first implementation). The WikiWikiWeb/C2 Wiki was frozen in 2015. Its content is in read-only mode ever since. Since then Ward is remodelling its software basis, but in an issue on github he wrote:
Perhaps I should explain why wiki worked.
I wrote a program in a weekend and then spent two hours a day for the next five years curating the content it held. For another five years a collection of people did the same work with love for what was there. But that was the end. A third cohort of curators did not appear. Content suffered. I had visualizations. I could see it decay. That is what I mean when I say that the those of good will have passed. A security engineer has compared the open internet to the summer of love. It was neat while it was happening but it is over.
I find this stance disheartening, albeit understandable. And the c2 wiki is at least still readable. But for how long will that remain so?
There are many, too many examples of content becoming inaccessible or even destroyed on a whim. The highest risk bears everything that is hosted by commercial entities for "free": think for example of Yahoo killing Geocities. This free (as in beer) is penny-wise and pound-foolish. Single copies are single points of failure and hopeing somebody will keep the lights on is not a viable backup stragegy.
Speaking of backups: One of the very few organization that tackles the problem of preserving the web is the Internet Archive. Sadly, it recently lost a lawsuit against big publishers, which has the potential to completely destroy it financially. And it appears that there is no meta-backup strategy in place.
What can an invidual do? Those who create websites might want to consider to follow Jeff Huangs approach and design their websites to last. Probably another good idea would be to explicitely use a license for free cultural works to allow content to be mirrored without legal headaches.
As much as I'd like to end on a high note, it is obvious that such issues cannot be tackled by an invidiual or small idealistic non-profit groups and the economic incentives are not in the favor of an open web.