A place to cache linked articles (think custom and personal wayback machine)
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

index.md 2.2KB

title: Web Decay Graph url: https://www.tbray.org/ongoing/When/201x/2015/05/25/URI-decay hash_url: b8dd4f6d72

I’ve been writ­ing this blog since 2003 and in that time have laid down, along with way over a mil­lion word­s, 12,373 hy­per­links. I’ve no­ticed that when some­thing leads me back to an old piece, the links are bro­ken dis­ap­point­ing­ly of­ten. So I made a lit­tle graph of their de­cay over the last 144 month­s.

URI decay at ongoing by Tim Bray

The “% Decay” val­ue for each val­ue of “Months Ago” is the per­cent­age of links made in that month that have de­cayed. For ex­am­ple, just over 5% of the links I made in the month 60 months be­fore May 2015, i.e. May 2010, have de­cayed.

Longer ti­tle · “A broad-brush ap­prox­i­ma­tion of URI de­cay fo­cused on links se­lect­ed for blog­ging by a Web geek with a cam­er­a, com­put­ed us­ing a Ru­by script cooked up in 45 minutes.” Mind you, the script took the best part of 24 hours to run, be­cause I was too lazy to make it run a hun­dred or so threads in par­al­lel.

I sup­pose I could regress the hell out of the da­ta and get a pret­ti­er line but the sto­ry these num­bers are telling is clear enough.

Another way to get a smoother curve would be for some­one at Google to throw a Map/Re­duce at a his­tor­i­cal dataset with hun­dreds of bil­lions of links.

This is a very sad graph · But to be hon­est I was ex­pect­ing worse. I won­der if, a hun­dred years af­ter I’m dead, the on­ly ones that re­main alive will be­gin with “en.wikipedia.org”?