Wed 2 Jul 2008
I’ve found myself in a few discussions about just how permanent web content is, and how permanent it should be, and what folks expectations on that front are. It’s a big question, and I’m curious what sort of take you folks have on it.
What do you expect to stay around? What do you not count on? Why? Is how you things see right now how you’d like to see them? Where do your feelings on this come from, historically?
Etc. Go crazy.
Posted by Josh Millard7 answers so far!
I would prefer it if when web content is deleted, it would really go away.
If I take down my personal web site, I’d rather nobody be able to find it again on archive.org, or wherever.
If I have an article online that I edit, I’d prefer it if nobody could find previous versions.
I’m the farthest thing from a privacy nut (I guess I’m kind of an exhibitionist) but to me, those sound like reasonable requests. Unfortunately, that’s not the way it works.
Now, if I post something at BigBigQuestion.com, I expect it to be there until the site’s owner removes the site, or removes the post for other reasons. I wouldn’t expect posts to stick around any longer or shorter than that.
I don’t have much in the way of expectations. It’s all (mostly) free information to me. :)
Frnkly dn’t s wht ll th fss s bt.
I’m of the mind that any content created on the site should remain there, barring legal difficulties. There are moments where an admin or moderator or whoever would no doubt like to remove something, and those are all understandable, but it’s a matter of accountability and record keeping to my mind.
What I mean is that I see the wayback machine as one of the greatest sites on the web, because it holds people accountable and keeps the records for when (for some unforeseen reason) it’s useful to go back and look at something now departed.
This doesn’t mean I think everyone should have to keep all web content they’ve generated up for public display, though. The OED has been edited, and you can’t blame them for removing out of date entries and such. On the other hand, it’s enormously valuable (even if only to a minuscule number of people) to be able to find an old edition for comparison and contrast. If they were able to have all old editions destroyed, it would to my mind be a terrible loss, to say nothing of the possibility for covering up wrongdoing. at least, as much wrongdoing as a dictionary is capable of.
Ultimately, I guess what I’m saying is that I think archiving is invaluable, but that it’s our job to enforce it, either through activism or through straight up backups like the wayback machine.
Ideally, I think everything should be kept around, but that it would be ok to go back later and strike through stuff with an explanation such as “I regret telling everyone that I’m afraid of water buffaloes as this has caused me an untold amount of pain both personally and professionally.
I agree with shmegegge, but:
The OED has been edited, and you can’t blame them for removing out of date entries and such.
The OED is exemplary in its actions; every revised entry has a button you can click to see the old, unrevised entry. I love the OED.
I act as if content will disappear by tomorrow, yet survive perpetually. A Schrödinger’s Internet, determined not by observation, but by desire of the observer/observed. Also, it is typically attached to Murphy’s Law. If it is desirable, interesting, or useful, it will probably disappear tomorrow. If it is a drunken party photo someone posts on Facebook, it will be the #1 Google hit for someone’s name until they win a Nobel prize or a porn actor uses that name as an alias.
While things of note in reality are proper to archive, there has been a treatment of the internet as a parallel world when actions and words are kept solely in this world and never mentioned outside. Sometimes, this is a wonderful release that is best lost to time in a matter of days.
Every time I look at the logic of what is and is not archived, it starts to make sense until the internet decides it will have none of that nonsense and promptly defies rationality.
It’s quite important for academic research on the web itself that things stay around, and have at the very least a version history. Currently this is done by downloading and archiving. Though there are efforts to make a webservice out of this (WebCite, Webcitation, etc) they haven’t worked out some fundamental issues, such as how to maintain an archive that’s distributed so there’s no central point of failure(putting everything under one URL doesn’t seem like a great idea, but then how do you link to archive pages) and who should be in charge of maintaining the archive.