I’m not convinced it’s dangerous to explore whether there are benefits to ephemerality.
I’m also not sure your Rembrandt example shows what you suggest it does. The average Atari 2600 programmer would be more equivalent to the hundreds of now unknown artists in Rembrandt’s time. The John Carmack’s of today will be remembered in detail with or without blanket archive efforts.
Maybe, just maybe, Rembrandt’s Status in our minds is a result of generations of people each seeing the individual value in his work. That is, each generation does indeed get to decide what future generations remember. Or at least it used to be true until the digital age.
Maybe the change is an improvement. But maybe not.
And libraries are the epitome of what you’re fighting against. They are by definition works chosen by humans based on judgment calls of their perceived value.
Let’s at least acknowledge that blanket archive efforts are a fundamental change in themselves and a departure from the human status quo for thousands of years. Then let’s debate whether the change is an unabated good.
While I don't endorse your parent's over-the-top rhetoric, and I do agree that there is value in ephemerality and that it's worth noting that libraries are more carefully curated than a dump-and-archive, I think it's also worth noting that these are generally public pages.
All the stuff Tumblr users intentionally wrote and published publicly, but none of their IP address logs and other incidentally collected information, is exactly what ought to be archived and preserved, in my opinion. This is in strong contrast to incidentally collected data including clear PII like IP addresses that many companies today are hoarding forever, when they ought to be ephemeral.
Tumblr blogs often include people's names, faces, and/or details about their personal lives. That's very much personally identifiable information! And while they did post it publicly, they likely didn't do so with the intention of it being saved forever in a publicly available, easily searchable archive. This especially applies for porn blogs where people post their own original content.
There's certainly value in archiving social media but I think it has to be balanced against the harms, instead of defending the practice with literal religious fervor and dismissing all criticism out of hand.
Was there something in particular I said that you felt was defending it with "literal religious fervor and dismissing all criticism out of hand", or were you referring to my grandparent? I don't think I dismissed anything out of hand, I specifically acknowledged both the value of ephemerality and the point that traditional libraries are curated.
I agree that there is a danger that people may not realize how public and permanent the things they published to Tumblr were, or how dangerous it can be to do so (and I downvoted a sibling comment dismissing this danger). However, I think you and I have different threat models.
In my mind, archiving PII that is intentionally published is not particularly harmful because most lay people do, in fact, understand that their avatar, username, and by default, posts are public on Tumblr. They have had the opportunity to remove that information this whole time, and they still do, Archive.org removes stuff if you ask them.
By contrast, lay people have no mental model for what kind of information is incidentally collected nor how dangerous or benign it is. Certainly, lay people also can and do misjudge how public and how dangerous the things they intentionally publish are, but the gap is far, far less than incidental information. "Would you tell a stranger this" or "would you write this on a bathroom wall" are decent heuristics: the only difference in danger between text written on a bathroom wall and written on Tumblr is due solely to the potentially wider reach and possibility of even going viral on Tumblr. (Photos, of course, can also subtly compromise privacy in ways surprising to a lay person, but the gap is still much smaller than incidental information.)
In my threat model, that gap in understanding is much, much more dangerous than the intrinsic danger of PII. That's why I think that as long as Archive.org has a usable removal process, I think pretty much all the danger is in surveillance capitalism's collection of incidental information, not Archive.org's permanent record of intentionally publicized information.
The reason we fight against censorship (which is what this debate comes down to) with literal religious fervor is because that's how the other side fights for it.
Don't want it archived forever? Don't put it on the Internet. Seems simple enough.
If Archive.org had your attitude, I would actively oppose it. Removing private, personal info is not censorship. And nothing about "just don't put it on the Internet" is simple. What if someone hacked your devices and then put it on the Internet for lolz? What if you shared it in confidence with someone you trusted, who is intentionally putting it on the Internet to hurt you? What if you accidentally pasted the wrong thing or uploaded the wrong file? What if you were a child and didn't understand the dangers?
There obviously should be ways to ameliorate your mistake, which is why it is absolutely critical that Archive.org has a removal process.
Many people writing personal diaries/letters probably didn't do so with the intention of it being saved forever in a publicly available, easily searchable archive.
Yet such data is invaluable to historians and can give us a window in time through the eyes of people who lived that time. Having that publicly available data lost for all time would be an immense loss to future generations.
I'm sure in a few generations, some historians will study those archived porn blogs and get an insight on the evolution of humans' relations to sexuality that today's historians can only dream of.
Ironically, IP addresses are probably the _least_ personally identifiable bit of information in a lot of that stuff. Most people's IPs are assigned to someone else within months, or even hours. But a username, profile picture, etc? Those are potentially identifiable.
In a reply to your sibling I explain how in my view, the fact that lay people have no mental model of what kind of information can be incidentally collected and how dangerous it is, whereas lay people are much more capable of understanding the dangers of a personally identifiable username, profile pic, and personal details revealed in posts, makes the former far more dangerous than the latter.
> The John Carmack’s of today will be remembered in detail with or without blanket archive efforts.
Sure, but this leaves us a distorted view of history, where we have lots of details on the lives of "great men" and next to none on how ordinary people lived. Which means the vast majority of people who lived and died in that period end up written out of their own history.
Archaeologists spend a lot of time rooting around in ancient rubbish piles and cesspools, because these are some of the very few places where physical evidence of how ordinary people lived has survived. Nobody in ancient times would have nominated those sites as culturally important or worthy of preservation. But what we know of how ordinary people in those times worked, played, ate and drank comes largely from things dug up from them.
I certainly am sympathetic to the preservationist mindset. OTOH, even if we restrict ourselves to content that is natively created in digital form, the amount of "stuff" that comes into existence every day--much of it not on the public web or public social media--is staggering. (And much is not public for good reasons.)
I'm not convinced that we should feel a compulsion to save all of that. Just because it's more practical to be a pack rat about digital content doesn't mean that, taken to extremes, it doesn't still seem like being a pack rat.
In 2008 I found a parcel of bare EPROMs at a flea market container 27 games. 1 of those games was Cabbage Patch Kids Adventures in the Park, and it was spread across 12 chips, each one showing a progressive state of development across 9 months.
To my mind, this was the only known find of a vintage Atari 2600 game and its iterative development process. So, 30 years later, the only reason we had this snapshot is because someone found these chips and sold them at the flea.
The current state of digital preservation is abhorrent. Those roms would have taken up less than 1/4 of a 5.25" floppy, but the company behind them never thought to preserve that information or data.
Take2 Interactive republished BioShock in 2012. They couldn't find their source code. They didn't save it. They had to go machine to machine looking for it. The reissued game is not the same as the original.
As a society, we don't place any value on this stuff, but the potential value of it cannot be understood until the future has occurred. Letting it vanish is a disservice to the future. In the past, if a book was published, it wasn't going to vanish if the publisher went out of business, there would simply be no new copies.
In our digital online age, things vanish in seconds, days and hours. This is also a very different state of affairs. In the past we could not save everything, but everything didn't have a clock counting down from the end of the quarter over its head, counting the seconds until it is deleted.
The Library of Congress tries to save everything. Yes, libraries weed the stacks and choose items to host. This is due to space concerns: they can't host everything ever. Digitally, they can, and many host reams of microfilm and old newspapers because they can.
Libraries can, thanks to tech, now host every book ever, digitally, for very low costs. Copyright prevents that.
This is an unabated good. Leaving things behind and forgetting them is how you get Tulsa Oaklahoma, or the Armenian Genocide denials. We don't get to choose what the future finds interesting, and for the first time in history, we do not have to. Why in the every loving fuck would you worry about that?
Most likely, only for personal reasons. This is a humanity level problem. Your personal worries are irrelevant in 100 years when everyone who ever knew you is dead anyway. Geocities would be more interesting at that time, as a subject of study.
Library of Congress, British Library, Bibliothèque Nationale etc choose to save everything they are mandated to, and a fair bit extra besides. That includes everything published. They don't save their water cooler chats, personal letters and everything sent by post, everything said on the phone or Facebook, etc.
The bar - perhaps found accidentally - seems quite important in deciding what must be archived, and what probably shouldn't.
Archives of personal letters and ephemera, preserved in manuscript/special collections libraries, are incredibly important research sources. This often includes letters which were never meant to published. LOC had a project to preserve every tweet (published to the world) until a few years ago - who knows what tweets might be useful to future researchers?
And yet, hundreds of years later historians and linguists crave for letters, and post, and telegrams to get a glimpse of actual life outside official publications.
Sure, and a hundred or more years later the family of the author, or relatives of the recipient can decide to release the family letters or telegram from WW1 or the US Civil War etc. That delay, usually at least until the correspondents have died, is important. The affair, the less than ideal belief, and all that other imperfect demonstration of humanity can no longer hurt or embarrass. It ceases to be private and personal and moves into the historic.
Releasing whilst the probably famous sender is alive is most often in the realms of to do damage, simply tasteless or paid for revelations in the gutter press.
> Leaving things behind and forgetting them is how you get Tulsa Oaklahoma, or the Armenian Genocide denials. We don't get to choose what the future finds interesting, and for the first time in history, we do not have to.
There is plenty of evidence for the Armenian genocide, the Holocaust, and 9/11. That doesn’t really stop deniers or conspiracy theorists. When it becomes politically advantageous, spreading misinformation becomes weaponized and mainstream. A bunch of nerds saving some ROM dumps isn’t going to really change that.
Like the library of Alexandria it’s also quite idealist to think archive.org will be around in 100 years or more. Not that we shouldn’t do it... but the future can be unkind to even all modern technology.
I’m also not sure your Rembrandt example shows what you suggest it does. The average Atari 2600 programmer would be more equivalent to the hundreds of now unknown artists in Rembrandt’s time. The John Carmack’s of today will be remembered in detail with or without blanket archive efforts.
Maybe, just maybe, Rembrandt’s Status in our minds is a result of generations of people each seeing the individual value in his work. That is, each generation does indeed get to decide what future generations remember. Or at least it used to be true until the digital age.
Maybe the change is an improvement. But maybe not.
And libraries are the epitome of what you’re fighting against. They are by definition works chosen by humans based on judgment calls of their perceived value.
Let’s at least acknowledge that blanket archive efforts are a fundamental change in themselves and a departure from the human status quo for thousands of years. Then let’s debate whether the change is an unabated good.