Recovering posts
My server crashed and so the blog was down all day and — damn! — I lost all posts since the 16th. I just restored some of them but links and comments were lost. Sorry!
: UPDATE: I managed to recover some of the meatier comment discussions. I didn’t recover all posts or comments because it’s just laborious. But note well how I did it: Google’s cache had everything my host had lost. It never ceases to amaze, Google.
July 24th, 2008 at 9:52 am
I think this incident highlights one of the points that I’ve been making - the fragility of data stored remotely.
I’m not talking about the usual warning to back up your data that people regularly ignore, but the tendency for such data to be unique and at risk.
One can make an argument that old photos stored in a shoebox are just as at risk as the modern ones stored on a memory card or hard drive, but one depends upon technology to recover it and the other doesn’t.
What I’m really concerned with is the gateway function that can be easily installed when data is only available remotely. This presents huge threats to democracy, democratic movements in authoritarian countries, and innovation in general.
It’s not like we don’t have experience with censorship and repression in this world. One can look at the role of samizdat in the USSR, or even the fiction of “Fahrenheit 451″ where books were memorized to see how important preserving information is.
The current cozy relationship between Google et al and China is also worrying. There are trends in the US as well.
NY AG Cuomo has intimidated a bunch of ISP’s into cutting off access to parts of the Usenet system because it “may” contain child pornography. This is prior restraint which, just this week has been declared unconstitutional by a federal court (again) in another case.
In addition it involves the ISP’s in filtering content which blurs the role of common carrier. If you think Google should stay away from content creation then so should Comcast, Verizon and the rest of the pipes.
Jeff, as a big supporter of freedom of speech I don’t think you get concerned enough about all the threats to it. Making everything electronic and centralized only makes it easier to throttle.
July 24th, 2008 at 11:35 am
You need to do nightly backups. Linux has tons of choices. Windows has robocopy which is just like rsync in Unix. Set up a cron job and forget about it.
July 24th, 2008 at 6:12 pm
RS:
Don’t put too much faith in technology. I once had a system of backups where they were recycled at the end of the month.
An important, but seldom accessed file got messed up, but this went unnoticed until the cycle erased the last existing copy.
So should we add in longer archiving? A year, a decade, forever? How would you find the correct version? Do you store one copy of the backup, several, in many locations?
How much is this extra reliability worth? Who pays?
July 25th, 2008 at 10:24 am
This is why I’ve left my young blog at Blogger. For all the little things I wish I could do with it that I haven’t figured out yet, I trust Google’s data storage.
Yes, I have copied all my posts to a long-ass Word document in case somebody one day wants all my posts (like future me) and I’m still not too sure about the legal crap about who owns what’s posted on a blogspot.com, but I sleep at night.
Peace.