Date: 2005-11-25 11:43:00
Tags: backup, livejournal
livejournal backup copy
Ever since I started using livejournal, it's annoyed me that the content I write is stored on somebody else's server. It's not that I don't have confidence that livejournal will continue to be around for the foreseeable future, but just that I prefer to maintain my own copy of my own content.

I've been halfheartedly addressing this problem by using the charm livejournal client to post. Charm stores an archive copy of every post you make. However, I don't always use charm to post, especially when I'm travelling.

To address this problem, I finally got around to writing a simple livejournal backup program called ljdump. This is a simple Python program that archives livejournal posts as XML files into a subdirectory. You can run it as often as you like and it will only download any new posts since the last time it was run.

Hope somebody finds this useful.
[info]thomasj
2005-11-25T20:36:06Z
Excellent! I've wanted something like this.

I wrote a perl script last year to retroactively make all my posts friends-only.
[info]ehintz
2005-11-25T20:54:16Z
Suh-weet. I've been halfheartedly looking for something like this for a while. Thanks...
[info]ehintz
2005-11-25T22:15:43Z
Feedback, in case you care...

First run through there were about 6 times that it failed with "challenge expired". 2nd run got all those entries.

Also, it choked both times on this entry:

http://www.livejournal.com/users/ehintz/2002/04/23/

Due to the quotes... The "auto-convert older entries from W. Euro (windows)" option on LJ settings fixed that.

Nice tool, thanks for writing 'er.
[info]ghewgill
2005-11-26T22:24:35Z
I saw that "challenge expired" thing once too, but running it again fixed it. Not sure what causes that (possibly on the server side).

That's interesting about the quotes. I use the API version that is supposed to encode everything in UTF-8 and not have that sort of problem, but who knows. The LJ API sucks. :)
[info]goulo
2005-11-26T00:04:21Z
I would swear I saw existing apps that do this sort of lj dumping already, but I'm too lazy to go look for the references...
[info]ghewgill
2005-11-26T22:26:32Z
Before writing this I halfheartedly looked and saw a bunch of Windows apps that did archiving. I just wanted a little command-line tool that dumped to XML files, nothing more. Apparently there is a tool developed by Livejournal themselves (called jbackup or something) that does something similar, but it's in Perl and relies on a bunch of nonstandard Perl modules. I wanted something low-impact.
[info]bovineone
2005-11-26T03:32:52Z
Is this much better than the built-in export functionality that LiveJournal offers? http://www.livejournal.com/support/faqbrowse.bml?faqid=8

It provides an XML export format (or CSV), but it doesn't export the reader comments made to an entry.
[info]dbaker
2005-11-26T03:50:38Z
That can only do one month at a time.
[info]kayateia
2008-03-20T04:52:34Z
I just discovered this while looking for a way to backup old LJ entries with comments. Very nice piece of work, thanks! The XML output looks nice and clean.

I'm curious how you did the display on your web site of the archived entries. I could write something to do something similar myself, but since I'm on a roll of laziness, I figured I'd ask... :)
[info]ghewgill
2008-03-20T04:59:04Z
I rolled a bunch of XSLT that publishes the XML backup to HTML. It's a bit more specific to my configuration but you're welcome to give it a go: http://hewgill.com/viewvc/ljpublish/trunk/
[info]kayateia
2008-03-20T05:04:34Z
Ahh cool, I'll check it out. Thanks!

Greg Hewgill <greg@hewgill.com>