Archiving

I’ve just been perusing old journal entries and pondering the fact that having them exist only on someone else’s server makes me nervous. I don’t want a server crash to destroy them. Does anyone know of a way to easily archive Journals and commentary into a file so I can save them on my own computer? I can cut and paste into Microsoft Word, but it is tedious.

8 thoughts on “Archiving”

  1. Why not just save the HTML file? Or did you want to get the comments as well? In which case I’m not sure what the best course is.

    –John

  2. Unfortunatly, as much as I’d love to say there’s a program out there to archive your journal… there isn’t. There *used* to be, but for whatever reason it no longer seems to work with the LJ servers.

    I’ve had the same fears as you, and there’s a pretty decent calling for a program like that, but no one’s actually made one yet. Sorry.

  3. I’d do that, I mean, if you really just need to save the stories, you can download the page, file>save as>and work online, I’m sure Howard knows how, I think we couldn’t get it to work a couple years ago for a project at work when we used Internet explorer, but we ended up using mozilla and it worked fine. Something like that would be pretty simple, just have to have a reliable source to store them on at home, be it a cd-r or spare harddrive or something like that 🙂

  4. The LJ FAQ entry for saving old entries is here.

    Long story short: The export page is here. You can download your entries in XML or in a comma separated format. Not pretty, but easy for programs to read. There is no way to save comments other than visiting all your pages and saving them. And if you really want to do something like that, I’d suggest a program like WinHTTrack as opposed to doing it by hand.

  5. I’d do as Benabik suggested, but I would use the XML file myself, rather than try to put it into a CSV format, which is useful for Excel, but… 😀

    Well, problem solved, I hope. Amazing how quickly one adapts to something like this, isn’t it?

  6. I felt the same way, so I wrote a Perl program to do exactly that. It’s not pretty, but it works. Just pass it your LJ username and password (so it can get all the protected entries), and it will copy all the entries into one file, and make a separate HTML file for each entry with all comments. Error checking is minimal to non-existant, so it occasionally dies, but doesn’t damage anything in the process.

    use LWP::UserAgent;
    use URI::URL;
    use HTTP::Cookies;
    #use LWP::Debug qw(+ +conns);
    use HTTP::Request::Common qw(POST);
    use CGI qw/escape unescape/;

    if ($#ARGV < 1) {
    print “Usage:getlj.pl {username} {password} [starting entry]\n”;
    exit 42;
    }

    $username = shift;
    $password = shift;
    $startwith = shift;
    if ($startwith eq ”) { $startwith = 1; }

    $ua = LWP::UserAgent->new;

    $cookie_jar = HTTP::Cookies->new({});

    $cookie_jar->clear();

    $ua->cookie_jar($cookie_jar);

    # Set up the login, in order to obtain the proper cookies.

    $url = new URI::URL(‘http://www.livejournal.com/login.bml’);

    $req = POST $url,
    [
    mode => ‘login’,
    user => $username,
    password => $password,
    ];

    $resp = $ua->request($req);

    if ($resp->is_success) {
    print “Login page loaded successfully\n”;
    # print $resp->headers_as_string;
    # print $resp->content;
    }
    else {
    print “request failed\n”;
    print $resp->message;
    exit 42;
    }
    print “\n—————————————————–\n”;
    print($cookie_jar->as_string());
    print “\n—————————————————–\n”;

    $url = new URI::URL(‘http://www.livejournal.com/interface/flat’);

    %getdata = (
    user => $username,
    password => $password,
    mode => ‘getevents’,
    selecttype => ‘one’,
    itemid => ‘-1’
    );

    if ($startwith != 1) {
    open (LJFULL,’>>lj.txt’);
    } else {
    open (LJFULL,’>lj.txt’);
    }
    $req = POST $url, [%getdata];

    printf “making request\n”;

    $resp = $ua->request($req);

    if ($resp->is_success) {
    print “Item retrieved successfully\n”;
    # print $resp->headers_as_string;
    # print $resp->content;
    }
    else {
    print “request failed\n”;
    print $resp->message;
    exit 42;
    }

    %tag = split (“\n”,$resp->content);

    $maxitem = $tag{events_1_itemid};

    print “There are $maxitem items to retrieve\n”;

    for ($item=$startwith; $item <= $maxitem; $item++) {

    $getdata{itemid} = $item;

    print “—— request data —-\n”;
    foreach $thing (keys %getdata) {
    print “key=$thing, value=$getdata{$thing}\n”;
    }

    $req = POST $url, [%getdata];

    $resp = $ua->request($req);

    if ($resp->is_success) {
    print “Item retrieved successfully\n”;
    }
    else {
    print “request failed\n”;
    print $resp->message;
    exit 42;
    }

    %tag = split (“\n”,$resp->content);

    print “——– tags for item $item—–\n\n”;
    print LJFULL “——– tags for item $item—–\n\n”;

    foreach $thing (keys %tag) {
    print “key=$thing, value=$tag{$thing}\n”;
    print LJFULL “key=$thing, value=$tag{$thing}\n”;
    }

    print “Property count: $tag{prop_count}\n”;

    $itemurl = “http://www.livejournal.com/users/$username/” . ($tag{‘events_1_anum’} + $tag{‘events_1_itemid’} * 256) . ‘.html’;
    print “itemurl=$itemurl\n”;

    # fetch the document via fake browser

    $webdoc = $ua->request(HTTP::Request->new(GET => $itemurl));
    $fn = ($tag{‘events_1_anum’} + $tag{‘events_1_itemid’} * 256) . ‘.html’;
    open (WEBPAGE,”>$fn”);
    print WEBPAGE $webdoc->content;
    close WEBPAGE;

    }

    close LJFULL;
    print “Done\n”;

  7. Semagic

    I’ve been using Semagic for my entries. It has a history function that can be used to look through every post you’ve made. For each entry, you can go into edit mode for the entry, then save it to a file. It then saves it in its own proprietary format. Kind of cumbersome but thats the best thing I can find so far.

  8. If you have a machine on which you can use the LogJam client, it has a feature to maintain an offline copy of your journal. I don’t know whether the offline copy includes comments.

    (Which reminds me that I should synchronize mine again.)

Comments are closed.