Missing history

So it turns out that the LiveJournal to WordPress Importer didn’t actually import everything. I’d been going through and updating links to old entries to point to their relevant entry here in WordPress, and there were several pages that didn’t actually make it across (it also imported every single comment back to around 2010 twice, so I had to go through and delete all of those duplicates; prior to 2010 it was fine, for some weird reason). That wouldn’t have been too bad, but in between my having imported my LiveJournal originally and me discovering this, Kristina’s old LiveJournal account was deleted, which also meant that every single comment of hers on my LiveJournal was now gone, and any fresh import I did directly from the LiveJournal API wouldn’t have them at all. 🙁 Looking back through my old entries was kind of sad, with just “Deleted comment” everywhere in place of Kristina’s actual comments.

I wanted to have the complete history of my LiveJournal here in WordPress, but I also didn’t want to have all of Kristina’s comments missing. I figured SOMETHING had to be able to be done!

I used ljdump to hit LiveJournal’s API and download each entry there into a raw XML file, and that had grabbed all the journal entries, so clearly something had fucked up in the WordPress import part.

The situation was this:

  • I had most but not all of my old LiveJournal entries imported into WordPress
  • Those entries that made it across did have Kristina’s old comments on them
  • I had all of my own entries downloaded to raw XML
  • ljdump also grabbed all the comments for each entry as well (sans Kristina’s, obviously)

I manually went through and compared the entries in WordPress to those on my LiveJournal month-by-month, and found that there were 68 missing ones in total. I hacked at the LiveJournal to WordPress Importer plugin until I was able to get it to read the raw XML files that’d come directly from ljdump, then spun up a new temporary WordPress install and was able to import just those missing entries. Next, I erased that temporary instance, imported the full backup from this blog, then ran the importer again to bring in just those 68 missing entries from XML, and it worked a treat.

Unfortunately there were a handful of those entries that also had had Kristina’s comments on them previously, so they were still missing. Thankfully, me being the digital hoarder that I am, I still had all of the email notifications that LiveJournal had sent me for each and every comment on my journal, and the LiveJournal API actually shows even deleted comments in their properly threaded state, just with no body or detail beyond the username who posted it. So I was able to copy the content and timestamp for each comment of Kristina’s that’d been on those missing entries that weren’t imported, and update the raw comment XML with that detail!

This is still a work in progress and my next step is to hack at the importer further to read the comments directly from XML (currently it’s reading the journal entries from the XML files, but the comments are still pulled from LiveJournal’s API directly). It’ll definitely be do-able, it may just take a little while because everything related to WordPress is in PHP and I’ve not done any PHPing for quite a number of years now!

This may seem a bit odd, but given I have 13 years of history in LiveJournal, and it’s where Kristina and I initially started chatting a lot more before she visited and we got together, I didn’t want to have these weird entries where Kristina was just essentially erased from my blog history.

I might even put my modifications to the plugin up on Bitbucket if it seems to be working well, given the current LiveJournal to WordPress Importer is a bit shit.

Leave a Reply

Your email address will not be published. Required fields are marked *