I'm posting in hopes that someone can pass these edits along to the people who may be able to use it. And when someone sees it in a search might get use, too!
I got my LJ onto my computer in XML format, using the ljexport.pl perl script mentioned on the Codex I backed up my WP and ran an import via Manage->Import->Livejournal in the WP control panel. It worked but I was bummed to see no tags. Then when looking more, I saw that LJ-Cuts weren't handled and that WP has more instead.
I copied wordpress/wp-admin/import/livejournal.php to a testing area and experimented with the code until it got the tags and fixed lj-cuts. Next, I copied livejournal.php to livejournal.php.install and then copied back my modified livejournal.php to wordpress/wp-admin/import/livejournal.php
Here are the edits I made...
After this code block:
preg_match('|<event>(.*?)</event>|is', $post, $post_content);
$post_content = str_replace(array ('<![CDATA[', ']]>'), '', trim($post_content[1]));
$post_content = $this->unhtmlentities($post_content);
Is where I put this code:
// find <lj-cut> and handle it with the <!--more --> option in WP
// could probably be an if() or while() and still work. Simple testing showed no
// iterations. Oh, there's also a <div class="ljcut" text="...">
if( stristr($post_content, 'lj-') || stristr($post_content, 'ljcut') ) {
if( stristr($post_content, 'lj-cut text=') ) { //assumes is not "lj-cut text="
// $1, $2, etc are regex placeholders for the parened content
$post_content = preg_replace('/lj-cut text="(.*?)">(.*?)/i', '!--more $1 -->$2', $post_content);
}
if( stristr($post_content, 'div class="ljcut"') ) {
$post_content = preg_replace(':<div class="ljcut" text="(.*?)">:i', '<!--more $1 -->', $post_content);
}
if( stristr($post_content, '<lj-cut>') ) { //assumes is not "lj-cut text="
$post_content = preg_replace(':<lj-cut>(.*?)</lj-cut>:i', '<!--more -->$1', $post_content);
}
// this gets rid of </lj-cut>, <lj-raw>, </lj-raw>, and anything <lj->
$post_content = preg_replace(':<(.*?)lj-(.*?)>:i', '', $post_content);
}
// insert tags
preg_match('|<props>(.*?)</props>|is', $post, $props);
$props= implode(" ",$props);
if ( stristr ($props, "taglist") ) {
preg_match('|taglist\' value=\'(.*?)\' /|is', $props, $tags_input);
$tags_input = $tags_input[1]; // tagstr[1] has our tags in a csv string
}
Note it is also before this code block:
// Clean up content
$post_content = preg_replace('|<(/?[A-Z]+)|e', "'<' . strtolower('$1')", $post_content);
$post_content = str_replace('<br>', '', $post_content);
$post_content = str_replace('<hr>', '<hr />', $post_content);
$post_content = $wpdb->escape($post_content);
Add $post_content = str_replace('</div>', '', $post_content); to the clean up content area:
// Clean up content
$post_content = str_replace('</div>', '', $post_content);
$post_content = preg_replace('|<(/?[A-Z]+)|e', "'<' . strtolower('$1')", $post_content);
$post_content = str_replace('<br>', '', $post_content);
$post_content = str_replace('<hr>', '<hr />', $post_content);
$post_content = $wpdb->escape($post_content);
Now, to the next grouping of code just below this. Find the line that says $postdata = compact('post_author', 'post_date', 'post_content', 'post_title', 'post_status'); and add , 'tags_input'); so that the line now becomes $postdata = compact('post_author', 'post_date', 'post_content', 'post_title', 'post_status', 'tags_input');
And we set the categories at the end of the else {} block below that edit when we add these couple of lines:
// set categories: function wp_create_categories(array $categories, $post_id = '') --return $cat_ids
$categories = explode( ", ", $tags_input );
wp_create_categories($categories, $post_id);
Output:
echo '<li>';
if ($post_id = post_exists($post_title, $post_content, $post_date)) {
printf(__('Post <i>%s</i> already exists.'), stripslashes($post_title));
} else {
printf(__('Importing post <i>%s</i>...'), stripslashes($post_title));
$postdata = compact('post_author', 'post_date', 'post_content', 'post_title', 'post_status', 'tags_input');
$post_id = wp_insert_post($postdata);
if (!$post_id) {
_e("Couldn't get post ID");
echo '</li>';
break;
}
// set categories: function wp_create_categories(array $categories, $post_id = '') --return $cat_ids
$categories = explode( ", ", $tags_input );
wp_create_categories($categories, $post_id);
}
preg_match_all('|<comment>(.*?)</comment>|is', $post, $comments);
$comments = $comments[1];
I put a bit of work into this and hope it can help someone. :)