WordPress.org

Support

Support » How-To and Troubleshooting » Best practices for migration from Joomla to WordPress?

Best practices for migration from Joomla to WordPress?

  • Hi folks,

    The Preamble

    I’ve got a site that’s been using Joomla since J first came on the scene. It’s currently running Joomla 1.5.23

    I find that Joomla takes way too much of my time to maintain and it becomes a security risk because keeping the core and all the plug-ins, etc. up to date is tedious, to say the least. I’ve got numerous other sites on WP and just love it more and more with every new release. Needless to say, I want to drop Joomla and be solely on WP.

    What I’ve done (and determined) so far
    I’ve read the info at http://codex.wordpress.org/Importing_Content and the part specific to Joomla (it provides three links to variou import tools).

    I’ve also read plenty on info on the forums here, and on other web sites.

    I’ve determined that using one of the migration scripts it should be fairly easy to get the basic page content into WordPress, although there may be issues with formatting (requiring me to manually reformat posts) and their will definitely be issues with images and any content that relies (in Joomla) on plugins or modules, etc.

    Here’s what I’d appreciate advice on

    1) I’d like to make sure the images issue is as automated as possible, but not sure how to go about it.

    2) Plug-in / module specific issues I don’t mind resolving manually. Shouldn’t be too many (he says with fingers crossed)

    3) Here’s the big and important part that so far I’m not 100% sure how to go about doing:
    – How do I ensure my site doesn’t end up generating loads of 404 errors?

    – What is the best practice for migrating a site to a new system, taking into account the SEO permalinks from the old system which Google (etc.) have stored and ranked?

    Currently my site is in domainname.com/site

    This is because it was originally on some other system and back then I found that setting up Joomla in a new directory and then putting a message on the old site pages notifying users of the move, etc., was the easiest way to go about it.

    I am thinking the new wordpress install will also need to end up in /site otherwise Google results will all generate 404s. Correct?

    Anyway, the bottom line is that I’d appreciate hearing about how to go about handling what needs to happen AFTER the migration.

    4) I’ve read these articles:
    http://www.iamboredsoiblog.eu/2009/04/20/joomla-migration-to-wordpress/
    http://www.abelcheng.com/a-comprehensive-guide-to-migrating-from-joomla-1-0-to-wordpress/

    Can anyone suggest whether the methods outlined in these articles are the best way to go about it?

    5) What happens to article comments? I am assuming they get lost in the process, unless I manually copy them all over? Is that correct?

    What I’d like to create, after doing this successfully, is a guide that comprehensively spells out the steps needed to achieve a full and proper migration from Joomla 1.5.x to WordPress (including handling the potential issues with 404s, SEO migration, permalinks, etc.)

    Any tips or suggestions on any points in the above would be greatly appreciated.

    Thank you,

    Jonathan

Viewing 15 replies - 16 through 30 (of 32 total)
  • Well, the INSERT statement happens in function j2wp_process_posts_by_steps, which is only called around line 684 in function j2wp_joomla_wp_posts_by_cat, in turn invoked in function j2wp_do_mig on line 178. The INSERT is called conditionally when a post is part of one of the categories the plugin script identifies in lines 86 and following. It’s not called for every single post. How many categories does the plugin report when doing its migration?

    Anyways, I guess in the end, the simplest is to use the first method I mentioned (adding ‘ID’ => $R->id around line 376), and force an INSERT just before wp_insert_post on line 408. Connect to database, then run a query like:

    $query = "INSERT INTO" . $j2wp_wp_tb_prefix . "post(id) VALUES('" . $j2wp_page['ID'] . "')";

    Hopefully that will make wp_insert_post happy.

    Hi “spiff06”,

    How many categories does the plugin report when doing its migration?

    In answer to your question, it depends on whether I tell it to do the whole lot or if I opt to select some, and then depends on how many I select.

    Either way, it will create the categories if they do not exist. If they do exist, it will continue without error.

    Within the context of the existing code, I am not sure how I would go about doing this: “Connect to database, then run query…”

    Would you mind showing me how?

    Thank you…

    For now I’ve imported all my posts the “messy” way, by prepopulating the table.

    If we manage to work out a way to get it happening in a more tidy fashion that would certainly be nice… for other users. I’ll keep working on it, and I’ve also posted a question over on Stackoverflow.

    THE MESSY METHOD
    Prepopulate the wp_posts table in the WP database with as many records as you require (look in Joomla to see how many records you have). I had 398, so I added 398 records to wp_posts.

    HOW? I did it by exporting the emtpy wp_posts table to a .csv file. I then opened this in Excel (Numbers, or OpenOffice would also do). In the spreadsheet application it was easy to autofill 1 to 398 in the ID column.

    I then reimported that .csv file into wp_posts. This gave me a wp_posts with 398 record rows with 1 to 398 in the ID field.

    I then ran version 1.5.4 of Mambo/Joomla to WordPress migrator, which can be installed from within WordPress.

    End result?

    All posts have the same ID as the original Joomla articles.

    The module provides a function for that: j2wp_do_wp_connect(). It’s called just before the loop, so you don’t need to worry about selecting the database. Then it’s only a question of running the query (around line 408):

    foreach ( $j2wp_pages as $j2wp_page )
    {
            /* BEGIN Force post id for SEO */
            $current_post_id = $j2wp_page['ID'];
            $query = "INSERT INTO {$j2wp_wp_tb_prefix}post(id) VALUES('$current_post_id')";
            $result = mysql_query($query, $CON);
            if (!$result) {
                    echo "Could not create post #$current_post_id<br />";
                    echo mysql_error();
            } else {
                    echo "Created post #$current_post_id<br />";
            }
            /* END Force post id for SEO */
            $id = wp_insert_post( $j2wp_page );

    Not forgetting to add the ID to each page (around line 377):

    $j2wp_pages[] = array(
            'ID' => $R->id, /* Force post id for SEO */
            'post_author' => $user_id,
            'post_content' => $post_content,
            'post_date' => $R->created,
            'post_date_gmt' => $R->created,
            'post_modified' => $R->modified,
            'post_modified_gmt' => $R->modified,
            'post_title' => $R->title,
            'post_status' => 'publish',
            'comment_status' => 'open',
            'ping_status' => 'open',
            'post_name' => $R->alias,
            'tags_input' => $R->metakey,
            'post_type' => 'page'
          );

    Hopefully that’ll do it.

    By the way, this question was asked 3 months ago and was left unresolved. When you do get it to work as required, might want to notify the developer(s) in the appropriate plugin section of these forums.

    Crossed posts again.

    Hi Spiff06,
    Please, do let me know if you wish to drop helping on this at any point. I’ll totally understand, and in no way wish to take your help for granted.

    Now that I have used the “messy method” to get my data into WordPress, I am revisiting what you originally helped me with… the redirections! Hard to believe that’s where this started out.

    As far as #3 and #4 I think you can still refer to posts via identifier, so if you had a post with, say, id=276 in Joomla, giving WordPress something like ?p=276 will reach the proper location.

    Fiddling with .htaccess should point you in the right direction.

    On Joomla I had SEO type permalinks. In the format:
    YYYYMMDD###/sectionname/categoryname/title.html

    ### = the unique identifier. As I write this I am getting the sense it’s darn lucky I’d left that in there, because I am thinking that with the right regex expression I’ll be able to extract that out for a rewrite rule, correct?

    Here is an example of a permalink:
    /20040322256/articles/inspirations-realisations-and-explorations/fate-or-fiction.html

    256 is the unique identifier.

    I gather all the other info can be ignored, and I just need to hone in on the 256. Correct?

    What RewriteCond rule would I use for that?

    Note you’ll need to add your Joomla legacy URL rules above the WordPress rules in .htaccess. If you had Joomla pages with other formats, you’ll have to set up several rules to cover all cases.

    As far as I can tell, the only rewrite rules in the htaccess right now are the default ones:

    RewriteCond %{REQUEST_URI} ^(/component/option,com) [NC,OR] ##optional - see notes##
    RewriteCond %{REQUEST_URI} (/|\.php|\.html|\.htm|\.feed|\.pdf|\.raw|/[^.]*)$  [NC]
    RewriteCond %{REQUEST_FILENAME} !-f
    RewriteCond %{REQUEST_FILENAME} !-d
    RewriteRule (.*) index.php

    On the joomla site I use a component called SH404SEF. It is set to use the .htaccess method for its SEO activity, so I am a little surprised to see there is no sign of it having put anything into the .htacess file. Not sure what to make of that.

    I think that covers it as far as SEO. Google will reach the right page with a code 301 (Moved Permanently), and will update its database with the new WordPress URL.

    Do I need to do anything other than provide the redirection rule in order for Google to treat it as a 301? Or does it just assume that to be the case when it sees it has been redirected?

    Again, thank you for your time.

    Jonathan

    The module provides a function for that: j2wp_do_wp_connect(). It’s called just before the loop, so you don’t need to worry about selecting the database. Then it’s only a question of running the query (around line 408):

    I’ve tried implimenting this change, as you suggested. It seems, however, that the code is perhaps not following the logic we think it is.

    I did a test. Even if I completely remove (comment out) the wp_insert_post routine we are modifying, it has no impact on the migration script running. Also, I can have that commented out AND have a pre-populated (with ID only) table, and then the migration will successfully complete the “messy method”. So it seems to me that this wp_insert_post routine is not being run at all, or is being used for some other part of the migration script (keep in mind, the script also has the ability fix URLs in posts, and to relocate images, etc.).

    I am content with just sorting out the redirection part for now, and seeing what I can discover on the stackoverflow forum. I’ll post any results here for other users, and hopefully learn more about PHP / MYSQL in the process.

    Thank you.

    ### = the unique identifier. As I write this I am getting the sense it’s darn lucky I’d left that in there

    Yes indeed. Might want to put it in your WordPress URLs too, although I bet you’ll be staying with WP a while.

    On Joomla I had SEO type permalinks. In the format:
    YYYYMMDD###/sectionname/categoryname/title.html

    This should work:

    RewriteRule ^[0-9]{8}([0-9]*)\/ /?p=$1 [R=301,L]

    Takes practice to get used to regular expressions. See regular-expressions.info and perhaps this cheat sheet for pointers if you need to make up more rules.

    Do I need to do anything other than provide the redirection rule in order for Google to treat it as a 301? Or does it just assume that to be the case when it sees it has been redirected?

    The R=301 flag is what search engines expect (301 redirects).

    You can also search plugins for 301.

    This should work:
    RewriteRule ^[0-9]{8}([0-9]*)\/ /?p=$1 [R=301,L]

    Thank you. Yes, I’ve spent the past hour or so researching regex. First thoughts are that it’s rather complex and yet I can see it is VERY powerful.

    Do I simply put that RewriteRule at the very top of my .htacess file? Or does something need to be above it (like RewriteBase and/or RewriteEngine On, etc.)?

    From reading a few articles on how to migrate from Joomla to WP I was made aware of the 301 plugins, but from what I gathered they required me to set up a rule for every unique page request, which sounded like a lot of work, which is why I asked for help here, and which is why I’ve been very keen to implement your .htacess approach 🙂

    I have installed a plugin that will tell me each time a page request results in a 404 page not found. As I am sure there will be links not covered by the RewriteRule… due to various joomla extensions creating their own permalink structure.

    You’ve really held my hand through this. Thank you. I’ll let you know the result soon.

    Do I simply put that RewriteRule at the very top of my .htacess file? Or does something need to be above it (like RewriteBase and/or RewriteEngine On, etc.)?

    Well, after RewriteEngine and RewriteBase, which set up the redirection module in Apache. If you modify the permalink structure in WordPress, which you undoubtedly will if you haven’t already, then place that line just above the WordPress block.

    From reading a few articles on how to migrate from Joomla to WP I was made aware of the 301 plugins, but from what I gathered they required me to set up a rule for every unique page request

    No, some of them let you use regex as well.

    Regular expressions are not that complex conceptually. It’s just that there’s several ways to get the same result, alternate separators for conflicts or clarity, and the need to escape sequences sometimes make it hard to read. Then there’s the backward matching and such options, which can be rather involved. Tools like the Regulator can help a great deal when learning or designing complex regex.

    Hello Spiff06,
    Amazingly, it’s July 4th and I am just now completing the migration! When I started this (back in Jan 2012) I have a lot of spare time (holiday period) but it took more time than I had available. So it’s now June/July and I am back on the task.

    There is one thing I am not sure about, and I’d greatly appreciate your advice. I found your help invaluable back in January.

    As mentioned in #3 of my OP… I had the Joomla site on domain.com/site (domain.com would redirect to domain.com/site)

    I have my wordpress install located at domain.com/wordpress

    I would actually like my site to load from domain.com

    I am going to use the .htaccess (as suggested by you) to handle the 301 redirects from my old joomla posts to the wordpress ones.

    What I am wondering is how I go about moving the WordPress install to domain.com/ in such a way that nothing breaks. I’d like to ensure that all articles previously at domain.com/site 301 redirect to the same article at domain.com/

    I am aware I can change the home directory for the WP install from /worpress to / fairly easily. I think I just move all the files, and then plug the new home dir details into the WP config. Correct?
    But I want to make sure I get the 301 redirects right. I’d hate to mess up my search engine status and incoming links from the engines.

    Any tips would be very helpful.

    Hi Spiff, or anyone…
    The trouble I am now facing is that the shortlink /?p=### does not work for pages. It only works for posts.
    Any suggestions?

    There is no difference. A page is a special kind of post.
    'post_type' => 'page'

    Hi Spiff.
    Thanks for mentioning this. It has prompted me to dig around. I see that on one of my other sites ?p= will load pages and posts.

    However, on the site I am working on now (the one I migrated from Joomla) I have to use ?page_id= for pages. If I use ?p= for pages, I get a 404 error. very strange. I am guessing that a plugin is messing around with the core WP shortlinks.

    I’ll disable them all and see what happens.

    Discovered that it is the 404 Redirected plugin causing the issue. When it is activated, ?p= generates a 404 when the ID is for a PAGE, if the Automatic Redirects option is not activated. If the Automatic Redirect option is activated, then it will redirect to the home page. Either way, it stops the shortlink for PAGES from working.

    I’ve made a tutorial about migrating between the latest versions of Joomla to WordPress. Hope it helps!

    How-To Migrate From Joomla To WordPress

Viewing 15 replies - 16 through 30 (of 32 total)
  • The topic ‘Best practices for migration from Joomla to WordPress?’ is closed to new replies.