• Resolved Piotr

    (@p396)


    Hello,

    My problem is a bit complicated, but I’m kind of new to WordPress and I’m still learning PHP, so maybe I can’t see the solution yet.

    Here’s my situation. I’m running a community forum on which I have a news section. I would like to have a CMS that takes posts from that section and displays them on a website just as most blogs do – with categories, tags and so on, and I think it can be achieved with WordPress (actually I’m almost sure it can, but I’m missing some knowledge to figure out how exactly). I’ve read a lot of tutorials, which gave me some ideas.

    To do this, I would have to connect to the forum database and get post data from a table or tables. I tried using wpdb class (new wpdb in my child theme functions, it works great), but is it possible to put this data in a WordPress loop using WP_query? Doing it this way would let me use template tags and give me all the goodness of WordPress (permalinks, search, archives?), the problem is, can the WP_query be used or modified to get certain columns from certain tables in a different database, and treat it as usual post data?

    If yes, great, if no, I have a second idea – copying certain forum columns into a separate table in wordpress database and using that data for custom posts. Probably this would be better, but I’m worried about the number of connections to the forum database – is there a way to copy and update this data periodically or when certain tables are updated (like when a new news-topic is published or edited)? Because in the previous case I was hoping that some kind of caching plugin in WordPress would solve it, but here… I’m not so sure.

    One thing to note – I’m not thinking about editing these posts in WordPress, I don’t need to do anything with them (although some parsing will be needed, but that is next on my “learn how to do it” list, if this works out), I just need a one way connection.

    So… this is it. I was hoping more experienced users can give me advice and point me in the right direction – I would be very grateful for that. Thanks for anything you can share.

Viewing 12 replies - 1 through 12 (of 12 total)
  • Moderator bcworkz

    (@bcworkz)

    On a superficial level, you could use WP_Query. It’s possible to alter the query just before it runs to query your forum table instead of the posts table. Then the query results will be rows from the forum table instead of posts. This is where things will likely fall apart unless the forum columns are remarkably similar to posts columns.

    All the cool features of WP you’re looking for rely on the correct interrelationship between the posts table and other tables, in particular the various terms tables. Unless your forum’s table is virtually a mirror of posts, it’s not going to work well. A caching plugin will not help here. Such a plugin reduces DB hits, it cannot magically make one table look like another.

    You will need to do some sort of regular import routine to translate recent forum posts into posts. This can be done with wp_schedule_event(). A timestamp saved every time the translation runs is used to query for the latest forum posts. Each result row is then translated in turn into a WP post and inserted into the posts table. Since as far as WP is concerned you have a complete, stand alone blog, everything will work fine. The only real difference is how the content is generated.

    Thread Starter Piotr

    (@p396)

    Hi,

    Thanks for your reply. So as I thought, using WP_Query directly is impossible here – you are right that the table structure would have to be identical, and I know it isn’t. Oh, and the caching plugin, just as you said, I was thinking about reducing the database hits, I wasn’t assuming it will change the structure. And the schedule event should help with my second idea, thank you for that.

    So one more thing to figure out, the translation. Would you have any suggestions? If I query forum posts and insert them into WordPress table, there will be problems when trying to use the loop. I would have to somehow translate them before inserting. Or is it possible to do this after inserting and before querying them for the loop?

    Thanks for the help.

    Moderator bcworkz

    (@bcworkz)

    It’s not possible to store the original format in WP and convert on output, otherwise you could just use the original table.

    It needs to happen before insertion. The wp_insert_post() takes a WP_Post object as an argument. Thus you need to build this object using the various columns of the forum table as the data source, but all the object keys are pure WP. Translation will mainly be of the “assign this column value to this post key” nature. There could be some format conversions, perhaps the timestamps are different. One thing to watch out for is character mapping. Unless both tables use identical character maps, be sure the conversion is done correctly.

    Thread Starter Piotr

    (@p396)

    Thank you for your help. I still have some things to learn, but I’m on the right track, so I’ll mark it as resolved.

    Moderator bcworkz

    (@bcworkz)

    You’re most welcome. If you get stuck on something in your learning ‘adventure’, feel free to ask.

    Thread Starter Piotr

    (@p396)

    Hi,

    I’m back after a while. I’m not really stuck (yet), just looking for a few answers. I’m not sure if I should post another topic after setting this one as resolved… I’ll repost it if that’s necessary.

    Anyways – after some reading I’ve created a custom post type, connected to my forum database and inserted a few posts into wp_posts. This required simple SQL functions to convert date and post name columns and surprisingly it works. But then I checked the number of queries and now I’m thinking it’s a terrible idea. Or I’m doing this terribly wrong. Here’s the code:

    // $mydb is new wpdb connecting to forum database
    $forumrows = $mydb->get_results( $sql, ARRAY_N );
    
    foreach ($forumrows as $row) {
    
    $news = array(
            "post_title" => $row["0"],
            "post_content" => $row["1"],
            "post_name" => $row["2"],
            "post_date" => $row["3"],
            "post_date_gmt" => $row["3"],
            "post_type" => "news",
            "post_status" => "publish"
    );
    
    // checking forum section ID  (column "4") to set category
        switch ($row["4"]) {
            case 62:
                $cat_ID = 2;
                break;
            case 63:
                $cat_ID = 3;
                break;
        }
    
    $post_ID = wp_insert_post( $news );
    
    wp_set_object_terms($post_ID, $cat_ID,'category');
    }

    As I said, it works – it queries post data from the forum, inserts it into wp_posts and adds a category depending on the forum section ID. But inserting just 10 posts resulted in 196 queries. If I were to insert all my posts, and there are about 900, I would probably use up database connections limit on my server. And still would need to update them regularly.

    Is there a more efficient way? Or is my code that inefficient?

    Any help will be appreciated.

    Moderator bcworkz

    (@bcworkz)

    Hmmm, that’s interesting! It’s not that your code is inefficient, it’s that WP is inefficient. Both insert post and set terms functions have the potential to make a LOT of queries on one single pass. I hadn’t realized this until you mentioned it, inspiring me to look through the related source code.

    To be fair, some of the possible queries I saw could be coming from a cache. WP does make an effort to not be too much of a pig. All the queries should not use up your alloted DB connections, AFAIK all the queries go through the one or two connections. Still, keeping the query count down is a good thing.

    The insert post function does a lot of work ensuring everything is in order before actually inserting the post, which is the reason for many of the queries. You could probably safely directly insert post rows using $wpdb, eliminating all the other related queries.

    A direct approach is also possible for setting terms, at least in theory. I find the terms tables organization rather confusing, so I would be inclined to let the set terms function be a pig. I’d be afraid to directly manipulate those tables because I don’t fully understand them. If you’re willing to figure out these tables, directly setting terms would save several more queries per post.

    If you’re concerned about hammering your server with this import process, you could set up the script to only do a few dozen posts at a time, saving where it’s at after each cycle. Then schedule it to run every so often. It may take all day to get through everything, but it will not impact the server much.

    BTW, regarding starting a new thread or adding on to an existing. More people will see your post if you start a new thread. But at least in this sub-forum, I pretty much read every new post, so I would see it either way. But I also prioritize existing threads first, resolved or not. I saw this post sooner being an existing thread. If you posted a new thread, it may have been a while before I got around to reading it, but I would most likely have read it eventually.

    Thread Starter Piotr

    (@p396)

    Thank you bcworkz!

    You gave me an idea. A few even. I’ll start from the end. šŸ˜‰

    Scheduling the import won’t help me. I checked my server limits, it has daily queries limits. But the limit is in millions, and by doing simple math (if 10 posts require 200 queries, 1 post = 20 queries) I can see that inserting my 900 posts this way would give me about 18 000 queries. I know that’s A LOT, but even if I wouldn’t care and updated all my posts every hour, I still would be quite safe. I do care though, and the least efficient way I’m considering right now is doing the big update once a day and updating 10 latest posts every hour. That would give me ~23 000 queries. Still a lot, but could be worse. And I prefer to be efficient, so…

    The direct approach. At the very beginning, when I did research on doing this import, I found out how to do this in MySQL – you can use good old INSERT INTO SELECT on both wp_posts and wp_term_relationships (I did it just five minutes ago to be sure if I remember correctly and it works). It’s a little complicated, but here’s how to do it.

    First you insert data into wp_posts – you have to remember you’re not using wp_insert_post anymore, so you need to take care of most of the columns by yourself (post_title, post_content, post_name, post_date, post_date_gmt, post_type, post_status). In my case I needed to use FROM_UNIXTIME for both dates and LCASE and REPLACE space with “-” for topic_name.

    Then the wp_term_relationships table – there are only three columns here (object_id, term_taxonomy_id, term_order). You insert your post ID in the first one, and in the term_taxonomy_id you insert the ID of your category or tag (which you’ll find in wp_terms table). The third column is set to 0 in my case, but it will probably change if I create categories inside categories, I’m not sure.

    And that’s it. My WordPress doesn’t complain after this, I can see the post in the loop, in category view and single post view, so I think it’s OK. But you may be right about using the wp_set_object_terms. If I would need to set a category and a few tags, this could be better. I see a problem though.

    wp_set_object_terms needs the post ID. If I’m not using wp_insert_post, I’m not getting the ID. Would it be OK to insert posts with my ID? Posts on my forum have a topic ID. It would be great if I could use it as post ID, later on I could even import topic replies as comments thanks to it. But will WordPress accept that? The ID is auto incremented, won’t there be any errors if I insert posts with IDs like 120 or 350?

    Oh, and one more question – warning, it’s kind of noobish. ;p As I said, before I did this in MySQL (INSERT INTO wordpress.wp_posts (all the columns) SELECT all the values FROM forumdb.topics and a few joins), how do I use wpdb here? $wpdb for inserting columns, new wpdb ($mydb in my example) for getting post data from the forum but how should the whole query look like?

    Thanks for your patience.

    Moderator bcworkz

    (@bcworkz)

    I think I can answer both the $wpdb and ID questions mostly all at once. The wpdb class gets instantiated by WP during initialization as the global object $wpdb. Once you declare global $wpdb; to insert a row use $wpdb->insert( $table, $data, $format ); Note in the reference that once the row is inserted you can get the inserted row’s ID with $wpdb->insert_id.

    I believe you can also assign your own ID without issue provided it does not already exist. If it does, the row insert should fail.

    The term tables are definitely easier if you are only dealing with existing terms for which you know the ID, in which case you are only assigning the term to a new object. In doing this I think you missed one detail. I believe after adding an object to term_relationships you must also increment the count value for the related term in term_taxonomy.

    It’s strange little details like that which make me reluctant to deal with term tables directly. But mainly when adding new terms. It seems the count is the only gotcha when assigning existing terms to objects, so direct manipulation is probably OK in this case.

    Thread Starter Piotr

    (@p396)

    It works! Thank you bcworkz, I would have missed the taxonomies and it would cause quite a few problems.

    One more question. Yeah, I know, no surprise here. šŸ˜‰ But this one is more interesting, I hope.

    I’m thinking if there is a way to include author’s name in the inserted post. I can’t insert it – wp_post stores only authors ID while my forum table holds the user name. What may make it easier is that I don’t need it to be a link – so no profile or listing posts by the author. Is there an efficient way to include the name in the post outside of the inserting process that I’m doing?

    Moderator bcworkz

    (@bcworkz)

    The only way to make the author ID thing work would be to add each author as a user. If all you really need is the author’s name and not the related user stuff, you could store the name in postmeta, then alter the templates to pull the author info from there instead of using the user ID to retrieve a name.

    It still would be a good idea to enter some user ID in each post row. It could be a special generic user ID that could act as a flag to get the author name from postmeta instead of user data. Then your custom author template tag could use this sort of logic (pseudo code):

    if ( $postdata->author_ID == $SPECIAL ) then
        echo get_postmeta('author_is_not_user');
    else
        echo get_userdata('display_name', $postdata->author_ID)

    Thread Starter Piotr

    (@p396)

    I’m back after some time. I hope you had great holidays.

    I need to thank you for all your advice, bcworkz. After a while I’ve got to a point where I’m building an actual plugin and it works as intended. Now I’m trying to make it use the settings API to dynamically change the sql query and it’s getting a little too complicated for my level of PHP and WordPress knowledge, so it’s time to ask more questions. šŸ˜‰ But I think it will be better if I start a new topic. Thanks again!

Viewing 12 replies - 1 through 12 (of 12 total)
  • The topic ‘Looping posts with content from different database’ is closed to new replies.