WordPress.org

Ready to get started?Download WordPress

Forums

Blogger Importer
[resolved] Extra > s (15 posts)

  1. Workshopshed
    Member
    Plugin Author

    Posted 3 years ago #

    I did a test import yesterday and it added an extra ">" onto the title and one onto the beginning of the text

    Importer version 0.3

  2. Workshopshed
    Member
    Plugin Author

    Posted 3 years ago #

    I've done some experiments with the code and on my home system which is PHP 5.2.2 the parsing works correctly. On my hosted system which is PHP 5.3.2 I get the extra >

    My experimental code is as follows.

    $entry = file_get_contents('Test.xml');
    
    $entry = "<feed>$entry</feed>";
    $AtomParser = new AtomParser();
    $AtomParser->parse( $entry );
    
    echo $AtomParser->entry->title;

    The xml is based on a blogger feed and an example is shown below

    <entry><id>tag:blogger.com,1999:blog-417730729915399755.post-8397846992898424746</id><published>2011-03-30T12:26:00.001-07:00</published><updated>2011-03-30T12:26:57.023-07:00</updated><title type='text'>Test post published</title></entry>

  3. Workshopshed
    Member
    Plugin Author

    Posted 3 years ago #

    I upgrade my local version to php 5.3.6 and reproduced the problem, there is definately something different about how the parser works.

    I've got a partial fix, it's working ok for the title and content but it would appear that some work is still needed to get the authors across properly as they are coming across as "anonymous" for the comments and with a ">" for the posts

    The fix for the title and the posts is to modify the AtomParser end_element function as follows:

    function end_element($parser, $name) {
    
    		$tag = array_pop(split(":", $name));
    
    		if (!empty($this->in_content)) {
    
    			if ($this->in_content[0][0] == $tag &&
    			$this->in_content[0][1] == $this->depth) {
    				array_shift($this->in_content);
    				if ($this->is_xhtml) {
    					$this->in_content = array_slice($this->in_content, 2, count($this->in_content)-3);
    				}
    
    				#AGC Mods to handle PHP 5.3 changes
    				if (count($this->in_content) > 1) {
    					if ($this->in_content[0] = '\\') {
    						$this->in_content = array_slice($this->in_content, 1, count($this->in_content)-1);
    					}
    				}
    				#echo "<br />content<br />";
    				#var_dump ($this->in_content);
    				#echo "<br />";
    
    				$this->entry->$tag = join('',$this->in_content);
    				$this->in_content = array();
    			} else {
    				$endtag = $this->ns_to_prefix($name);
    				if (strpos($this->in_content[count($this->in_content)-1], '<' . $endtag) !== false) {
    					array_push($this->in_content, "/>");
    				} else {
    					array_push($this->in_content, "</$endtag>");
    				}
    
    			}
    		}
    
    		array_shift($this->ns_contexts);
    
    		#print str_repeat(" ", $this->depth * $this->indent) . "end_element('$name')" ."\n";
    
    		$this->depth--;
    	}
  4. Workshopshed
    Member
    Plugin Author

    Posted 3 years ago #

    My partial solution above was just a hack, as at the time I did not really understand what was happening in the class, it's a SAX like XML parser and the actual problem is in the CData function which is treating an array like a string which acts differently in 5.3.

    The correct solution is not to change end element but to add an extra "if" into the cdata function as follows

    function cdata($parser, $data) {
    		#print str_repeat(" ", $this->depth * $this->indent) . "data: #" . $data . "#\n";
    		if (!empty($this->in_content)) {
    			// handle self-closing tags (case: text node found, need to close element started)
                // AGC:Fix 2011-04-08 Error with StrPos expects first parameter to be a string not an array
                if (count($this->in_content) > 1) {
        			if (strpos($this->in_content[count($this->in_content)-1], '<') !== false) {
        				array_push($this->in_content, ">");
        			}
                }
    			array_push($this->in_content, $this->xml_escape($data));
    		}
    	}
  5. cornerback97
    Member
    Posted 3 years ago #

    @Workshopshed

    Thanks so much for your fix! Seeing that it came 7 hours ago couldn't have been better timing. This error could have been costly for me and my client in the process of switching their blogs, but your fix has helped us tremendously.

    Thanks again!

  6. Workshopshed
    Member
    Plugin Author

    Posted 3 years ago #

    Glad to be of help, don't know what needs to be done to get the problem and a solution back into the dev stream but I'm guessing people from wordpress are monitoring the forums and will use this analysis to roll a fix into the blogger importer releases.

  7. diniscorreia
    Member
    Posted 3 years ago #

    Oh my, I can't believe you came up with a solution! Thanks a lot!

    I struggled with the issue in the past, and my workaround was to just run a query on the database (now that's hack).

    This fix should be made into the plugin - I'll ask around and see if there's some place to get it submitted.

    Again, thanks.

  8. Ze Fontainhas
    Moderator
    Posted 2 years ago #

    ...I'll ask around and see if there's some place to get it submitted.

    I thought you'd never ask:
    http://plugins.trac.wordpress.org/

  9. Workshopshed
    Member
    Plugin Author

    Posted 2 years ago #

    With the state that blogger is in at the moment I can see that there might be a lot of demand for this plugin in the very near future.

  10. Workshopshed
    Member
    Plugin Author

    Posted 2 years ago #

  11. Workshopshed
    Member
    Plugin Author

    Posted 2 years ago #

    Otto seems to have a handle on this.

    "new branch will fix the problem."

    http://core.trac.wordpress.org/ticket/14525

    I've found the branch here.

    http://plugins.svn.wordpress.org/blogger-importer/branches/oauth/

    I'm testing it with a mini blog I put together.

    1 published post with a picture and two comments, one from a blogger user, one anonymous
    1 draft post with several categories
    1 scheduled post
    1 published post with really long title
    1 published post with a short title.

    The authentication works ok, my 5 test posts including scheduled and draft are importing ok but only one of my 2 comments is loaded, the one with the blogger user did not load.

    http://minitemp.blogspot.com/feeds/comments/default

  12. Samuel Wood (Otto)
    Tech Ninja
    Plugin Author

    Posted 2 years ago #

    The new branch is still being actively worked on, so I don't expect it to be all perfect yet. However, you have to admit that it does fix the > problem. :)

  13. Workshopshed
    Member
    Plugin Author

    Posted 2 years ago #

    Otto, just spotted one more thing. The draft post was imported as published in that oAuth version I tried the other day. And yes it fixes the ">" issue

  14. Workshopshed
    Member
    Plugin Author

    Posted 2 years ago #

    The issue is now "Milestone Awaiting Review", is it the issue under review or the fix?

    Will happily volunteer to test if knew where the code was.

  15. You can download the patch using the Original Format link at the bottom of this page:
    http://core.trac.wordpress.org/attachment/ticket/14525/14525.diff

    I tested it with 3.2.1 and commented here:
    http://core.trac.wordpress.org/ticket/14525#comment:5

Topic Closed

This topic has been closed to new replies.

About this Plugin

About this Topic