• ResolvedPlugin Author Workshopshed

    (@workshopshed)


    I did a test import yesterday and it added an extra “>” onto the title and one onto the beginning of the text

    Importer version 0.3

Viewing 14 replies - 1 through 14 (of 14 total)
  • Plugin Author Workshopshed

    (@workshopshed)

    I’ve done some experiments with the code and on my home system which is PHP 5.2.2 the parsing works correctly. On my hosted system which is PHP 5.3.2 I get the extra >

    My experimental code is as follows.

    $entry = file_get_contents('Test.xml');
    
    $entry = "<feed>$entry</feed>";
    $AtomParser = new AtomParser();
    $AtomParser->parse( $entry );
    
    echo $AtomParser->entry->title;

    The xml is based on a blogger feed and an example is shown below

    <entry><id>tag:blogger.com,1999:blog-417730729915399755.post-8397846992898424746</id><published>2011-03-30T12:26:00.001-07:00</published><updated>2011-03-30T12:26:57.023-07:00</updated><title type='text'>Test post published</title></entry>

    Plugin Author Workshopshed

    (@workshopshed)

    I upgrade my local version to php 5.3.6 and reproduced the problem, there is definately something different about how the parser works.

    I’ve got a partial fix, it’s working ok for the title and content but it would appear that some work is still needed to get the authors across properly as they are coming across as “anonymous” for the comments and with a “>” for the posts

    The fix for the title and the posts is to modify the AtomParser end_element function as follows:

    function end_element($parser, $name) {
    
    		$tag = array_pop(split(":", $name));
    
    		if (!empty($this->in_content)) {
    
    			if ($this->in_content[0][0] == $tag &&
    			$this->in_content[0][1] == $this->depth) {
    				array_shift($this->in_content);
    				if ($this->is_xhtml) {
    					$this->in_content = array_slice($this->in_content, 2, count($this->in_content)-3);
    				}
    
    				#AGC Mods to handle PHP 5.3 changes
    				if (count($this->in_content) > 1) {
    					if ($this->in_content[0] = '\\') {
    						$this->in_content = array_slice($this->in_content, 1, count($this->in_content)-1);
    					}
    				}
    				#echo "<br />content<br />";
    				#var_dump ($this->in_content);
    				#echo "<br />";
    
    				$this->entry->$tag = join('',$this->in_content);
    				$this->in_content = array();
    			} else {
    				$endtag = $this->ns_to_prefix($name);
    				if (strpos($this->in_content[count($this->in_content)-1], '<' . $endtag) !== false) {
    					array_push($this->in_content, "/>");
    				} else {
    					array_push($this->in_content, "</$endtag>");
    				}
    
    			}
    		}
    
    		array_shift($this->ns_contexts);
    
    		#print str_repeat(" ", $this->depth * $this->indent) . "end_element('$name')" ."\n";
    
    		$this->depth--;
    	}
    Plugin Author Workshopshed

    (@workshopshed)

    My partial solution above was just a hack, as at the time I did not really understand what was happening in the class, it’s a SAX like XML parser and the actual problem is in the CData function which is treating an array like a string which acts differently in 5.3.

    The correct solution is not to change end element but to add an extra “if” into the cdata function as follows

    function cdata($parser, $data) {
    		#print str_repeat(" ", $this->depth * $this->indent) . "data: #" . $data . "#\n";
    		if (!empty($this->in_content)) {
    			// handle self-closing tags (case: text node found, need to close element started)
                // AGC:Fix 2011-04-08 Error with StrPos expects first parameter to be a string not an array
                if (count($this->in_content) > 1) {
        			if (strpos($this->in_content[count($this->in_content)-1], '<') !== false) {
        				array_push($this->in_content, ">");
        			}
                }
    			array_push($this->in_content, $this->xml_escape($data));
    		}
    	}
    cornerback97

    (@cornerback97)

    @workshopshed

    Thanks so much for your fix! Seeing that it came 7 hours ago couldn’t have been better timing. This error could have been costly for me and my client in the process of switching their blogs, but your fix has helped us tremendously.

    Thanks again!

    Plugin Author Workshopshed

    (@workshopshed)

    Glad to be of help, don’t know what needs to be done to get the problem and a solution back into the dev stream but I’m guessing people from wordpress are monitoring the forums and will use this analysis to roll a fix into the blogger importer releases.

    diniscorreia

    (@diniscorreia)

    Oh my, I can’t believe you came up with a solution! Thanks a lot!

    I struggled with the issue in the past, and my workaround was to just run a query on the database (now that’s hack).

    This fix should be made into the plugin – I’ll ask around and see if there’s some place to get it submitted.

    Again, thanks.

    …I’ll ask around and see if there’s some place to get it submitted.

    I thought you’d never ask:
    http://plugins.trac.wordpress.org/

    Plugin Author Workshopshed

    (@workshopshed)

    With the state that blogger is in at the moment I can see that there might be a lot of demand for this plugin in the very near future.

    Plugin Author Workshopshed

    (@workshopshed)

    Plugin Author Workshopshed

    (@workshopshed)

    Otto seems to have a handle on this.

    “new branch will fix the problem.”

    http://core.trac.wordpress.org/ticket/14525

    I’ve found the branch here.

    http://plugins.svn.wordpress.org/blogger-importer/branches/oauth/

    I’m testing it with a mini blog I put together.

    1 published post with a picture and two comments, one from a blogger user, one anonymous
    1 draft post with several categories
    1 scheduled post
    1 published post with really long title
    1 published post with a short title.

    The authentication works ok, my 5 test posts including scheduled and draft are importing ok but only one of my 2 comments is loaded, the one with the blogger user did not load.

    http://minitemp.blogspot.com/feeds/comments/default

    Plugin Contributor Samuel Wood (Otto)

    (@otto42)

    WordPress.org Admin

    The new branch is still being actively worked on, so I don’t expect it to be all perfect yet. However, you have to admit that it does fix the > problem. 🙂

    Plugin Author Workshopshed

    (@workshopshed)

    Otto, just spotted one more thing. The draft post was imported as published in that oAuth version I tried the other day. And yes it fixes the “>” issue

    Plugin Author Workshopshed

    (@workshopshed)

    The issue is now “Milestone Awaiting Review”, is it the issue under review or the fix?

    Will happily volunteer to test if knew where the code was.

    You can download the patch using the Original Format link at the bottom of this page:
    http://core.trac.wordpress.org/attachment/ticket/14525/14525.diff

    I tested it with 3.2.1 and commented here:
    http://core.trac.wordpress.org/ticket/14525#comment:5

Viewing 14 replies - 1 through 14 (of 14 total)
  • The topic ‘[Plugin: Blogger Importer] Extra > s’ is closed to new replies.