Title: Mark Tuttle's Replies | WordPress.org

---

# Mark Tuttle

  [  ](https://wordpress.org/support/users/markrtuttle/)

 *   [Profile](https://wordpress.org/support/users/markrtuttle/)
 *   [Topics Started](https://wordpress.org/support/users/markrtuttle/topics/)
 *   [Replies Created](https://wordpress.org/support/users/markrtuttle/replies/)
 *   [Reviews Written](https://wordpress.org/support/users/markrtuttle/reviews/)
 *   [Topics Replied To](https://wordpress.org/support/users/markrtuttle/replied-to/)
 *   [Engagements](https://wordpress.org/support/users/markrtuttle/engagements/)
 *   [Favorites](https://wordpress.org/support/users/markrtuttle/favorites/)

 Search replies:

## Forum Replies Created

Viewing 15 replies - 1 through 15 (of 20 total)

1 [2](https://wordpress.org/support/users/markrtuttle/replies/page/2/?output_format=md)
[→](https://wordpress.org/support/users/markrtuttle/replies/page/2/?output_format=md)

 *   Forum: [Plugins](https://wordpress.org/support/forum/plugins-and-hacks/)
    In
   reply to: [[HTML Import 2] [Plugin: HTML Import 2] Preserving directory hierarchy](https://wordpress.org/support/topic/plugin-html-import-2-preserving-directory-hierary/)
 *  Thread Starter [Mark Tuttle](https://wordpress.org/support/users/markrtuttle/)
 * (@markrtuttle)
 * [13 years, 6 months ago](https://wordpress.org/support/topic/plugin-html-import-2-preserving-directory-hierary/#post-3063137)
 * I see. That is not my experience, but I understand your explanation. It would
   ordinarily be a very minor issue, except that I import large sites subtree by
   subtree over extended periods of time as I and other volunteers hack the archaic
   html to meet modern standards before importing. Perhaps the most elegant solution
   is to insert a line into the documentation (even into the configuration page?)
   mentioning this distinction so the next guy like me (if there ever is one) is
   not surprised.
 *   Forum: [Plugins](https://wordpress.org/support/forum/plugins-and-hacks/)
    In
   reply to: [[HTML Import 2] [Plugin: HTML Import 2] Don't want to import selected tag or H1 heading](https://wordpress.org/support/topic/plugin-html-import-2-dont-want-to-import-selected-tag-or-h1-heading/)
 *  [Mark Tuttle](https://wordpress.org/support/users/markrtuttle/)
 * (@markrtuttle)
 * [13 years, 7 months ago](https://wordpress.org/support/topic/plugin-html-import-2-dont-want-to-import-selected-tag-or-h1-heading/#post-2569539)
 * I did not appreciate the consequence of this problem until I tried importing 
   files this month using the tag ‘body’ to select the whole web page for importing.
   Now the body tag appears in post_content in the database, and now WordPress is
   generating invalid html with one ‘body’ tag nested inside the WordPress ‘body’
   tag.
 *   Forum: [Plugins](https://wordpress.org/support/forum/plugins-and-hacks/)
    In
   reply to: [[HTML Import 2] [Plugin: HTML Import 2] Preserving directory hierarchy](https://wordpress.org/support/topic/plugin-html-import-2-preserving-directory-hierary/)
 *  Thread Starter [Mark Tuttle](https://wordpress.org/support/users/markrtuttle/)
 * (@markrtuttle)
 * [13 years, 8 months ago](https://wordpress.org/support/topic/plugin-html-import-2-preserving-directory-hierary/#post-3063007)
 * Looking at method get_post($path,$placeholder) in class HTML_Import in html-importer.
   php
 *     ```
       // if we're doing hierarchicals and this is an index file of a
       // subdirectory, instead of importing this as a separate page, update
       // the content of the placeholder page we created for the directory
       if (is_post_type_hierarchical($options['type']) &&
           dirname($path) != $options['root_directory'] &&
           basename($path) == $options['index_file']) {
       	$post_id = array_search(dirname($path), $this->filearr);
       	if ($post_id !== 0)
       		$updatepost = true;
       }
   
       if ($updatepost) {
       	$my_post['ID'] = $post_id;
       	wp_update_post( $my_post );
       }
       else // insert new post
       	$post_id = wp_insert_post($my_post);
       ```
   
 * it seems that files in the root directory are not made children of the index 
   file in the root directory, because no placeholder post gets made for the root
   directory, so there is no existing post to update with wp_update_post. Am I reading
   this correctly? So by design the hierarchy in for root directory must be constructed
   manually?
 *   Forum: [Plugins](https://wordpress.org/support/forum/plugins-and-hacks/)
    In
   reply to: [[HTML Import 2] [Plugin: HTML Import 2] Retain Class Names Cleaning Up Bad HTML?](https://wordpress.org/support/topic/plugin-html-import-2-retain-class-names-cleaning-up-bad-html/)
 *  [Mark Tuttle](https://wordpress.org/support/users/markrtuttle/)
 * (@markrtuttle)
 * [13 years, 9 months ago](https://wordpress.org/support/topic/plugin-html-import-2-retain-class-names-cleaning-up-bad-html/#post-2835704)
 * I would try listing “class” as one of the allowed attributes under the “clean
   up html” section of the import tool.
 *   Forum: [Plugins](https://wordpress.org/support/forum/plugins-and-hacks/)
    In
   reply to: [[HTML Import 2] [Plugin: HTML Import 2] Redirects not showing permalink structure](https://wordpress.org/support/topic/plugin-html-import-2-redirects-not-showing-permalink-structure/)
 *  [Mark Tuttle](https://wordpress.org/support/users/markrtuttle/)
 * (@markrtuttle)
 * [14 years, 1 month ago](https://wordpress.org/support/topic/plugin-html-import-2-redirects-not-showing-permalink-structure/#post-2672789)
 * What do you see when you enter [http://www.site.com/?page_id=435](http://www.site.com/?page_id=435)
   directly into your browser? Do you see the url rewritten to the permalink you
   are expecting?
 * This sounds like a problem with permalinks configuration to me and not this plugin.
   Have you looked at [http://codex.wordpress.org/Using_Permalinks](http://codex.wordpress.org/Using_Permalinks)?
   Is your web server configured as described there under “Permalink Types” (eg,
   apache with mod_rewrite loaded)? Have you tried other permalink settings like“
   Month and name” or “Post name” in place of the “Custom Structure” you seem to
   be using, just for debugging purposes?
 *   Forum: [Plugins](https://wordpress.org/support/forum/plugins-and-hacks/)
    In
   reply to: [[HTML Import 2] [Plugin: HTML Import 2] Importing Static HTML Pages](https://wordpress.org/support/topic/plugin-html-import-2-importing-static-html-pages/)
 *  [Mark Tuttle](https://wordpress.org/support/users/markrtuttle/)
 * (@markrtuttle)
 * [14 years, 2 months ago](https://wordpress.org/support/topic/plugin-html-import-2-importing-static-html-pages/#post-2251285)
 * If you are not already comfortable with php scripting, and since your target 
   is a single directory of files, it is probably faster and safer to bite the bullet
   and change the slugs manually one page at a time (dashboard -> pages -> all pages-
   > edit). A single evening of boring manual labor in front of the television will
   probably get the job done, and then it will be over.
 *   Forum: [Plugins](https://wordpress.org/support/forum/plugins-and-hacks/)
    In
   reply to: [[HTML Import 2] [Plugin: HTML Import 2] Broken in 3.3.1](https://wordpress.org/support/topic/plugin-html-import-2-broken-in-331/)
 *  [Mark Tuttle](https://wordpress.org/support/users/markrtuttle/)
 * (@markrtuttle)
 * [14 years, 2 months ago](https://wordpress.org/support/topic/plugin-html-import-2-broken-in-331/#post-2491278)
 * I suspect you misspecified the html element containing the content of the pages
   under dashboard->settings->html import. See [another post](http://wordpress.org/support/topic/plugin-html-import-2-import-only-the-title?replies=5)
   where Stephanie addresses this in the context of the importer grabbing only the
   page titles.
 *   Forum: [Plugins](https://wordpress.org/support/forum/plugins-and-hacks/)
    In
   reply to: [[HTML Import 2] [Plugin: HTML Import 2] Are you planning to upgrade for WP 3.3.1 please!](https://wordpress.org/support/topic/plugin-html-import-2-are-you-planning-to-upgrade-for-wp-331-please/)
 *  [Mark Tuttle](https://wordpress.org/support/users/markrtuttle/)
 * (@markrtuttle)
 * [14 years, 2 months ago](https://wordpress.org/support/topic/plugin-html-import-2-are-you-planning-to-upgrade-for-wp-331-please/#post-2485289)
 * I’ve used the plugin with 3.3.1 (I saw a [schedule ](http://wpdevel.wordpress.com/version-3-4-project-schedule/)
   suggesting 3.4 is coming out in April). What didn’t work?
 *   Forum: [Plugins](https://wordpress.org/support/forum/plugins-and-hacks/)
    In
   reply to: [[HTML Import 2] [Plugin: HTML Import 2] Don't want to import selected tag or H1 heading](https://wordpress.org/support/topic/plugin-html-import-2-dont-want-to-import-selected-tag-or-h1-heading/)
 *  [Mark Tuttle](https://wordpress.org/support/users/markrtuttle/)
 * (@markrtuttle)
 * [14 years, 2 months ago](https://wordpress.org/support/topic/plugin-html-import-2-dont-want-to-import-selected-tag-or-h1-heading/#post-2569456)
 * I’m not the plugin author, but
 * 1. I think selecting the entire node <div id=”content”>…</div> is a reasonable
   design decision. Can you say why this is a problem? Do you now have two document
   nodes with the same id “content”? Is this as simple as modifying your theme to
   omit the extra <div id=”content”> </div> wrapper?
 * 2. For the duplicate title, I suspect this is not a problem with the plugin. 
   I suspect the problem is that your static pages (as mine did) contain both <title
   >title string</title> and <h1>title string</h1> and most WordPress themes repeat
   the title at the top of the body with a line like
 *     ```
       <h1 class="entry-title"><?php the_title(); ?></h1>
       ```
   
 * So one quick solution is just to delete this line from the theme files. It is
   also possible to write a small script to iterate over the pages in the database
   to strip the initial <h1>…</h1> element from $page[‘post_content’] for each $
   page in the database.
 *   Forum: [Plugins](https://wordpress.org/support/forum/plugins-and-hacks/)
    In
   reply to: [[HTML Import 2] [Plugin: HTML Import 2] Importing Static HTML Pages](https://wordpress.org/support/topic/plugin-html-import-2-importing-static-html-pages/)
 *  [Mark Tuttle](https://wordpress.org/support/users/markrtuttle/)
 * (@markrtuttle)
 * [14 years, 2 months ago](https://wordpress.org/support/topic/plugin-html-import-2-importing-static-html-pages/#post-2251281)
 * I’m not the plugin author, but a [subsequent poster asked a similar question](http://wordpress.org/support/topic/plugin-html-import-2-how-do-i-keep-the-existing-slugs?replies=2).
 * I have manually patched slugs to match filenames by computing the mapping $id-
   >$filename from post id to filename, and then writing a script to essentially
 *     ```
       $post = get_page($id);
       $post['post_name'] = $filename;
       wp_update_post($post);
       ```
   
 * By the way, your request is reasonable, but I’m not certain it is always possible.
   There seems to be some requirement that slugs are unique (although I’ve gotten
   away with slugs that are not unique for reasons I can’t explain), so if you have
   two files with the same name FILENAME you might end up with posts having slugs
   FILENAME and FILENAME-2.
 *   Forum: [Plugins](https://wordpress.org/support/forum/plugins-and-hacks/)
    In
   reply to: [[HTML Import 2] [Plugin: HTML Import 2] How do I keep the existing slugs?](https://wordpress.org/support/topic/plugin-html-import-2-how-do-i-keep-the-existing-slugs/)
 *  [Mark Tuttle](https://wordpress.org/support/users/markrtuttle/)
 * (@markrtuttle)
 * [14 years, 2 months ago](https://wordpress.org/support/topic/plugin-html-import-2-how-do-i-keep-the-existing-slugs/#post-2589382)
 * I’m not the plugin author, but I believe the HTML_Import class in html-importer.
   php defines a method get_post that reads the file and creates the page. This 
   class builds up a WordPress post object in the array $my_post, without explicitly
   specifying the slug, and then inserts the post into the database with the lines
 *     ```
       if ($updatepost) {
         $my_post['ID'] = $post_id;
         wp_update_post( $my_post );
       }
       else // insert new post
         $post_id = wp_insert_post($my_post);
       ```
   
 * I believe these functions wp_update_post and wp_insert_post generate the slug
   from the title when the slug is not specified. So if you know how to compute 
   the slug you want to use, then setting the slug to $slug should be as easy as
   adding
 *     ```
       $my_post['post_name'] = $slug;
       ```
   
 * before the post insertion code.
 *   Forum: [Plugins](https://wordpress.org/support/forum/plugins-and-hacks/)
    In
   reply to: [[HTML Import 2] [Plugin: HTML Import 2] Fopen mode not specified](https://wordpress.org/support/topic/plugin-html-import-2-fopen-mode-not-specified/)
 *  [Mark Tuttle](https://wordpress.org/support/users/markrtuttle/)
 * (@markrtuttle)
 * [14 years, 2 months ago](https://wordpress.org/support/topic/plugin-html-import-2-fopen-mode-not-specified/#post-2399125)
 * I’m not the plugin author, but the only instance of fopen that I can find in 
   the latest release is in html-importer.php:
 *     ```
       $contents = @fopen($path);  // read entire file
       if (empty($contents))
         $contents = @file_get_contents($path);
       ```
   
 * I think the fopen fails, the at-sign suppresses the error message, and the file_get_contents
   actually reads the file.
 *   Forum: [Plugins](https://wordpress.org/support/forum/plugins-and-hacks/)
    In
   reply to: [[HTML Import 2] [Plugin: HTML Import 2] Importing preformatted text (pre tag)](https://wordpress.org/support/topic/plugin-html-import-2-importing-text/)
 *  Thread Starter [Mark Tuttle](https://wordpress.org/support/users/markrtuttle/)
 * (@markrtuttle)
 * [14 years, 3 months ago](https://wordpress.org/support/topic/plugin-html-import-2-importing-text/#post-2550173)
 * To strip the cdata, script, and style blocks, I think it is sufficient to add
   the functions
 *     ```
       function allowed_tag($tag,$allowedtags=NULL) {
         return
           !is_null($allowedtags) &&
           stripos($allowedtags,$tag) !== false;
       }
   
       function strip_cdata_block($string,$allowedtags=NULL) {
         if ($this->allowed_tag('<cdata>',$allowedtags)) return $string;
   
         $delim = "@";
         $cdata_start = preg_quote('<![CDATA[',$delim);
         $cdata_end = preg_quote(']]>',$delim);
         $block = "$cdata_start.*?$cdata_end";
   
         return preg_replace("${delim}$block${delim}s","",$string);
       }
   
       function strip_tag_block($tag,$string,$allowedtags=NULL) {
         if ($this->allowed_tag($tag,$allowedtags)) return $string;
         if (!preg_match(":<(.*?)>:",$tag,$match)) return $string;
   
         $delim = "@";
         $tag_str = $match[1];
         $tag_start = "<$tag_str(?:>|\\s[^>]*>)";
         $tag_end   = "</$tag_str(?:>|\\s[^>]*>)";
         $block = "$tag_start.*?$tag_end";
   
         return preg_replace("${delim}$block${delim}is","",$string);
       }
   
       function strip_comment_block($string) {
         $delim = "@";
         $comment_start = preg_quote('<!--',$delim);
         $comment_end = preg_quote('-->',$delim);
         $block = "$comment_start.*?$comment_end";
   
         return preg_replace("${delim}$block${delim}s","",$string);
       }
       ```
   
 * and add the following calls before strip_tags at the head of clean_html:
 *     ```
       $string = $this->strip_cdata_block($string,$allowtags);
       $string = $this->strip_tag_block('<script>',$string,$allowtags);
       $string = $this->strip_tag_block('<style>',$string,$allowtags);
       $string = $this->strip_comment_block($string);
       ```
   
 *   Forum: [Plugins](https://wordpress.org/support/forum/plugins-and-hacks/)
    In
   reply to: [[HTML Import 2] [Plugin: HTML Import 2] Importing preformatted text (pre tag)](https://wordpress.org/support/topic/plugin-html-import-2-importing-text/)
 *  Thread Starter [Mark Tuttle](https://wordpress.org/support/users/markrtuttle/)
 * (@markrtuttle)
 * [14 years, 3 months ago](https://wordpress.org/support/topic/plugin-html-import-2-importing-text/#post-2550138)
 * I propose adding to the HTML_Import class defined in html-importer.php the function
 *     ```
       function strip_insignificant_html_whitespace($string) {
         $pre_start = "<pre(?:>|\\s[^>]*>)";
         $pre_end   = "</pre(?:>|\\s[^>]*>)";
   
         $old_parts = preg_split(";($pre_start|$pre_end);i",$string,0,PREG_SPLIT_DELIM_CAPTURE);
         $new_parts = array();
   
         $strip = true;
         foreach ($old_parts as $part) {
           if (preg_match(";$pre_start;i",$part)) {
             $tmp = preg_replace(";\s+;"," ",$part);
             $new_parts[] = preg_replace("; +>;",">",$tmp);
             $strip = false;
             continue;
           }
           if (preg_match(";$pre_end;i",$part)) {
             $tmp = preg_replace(";\s+;"," ",$part);
             $new_parts[] = preg_replace("; +>;",">",$tmp);
             $strip = true;
             continue;
           }
           if ($strip)
             $new_parts[] = preg_replace(";\s+;"," ",$part);
           else
             $new_parts[] = $part;
         }
         return implode("",$new_parts);
       }
       ```
   
 * In clean_html
 *     ```
       replace
         $string = str_replace( '\n', ' ', $string );
       with
         $string = $this->strip_insignificant_html_whitespace($string);
       ```
   
 * In get_post in the `!empty($my_post['post_content']))`
 *     ```
       replace
         $my_post['post_content'] = ereg_replace("[\n\r]", " ", $my_post['post_content']);
       with
         $my_post['post_content'] = $this->strip_insignificant_html_whitespace($my_post['post_content']);
       ```
   
 * It would be nice also to strip the contents of cdata blocks and <script>..</script
   > blocks cleanly. I find examples like
 *     ```
       <div id="googleAds">
         <!-- b e g i n   g o o g l e  a d s  -->
         <script type="text/javascript">
           //<![CDATA[
           <!--
           google_ad_client = "...";
           google_ad_slot = "...";
           google_ad_width = ...;
           google_ad_height = ...;
           //-->
           //]]>
         </script>
         <script type="text/javascript" src="/data/../pagead2.googlesyndication.com/pagead/show_ads.js">
         </script> <!-- e n d   g o o g l e  a d s  -->
       </div>
       ```
   
 * that are not stripped cleanly by the application of the php strip_tags function
   in the plugin.
 *   Forum: [Plugins](https://wordpress.org/support/forum/plugins-and-hacks/)
    In
   reply to: [[Advanced Editor Tools] WordPress 3.3.1 and TinyMCE Advanced 3.4.5 not working](https://wordpress.org/support/topic/wordpress-331-and-tinymce-advanced-345-not-working/)
 *  Thread Starter [Mark Tuttle](https://wordpress.org/support/users/markrtuttle/)
 * (@markrtuttle)
 * [14 years, 4 months ago](https://wordpress.org/support/topic/wordpress-331-and-tinymce-advanced-345-not-working/#post-2490499)
 * What I had missed (forgotten) was that the installation of the TinyMCE Advanced
   plugin adds a configuration page to the dashboard. Go to Dashboard -> Settings-
   > TinyMCE Advanced. There I have the ability to drag and drop the new editing
   buttons I want into the rows of editing buttons in the visual editor. I saved
   that, and the buttons now appear in the visual editor. I had thought they would
   appear in the visual editor by default after installation. I had forgotten that
   I have to choose the buttons I want.

Viewing 15 replies - 1 through 15 (of 20 total)

1 [2](https://wordpress.org/support/users/markrtuttle/replies/page/2/?output_format=md)
[→](https://wordpress.org/support/users/markrtuttle/replies/page/2/?output_format=md)