WP Web Scraper
Scraping html is just showing text (2 posts)

  1. ppfearn
    Posted 3 years ago #

    I'm trying to display the html version of what I'm scraping but no matter what I try it just seems to either display the text version or it displays the entire page in html.
    I tried using shortcodes but the closing tags in my xpath were causing problems. I then tried installing a php plugin to allow live php in my pages and I'm still seeing the same problem:

    I can't see much support for html output, have I got the usage wrong for html? I'm expecting it to just place the html into my document. For example, I am scraping a table and all I see coming back is the text. I was expecting the table html to come back and then for it to be displayed as a table.

    Upon checking the source I can see this coming back:

     Start of web scrap (created by wp-web-scraper)
     Source URL: http://full-time.thefa.com/DisplayTeam.do?teamID=1769059&divisionseason=616234725
     Xpath: //*[@id="common.ui.team.displayteam.DisplayTeamForm"]/table[2]
     Delivered thru: Cache
     WPWS options: Array
        [postargs] =>
        [cache] => 60
        [user_agent] => WPWS bot (http://hartshead.tk)
        [timeout] => 2
        [on_error] => error_hide
        [output] => html
        [clear_regex] =>
        [clear_selector] =>
        [replace_regex] =>
        [replace_selector] =>
        [replace_with] =>
        [replace_selector_with] =>
        [basehref] =>
        [striptags] =>
        [removetags] =>
        [callback] =>
        [debug] => 1
        [htmldecode] =>

    This makes me think that I've got the right code to output html but that I may have the usage wrong or I may not understand what the plugin is doing

    I have tried the following:
    <?php echo wpws_get_content('http://full-time.thefa.com/DisplayTeam.do?teamID=1769059&divisionseason=616234725', '', '//*[@id="common.ui.team.displayteam.DisplayTeamForm"]/table[2]', '', '', '', '', 'html', '', 'ppfearn', '', '')?>

    I also tried 'output=html' and 'output="html"' with the same results.
    I'm a bit stuck at this point so any advice would be great.


  2. uhi888
    Posted 3 years ago #

    I've got the same problem. Would be great to get a solution. Thanks in advance

Topic Closed

This topic has been closed to new replies.

About this Plugin

  • WP Web Scraper
  • Frequently Asked Questions
  • Support Threads
  • Reviews

About this Topic