WordPress.org

Ready to get started?Download WordPress

Forums

WP Web Scraper
Scraping html is just showing text (2 posts)

  1. ppfearn
    Member
    Posted 1 year ago #

    Hi,
    I'm trying to display the html version of what I'm scraping but no matter what I try it just seems to either display the text version or it displays the entire page in html.
    I tried using shortcodes but the closing tags in my xpath were causing problems. I then tried installing a php plugin to allow live php in my pages and I'm still seeing the same problem:

    I can't see much support for html output, have I got the usage wrong for html? I'm expecting it to just place the html into my document. For example, I am scraping a table and all I see coming back is the text. I was expecting the table html to come back and then for it to be displayed as a table.

    Upon checking the source I can see this coming back:

    <!--
     Start of web scrap (created by wp-web-scraper)
     Source URL: http://full-time.thefa.com/DisplayTeam.do?teamID=1769059&divisionseason=616234725
     Selector:
     Xpath: //*[@id="common.ui.team.displayteam.DisplayTeamForm"]/table[2]
     Delivered thru: Cache
     WPWS options: Array
    (
        [postargs] =>
        [cache] => 60
        [user_agent] => WPWS bot (http://hartshead.tk)
        [timeout] => 2
        [on_error] => error_hide
        [output] => html
        [clear_regex] =>
        [clear_selector] =>
        [replace_regex] =>
        [replace_selector] =>
        [replace_with] =>
        [replace_selector_with] =>
        [basehref] =>
        [striptags] =>
        [removetags] =>
        [callback] =>
        [debug] => 1
        [htmldecode] =>
    )
    -->

    This makes me think that I've got the right code to output html but that I may have the usage wrong or I may not understand what the plugin is doing

    I have tried the following:
    <?php echo wpws_get_content('http://full-time.thefa.com/DisplayTeam.do?teamID=1769059&divisionseason=616234725', '', '//*[@id="common.ui.team.displayteam.DisplayTeamForm"]/table[2]', '', '', '', '', 'html', '', 'ppfearn', '', '')?>

    I also tried 'output=html' and 'output="html"' with the same results.
    I'm a bit stuck at this point so any advice would be great.
    Thanks

    http://wordpress.org/extend/plugins/wp-web-scrapper/

  2. uhi888
    Member
    Posted 1 year ago #

    I've got the same problem. Would be great to get a solution. Thanks in advance

Topic Closed

This topic has been closed to new replies.

About this Plugin

About this Topic