I am trying to scrape a section of a website, lts say it’s http://whatever.com/news/index.html
some links are like this:
but images are like this:
I tried basehref=”http://whatever.com”
(but then the ../something links don’t work
I tried basehref=”http://whatever.com/news/”
(but then the /something links don’t work.
How do I delete “..” in the URL’s? I don’t think I’m using replace_text / replace_with or clear_regex correctly, because they don’t seem to work.
Thanks if you can help
- The topic ‘problem with webpage with some "../whatever" and some "/whatever" links’ is closed to new replies.