Having just upgraded the version of simply static to 184.108.40.206 I was very surprised to see the number of URLs scanned by simply static on my site leap from 3403 to 6755 pages when I didn’t add any pages or posts (just some edits). Looking at the generated export log I see 3351 missing URL errors where previously I didn’t see any.
The missing URLs are all of the form:
404 http://virtual.internal/character/imprudentius/GNU Terry Pratchett Found on /character/imprudentius/ 404 http://virtual.internal/character/matthew-dixon/text/html; charset=UTF-8 Found on /character/matthew-dixon/
and looking at the HTML for these pages shows that simply static now tries to follow the
meta http-equiv="Content-Type" content="text/html; charset=UTF-8"
meta http-equiv="X-Clacks-Overhead" content="GNU Terry Pratchett"
(It also followed a
meta name="msapplication-TileImage"html tag but that was indeed actually a link).
It looks to me like simply static is now overeager on following meta tags – at the very least it will need to be more intelligent about the values of http-equiv field since of the offical ones only default-style could be a link (X-Clacks-Overage is NOT an offical value for http-equiv but ideally would not cause the scan).
In addition the extra URLs doubled the time for the scan
- The topic ‘220.127.116.11 following http-equiv fields in the html causing many irrelevant 404s’ is closed to new replies.