Support » Plugin: ActivityPub » Pinged by Fediverse crawlers on inexisting resources

  • Resolved didierjm

    (@didierjm)


    Hi Matthias, Wie geht’s?
    Am testing the plugin on the site I referenced above.
    Seem to work fine, so far, as I can find it in Mastodon, see updates coming, see answers coming back and so on… (Other networks seem to have more troubles such as Firendica – can’t find blog – & Hubzilla – can’t comment/like).
    What’s annoying, is that I get a lot of pings from “2” crawlers, trying to open:
    /api/v1/instance
    /api/v1/config
    /api/statusnet/version.json
    /api/statusnet/config
    this generates 404 errors, a lot of them, as they’re pinging every 20 minutes or so…

    These are:
    fediverse.network crawler
    MastoPeek v0.7.2

    They might be misconfigured (in this case I’ll block them), or they’re looking for “official” Mastodon resources that don’t exist as the blog is not a “real” Mastodon instance…

    What should I do?
    Thx in advance.
    Sincerely
    DJM

    The page I need help with: [log in to see the link]

Viewing 4 replies - 1 through 4 (of 4 total)
  • Plugin Author Matthias Pfefferle

    (@pfefferle)

    We already discussed that here: https://github.com/pfefferle/wordpress-nodeinfo/issues/2

    These are crawlers that tries to list you in one of these networks:

    * https://mastopeek.app-dist.eu/
    * https://fediverse.network/
    * https://the-federation.info/

    They try to get infos about your fediverse node. The “standard” here is nodeinfo and that is supported by the ActivityPub plugin, that’s why you are listed here: https://fediverse.network/amf.didiermary.fr ๐Ÿ˜‰
    The other endpoints are used by Status.net, GNU.social or Mastodon, and the crawler always seems to check all endpoints, if the site uses nodeinfo or not.

    I am sorry, but I can’t control that ๐Ÿ™

    didierjm

    (@didierjm)

    Thx Matthias, no problem, that’s fine. As I said was more a question than a problem. sorry, hadn’t checked on github… will do next time.
    I’ll try to redirect their requests, to avoid all the 404. What’s the best endpoint to point them to? The blog’s root?
    Danke sehr.
    Sincerely
    DJM

    • This reply was modified 9 months ago by .
    Plugin Author Matthias Pfefferle

    (@pfefferle)

    That is a good question, but I think if you want to redirect the requests it doesnโ€˜t matter to which URL you send the crawlers.

    didierjm

    (@didierjm)

    Thx Matthias,
    In fact, in the plugin I use to manage 404 errors (Redirection), I can easily tell to ignore the request – nothing happens, but at least it’s not recorded anymore…
    Because there are plenty of URLS that are pinged multiple times per day or hour sometimes… so far I could list, from the same IP addresses:
    /.well-known/assetlinks.json
    /.well-known/x-social-relay
    /api/gnusocial/version.json
    /api/meta
    /api/statuses/public_timeline.json
    /api/statusnet/config
    /api/statusnet/version.json
    /api/v1/config
    /api/v1/instance
    /api/v1/instance/activity
    /api/v1/instance/peers
    /nodeinfo/2.0.json
    /poco
    /siteinfo.json
    /statistics.json
    And indeed you’re right for the 3 main agents pinging/crawling (your msg above).
    Sincerely
    DJM

    • This reply was modified 9 months ago by .
Viewing 4 replies - 1 through 4 (of 4 total)
  • The topic ‘Pinged by Fediverse crawlers on inexisting resources’ is closed to new replies.