@11whyohwhy15, as the page info you pasted in indicated, what the correct number is depends highly on your page content, type of theme, smarts of your widgets, and other factors like Ajax.
For example, if you are using smart widgets, where some might show dependent content based on the visitor (like based on country), then to bust page caches, they frequently use Ajax (call-backs to the server) to call in the localized pieces.
So if you load a page, but that page has 4 widgets each doing just one Ajax call, then what looks to the visitor as a single page is really 5 calls to the server. Not counting the other dynamic things that could be going on.
So it is not that a human user can read 240 pages, or would even click that fast if just browsing through. It is that each single click depending on your site design could be multiplied up several times.
Hence, YOU, the site owner is the only one that can determine what the right number is for your site. I think the default is merely set so high that it is unlikely to suddenly block off half the page content because access limits blocks it off in the middle of a page, and generate support calls for THAT reason. π
On determining which accesses are truly human or not. Hard to do reliably.
That depends on how “dumb” the robots are created (most are pretty stupid).
A really well designed robot/crawler can appear VERY human in how they access the site.
Heck, forum spammer bots like xRumer and it’s cousins for blog spamming have “long term” planning in them to appear human.
Register on a site one day.. Then post a few automated but completely bogus “replies” over a couple of days to appear like a “real, active forum member”.
THEN START spamming like crazy after gaining forum cred and the site allowing links. π
It’s all in the programming.
But most robots are simplistic. They miss out on sending certain headers, so are clearly not from a real browser.. Don’t load CSS/JS or other things, and so despite their agent-strings claiming to be a human browser, they are obviously not.
There are a ton of things that COULD be used for an estimated guess on what is human or not. None are 100%.
Not even Google Analytics is 100%, because any real human visitor, arriving with such tracker blockers as Ghostery will prevent from loading up the Google Analytics JS scripts. So these users can browse as normal, read every interesting page on your site, and Google Analytics would be none the wiser. π Google depend on their Javascript loading up in that person’s browser.