I have read all of the top level articles and looked at some of the nested articles. I will continue searching but I have some questions.
The top level first article, https://codex.wordpress.org/FAQ_My_site_was_hacked, and I believe all or most of the others are for WordPress 2.7.1. I am running WordPress 4.4. Are there significant differences between the two?
Several references provide indication of difficulty with finding errant scripts, directories and files. I don’t understand this. suppose I have a pristine copy of WordPress (downloaded but not installed) and I have a fault with the installed version. As a first level search couldn’t I do something like:
> cd pristine/
> ls -Al | tr -s ' ' | cut -d' ' -f5,9 > pristine
> cd installed
> ls -Al | tr -s ' ' | cut -d' ' -f5,9 > /pristine/installed
> diff pristine installed | grep '>'
What I am comparing is the unsorted files and file sizes in the installed and pristine directories. This process detects all files with different sizes and all files and folders not part of the initial installation. The reduced set of files and folders are then suspicious, but is to me important is the reduction in search time. A few simple commands search all files.
s
The other note is does WordPress distribute a manifest with file sizes and directories as part of their release (and should they)? If they do then the provided data can be used directly without searching the pristine directory.
And as another fillip, does WordPress provide a signature for each file? If they do provide a signature and provide the algorithm used to calculate the signature then this can be used to validate existing files and will identify some valid files with bogus data which have file sizes matching the distributed version.
I have to admit that I am a complete novice (and dumb to boot), but the articles that I read stress the search process as being difficult and near impossible. And yet, the discovery of altered files appears easy. Further, with a little support from WordPress, it should be possible to identify files integral to correct operation which should never be changed and to identify pristine file content. And etc.
I guess that I don’t understand. could you point out articles which explain why my reasoning is faulty or could you explain why the reasoning is faulty?