Eg I grab wordpress.org/latest.md5 which should only be a few bytes, I compare it to md5 latest.tar.gz, and if they're different I redownload and extract.
I think I may be having a dense moment here. If you have the known md5 of the tar.gz that belongs to the known latest version of wordpress, and you use that to compare against the hash sum of the latest tar.gz posted as available for download, why would your script have to guess at anything? In the event of a negative result, it would be time to download the new version, would it not?
that looks like it's more geared towards ensuring you downloaded a legitimate copy of a specific version and that it came out correct
Yes. It is. I believe, that is exactly the intent of using a hash sum. (Forensically speaking, just a numeric representation unique to the content of a file or files.) No two versions of WordPress (different versions, or a corrupted example of the same versions), would ever share the same hash, so why would any script have to take any "educated guesses"? If the hash doesn't match the known valid signature of the known latest version, it's time to re-download, no?
That matches exactly the desires you expressed in your first post.
"Eg I grab wordpress.org/latest.md5 which should only be a few bytes, I compare it to md5 latest.tar.gz, and if they're different I redownload and extract...
...Just something in a text file that we can parse easily in a script to determine if the file's changed and should be re-downloaded"
Which makes me not understand this statement at all;
it doesn't really help guess what the newest version is, unless your script takes educated stabs in the dark (checks 2.7.2, 2.8.0, etc).
I can't think of a script that would be cable of anticipating the MD5 (or any other algorithm) of a file not yet created. So guessing is out of the equation. It either matches a known signature or it doesn't.
No matter.. I'm probably just not seeing what you are really trying to accomplish. Hope you find a solution that works for you.
Cj