Hi,
The filename support depends upon the zip engine that the zips are being created with.
The default zip engine – /usr/bin/zip – supports UTF8, provided that the webserver does. So does PHP’s built-in ZipArchive. If you have neither available, then it falls back to PclZip. PclZip does not natively support UTF8, but you can use a special script (it’s somewhere in this forum… I can find it if you can’t…) that fixes the filenames upon unzipping.
So, I recommend looking at the log of the backup to check which zip engine is being used. If it’s one of the first two, then probably the backup is fine and you’re having a problem with your unzip tools.
David
Thread Starter
beltex
(@beltex)
Hi David
Ok that leaves a bit confused.
I work locally on a dev box (virtual) and when i look at that log (images are ok on that box) i see no info on zip.
Checking if we have a zip executable available
0000.029 (0) Creation of backups of directories: beginning
0000.032 (0) Beginning creation of dump of plugins (split every: 400 Mb)
0000.281 (0) Total entities for the zip file: 735 directories, 4655 files (0 skipped as non-modified), 79.5 Mb
On the production server i do see it
Checking if we have a zip executable available
0000.076 (0) Testing: /usr/bin/zip
0000.098 (0) Output: zip warning: binziptest/test.zip not found or empty
0000.106 (0) Output: adding: binziptest/subdir1/ (in=0) (out=0) (stored 0%)
0000.107 (0) Output: adding: binziptest/subdir1/subdir2/ (in=0) (out=0) (stored 0%)
0000.108 (0) Output: adding: binziptest/subdir1/subdir2/test.html (in=128) (out=105) (deflated 18%)
0000.110 (0) Output: total bytes=128, compressed=105 -> 18% savings
0000.123 (0) Output: adding: binziptest/subdir1/subdir2/test2.html (in=135) (out=111) (deflated 18%)
0000.125 (0) Output: total bytes=263, compressed=216 -> 18% savings
0000.169 (0) Working binary zip found: /usr/bin/zip
0000.171 (0) Zip engine: found/will use a binary zip: /usr/bin/zip
0000.172 (0) Creation of backups of directories: beginning
ill check if my local box has zip…..
if that work ill let u know but thanks for the answer!!!
Hi,
I’d need to see the complete log file from the local site to be able to tell which zip engine it’s using – the pasted extract isn’t sufficient to indicate.
David
Thread Starter
beltex
(@beltex)
Hi david,
I tested with /bin/zip on both servers but for some reason my filenames are still messed up.
http://pastebin.com/auqM1QNi (dev box)
Hi,
The log file records that your webserver isn’t running in a UTF-8 locale:
LANG: C
That means that any shell commands spawned – like /usr/bin/zip – won’t be, either, and can’t handle non-ASCII characters.
You should either run the webserver in a UTF-8 locale, or make UD switch to using PHP’s zip module instead, with:
define(‘UPDRAFTPLUS_NO_BINZIP’, true);
David
Thread Starter
beltex
(@beltex)
Thanks for your support, sounds logic, ill reconfigure the locales on those boxes and try again
Thread Starter
beltex
(@beltex)
Strange, i have changed the system language on both systems, have set the defeult encoding to utf-8 in httpd confs, restarted and rebooted but the results are the same.
addding define(‘UPDRAFTPLUS_NO_BINZIP’, true); to wp-confug does not seem to do much ((
I guess thats it now is a server issue, i will find a guru that may be able to help
You can check the first few lines of the UD log file to see what the environment variable LANG variable is set to – this is what the zip binary picks up and uses. It should be something that includes UTF8, and not be just C. e.g. On mine:
$ echo $LANG
en_GB.utf8
David
Thread Starter
beltex
(@beltex)
Thanks
if i do that in terminal i get en_US.UTF-8
Thats if im logged in as root and is concitent with the default i setup while doing dpkg-reconfigure locales
I noticed that other programs dont trust zip to handle filenames directly and use iconv to deal with it.
is that old info? Please dont feel im trying to blame or push u or smth, im just very curious now ))
mmmm something i do notice is that on the old box the content is served as jpg but on the dev box its served as html: I take that info from chrome inspector filetype.
Hi Beltex,
iconv would only come into it if you’re trying to extract from a zip file created with filenames in the wrong encoding. That can be done; there’s an example script for doing that here: https://wordpress.org/support/topic/garbled-non-ascii-characters-file-names?replies=8#post-7382696
However, if you run your webserver in a UTF8 locale, so that it is able to process UTF8 filenames that are on your filesystem as UTF8, then you won’t need that.
Running this through a browser will tell you what the current setting is:
<?php echo $_ENV[‘LANG’]; ?>
David
Thread Starter
beltex
(@beltex)
Hi David,
very interesting, it seems that my version of Apaa che2 does not pickup the default charset setting.
pffff
Ill update and see if that changes it finaly
Thread Starter
beltex
(@beltex)
While reading on it seems that iso-8859-1 is the standard for Apache and actualy should not make that much difference in general serving of websites. Encodings for websites are stated in the headers if im correct, so running with iso-8859-1 may work fine and even your plugin may run fine, until of course u would have my user-case.
AddDefaultCharset should only be used when all of the text resources to which it applies are known to be in that character encoding and it is too inconvenient to label their charset individually. (Apache docs)
Its not a user-case that occurs a lot but it might still be nice to not be depending on that setting for such an important plugin.
I actually dont understand how the zip binary uses the charset of Apache, or is that because its apache user that uses the zip binary?
Thats just my thought though, would u know of an other way to tell zip about the encoding? Could i do smth with the user running apache that would make it work?
I tried updating Apache but it still uses iso-8859-1 by default and it can not be overided
Hi,
Sorry – I’ve been going down the wrong route. My test setups also have Apache running in LANG=C, and back up non-ASCII filenames without problem. So, it’s not that.
How are you verifying that there’s a problem… how are you able to confirm that it’s the zip files themselves that have a problem, and not the program you’re using to inspect the zips, or unzip the zips?
David
Thread Starter
beltex
(@beltex)
Wow ))) well no problem, I like solving problems.
I inspect them by restoring them and end up with an site that does not work but still busy witj it