I have the interface in Explorer mode. When I open a remote directory that contains a file with a german umlaut-u ('ü', Unicode code point 0xFC) in its name, the session abruptly closes with this error message:
Connection has been unexpectedly closed. Server sent command exit status 255.
When I click abort, I get another error message:
Connection has been unexpectedly closed. Server sent command exit status 255.
---------------------------
Error listing directory '/tmp/media/sda1/WinScpTest/UmlautU'.
Error changing directory to '/tmp/media/sda1/WinScpTest/UmlautU'.
When I click OK there, sometimes (not always), this error message shows up:
Cannot focus a disabled or invisible window
Please help us improving WinSCP by reporting the error on WinSCP support forum.
---------------------------
Stack trace:
(0023CC0E) Vcl::Forms::TCustomForm::SetActiveControl
I can click "Reconnect" in these dialogs but it never works. I just get the same error message again. I can also not manually connect to this device, since WinSCP will try to open that same directory again, which causes the same error.
I need to completely close the session tab, open the "Advanced Site Settings" -> Environment -> Directories, and change the "Remote directory" under "Remember last used directory".
In "Advanced Site Settings" -> Environment -> Server environment, "UTF-8 encoding for filenames is currently set to "On", but this also happens when it is set to "Auto" or "Off".
WinSCP 5.19.5
Just to be thorough, I updated to WinSCP 5.21.3 (current version at the moment) and it still happens.
It also happens with other non-ASCII characters.
The remote device is a custom embedded device running a Buildroot Linux. Using FileZilla to connect to it and enumerate the files in that directory works fine, so I don't think it's a server-side issue.
The log file has this error on line 519:
! 2022-09-12 16:15:34.215 FATAL: cannot encode local path name 'ü.txt'
It's very easy to get confused by the encodings here, so I'll spell it out very explicitly: The log-file seems to be properly encoded in UTF-8, since, when decoded with UTF-8, other non-ASCII text comes out readable (e.g. "Mitteleuropäische Zeit").
The log file, at that quoted location, contains the bytes
27 C3 83 C2 BC 2E 74 78 74 27
, which, when decoded with UTF-8 like the rest of the file, comes out to be these Unicode-code-points (in hexadecimal) with the characters corresponding to those code-points above them for easier orientation:
' Ã ¼ . t x t '
27 C3 BC 2E 74 78 74 27
Which looks like the byte-sequence produced by encoding the text 'ü.txt' with the 1252 encoding. But, of course, this is not supposed to be a byte-sequence, since it's already decoded. A sequence of bytes and a sequence of Unicode-code-points is not the same thing. The code-points just happen to all be below 256.
You might already understand this well, but I wanted to explain it thoroughly, just to avoid confusion.
I wrote a program to check possible ways this text could have been mangled and it found these combinations:
Assuming the text 'ü.txt' was encoded with some encoding and decoded with a different one, which combination gives 'ü.txt' as the resulting text?
'ü.txt' -[65001 Unicode (UTF-8) utf-8]-> '27 C3 BC 2E 74 78 74 27' -[1252 Westeuropäisch (Windows) Windows-1252]-> 'ü.txt'
'ü.txt' -[65001 Unicode (UTF-8) utf-8]-> '27 C3 BC 2E 74 78 74 27' -[1254 Türkisch (Windows) windows-1254]-> 'ü.txt'
'ü.txt' -[65001 Unicode (UTF-8) utf-8]-> '27 C3 BC 2E 74 78 74 27' -[28591 Westeuropäisch (ISO) iso-8859-1]-> 'ü.txt'
'ü.txt' -[65001 Unicode (UTF-8) utf-8]-> '27 C3 BC 2E 74 78 74 27' -[28599 Türkisch (ISO) iso-8859-9]-> 'ü.txt'
So it seems there is a really nasty encoding bug somewhere.
My thoughts:
* According to
https://winscp.net/eng/docs/faq_utf8#sftp and
https://winscp.net/eng/docs/ui_login_environment#utf, WinSCP should automagically not use UTF-8 when the server sends non-UTF-8 file names, but, as explained above, the problem also persists when the "Use UTF-8" option is set to "Off".
* In FileZilla I can select "Force UTF-8" and it still works fine. On the other hand, when I force it to use the 1252 encoding, the file names come out mangled (in this case, "ü.txt" is displayed as the file name, which matches the "ü.txt" -[UTF-8]-> ... -[1252]-> "ü.txt" pattern). This further indicates that the server is properly sending file names encoded with UTF-8.
* The error messages says "cannot
encode local path name", as opposed to
decode. This could be a typo, but if we assume that it isn't, the problem didn't occur while decoding the byte-sequence sent from the server, but while encoding something. I am not sure what this would be. But the surrounding log messages suggest that it did happen while processing incoming data. Not sure what to make of this.
I know how to reproduce the problem or the problem happens frequently enough. I wish to be contacted by the WinSCP team to help resolving the problem.
Description: I am not sure that I want to have SSH keys in a file that I just hand out on the internet, so I redacted those out. Shouldn't matter anyways.