A while ago, a very unfamiliar field used to be reported to me a number of Nextcloud scheme. The user uploaded a file with an “ö” on a SMB fragment that used to be configured as an exterior storage within the Nextcloud server. Nonetheless when having access to the folder containing the file over WebDAV, it didn’t seem (no topic which WebDAV shopper used to be veteran). After ruling out the same outdated causes (nasty permissions, and heaps others…), I analyzed the community traffic between the WebDAV shopper and the server and noticed that the file establish is indeed no longer returned after issuing a PROPFIND
. So I mutter some breakpoints within the Nextcloud source code to study if it is miles additionally no longer returned by the SMB server.
It used to be returned by the SMB server, but when the Nextcloud scheme requested more metadata for the file (with the direction within the ask of), the SMB server returned a “file no longer found” error, which lead Nextcloud to discard the file.
How can it happen that the file is first returned by the SMB server when list data but then the server in an instant reviews an error when requesting more metadata?
When taking a peek on the raw bytes of the first (list) and 2nd (metadata) SMB ask of, I discovered the wrongdoer. The SMB server despatched the bytes 0x6F 0xCC 0x88
for the ö, which is U+006F LATIN SMALL LETTER O and U+0308 COMBINING DIAERESIS. Within the 2nd ask of, the Nextcloud server despatched 0xC3 0xB6
, which is LATIN SMALL LETTER O WITH DIAERESIS. Even though the two characters behold exactly the identical, their code level sequence is assorted.
This is identified as Unicode equivalence and, in theory, addressed by Unicode normalization. Nonetheless right here, normalization prompted this field. Before storing the file establish within the cache, Nextcloud normalized the file establish (to NFC) in a characteristic normalizePath
:
Featured Content Ads
add advertising here public static characteristic normalizePath($direction, $stripTrailingSlash = marvelous, $isAbsolutePath = erroneous, $keepUnicode = erroneous) {
...
//normalize unicode if conceivable
if (!$keepUnicode) {
$direction = OC_Util:: normalizeUnicode($direction);
}
...
}
This normalized direction is then veteran for the 2nd ask of, which fails since the file is saved in a assorted normalization plan (NFD on this case) on the SMB server.
So what can you stop about this? After all it is possible you’ll also write a shrimp script to normalize all your file names to NFC on all shares, but this can also honest no longer be smooth for mountainous shares with millions of data. However it turns out that I was no longer the first to skills this scenario and there would possibly be a “NFD compatibility” likelihood that would possibly well well be mutter on a per-fragment basis. When it is miles enabled, the unusual NFD establish is additionally saved within the cache (which results in additional database queries, the likelihood is therefore no longer enabled per default).
Be a part of the pack! Be a part of 8000+ others registered users, and earn chat, create teams, publish updates and create company all the procedure thru the area!
www.knowasiak.com/register