Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The problem, as I understand it, was filenames were read and written as bytes. That's fine if everybody is on the same platform but it doesn't always work when moving between, for example, Linux (UTF-8) and Windows (UTF-16).

I don't know if that's still an issue.



Hm, the Mercurial wiki does mention that's it was/is a problem on Windows.

I was surprised because we've been using it for years at work, sharing repositories between Linux and Windows, and we have never been bitten by this. Maybe the latin-1 subset of Unicode works? Or all the files we worked on just happened to have ASCII names (despite us being mostly non-Americans).


Unfortunately, on Linux, filenames are actually bytes, not utf-8.


Well yes and on Windows they are a sequence of 16bit values, not utf-16. My point was the encoding of filenames differ. So you can't do a byte-for-byte copy and expect the same results across platforms.

As an aside, it'd be perfectly legitimate for Mercurial (and git) to reject filenames that aren't representable in unicode.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: