[23:48] <cjwatson> wgrant: I ran into an interesting issue when doing some casual Python 3 porting.  lib/lp/services/webapp/tests/test_servers.py:TestWebServiceRequestToBrowserRequest.test_unicode_path_info tests that Unicode is permitted in PATH_INFO.  PEP-3333 explicitly forbids this (or rather, it allows Unicode but only the bottom 256 codepoints, and anything else must be MIME-encoded), and zope.publisher ...
[23:48] <cjwatson> ... 4.0.0 enforces this at least in some places.  Do you think we should adjust the test in this case?
[23:50] <cjwatson> I can't think of a case where we'd actually need anything above U+00FF in PATH_INFO; surely practically everything there is a name or a fixed segment of some kind.
[23:50] <wgrant> cjwatson: Filenames, mostly.
[23:50] <cjwatson> Don't they get encoded?
[23:51] <wgrant> You'd think so, but it's possible lazr.restful decodes them at some point.
[23:51] <wgrant> Or zserver does
[23:51] <cjwatson> I mean if they don't undergo quoting then we must have other bugs
[23:51] <wgrant> It sounds like it's probably safe, but it needs checking
[23:51] <cjwatson> Yeah
[23:52] <cjwatson> It's conceivable we'd need to upgrade ZTK first, I suppose
[23:55] <cjwatson> Also for bonus points py2 urllib.(un)quote and py3 urllib.parse.(un)quote don't behave the same way for non-trivial Unicode
[23:55] <cjwatson> >>> unquote('%D7%90')
[23:55] <cjwatson> 'א'
[23:55] <cjwatson> ^ py3
[23:55] <cjwatson> >>> unquote('%D7%90')
[23:55] <cjwatson> '\xd7\x90'
[23:55] <cjwatson> ^ py2
[23:55] <cjwatson> and the other way is a KeyError in py2
[23:59] <cjwatson> so we need to make sure to consistently encode-then-quote and unquote-then-decode