Discussion:
Query parts in IRIs in plain text mail
Martin J. Dürst
2012-07-11 10:47:09 UTC
Permalink
This is a test based on the comment by Dave Thaler that IRIs may also
appear in (plain text) email.

If my MUA (Eudora/Penelope/Thunderbird) does what I told it, this mail
should be in iso-8859-1 (Latin-1). If you have any way to check that
it's still Latin-1 at your end, please do so.

The IRI to test,
http://www.sw.it.aoyama.ac.jp/non-existent?résumé
is the same as the one I used in the SVG test.

It won't resolve (and doesn't need to), but should show you where your
browser wants to go
(http://www.sw.it.aoyama.ac.jp/non-existent?r%C3%A9sum%C3%A9 if it uses
UTF-8 for the IRI->URI conversion,
http://www.sw.it.aoyama.ac.jp/non-existent?r%E9sum%E9 if it uses
iso-8859-1).

Please report any results on the list.

Regards, Martin.
Martin J. Dürst
2012-07-11 10:57:02 UTC
Permalink
I verified that the mail was in iso-8859-1 (see below for details).
When I click on it, Opera gets activated, and shows the address as
http://www.sw.it.aoyama.ac.jp/non-existent?r%C3%A9sum%C3%A9

This would be consistent with what I got yesteday for SVG. I'm very
interested in knowing what other browsers (and maybe mailers, both sides
are involved) are doing. I can't check that easily myself because I
don't want to change default browsers or MUAs.

Regards, Martin.


P.S.: Here are the relevant lines from the mail as it arrived at my
place. It was downconverted to quoted-printable on the way back from W3C
to Aoyama :-(.

Content-Type: text/plain; charset=ISO-8859-1; format=flowed
X-MIME-Autoconverted: from 8bit to quoted-printable by
scmailgw01.scop.aoyama.ac.jp

The IRI to test,
http://www.sw.it.aoyama.ac.jp/non-existent?r=E9sum=E9
is the same as the one I used in the SVG test.
Post by Martin J. Dürst
This is a test based on the comment by Dave Thaler that IRIs may also
appear in (plain text) email.
If my MUA (Eudora/Penelope/Thunderbird) does what I told it, this mail
should be in iso-8859-1 (Latin-1). If you have any way to check that
it's still Latin-1 at your end, please do so.
The IRI to test,
http://www.sw.it.aoyama.ac.jp/non-existent?résumé
is the same as the one I used in the SVG test.
It won't resolve (and doesn't need to), but should show you where your
browser wants to go
(http://www.sw.it.aoyama.ac.jp/non-existent?r%C3%A9sum%C3%A9 if it uses
UTF-8 for the IRI->URI conversion,
http://www.sw.it.aoyama.ac.jp/non-existent?r%E9sum%E9 if it uses
iso-8859-1).
Please report any results on the list.
Regards, Martin.
Dave Thaler
2012-07-11 16:34:52 UTC
Permalink
Outlook 2010 + IE10:
http://www.sw.it.aoyama.ac.jp/non-existent?résumé
shows in the address bar (which will show IRIs). I suspect that means UTF-8, but a
better test would be one that actually has 2 pages of content at the two different URIs
where the content tells you which one it was (so the display of the address won't
matter).
-----Original Message-----
Sent: Wednesday, July 11, 2012 3:47 AM
Subject: Query parts in IRIs in plain text mail
This is a test based on the comment by Dave Thaler that IRIs may also appear
in (plain text) email.
If my MUA (Eudora/Penelope/Thunderbird) does what I told it, this mail
should be in iso-8859-1 (Latin-1). If you have any way to check that it's still
Latin-1 at your end, please do so.
The IRI to test,
http://www.sw.it.aoyama.ac.jp/non-existent?résumé
is the same as the one I used in the SVG test.
It won't resolve (and doesn't need to), but should show you where your
browser wants to go
(http://www.sw.it.aoyama.ac.jp/non-existent?r%C3%A9sum%C3%A9 if it uses
UTF-8 for the IRI->URI conversion,
http://www.sw.it.aoyama.ac.jp/non-existent?r%E9sum%E9 if it uses iso-
8859-1).
Please report any results on the list.
Regards, Martin.
Dave Thaler
2012-07-11 19:45:14 UTC
Permalink
Let me take a slightly different example:
http://www.sw.it.aoyama.ac.jp/non-existent?é
(I don't know what charset my email is in)

If the charset were iso-8859-1 then under RFC 3987 as I understand it,
this would become
http://www.sw.it.aoyama.ac.jp/non-existent?%C3%83%C2%A9
In other words, you have to convert iso-8859-1 to UTF-8 and then pct-encode
the UTF-8.

But as I understand 3987bis it would become
http://www.sw.it.aoyama.ac.jp/non-existent?%C3%A9
which would then be passed around via various APIs and protocols
that would not pass the charset along with it.
As such it would be interpreted by the receiving code as pct-encoded UTF-8:
http://www.sw.it.aoyama.ac.jp/non-existent?é
which of course it isn't.

Am I missing something?

-Dave
-----Original Message-----
Sent: Wednesday, July 11, 2012 9:35 AM
Subject: RE: Query parts in IRIs in plain text mail
http://www.sw.it.aoyama.ac.jp/non-existent?résumé
shows in the address bar (which will show IRIs). I suspect that means UTF-8, but a
better test would be one that actually has 2 pages of content at the two
different URIs where the content tells you which one it was (so the display of
the address won't matter).
-----Original Message-----
Sent: Wednesday, July 11, 2012 3:47 AM
Subject: Query parts in IRIs in plain text mail
This is a test based on the comment by Dave Thaler that IRIs may also
appear in (plain text) email.
If my MUA (Eudora/Penelope/Thunderbird) does what I told it, this mail
should be in iso-8859-1 (Latin-1). If you have any way to check that it's still
Latin-1 at your end, please do so.
The IRI to test,
http://www.sw.it.aoyama.ac.jp/non-existent?résumé
is the same as the one I used in the SVG test.
It won't resolve (and doesn't need to), but should show you where your
browser wants to go
(http://www.sw.it.aoyama.ac.jp/non-existent?r%C3%A9sum%C3%A9 if it uses
UTF-8 for the IRI->URI conversion,
http://www.sw.it.aoyama.ac.jp/non-existent?r%E9sum%E9 if it uses iso-
8859-1).
Please report any results on the list.
Regards, Martin.
Loading...