Discussion:
[iri] #44: Reference Unicode TR 46, and if yes, how?
iri issue tracker
2011-10-21 22:50:22 UTC
Permalink
#44: Reference Unicode TR 46, and if yes, how?


Comment (by duerst@…):

It may make sense to watch other internet drafts as they move through the
IESG for approval, and then look at how we can reuse their text. (As an
example, http://tools.ietf.org/html/draft-ietf-websec-origin-06 mentions
both IDNA 2003 and 2008, in http://tools.ietf.org/html/draft-ietf-websec-
origin-06#section-8.4, IDNA dependency and migration.)
--
----------------------+------------------------
Reporter: duerst@… | Owner: addison@…
Type: defect | Status: new
Priority: major | Milestone:
Component: 3987bis | Version:
Severity: - | Resolution:
Keywords: |
----------------------+------------------------

Ticket URL: <http://trac.tools.ietf.org/wg/iri/trac/ticket/44#comment:3>
iri <http://tools.ietf.org/wg/iri/>
Peter Saint-Andre
2011-11-09 22:33:05 UTC
Permalink
Post by iri issue tracker
#44: Reference Unicode TR 46, and if yes, how?
It may make sense to watch other internet drafts as they move through the
IESG for approval, and then look at how we can reuse their text. (As an
example, http://tools.ietf.org/html/draft-ietf-websec-origin-06 mentions
both IDNA 2003 and 2008, in http://tools.ietf.org/html/draft-ietf-websec-
origin-06#section-8.4, IDNA dependency and migration.)
<hat type='individual'/>

It's not clear to me how UTR 46 is quite on-target for IRIs in general.
If anything, UTR 46 might be referenced from one of the IDNA specs, but
not from the IRI spec (IMHO).

Peter
--
Peter Saint-Andre
https://stpeter.im/
John C Klensin
2011-11-09 23:14:37 UTC
Permalink
--On Wednesday, November 09, 2011 15:33 -0700 Peter Saint-Andre
Post by Peter Saint-Andre
# 44: Reference Unicode TR 46, and if yes, how?
It may make sense to watch other internet drafts as they
move through the IESG for approval, and then look at how we
can reuse their text. (As an example,
http://tools.ietf.org/html/draft-ietf-websec-origin-06
mentions both IDNA 2003 and 2008, in
http://tools.ietf.org/html/draft-ietf-websec-
origin-06#section-8.4, IDNA dependency and migration.)
<hat type='individual'/>
It's not clear to me how UTR 46 is quite on-target for IRIs in
general. If anything, UTR 46 might be referenced from one of
the IDNA specs, but not from the IRI spec (IMHO).
Peter,

Let me say that a little more strongly. URIs and IRIs need to
be in some sort of reduced canonical form or basically all hope
of comparing them (including for caching purposes) without some
rather complicated algorithm disappears. To the extent to which
they are a good idea at all, mapping procedures like UTR 46 and
RFC 5895 are useful for providing users with more convenience
and flexibility. But, to the extent to which URIs and IRIs are
going to be used between systems, used to identify cached
content, etc., they just don't belong in them. Worse, neither
UTR 46 nor RFC 5895 (especially the former) are general-purpose
mapping/ equivalence routines. They are specific to IDNA and,
to a considerable measures, motivated by a desire to smooth out
IDNA2003 -> IDNA2008 transition.

john
Peter Saint-Andre
2011-11-09 23:26:46 UTC
Permalink
Post by John C Klensin
--On Wednesday, November 09, 2011 15:33 -0700 Peter Saint-Andre
Post by Peter Saint-Andre
# 44: Reference Unicode TR 46, and if yes, how?
It may make sense to watch other internet drafts as they
move through the IESG for approval, and then look at how we
can reuse their text. (As an example,
http://tools.ietf.org/html/draft-ietf-websec-origin-06
mentions both IDNA 2003 and 2008, in
http://tools.ietf.org/html/draft-ietf-websec-
origin-06#section-8.4, IDNA dependency and migration.)
<hat type='individual'/>
It's not clear to me how UTR 46 is quite on-target for IRIs in
general. If anything, UTR 46 might be referenced from one of
the IDNA specs, but not from the IRI spec (IMHO).
Peter,
Let me say that a little more strongly. URIs and IRIs need to
be in some sort of reduced canonical form or basically all hope
of comparing them (including for caching purposes) without some
rather complicated algorithm disappears. To the extent to which
they are a good idea at all, mapping procedures like UTR 46 and
RFC 5895 are useful for providing users with more convenience
and flexibility. But, to the extent to which URIs and IRIs are
going to be used between systems, used to identify cached
content, etc., they just don't belong in them. Worse, neither
UTR 46 nor RFC 5895 (especially the former) are general-purpose
mapping/ equivalence routines. They are specific to IDNA and,
to a considerable measures, motivated by a desire to smooth out
IDNA2003 -> IDNA2008 transition.
<hat type='individual'/>

You're preaching to the choir. :)

I see no reason to reference either UTR 46 or RFC 5895 in 3987bis, but
other WG participants might disagree.

Peter
--
Peter Saint-Andre
https://stpeter.im/
Chris Weber
2011-11-11 05:25:16 UTC
Permalink
Post by Peter Saint-Andre
Post by John C Klensin
Peter,
Let me say that a little more strongly. URIs and IRIs need to
be in some sort of reduced canonical form or basically all hope
of comparing them (including for caching purposes) without some
rather complicated algorithm disappears. To the extent to which
they are a good idea at all, mapping procedures like UTR 46 and
RFC 5895 are useful for providing users with more convenience
and flexibility. But, to the extent to which URIs and IRIs are
going to be used between systems, used to identify cached
content, etc., they just don't belong in them. Worse, neither
UTR 46 nor RFC 5895 (especially the former) are general-purpose
mapping/ equivalence routines. They are specific to IDNA and,
to a considerable measures, motivated by a desire to smooth out
IDNA2003 -> IDNA2008 transition.
<hat type='individual'/>
You're preaching to the choir. :)
I see no reason to reference either UTR 46 or RFC 5895 in 3987bis, but
other WG participants might disagree.
Peter
It sounds like you both agree, and after reading through the original
thread started by Julian
<http://lists.w3.org/Archives/Public/public-iri/2010Sep/0010.html> it
seems this was originally a question for the Section 3.4 Mapping
ireg-name, which has since been corrected. The topic of
canonicalization has been moved along with IRI comparison to
draft-ietf-iri-comparison.

Best regards,
Chris Weber
Mark Davis ☕
2011-11-11 07:16:14 UTC
Permalink
It really entirely depends on what IRIs are being used for, and what degree
of backwards compatibility is needed. It would break compatibility with
many to most existing implementations to restrict them to IDNA2008. For
example, currently an IRI can be of the form
"http://ÖBB.at<http://xn--bb-eka.at>",
and programs (IE, Chrome,... Search Engines, etc.) expect that to be valid.

If, as John said, an IRI is *only* used to express a canonical form, so
that it is ok that "http://ÖBB.at <http://xn--bb-eka.at>" and "
https://mail.google.com//mail" and so on are illegal IRIs, then it would be
fine* to restrict IRIs to IDNA2008.

(* mostly fine. IDNA2008 does not guarantee backwards compatibility when
used with different versions of Unicode, unfortunately. Luckily there is
only one character so far that used to be valid under IDNA2008 but is no
longer, and that character is fairly obscure.)

Mark
*— Il meglio Ú l’inimico del bene —*
*
*
*
[https://plus.google.com/114199149796022210033]
*
Post by Chris Weber
Post by Peter Saint-Andre
Post by John C Klensin
Peter,
Let me say that a little more strongly. URIs and IRIs need to
be in some sort of reduced canonical form or basically all hope
of comparing them (including for caching purposes) without some
rather complicated algorithm disappears. To the extent to which
they are a good idea at all, mapping procedures like UTR 46 and
RFC 5895 are useful for providing users with more convenience
and flexibility. But, to the extent to which URIs and IRIs are
going to be used between systems, used to identify cached
content, etc., they just don't belong in them. Worse, neither
UTR 46 nor RFC 5895 (especially the former) are general-purpose
mapping/ equivalence routines. They are specific to IDNA and,
to a considerable measures, motivated by a desire to smooth out
IDNA2003 -> IDNA2008 transition.
<hat type='individual'/>
You're preaching to the choir. :)
I see no reason to reference either UTR 46 or RFC 5895 in 3987bis, but
other WG participants might disagree.
Peter
It sounds like you both agree, and after reading through the original
thread started by Julian <http://lists.w3.org/Archives/**
Public/public-iri/2010Sep/**0010.html<http://lists.w3.org/Archives/Public/public-iri/2010Sep/0010.html>>
it seems this was originally a question for the Section 3.4 Mapping
ireg-name, which has since been corrected. The topic of canonicalization
has been moved along with IRI comparison to draft-ietf-iri-comparison.
Best regards,
Chris Weber
Larry Masinter
2011-11-12 03:22:12 UTC
Permalink
I think I understand the problem statement, but I don't think I agree with the solution.

Each application has to decide what kind of comparison is appropriate for the application, and comparison for the purpose of "caching" will be different from comparison for the purpose of "cache invalidation" or "deciding whether preloading the cache is useful".

And in any case, "caching" is on a protocol-by-protocol basis; caching for "http" vs "https" vs "ftp" (if that ever cached) would all be different.

So to the extent that there's a reference to UTR 46 and RFC 5895, the reference would be in the comparison document...

Do you agree?
Post by Peter Saint-Andre
Post by John C Klensin
Peter,
Let me say that a little more strongly. URIs and IRIs need to be in
some sort of reduced canonical form or basically all hope of
comparing them (including for caching purposes) without some rather
complicated algorithm disappears.
To the extent to which they are a
good idea at all, mapping procedures like UTR 46 and RFC 5895 are
useful for providing users with more convenience and flexibility.
But, to the extent to which URIs and IRIs are going to be used
between systems, used to identify cached
content, etc., they just don't belong in them.
I think all of this belongs in the comparison document, now, and not 3738bis
Worse, neither
Post by Peter Saint-Andre
Post by John C Klensin
UTR 46 nor RFC 5895 (especially the former) are general-purpose
mapping/ equivalence routines. They are specific to IDNA and, to a
considerable measures, motivated by a desire to smooth out
IDNA2003 -> IDNA2008 transition.
<hat type='individual'/>
You're preaching to the choir. :)
I see no reason to reference either UTR 46 or RFC 5895 in 3987bis, but
other WG participants might disagree.
Peter
It sounds like you both agree, and after reading through the original thread started by Julian <http://lists.w3.org/Archives/Public/public-iri/2010Sep/0010.html> it seems this was originally a question for the Section 3.4 Mapping
ireg-name, which has since been corrected. The topic of
canonicalization has been moved along with IRI comparison to draft-ie
iri issue tracker
2012-02-16 01:10:58 UTC
Permalink
#44: Reference Unicode TR 46, and if yes, how?

Changes (by duerst@…):

* owner: addison@… => duerst@…
* component: 3987bis => comparison


Comment:

Issue moved to comparison document. Look at text in
http://tools.ietf.org/html/draft-ietf-websec-strict-transport-sec-04.
--
------------------------+-----------------------
Reporter: duerst@… | Owner: duerst@…
Type: defect | Status: new
Priority: major | Milestone:
Component: comparison | Version:
Severity: - | Resolution:
Keywords: |
------------------------+-----------------------

Ticket URL: <http://trac.tools.ietf.org/wg/iri/trac/ticket/44#comment:4>
iri <http://tools.ietf.org/wg/iri/>
Loading...