From the dualstack-is-fun department...

Tue Mar 1 13:29:02 CET 2011

On Tue, Mar 1, 2011 at 8:31 AM, Cameron Byrne <cb.list6 at gmail.com> wrote:
>
> On Feb 28, 2011 10:41 PM, "Andrew Yourtchenko" <ayourtch at gmail.com> wrote:
>>
>> Daniel,
>>
>> On Tue, Mar 1, 2011 at 1:34 AM, Daniel Roesen <dr at cluenet.de> wrote:
>> > On Tue, Mar 01, 2011 at 12:07:36AM +0000, Bjoern A. Zeeb wrote:
>> >> And, would you have noticed the IPv6 related bug if happy-eyeballs was
>> >> already implemented or would legacy IP have worked well enough for you
>> >> to not notice (read as - what's better?, getting the bug fixed or not
>> >> noticing and harming IPv6 for longer)?
>> >
>> > IPv6 is not customer-driven, but provider driven. So #1 priority must be
>> > that "things keep working" as painless as possible.
>> >
>> > But you're right in the sense that with Happy Eyeballs we need methods
>> > to measure problems being masked by HE. How, is one of the things
>> > which seem to be missing in the Happy Eyeballs discussion.
>>
>> Totally agree. Like I told to Bjoern when we met in @fosdem a few
>> weeks ago - from the pure engineering point of view I think a good
>> thing that could happen is IPv4 would suddenly vanish from the face of
>> earth for 3-4 months. Then we notice all the problems and can fix them
>> (very fast ;-) (Un)fortunately this is not possible - as it would be a
>> major catastrophy from the user experience point of view.
>>
>> Happy Eyeballs is a bit on the other side of the spectrum - by working
>> hard to make the UX as seamless as possible indeed it masks these
>> kinds of problems - so with it the chances are high that these
>> problems will not be noticed. Actually, even more so since the
>> opportunistic connection establishment that you mentioned in the first
>> mail might not even happy if the single protocol consistently wins (so
>> it is not 100% true about the increase in load).
>>
>> We plan a bar bof @Prague, I will definitely bring this topic up there
>> too - meantime if you have ideas, feel free to write them up for the
>> discussion.
>>
>> Side remark: I noticed this trend overall - the more robust you have a
>> protocol to external influences (soft failures instead of hard
>> failures), the "nicer" is the user experience, and the more hell is in
>> debugging of this protocol for the support/dev folks when the
>> experience slowly degrades to the point of being unacceptable. It's a
>> tough choice.
>>
>
> This also creates the ugly situation where customer calls help desk saying
> website x is down, support person tries to get to website x, and it works.
> Help desk says, nope "works for me" and the broken ipv6 access or dare I say
> ipv4 access is broken to the none-HE user but works for the HE user. If the
> none he-user cannot easily convince others that there is a problem, that is
> bad.

Yes, we already have in the latest text:

"Debugging and Troubleshooting

This mechanism is aimed to help the user experience in case of connectivity
problems. However, this precise reason also makes it tougher to use these
applications as a means of the verification that the problems are fixed. To
assist in that regard, the applications implementing the proposal in this
document SHOULD also provide a mechanism to temporarily use only one
address family."

Too weak ? Wrong approach ?

>
> This is a support nightmare as HE masks the issue and will not be uniformly
> deployed -- ever.
>
> This is a classic dilemma. Masking the problem ostensibly makes it go away,
> but at the same time exacerbates the ability to resolve it. It is kind of
> like beer :)
> and beer is good, especially when I been troubleshooting
> connectivity issues all day and my customers keeping telling me  websites
> are down
> ... but not all of them ... they all but works for me ....

...and "no-one changed anything". (That's what everyone says for the
past 15 or so years, I keep asking just in case, to see how often it's
"we changed X and Y has broken". I can count the occasions on fingers
of one hand, vs. the ~mid-4-digits number of the other outcome. So
fundamentally nothing changes - it breaks by itself today too :-)

(but seriously: appreciate all of the comments. I thought the above
blurb about troubleshooting in the draft should be enough, but maybe
it is not too strongly worded.
Maybe there needs to be some way to flag the problem that has been
worked around. To whom ? How ? Is it supposed to be specifit to this
area or maybe should there be something generic ? I think I'm getting
on hyperbolic HE-tangent trajectory with these questions.)

cheers,
andrew

>
> Cb
>> cheers,
>> andrew
>