Some very nice broken IPv6 networks at Google and Akamai (Was: Some very nice IPv6 growth as measured by Google)

Jeroen Massar jeroen at massar.ch
Sun Nov 9 19:58:56 CET 2014


On 2014-11-09 12:00, Tore Anderson wrote:
> * Jeroen Massar
> 
>> On 2014-11-08 18:38, Tore Anderson wrote:
>>> Yannis: «We're enabling IPv6 on our CPEs»
>>> Jeroen: «And then getting broken connectivity to Google»
>>>
>>> I'm not a native speaker of English, but I struggle to understand it
>>> any other way than you're saying there's something broken about
>>> Yannis' deployment. I mean, your reply wasn't even a standalone
>>> statement, but a continuation of Yannis' sentence. :-P
>>
>> That statement is correct though. As Google and Akamai IPv6 are
>> currently broken, enabling IPv6 thus breaks connectivity to those
>> sites.
> 
> Only if Google and Akamai are universally broken, which does not seem
> to have been the case. I tested Google from the RING at 23:20 UTC
> yesterday:

And Google confirmed that they fixed "something", we'll never really
know what they fixed though.

Your test was done from colocated hosts. While real people use access
networks.

Thus while such a test gives insight that some of it works, it does not
cover corner cases.


Also note that the Akamai problem (which still persists) is a random
one. Hence fetching one URL is just a pure luck thing if it works or
not. As a generic page has multiple objects though, you'll hit it much
quicker.


>> No, PMTUD is fine in both IPv4 and IPv6.
>>
>> What is broken is people wrongly recommending to break and/or
>> filtering ICMP and thus indeed breaking PMTUD.
> 
> There's a critical mass of broken PMTUD on the internet (for whatever
> reasons). It does not matter who's fault it is, the end result is the
> same - the mechanism cannot be relied upon if you actually care about
> service quality.
> 
> From where I'm sitting, Google is advertising me an IPv6 TCP MSS of
> 1386. That speaks volumes. I don't believe for a second that my local
> Google cluster is on links with an MTU of 1434; the clamped TCP MSS must
> have intentionally have been configured, and the only reason I can
> think of to do so is to avoid PMTUD.
> 
> What works fine in theory sometimes fail operationally (cf. 6to4).
> Insisting that there exists no problem because it's just everyone else
> who keeps screwing it up doesn't change operational realities.

I am not 'insisting' that there is no problem with PMTUD.

I am stating that the problem has to be fixed at the source, not hidden
in the network.


>> I also have to note that in the 10+ years of having IPv6 we rarely saw
>> PMTU issues, and if we did, contacting the site that was filtering
>> fixed the issue.
> 
> Looking at it from the content side, users using IPv6 tunnels are in a
> tiny, tiny minority, while still managing to be responsible for a
> majority of trouble reports.

Maybe as those users are more technically experienced and are able to
get their message out, while non-techie users just disable IPv6 as is
advised in a LOT of places? :)

[..]
> Native users are immune against these problems, because they do not have
> to use PMTUD.

You are forgetting the little fact that "native" is a really strange
word. Quite a few DSL deployments use PPPoE etc.

There are also a lot of "native" deployments out there that use 6rd.


Instead of just coming with "TUNNELS SUCK!!!!@&$!@#&$%^!*@%!" actually
Contact the networks that are broken and try to get them to fix the
problem. You might not want to fix those as it is not your problem, but
it is a problem for access networks.

Note btw that Google is not stating anything about the problem they had.
And Akamai, well, they are still digging.

Thus PMTUD might be an issue, might also be something else completely.

Without insight into those systems, one just has to guess.




>> The two 'workarounds' you mention are all on the *USER* side (RA MTU)
>> or in-network, where you do not know if the *USER* has a smaller MTU.
> 
> LAN RA MTU, yes. TCP MSS, no - it can be done in the ISP's tunnel
> router.

Do you really suggest making the Internet have an MTU of 1280? :)

>> Hence touching it in the network is a no-no.
> 
> It appears to me that the ISPs that are deploying tunnels (6RD) for
> their users consider these a "yes-yes". Presumably because they've
> realised that reducing reliance on PMTUD is in their customer's best
> interest, as it gives the best user experience.
> 
> Is there *any* ISP in the world that does 6RD that does *not* do TCP MSS
> clamping and/or reduced LAN RA MTUs? (Or, for that matter, does IPv4
> through PPPoE and does not do TCP MSS clamping?)
> 
> For what it's worth, the vast majority of tunneled IPv6 traffic we see
> comes from ISPs with 6RD, which generally works fine due to these
> workarounds. Thankfully.

Till people start using non-TCP protocols, and everything breaks.

Hence, don't hide the fact, instead fix it.

[..]
>> That is indeed an assumption, as we can't see the Google/Akamai end of
>> the connection.
> 
> If you see failures on MTU=1500 links, I think there must be at least
> two distinct problems at play. When users report «MTU 1480 MSS 1220 =
> fix», then that is extremely indicative of a PMTUD problem.

For the Google case that was reported. Testing it myself, that did not
fix anything.

As you already stated, Google is announcing a small MSS themselves...


> With MTU=1500 links, PMTUD isn't necessary, so it must be some other
> root cause.

Of course PMTUD is needed. Not all links on the Internet are 1280.

Heck, there are even some "transit" providers that use tunnels and thus
have <1500 MTU


>> As you are wearing the hat of a hoster though, you should as there are
>> eyeballs that you want to reach that are behind tunnels and other
>> linktypes with a lower MTU than 1500.
>>
>> Hence, I can only suggest to do your testing also from behind a node
>> that has a lower MTU. eg by configuring a monitoring node with a
>> tunnel into your own network and setting the MTU lower, or do the
>> MTU-in-RA-trick for that interface.
> 
> The problem is that even if tunneled user A has no problems with PMTUD,
> tunneled user B might have. So testing from a node that has a lower MTU
> can't tell me with any degree of certainty that «tunneled users are
> fine».

There is never a 100% test. But at least you cover your base there.

Greets,
 Jeoren




More information about the ipv6-ops mailing list