Dear Akamai, you got a /32 there not a bunch of /48s - how to break Facebook and annoy lots of users

Mon Aug 20 21:44:46 CEST 2012

Hi everyone.  How's it hangin'?  Seems like we have once again stirred up a lot of dust unintentionally.  I'm going to try to clear up a few things here, so please pardon the length of this response.  Feel free to let me know if anything was not clear.

First, some housekeeping: Akamai should have route6 objects for all our announcements.  I'll have someone get on that.  Mea Culpa, please forgive any inconvenience.  (I thought we already cleaned that up after the last thread, so this is especially embarrassing.)

Second, some stats: Akamai has several thousand nodes, but not all use Akamai-owned IP space.  We prefer to use a block from the ISP hosting the node, and obviously those do not need a separate announcement.  When necessary, i.e. when the hosting ISP cannot give us space, we have to make a decision whether the node is worth using our own IP space.  Frequently, but not guaranteed, the answer is "yes".  There are also some nodes which are multi-homed, and obviously those use our space, but they are a minor percentage of the total number of nodes.

Either way, since Akamai has no backbone, each node with unique, Akamai-owned IP space must (obviously) announce its block independently.  (If anyone suggests something like GRE tunnels, I will ridicule them in public. :)

Now on to the issue at hand.  As per an earlier thread here, we have three /32s.  We deaggregate and hand out /48s to our individual nodes.  We also announce the aggregate, as shown earlier in this thread.  If you have trouble reaching a node, please email noc at akamai.com (24/7), or NetSupport-tix at akamai.com (M-F biz-hours++, but likely a better place to get your problem solved).

The interesting thing below is that things "sometimes" work.  If a prefix / path is unavailable, it should not work, full stop.  And assuming you are using a topologically close recursive name server, our system should see the disconnectivity and not return that AAAA when you resolve v6 hostnames.  I'm not sure what is wrong with this particular situation, but we'll be looking into it.  Probably some weirdness around SIXXS which I cannot grok in my massively sleep deprived condition, but that's another matter.

As for whether we should deaggregate PA space, I'm afraid that decision is already made.  We are not asking for 1000+ /32s from the RIRs, and there really isn't another good solution to this problem AFAIK.  We are not trying to cause problems, but we have constraints in which we must work as well.

If this does not fit your view of how the world should work, I am afraid we shall have to agree to disagree - unless you can come up with a better solution than asking for 1000+ /32s.

One last note, on a slightly more personal nit to pick.  People have been screaming about the "exponential" growth of the table for well over a decade, and how the world was coming to an end Any Second Now.  I double-checked Nick and he is right, the sky is not falling.  Of course, it is important to not waste slots needlessly, or otherwise be silly with a limited resource.  But The End is not nigh.

I've been doing this for a little while now, and my biggest fear is not another order of magnitude in the v6 table.  Far more likely to destroy the Internet is the growth of Mbps, Gbps, Tbps - in fact, some would argue it has already caused us harm.

The only realistic way to manage the growth of things like VoD is massive distribution of content.  Everyone who has a _LOT_ of traffic is following in Akamai's footsteps by placing non-network connected caches inside broadband networks.  Assuming ISPs treat content providers alike, you will see this problem with many content companies.

Without this distribution, I posit the Internet would have already failed.  To put this in perspective, 1 in 5 bits on most broadband modems worldwide come from an Akamai server.  Google has similar traffic, NetFlix has a lot of traffic in the US and a non-trivial amount in other countries, etc.  It seems more than obvious to me the real danger is creating obstacles to distributing traffic more widely, not whether we have a few thousand more or even a couple orders of magnitude more prefixes in the v6 DFZ.

But then, maybe I'm biased....

-- 
TTFN,
patrick

P.S. Regarding the Subject line: Jeroen, we have different definitions of "lots of users".

On Aug 20, 2012, at 14:48 , Nick Hilliard <nick at foobar.org> wrote:

> On 20/08/2012 19:25, Marco d'Itri wrote:
>> Because some people are trying hard to not repeat the same errors which 
>> are causing the tragedy of the commons of the IPv4 DFZ.
> 
> There is no tragedy of the commons here.
> 
> The Tragedy of the Commons was a cautionary historical incident where a
> small number of people abused a common resource and caused a complete,
> permanent collapse of that resource.
> 
> In the case of the ipv4 dfz, we have a commonage which is well within the
> scaling bounds of equipment which has been on sale for the last 7 years,
> and which looks like it will scale for several years to come.  At that
> stage, normal kit retirement will come into play and the next generation of
> kit will scale well beyond what's currently available, which will
> accommodate expected DFZ growth for many years to come.
> 
> Even if for some reason the v6 table explodes and smashes everyones'
> forwarding engines, we still then have the option of targeted filtering
> because on a global scale, resource consumption will tend to follow a
> Pareto distribution.  This means we can cherry pick the greatest abusers
> and filter them until they sort out their broken policies (i.e. it will
> hurt them more than anyone else).
> 
> All of which is to say that in the worse case, it is not feasible at
> current usage growth rates that we will sustain a complete collapse of the
> Internet due to unconstrained DFZ growth, even in the long term.
> 
> The sky is not falling (I checked earlier today).
> 
> Even still, let's just be sensible.  Let's do our prefix aggregation
> carefully because we know that too much is bad.
> 
>> You mean, get 800 separate separate PI assignments from the RIRs?  What
>> problem is that going to solve other than annoying the LIRs?  Would you be
>> happier if Akamai announced 800 /32s instead?
> 
> But if Akamai or some other organisation which has 800 publicly routed
> sites, then they're going to need 800 v6 prefixes.  It is pointless to tell
> them that they need to use /32 for each just to get around peoples'
> filters.  Insisting on /32 for each site is fixing the wrong problem.
> 
> Also, please note ripe-555, section 4.
> 
> Nick
>