Ipv6 Routing (from hell)

Fri Mar 28 04:46:46 CET 2008

Bernhard Schmidt wrote:
> Hi Nick,
>
>> Michael, you really need to get your own ipv6 address range, get an
>> ASN and talk BGP.  You can do cute and interesting things and have
>> multiple access points from your network into multiple upstreams. 
>> Yeah, you can fudge around with tunnels and build brokenness into
>> your network from day 1, but trust me, you'll regret it.
>
Yes, having a real ipv6 address range, BGP, and ASN would be helpful,
but so far I can't see a way of avoiding tunnels somewhere in the
architecture.

The costs involved in getting an allocation from arnic are prohibitive
for one small community oriented network like air-stream (or, in my
other olpc project in Nicaragua). Perhaps all the wireless community
networks across Australia could band together to become "real", however.
Perhaps one already is...

There are real problems on all levels of the stack, levels 7 and 8
(financial and political) not least among them.
> Those networks usually suffer from a lack of internal bandwidth, so
> you really need optimized ingress routing as well. Impossible to do
> without further deaggregation (down to having each endsite (=/64)
> announced on the closest ingress), which pollutes the routing table
> even more.
>
This is why I felt tunnels were the answer. Whether the tunnels run over
ipv4 or 6 is irrelevant. Let me simplify that diagram to focus more on
the external routing issues:

 M1M2M3M4M5M6M7M8M9MX
       G1            G2                G3
           \             |                 / (tunnels)
                      G0
                         |
                  Internet

G1-3 all exchange detailed routing information with G0. G0 presents a
/48 to the world, but
does not need to share any of that internal detail with the world. The
connections to G1-G3 are tunnels (running over any protocol). G0
"figures out" the best routes to each /64

A mesh node going from M1 to M4 would probably use the mesh.

A mesh node needing to get from M1 to M9, will be more efficient
to go M1-G1-G0-G3-M9.

The problem arises when you want to have G0, G0' G0'' located in
different POPs, or if you wanted G1-3 to be elected as G0 based on
availablity. I've basically been convinced by this discussion so far
that this is an unsolvable problem. Given asymmetric data pipes between
POP and G1-3, and that you need G0 to exist in a co-lo, not only
providing /48 routing from the outside and /64 (or less) on the inside,
but failover tunnelling to each of the known G1-X gateways. )

That is also an unsolved problem with existing tunnel brokers(?), but,
at least, you don't need to propagate anything inside the network past
G0. Without a parallel routing infrastructure extending all the way up
to the international fiber interconnects, Ipv6 networks end up being
very regional in nature, in order to be purely  efficient. And tunnels
seem inescapable.

The compromise which depends on good interconnects between the various
G1-G3 ISP POPs is probably not too horrible, on the close order of 10ms
(SWAG) in most cases, more where they interconnect in another city.

> Folks are debating about giving such networks one routing slot already
> (see the PI discussions),
where?
> doing PI and BGP on those networks either results in internal
> congestion (announcing a /48 on all ingress points and forwarding the
> traffic all the way through the mesh to the destination) or
> deaggregating to possibly hundreds of routing slots for optimized
> ingress in reasonably sized networks.
Yes, most routing information need to make it to G*, but it can stay there.
> And remind you, we are usually talking about dozens of el-cheapo 20
> EUR consumer DSL connections to different ISPs in those meshing
> networks, there is just no way of getting native IPv6 with BGP on there.
>
Well, for the G1-GX I was thinking dozens of cheap BSD or Linux x86
boxes rescued from the graveyard attached to those el-cheapo DSL
connections. Memory is not a problem. Maintenence is (but if you have
dozens of gateways, failure is less of a problem). Openwrt - maybe. I
sure wish 32+MB versions of stuff based on that code existed.

G0 and G0' have to be robust with at least 512MB of ram, but that's no
problem.

For the Ms, they need to be dirt simple, cheap, stupid - but don't need
anything more complex than olsr in them.

Another note - I have been testing Squid from cvs, which has ipv6<->ipv4
support. So far it seems to be working well in my limited testing, I am
connected to it via ::1 and surfing for a couple days now.

I also noted to my surprise the "wpad" standard
(http://en.wikipedia.org/wiki/Web_Proxy_Autodiscovery_Protocol) "just
works" on the browsers I tried,
although coming up with a robust proxy.pac file and robust, secure DNS
architecture is beyond the scope of this document. :).

Elsewhere, current radvd 1.1 will emit RDNSS records, but my current
kernels don't have a way of getting them.... I think that is fixed in
2.6.24 and later.

And dibble-server (DHCPv6) works with dibble-client but after about a
day goes to 100% of cpu.

I would like to evolve dnsmasq closer to the ideal combination of dhcp
and dns, I think.
> PI+BGP is a good tool for most multihoming purposes, but not for all
> of them.
>
Reading... testing... reading... playing... reading... bbl
> Bernhard

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 252 bytes
Desc: OpenPGP digital signature
URL: <https://lists.cluenet.de/pipermail/ipv6-ops/attachments/20080328/8e933406/attachment.sig>