Linux IPv6 routing strange behaviour
Jeroen Massar
jeroen at massar.ch
Thu Aug 15 14:32:20 CEST 2013
On 2013-08-15 13:26, Phil Mayers wrote:
> On 15/08/13 12:14, Pim van Pelt wrote:
>> Just ad a datapoint to Max' last remark, at sixxs we moved away from
>> kernel based routing by implementing ipv6 routing in userspace (taking
>> tap input and raw socket output) largely because of neighbor cache
>
> Interesting. Was this custom/proprietary software or is it available
> somewhere?
To add to Pim's comments:
It is quite specific to the problems that SixXS PoPs have:
Large amount of tunnels and routes
Also note that these tunnels are dynamic and thus endpoints change all the time.
The Linux kernel (nor likely any other kernel) is just not (and likely will never) be designed for what the SixXS PoPs do. We saw random 'forgetting' of _static_ routing entries, and even tunnel interfaces going missing and other weird effects without any error/warnings whatsoever; thus what really happened is a mystery.
The routing logic along with the caching/neighbor lookups etc on top of those issues did not help at all either. Note that the same goes for FreeBSD/NetBSD/OpenBSD/OSX from our testing (yes, we checked if OSX was smarter about it, it is not ;)
>From our testing, performance characteristics are mostly the same when running sixxsd on the above platforms: it fills about 10G of tunneled traffic on a virtual interface on a i7 3.4Ghz. (Simulated traffic, but as everything is a static non-locking lookup that should be quite okay ;) If we ever hit the limits of that setup, we can always think about adding some threads or so to use the other cpus (hence why I don't mention quad-core above)...
Since deploying it we then also have not had any issues with the PoPs themselves anymore except for hardware outages or routing issues outside on the network itself. (code can't solve those... yet ;)
sixxsd is available for use solely by SixXS PoPs, but as said, it is solving a very specific problem that one likely does not have outside the scope of this. Thus it likely won't solve any problem you are having: as always, actually defining the problem one has might lead to a solution.
Some more details are available here:
http://www.sixxs.net/faq/sixxs/?faq=sixxsd
As a bonus, this is how the routing table of deham01 looks like:
8<--------------
root at deham01:~# ip -6 ro show
2001:6f8:862:1::/64 dev eth0 proto kernel metric 256 mtu 1500 advmss 1440 hoplimit 4294967295
2001:6f8:900:ffff::1 dev sixxs metric 1024 mtu 1500 advmss 1440 hoplimit 4294967295
2001:6f8:900::/48 via 2001:6f8:900:ffff::1 dev sixxs metric 1024 mtu 1500 advmss 1440 hoplimit 4294967295
2001:6f8:900::/40 via 2001:6f8:900:ffff::1 dev sixxs metric 1024 mtu 1500 advmss 1440 hoplimit 4294967295
2001:6f8:1000::/40 via 2001:6f8:900:ffff::1 dev sixxs metric 1024 mtu 1500 advmss 1440 hoplimit 4294967295
2001:6f8:1100::/40 via 2001:6f8:900:ffff::1 dev sixxs metric 1024 mtu 1500 advmss 1440 hoplimit 4294967295
2001:6f8:1200::/40 via 2001:6f8:900:ffff::1 dev sixxs metric 1024 mtu 1500 advmss 1440 hoplimit 4294967295
2001:6f8:1300::/40 via 2001:6f8:900:ffff::1 dev sixxs metric 1024 mtu 1500 advmss 1440 hoplimit 4294967295
fe80::/64 dev eth0 proto kernel metric 256 mtu 1500 advmss 1440 hoplimit 4294967295
default via fe80::5:73ff:fea0:1 dev eth0 metric 1024 mtu 1500 advmss 1440 hoplimit 4294967295
------------------>8
Yes, that is 5 /40s worth of address space and everything is piped into the sixxs interface to a single neighbor that lives on the tapped interface. We thus indeed hit the Linux routing logic a bit, but as the table is small and it is a single neighbor nothing much dynamic happens there. "ip -6 monitor route" is thus nice an silent.
Greets,
Jeroen
More information about the ipv6-ops
mailing list