(Loose) uRPF vs. non-announced IXP space
Bernhard Schmidt
berni at birkenwald.de
Wed Feb 8 12:36:30 CET 2012
Hi,
this problem is based on a local configuration, but I think the problem
is generic enough to mandate a discussion. None of the parties shown
here are to blame from my POV.
We have received a trouble ticket from a person currently residing in
the KDDI network in Japan not being able to access services in our
network. Upon further investigation it turned out that pMTU discovery
was broken towards the user.
bschmidt at ping:~$ tracepath6 240f:13:6141:1::1
1?: [LOCALHOST] 0.011ms pmtu 1500
1: vl-60.csr1-2wr.lrz-muenchen.de 0.799ms
1: vl-60.csr1-2wr.lrz-muenchen.de 0.976ms
2: vl-3066.csr1-2wr.lrz.de 0.804ms
3: xr-gar1-pc110-108.x-win.dfn.de 1.177ms
4: zr-fra1-te0-6-0-7.x-win.dfn.de 10.109ms
5: 20gigabitethernet4-3.core1.fra1.he.net 13.161ms
6: 10gigabitethernet5-3.core1.lon1.he.net 27.212ms
7: 10gigabitethernet7-4.core1.nyc4.he.net 95.039ms
8: 10gigabitethernet1-2.core1.nyc1.he.net 94.787ms
9: no reply
10: no reply
11: no reply
12: no reply
bschmidt at ping:~traceroute6 -q1 240f:13:6141:1::1 1480
[...]
8 10gigabitethernet1-2.core1.nyc1.he.net (2001:470:0:37::2) 102.575 ms
9 *
10 2001:268:fb80:1::1 (2001:268:fb80:1::1) 267.924 ms
11 2001:268:fb02:6::1 (2001:268:fb02:6::1) 268.199 ms
bschmidt at ping:~traceroute6 -q1 240f:13:6141:1::1 1481
[...]
8 10gigabitethernet1-2.core1.nyc1.he.net (2001:470:0:37::2) 95.860 ms
9 *
10 *
11 *
Hop 9 is, according to the HE.net looking glass, the KDDI router at
NYIIX (2001:504:1::a500:2516:1). I initially suspected a misconfigured
tunnel and contacted KDDI, but it soon turned out that the router was
actually answering and sending ICMPv6 too-big messages when tracing from
other networks.
7: 10gigabitethernet1-2.core1.nyc1.he.net 109.748ms
8: 2001:504:1::a500:2516:1 288.354ms asymm 14
9: 2001:504:1::a500:2516:1 287.596ms pmtu 1480
9: 2001:268:fb80:1::1 287.681ms asymm 13
Long story short, in the end it turned out that my upstream (DFN) has
deployed loose (!) uRPF on transit interfaces. From discussions on IRC I
gather this is a pretty common configuration, for example for
BGP-injected blackholing of sources on Cisco.
The NYIIX transfer network is not announced in the DFZ, which is also a
pretty common configuration and explicitly allowed (or even recommended)
by RFC 5963.
---
IPv6 prefixes for IXP LANs are typically publicly well known and
taken from dedicated IPv6 blocks for IXP assignments reserved for
this purpose by the different RIRs. These blocks are usually only
meant for addressing the exchange fabric, and may be filtered out by
DFZ (Default Free Zone) operators. When considering the routing of
the IXP LANs two options are identified:
o IXPs may decide that LANs should not to be globally routed in
order to limit the possible origins of a Denial-of-Service (DoS)
attack to its participants' AS (Autonomous System) boundaries. In
this configuration, participants may route these prefixes inside
their networks (e.g., using BGP no-export communities or routing
the IXP LANs within the participants' IGP) to perform fault
management. Using this configuration, the monitoring of the IXP
LANs from outside of its participants' AS boundaries is not
possible.
---
Both common configurations together lead to pMTU blackholing, when the
MTU is reduced at the hop behind the peering LAN (i.e. with a tunnel
starting on the peering router). Which, granted, should become less
common in the future, but is still in the wild.
So what to do in this case? The platform in use does not support
ACL-based exceptions for uRPF (IOS-XR, I guess there are a few more).
Getting every IXP worldwide to announce their prefix is also a daunting
task.
For fun, I've tested all IXP peering interfaces HE.net has in PeeringDB
at the moment (by hand, so some mistakes are possible), and found that
out of 50 IXPs only 8 have their prefix visible from my POV.
IXP AS6939 address DFZ
===========================================================
AMS-IX 2001:7f8:1::a500:6939:1 yes
B-CIX 2001:7F8:19:1::1b1b:1 NO
BigApe 2001:458:26:2::500 yes
Any2 LA 2001:504:13::1a NO
Any2 SV 2001:504:13:3::21 NO
DE-CIX 2001:7f8::1b1b:0:1 yes
ECIX Berlin 2001:7f8:8:5:0:1b1b:0:1 NO
ECIX Düssel 2001:7f8:8::1b1b:0:1 NO
ECIX Hamburg 2001:7f8:8:10:0:1b1b:0:1 NO
Equinix ASH 2001:504:0:2::6939:1 NO
Equinix CHI 2001:504:0:4::6939:1 NO
Equinix DAL 2001:504:0:5::6939:1 NO
Equinix HK 2001:de8:7::6939:1 NO
Equinix LA 2001:504:0:3::6939:1 NO
Equinix NY 2001:504:f::39 NO
Equinix Newark 2001:504:0:6::6939:1 NO
Equinix PA 2001:504:d::10 NO
Equinix Paris 2001:7f8:43::6939:1 NO
Equinix SJ 2001:504:0:1::6939:1 NO
Equinix Tokyo 2001:de8:4::6939:1 NO
Equinix Sing 2001:de8:5::6939:1 NO
Equinix Zurich 2001:7f8:c:8235:194:42:48:80 NO
France-IX 2001:7f8:54::10 NO
HKIX 2001:7fa:0:1::ca28:a19e NO
JPIX 2001:de8:8::6939:1 NO
JPNAP 2001:7fa:7:1::6939:1 NO
KCIX 2001:504:1b:1::5 NO
KleyrEX 2001:7f8:33::A100:6939:1 NO
LAIIX 2001:504:a::a500:6939:1 NO
LINX 2001:7f8:4:0::1b1b:1 yes
LoNAP 2001:7f8:17::1b1b:1 yes
MICE 2607:fe10:ffff::52 yes
NASA-AIX 2001:478:6663:100::44 NO
NetNod 2001:7f8:d:fc::187 NO
NIX.CZ 2001:7f8:14::6e:1 NO
NL-IX 2001:7f8:13::a500:6939:1 NO
NOTA 2001:478:124::176 NO
NWAX 2001:0478:0195::42 NO
NYIIX 2001:504:1::a500:6939:1 NO
PaNAP 2001:860:0:6::6939:1 yes
PLIX 2001:7f8:42::a500:6939:1 NO
RMIX Denver 2605:6c00:303:303::69 NO
SIX 2001:504:16::1b1b NO
SOLIX 2001:7f8:21:10::101 NO
STHIX 2001:7f8:3e::a500:0:6939:1 NO
SwissIX 2001:7f8:24::AA NO
Telx Atlanta 2001:478:132::75 NO
Telx NY 2001:504:17:115::17 NO
Telx Phoenix 2001:478:186::20 NO
TorIX 2001:0504:001A::34:112 yes
For the record, since our platform does not support loose uRPF and we
want to pass those unreachables no matter what we use the following
static ACLs on upstream links:
ipv6 access-list ACL-UPSTREAM6-in
deny ipv6 <ownprefix> any
permit ipv6 2000::/3 any
permit ipv6 FE80::/64 any
permit icmp any any unreachable
deny ipv6 any any
But that does not offer the same featureset as loose uRPF of course.
Best Regards,
Bernhard
More information about the ipv6-ops
mailing list