(Loose) uRPF vs. non-announced IXP space

Bernhard Schmidt berni at birkenwald.de
Wed Feb 8 12:36:30 CET 2012


Hi,

this problem is based on a local configuration, but I think the problem
is generic enough to mandate a discussion. None of the parties shown
here are to blame from my POV.

We have received a trouble ticket from a person currently residing in
the KDDI network in Japan not being able to access services in our
network. Upon further investigation it turned out that pMTU discovery
was broken towards the user.

bschmidt at ping:~$ tracepath6 240f:13:6141:1::1
 1?: [LOCALHOST]                        0.011ms pmtu 1500
 1:  vl-60.csr1-2wr.lrz-muenchen.de                        0.799ms
 1:  vl-60.csr1-2wr.lrz-muenchen.de                        0.976ms
 2:  vl-3066.csr1-2wr.lrz.de                               0.804ms
 3:  xr-gar1-pc110-108.x-win.dfn.de                        1.177ms
 4:  zr-fra1-te0-6-0-7.x-win.dfn.de                       10.109ms
 5:  20gigabitethernet4-3.core1.fra1.he.net               13.161ms
 6:  10gigabitethernet5-3.core1.lon1.he.net               27.212ms
 7:  10gigabitethernet7-4.core1.nyc4.he.net               95.039ms
 8:  10gigabitethernet1-2.core1.nyc1.he.net               94.787ms
 9:  no reply
10:  no reply
11:  no reply
12:  no reply

bschmidt at ping:~traceroute6 -q1 240f:13:6141:1::1 1480
[...]
 8  10gigabitethernet1-2.core1.nyc1.he.net (2001:470:0:37::2)  102.575 ms
 9  *
10  2001:268:fb80:1::1 (2001:268:fb80:1::1)  267.924 ms
11  2001:268:fb02:6::1 (2001:268:fb02:6::1)  268.199 ms

bschmidt at ping:~traceroute6 -q1 240f:13:6141:1::1 1481
[...]
 8  10gigabitethernet1-2.core1.nyc1.he.net (2001:470:0:37::2)  95.860 ms
 9  *
10  *
11  *

Hop 9 is, according to the HE.net looking glass, the KDDI router at
NYIIX (2001:504:1::a500:2516:1). I initially suspected a misconfigured
tunnel and contacted KDDI, but it soon turned out that the router was
actually answering and sending ICMPv6 too-big messages when tracing from
other networks.

 7:  10gigabitethernet1-2.core1.nyc1.he.net              109.748ms
 8:  2001:504:1::a500:2516:1                             288.354ms asymm 14
 9:  2001:504:1::a500:2516:1                             287.596ms pmtu 1480
 9:  2001:268:fb80:1::1                                  287.681ms asymm 13


Long story short, in the end it turned out that my upstream (DFN) has
deployed loose (!) uRPF on transit interfaces. From discussions on IRC I
gather this is a pretty common configuration, for example for
BGP-injected blackholing of sources on Cisco.

The NYIIX transfer network is not announced in the DFZ, which is also a
pretty common configuration and explicitly allowed (or even recommended)
by RFC 5963.

---
   IPv6 prefixes for IXP LANs are typically publicly well known and
   taken from dedicated IPv6 blocks for IXP assignments reserved for
   this purpose by the different RIRs.  These blocks are usually only
   meant for addressing the exchange fabric, and may be filtered out by
   DFZ (Default Free Zone) operators.  When considering the routing of
   the IXP LANs two options are identified:

   o  IXPs may decide that LANs should not to be globally routed in
      order to limit the possible origins of a Denial-of-Service (DoS)
      attack to its participants' AS (Autonomous System) boundaries.  In
      this configuration, participants may route these prefixes inside
      their networks (e.g., using BGP no-export communities or routing
      the IXP LANs within the participants' IGP) to perform fault
      management.  Using this configuration, the monitoring of the IXP
      LANs from outside of its participants' AS boundaries is not
      possible.
---

Both common configurations together lead to pMTU blackholing, when the
MTU is reduced at the hop behind the peering LAN (i.e. with a tunnel
starting on the peering router). Which, granted, should become less
common in the future, but is still in the wild.

So what to do in this case? The platform in use does not support
ACL-based exceptions for uRPF (IOS-XR, I guess there are a few more).
Getting every IXP worldwide to announce their prefix is also a daunting
task.

For fun, I've tested all IXP peering interfaces HE.net has in PeeringDB
at the moment (by hand, so some mistakes are possible), and found that
out of 50 IXPs only 8 have their prefix visible from my POV.

IXP		AS6939 address				DFZ
===========================================================
AMS-IX		2001:7f8:1::a500:6939:1 		yes
B-CIX		2001:7F8:19:1::1b1b:1			NO
BigApe		2001:458:26:2::500 			yes
Any2 LA		2001:504:13::1a 			NO
Any2 SV		2001:504:13:3::21 			NO
DE-CIX		2001:7f8::1b1b:0:1 			yes
ECIX Berlin	2001:7f8:8:5:0:1b1b:0:1 		NO
ECIX Düssel	2001:7f8:8::1b1b:0:1 			NO
ECIX Hamburg	2001:7f8:8:10:0:1b1b:0:1 		NO
Equinix ASH	2001:504:0:2::6939:1 			NO
Equinix CHI	2001:504:0:4::6939:1 			NO
Equinix DAL	2001:504:0:5::6939:1  			NO
Equinix HK	2001:de8:7::6939:1 			NO
Equinix LA	2001:504:0:3::6939:1 			NO
Equinix NY	2001:504:f::39 				NO
Equinix Newark  2001:504:0:6::6939:1 			NO
Equinix PA	2001:504:d::10 				NO
Equinix Paris 	2001:7f8:43::6939:1 			NO
Equinix SJ	2001:504:0:1::6939:1 			NO
Equinix Tokyo	2001:de8:4::6939:1 			NO
Equinix Sing	2001:de8:5::6939:1 			NO
Equinix Zurich	2001:7f8:c:8235:194:42:48:80 		NO
France-IX	2001:7f8:54::10 			NO
HKIX		2001:7fa:0:1::ca28:a19e 		NO
JPIX		2001:de8:8::6939:1 			NO
JPNAP		2001:7fa:7:1::6939:1 			NO
KCIX		2001:504:1b:1::5 			NO
KleyrEX		2001:7f8:33::A100:6939:1 		NO
LAIIX		2001:504:a::a500:6939:1 		NO
LINX		2001:7f8:4:0::1b1b:1 			yes
LoNAP		2001:7f8:17::1b1b:1 			yes
MICE		2607:fe10:ffff::52 			yes
NASA-AIX	2001:478:6663:100::44 			NO
NetNod		2001:7f8:d:fc::187 			NO
NIX.CZ		2001:7f8:14::6e:1 			NO
NL-IX		2001:7f8:13::a500:6939:1 		NO
NOTA		2001:478:124::176 			NO
NWAX		2001:0478:0195::42 			NO
NYIIX		2001:504:1::a500:6939:1 		NO
PaNAP		2001:860:0:6::6939:1 			yes
PLIX		2001:7f8:42::a500:6939:1 		NO
RMIX Denver	2605:6c00:303:303::69 			NO
SIX		2001:504:16::1b1b 			NO
SOLIX		2001:7f8:21:10::101 			NO
STHIX		2001:7f8:3e::a500:0:6939:1 		NO
SwissIX		2001:7f8:24::AA 			NO
Telx Atlanta	2001:478:132::75 			NO
Telx NY		2001:504:17:115::17 			NO
Telx Phoenix	2001:478:186::20			NO
TorIX		2001:0504:001A::34:112 			yes

For the record, since our platform does not support loose uRPF and we
want to pass those unreachables no matter what we use the following
static ACLs on upstream links:

ipv6 access-list ACL-UPSTREAM6-in
 deny ipv6 <ownprefix> any
 permit ipv6 2000::/3 any
 permit ipv6 FE80::/64 any
 permit icmp any any unreachable
 deny ipv6 any any

But that does not offer the same featureset as loose uRPF of course.

Best Regards,
Bernhard


More information about the ipv6-ops mailing list