Cisco Routing & Switching

High Availability using IP SLAs

Nowadays, with the arrival of converged networks where data, voice and video go through IP, the use of backup internet connections is mandatory for bussines continuity. Depending where we want to deploy this high availability, we have several options, if our company is large enough, we’d require high bandwidth and low latency links, being the ideal solutions two BGP peerings from two different ISPs.

In contrast, if we are seeking high availability in a small company or branch, two ADSL, cable modem or even cellular connections would be enough. This post shows how to set up backup internet access in this last case.

Based on our topology:

R1 and R4 will act as R5’s gateways from different ISPs. R4 will be an ADSL connection and R1 a cellular/3G connection, we’ll prefer R4 as default exit point and use R1 only after a R4 failure.  R7 will represent a remote server in Internet, and R9 a well known device in R4’s ISP network, for example a DNS server.

IP SLA Configuration:

First step when configuring an ip sla is select a remote device as target for proves, for this purpose we’ll use the DNS server on R4’s network. Before we configure the sla, there is a key concept to take into account, we need to specify a static route to the DNS server through R4, this is very important in order to the SLA works correctly. I will explain this later.

R5(config)#ip route
R5(config)#do sh run | sec sla
ip sla 1
 timeout 1000
 frequency 2
ip sla schedule 1 life forever start-time now

With this configuration we instruct the router to check ip reachability every 2 seconds and a timeout of 1 second. If an icmp echo reply doesn’t arrive within 1 second, the test will fail. We’ve configured to start inmediately the test and no finish time specified.

Track object configuracion:

Next step is the track object, used to monitor the SLA’s state:

R5(config)#track 1 rtr 1 reachability

And finally the default static routes:

R5(config)#ip route track 1
R5(config)#ip route 2

We configured first a static route with a default administrative distance of 1 mapped to the track object. In this manner, as long as the SLA tests success, the track object will return true and the static route will be injected to the routing table, whereas the second route, since it has a higher default AD of 2, will be hidden until the first one fail. This set up is known as “floating static routes”.

Returning to the explicit route to the DNS server, now we can understand its importance. Let’s suppose we don’t configure the static route, at the beginning everything will work fine, the SLA tests will be successful, but, what would happen after a R4 failure? The track would report a failure, the new gateway would be R1 and… ups! We still having reachability to DNS Server through R1 network, therefore the SLA tests will success and the track would return a state of true, injecting the static route via the undesired R4 again!

Testing the configuration:

Last step is check this configuration works correctly. Launch a repetitive ping towards R7 (an independent server on Internet) and see how many packets were lost during the transition between ISPs. I will shutdown R4’s Serial0/0 interface to simulte a failure, once the network has converged, re-enable the interface.

R5#sh ip route
Gateway of last resort is to network is variably subnetted, 3 subnets, 2 masks
C is directly connected, Loopback0
C is directly connected, FastEthernet0/0
S [1/0] via
S* [1/0] via

After shutdown Serial 0/0 on R4:

R5#sh ip route
Gateway of last resort is to network is variably subnetted, 3 subnets, 2 masks
C is directly connected, Loopback0
C is directly connected, FastEthernet0/0
S [1/0] via
S* [2/0] via

The ping as it goes:

R5#ping repeat 1000
Type escape sequence to abort.
Sending 1000, 100-byte ICMP Echos to, timeout is 2 seconds:
*Mar  1 02:20:58.947: %TRACKING-5-STATE: 1 rtr 1 reachability Up->Down.!!!!!!!!!!!!!!!!!!!
*Mar  1 02:21:13.967: %TRACKING-5-STATE: 1 rtr 1 reachability Down->Up...

Additional information:

Independent of SLA configuration,  I had to tune EIGRP timers for achieving fast converging times between neighbors failures, as well some route filtering between R1 and R4.

In this example I run EIGRP between R1, R4, R9 and R7. If  we leave the default timers of 60sec for hello and 180sec for hold-time on low bandwidth links, the network will take too much time to converge because we’ll have to wait up to 180sec to declare the neighbor R4 down and update the corresponding routing tables. Hence I changed these values to 1 and 2 respectively.

R4(config-if)#ip hello-interval eigrp 2 1
R4(config-if)#ip hold eigrp 2 1

Regarding to R4 and R1, because we are running EIGRP on fa0/0 interfaces, after R4’s serial 0/0 is disabled it still learning the route to through R1. We can avoid this using ditribute-lists:

R4#sh ip prefix-list
ip prefix-list ANY_FROM_R1: 1 entries
   seq 5 deny le 32
R4#sh run | sec eigrp
router eigrp 2
 distribute-list prefix ANY_FROM_R1 gateway in FastEthernet0/0
 no auto-summary

Related Posts

No Comments

Leave a Reply