Tuesday, September 8, 2015

How to Make Cisco Prime Infrastructure Possibly Suck a Little Less

So, if you gather a bunch of people from the wireless community who use Cisco products together in a room with Cisco people you will inevitably hear horror stories about PI with complaints ranging from it doesn't work with my browser to nightmare upgrade experiences or even the occasional report that Prime has risen out of the Sea of Japan and is currently attacking Tokyo.  Cisco is taking care of the first one...hopefully...with Prime 3.0 being written in HTML 5 instead of Flash.  That last...well sorry Japan, you're on you're own there.  However, that middle one is still just awful and if have never seen PI eat itself to death during an upgrade you are a lucky individual.

I have personally talked to people that have lost everything during an in-line upgrade.  It can take weeks to recover if you have lots of maps.   I now avoid in-line upgrades which is your first pro-tip.  In-line upgrades are about as smart as buying an off the rack suit that's six sizes too small and expecting it not to tear itself apart as you wear it while you're doing jumping jacks.  Think that analogy is stupid?  Well, it's still ten times smarter than an in-line upgrade.

Anyway, at one of these meetings of the minds one of the attendees asked it was possible to get the maps out of prime since that was the most difficult bit to replace.  Turns out there is.  I have been using it for some time as a way to mitigate a worst case scenario while attempting to migrate prime.  This is what people commonly refer to as "upgrading prime".

Annnnnd....without further ado...

1. Click on the site maps link in Prime. In classic theme you go to Monitor -> Site Maps.  In the devil's user interface or what Cisco refers to as "Converged Theme" you go to Maps -> Site Maps.  There at the top center of the screen is a drop down box.  Open it up and you will see...


Just select "Export Maps" and hit the Go button.

2. You will then be present with a screen that looks like this...

Only without all the lines covering location names.

You can export any or all of your maps.  You can even export map info so that when you import the maps back into prime all of your access points will be where they are support to be.  The one caveat is that you have to add your controllers back to Prime first.  The access points have to be in the Prime database in order for Prime to add them back to the maps.  Also, I have seen Prime 3.0.  The process is essentially the same.  The interface just looks a bit different.  It's all schmancy now.

There you go.  Pretty simple right?

Apple finally decides to play nice.

Well, it appears that Apple finally decided to stop ignoring Cisco's phone calls and play nice.  This is great new for those of us with users who insist on using Apple wireless clients because...pretty....BRAAAAAAAINS!


Tuesday, July 7, 2015

Permitted Data Rates When Using TKIP or WEP

Okay, I get it.  Most enterprise and SOHO environments are probably using WPA2-Personal or WPA2-Enterprise.  This goes for most home users as well.  Why then, would I bother writing a post about data rate limitations for WEP and TKIP you ask?  Well, the answer is simple.  I work for a hospital and in healthcare all the normal rules are thrown out the window.  Thanks to heavy restrictions on what can or cannot be used in healthcare the updates tend to come slowly if at all.  I have spoken to other wireless engineers and exchanged horror stories.  I wish I could say my experience is unique.  Unfortunately, it isn't.  So, this is for those of us that still have to use out dated protocols.

The 802.11n and higher amendments do not permit the use of WEP or TKIP encryption for the High Throughput (HT) and Very High Throughput (VHT) data rates.  In fact, the Wifi Alliance will only certify 802.11n radios that use CCMP encryption for the higher data rates.  However, newer radios should support TKIP and WEP using the slower data rates defined for legacy 802.11a/b/g radios. 

What does this mean?  Basically if you have a nice shiny new 802.11ac deployment all pimped out with the latest technology and Joe User decides to connect to your network using WPA he will only see a maximum data rate of 54mbps.

One of these days I will have all the pesky legacy devices off my network.  Of course, by then an entirely new crop of legacy devices will be plaguing me for different reasons.  It's what I call job security.

Wednesday, October 1, 2014

Load Balancing ISE Policy Services Nodes Behind a F5 Big-IP

Well, after having gone through all the trouble to create something that essentially didn't exist for the public, Cisco was nice enough to create something that was better...in PDF format.  Here is the link for their guide...

https://www.cisco.com/c/dam/en/us/td/docs/security/ise/how_to/HowTo-95-Cisco_and_F5_Deployment_Guide-ISE_Load_Balancing_Using_BIG-IP.pdf



Having just completed the process of load balancing nine Cisco ISE Policy Services Nodes(PSN) behind our F5 load balancer I found that one of the most frustrating things was that absolute lack of a publicly available step-by-step guide.  This might be something that is simple for someone who is an expert with F5 load balancers, but for a wireless guy with no real F5 or ISE experience it can be a pretty heft challenge.

With that in mind I have decided to write a step-by-step guide documenting the process I used to get everything working.  It should be noted that F5 can make significant changes from one version of code to the next so this guide may have to be modified slightly depending on code version.  Our Big-IP is currently running code version 11.4.


This is a basic diagram of how the ISE system is connected today.  The IP addresses have been changed to protect the innocent.  ISE has some requirements that must be met in order to put the PSNs behind any load balancer.  First, the ISE nodes have to be configured so that the F5 acts as their default gateway.  This means they must be layer-2 adjacent to the F5.  Second, Source NAT does not work with ISE.  ISE uses the source Network Access Device (NAD) to track RADIUS sessions and to perform Change of Authorizations.  Third, RADIUS sessions must be configured for persistence on the F5 through the use of an iRule which I will provide in the step-by-step instructions.

For this deployment I decided that I did not want the traffic between the admin/monitor nodes and the individual PSNs to go through the load balancer since it really wasn't necessary and I wasn't all that sure how to set it up anyway.  To accomplish this I set static routes on the ISE PSNs for specific hosts to use the VLAN 1 default gateway rather than the IP address of the F5.

Cisco ISE static route:
ip route 192.168.2.10 255.255.255.255 gateway 192.168.1.1

NOTE: If you are already a master of the F5 and would just like some general guidance on how to configure if rather than a step-by-step guide you can go to the link below.  It also provides some background information on why things are configured the way they are that I didn't go over in this post.

https://supportforums.cisco.com/blog/153056/ise-and-load-balancing

Now without further ado...

Step 1:
The ISE PSNs that are to receive load balanced traffic need to be added to the F5 system.  You can do this on the F5 by going to Local Traffic -> Nodes -> Node List and creating an entry for each of your PSN nodes.


Step 2:
In order to ensure that the policy services node is reachable and available for authentication and accounting monitoring probes need to be configured on the F5.  These probes require an account (AD in our case) that will allow them to verify that the PSNs are connected to back end resources such as Active Directory.  These can be created by going to Local Traffic -> Monitors and creating two monitors.  The type for the first will be RADIUS and the type for the second will be RADIUS Accounting.



Step 3:
Next the server pools used to load balance RADIUS authentication and accounting traffic need to be created.  You can use either round robin or leased connections as your load balancing method.  We went with least connections.  You can create these pools by going to Local Traffic -> Pool List and hitting the Create button.  The screen shots below illustrate the configuration options I have set.
Be sure to add the health monitors created in the last step to their respective pools under the health monitor configuration of the Health Pool.






Step 4:
This step creates iRule that will be used to maintain persistence needs to be configured.  I have included my iRule.  ISE uses two RADIUS attributes for session tracking and both of them should be included in the iRule.  These are calling_station_id and framed_ip.  You create these iRule by going to Local Traffic -> iRules -> iRule List and pressing the Create button.  Feel free to copy and paste it as your leisure.  Just be aware that this iRule may not work with all versions of F5 code.

The iRule:
# ISE persistence iRule based on Calling-Station-Id (Client MAC Address) and Framed-IP-Address (Client IP address)
when CLIENT_DATA {
set framed_ip [RADIUS::avp 8 ip4]
set calling_station_id [RADIUS::avp 31 "string"]
# log local0. "Request from $calling_station_id:$framed_ip"
persist uie "$calling_station_id:$framed_ip"
}


Step 5:
Once the iRule is created a persistence profile has to be is configured. This persistence profile will be used by the RADIUS virtual servers to maintain persistence based on the criteria in the iRule.  To create the persistence profile go to Local Traffic -> Profiles -> Persistence and press the Create button.  It should be noted that I have seen an alternate version of persistence configuration that involved applying the iRule directly to the virtual server rather than creating a persistence profile.  I tried it and it didn't work for me.  I can only assume this is something that works differently in different versions of the F5 code.


Lost of stuff to configure on this one.  Be sure to select 'Universal' as the persistence type, 'Match Across Services', and add the iRule created in the previous step.  The Timeout is more of personal preference, but I did configure it.  The Custom check box on the far right has to be checked in order to enable all the options below it.

Step 6:
Now that all that other stuff is set up it's time to set up the virtual servers used to load balance RADIUS traffic.  You create virtual servers by going to Local Traffic -> Virtual Servers -> Virtual Server List and pressing the Create button.  Since there are a large number of configuration options I will put some explanations between the screen captures.


In the above section a source of 0.0.0.0/0 is used because the load balancer is supposed to receive RADIUS traffic from all network devices.  Our network has several difference network management subnets so this was really the only option, but it could be changed to a specific subnet if so desired.  The destination is the VIP used for load balancing. The service port in this case is 1812 because this is the authentication server.


It will be necessary to set the configuration mode to advanced to get all the configuration options needed.  The big thing on the above screen shot it the RADIUS profile.  It should be set to radiusLB_calling_station_id.


In the above example I have All VLANs and Tunnels configured for VLAN and Tunnel Traffic.  This can be configured for just the VLANs the load balancer uses to pass traffic.  For instance, VLAN 1 on the network diagram could be set here.  In fact, that's exactly how I have it configured.  It's just not VLAN 1 and photo shopping screen shots is something I'm just not interested in doing.


Under the resources section the Default Pool and Default Persistence profile need to be set up.  These were both created in previous steps.  Not the iRule section at the bottom.  Remember in the persistence profile step that I mentioned adding the iRule directly to the virtual server?  This is where you would do that.  It didn't work for me, but it could change with the code version.


These next few screen shots are basically the same as the four previous.  The only real difference is that this is the configuration for RADIUS accounting so port 1813 is used instead of 1812.




Step 7:
Now that big core load balancing is set up there are some optional configurations for load balancing for things like DHCP, CoA, NMAP and SNMP.  None of these are absolutely required, but it's highly likely that you will use one or more of them with ISE.  I am using all of them.

Policy NAT configuration:

Unlike RADIUS traffic on ports 1812 and 1813 other things such as CoA and SNMP use source NAT. The only thing that needs to be configured is the server in the member list.  This should be configured for the host name of the VIP.



DHCP Profiling:

ISE is capable of using DHCP traffic to profile endpoints and they connect to the network.  If you want to use DHCP profiling the load balancer will have to be configured for it.  The next several screen captures illustrate how this is done.

The DHCP server pool list is created by going to Local Traffic -> Pools -> Pool List and pressing the Create button. I chose to use a built in ICMP health monitor to track the health status of the ISE nodes.  The member list includes all the ISE PSNs behind the load balancer and Round Robin is used for the load balancing technique since the ISE PSNs will share all DHCP information with each other.



The DHCP virtual server is pretty basic. The configurations are basically the same as those used to configure the RADIUS servers previously.  Be sure to configure the default pool under the Resources tab to use the DHCP pool configured previously. You create the DHCP virtual servers by going to Local Traffic -> Virtual Servers -> Virtual Server List and pressing the Create button.





RADIUS CoA:

Because the CoA communication will initiate from the ISE nodes this server needs to be configure to accept traffic from the network the ISE PSNs are on.  The service port should be 1700 which is used for CoA.  The remaining configurations are similar to those used in the radius virtual servers.  The two differences being the Source Address Translation method and SNAT pool.  Both of these are under the advanced configuration options of the virtual server. You create the CoA virtual servers by going to Local Traffic -> Virtual Servers -> Virtual Server List and pressing the Create button.




SNMP:

SNMP is almost identical to CoA.  The only difference is the Service Port used.  In this case it will be port 161.  The Source Address Translation method and SNAT pool are the same as CoA.  You create the SNMP virtual servers by going to Local Traffic -> Virtual Servers -> Virtual Server List and pressing the Create button.




Step 8:
Finally, a virtual server needs to be configured to handle all other inbound traffic that will go through the load balancer and  one that will be used to handle all return traffic. You create tthese virtual servers by going to Local Traffic -> Virtual Servers -> Virtual Server List and pressing the Create button.

 The return traffic server is intended as a catch-all to handle all other traffic that might pass through the load balancer.  The source network is the ISE PSN network and all ports are allowed.  The only other thing that needs to be configure is the Protocol Profile.  This needs to be set for 'fastL4' which can help improve performance.



 The default forward is a catch-all rule designed to handle all traffic destined for the ISE PSNs not specifically covered already.  The destination network is set for the ISE PSN network and all ports are forwarded.  Once again, the only thing that needs to be configured is the Protocol Profile.




Well, that's about it.I am pretty sure that I will have to tweak this as time goes on, but this is what is currently being used in production and it's been running solid for a few weeks now.

Thursday, May 15, 2014

CSCue56163 - 12.4(25e)JAL1 AP recovery img does not work as expected

I got hit with this little gem the other day and it proved to be quite frustrating.  It was only through some lucky Google searches that I was able to find a post on the CSC forums by someone else with the same issue.  It seems that this particular recovery image does not like it when you disable proxy-arp.  Something we have done for security reasons.  This is now my favorite problem with out of the box APs.  It's followed closely by the box of APs pre-loaded with mesh code and the box of APs from the wrong regulatory domain.  It only takes top slot because it is a flaw with the code rather than a simple mistake on the part of our reseller.

I was able to see this bug in action by issuing the "debug capwap client event" and "debug capwap client error" commands.  For my trouble I received  a repeat of the following messages:

AP4403.a701.a00a>
*Mar  1 00:27:21.875: %CAPWAP-3-EVENTLOG: Could not discover any MWAR.
*Mar  1 00:27:21.875: %CAPWAP-3-EVENTLOG: Starting Discovery. Initializing discovery latency in discovery responses.
*Mar  1 00:27:21.875: %CAPWAP-3-EVENTLOG: CAPWAP State: Discovery.
*Mar  1 00:27:21.875: %CAPWAP-3-EVENTLOG: spamResolveStaticGateway - Removing default route for existing gateway 10.119.64.1
*Mar  1 00:27:21.875: %CAPWAP-3-EVENTLOG: spamResolveStaticGateway - Adding default route for gateway 10.119.64.1
*Mar  1 00:27:21.875: %CAPWAP-3-EVENTLOG: spamResolveStaticGateway  - gateway found 10.119.64.1
*Mar  1 00:27:21.875: %CAPWAP-3-EVENTLOG: spamResolveStaticGateway - Removing default route for existing gateway 10.119.64.1
*Mar  1 00:27:21.875: %CAPWAP-3-EVENTLOG: spamResolveStaticGateway - Adding default route for gateway 10.119.64.1
*Mar  1 00:27:21.875: %CAPWAP-3-EVENTLOG: spamResolveStaticGateway  - gateway found 10.119.64.1

It was a Google search on some of these messages that lead me to the resolution of the issue.

There are three ways to deal with this little problem according to the Cisco Bug page.

They are as follows:

1. Make sure that ip proxy-arp is configured (default setting for an IOS
router), on the AP's subnet's default gateway. Also if ip broadcast-address is defined on the vlan with something other than 255.255.255.255 the AP will not join. Either no this command or set it to broadcast.

2. If console access is available on the AP, then disable IP routing - then it
should be able to join, and download the new IOS image:

ap#debug capwap console cli
ap#configure terminal
ap(config)#no ip routing

(wait for it to join)

This setting will not survive a reboot.

3. Install a different recovery (rcvk9w8) or lightweight IOS (k9w8) image on
the AP, such as 15.2(2)JA1.

I went with the second option to troubleshoot the issue, but when the 250 APs in the order are deployed I will probably use option 1 and temporarily enable proxy-arp on the AP management subnets.  Below I have included a screen capture directly from Cisco's web site.

CSCue56163




Monday, April 14, 2014

Adding Licenses to a Cisco MSE

Recently, I was tasked with migrating our MSE licenses from one MSE to the other as part of converting the MSEs to a HA pair.  We have several licenses and Cisco reissued the licenses so they could be applied to the migration target.  That was the easy part.

As I was adding the licenses to the new MSE the services stopped responding and I was unable to get them running again.  I tried stopping/starting the services, restarting the services, and rebooting the server with no luck.  For most people this wouldn't be a big deal.  In fact, I have run across some posts on the Cisco support forums from people with the same issue.  The typical fix was to wipe out the server and start over.  Unfortunately for me, this wasn't an option if it could be avoided.  Our MSEs run the Aeroscout engine which is used to track, among other things, temperature sensors on freezers full of expensive pharmaceuticals.

I contacted Cisco and once I was put in touch with a TAC engineer we went through several troubleshooting steps including uninstalling/reinstalling the MSE software.  In order to get everything working again we had to uninstall the MSE software and delete the database then reinstall everything.  Once that was done the Aeroscout engine was put back into place and everything was brought back online.

Once everything was working again I asked the TAC engineer for an explanation.  I wanted to know why this happened.  Naturally, she told me that without the logs she had me delete during the troubleshooting process she wouldn't be able to provide me with an explanation.  I got the same thing from the advanced engineer I contacted next.

Having failed at getting an explanation, I asked if TAC at least had a recommendation on adding the licenses to prevent this from happening again.  TAC recommends that, if as in our case, all licenses be combined into a single license (we had six difference licenses) and use the Cisco Prime interface to add the licenses rather than copying them directly into the licenses folder on the MSE.

I don't know if this will help anyone else.  I doubt most people need to maintain their MSE's uptime like we do.  This will most likely apply to other medical institutions more than anywhere else.  Still I figured it would be worth mentioning if it helps others avoid the same frustration.

Wednesday, December 11, 2013

Wifi for the Mobile Blood Bank Buses

A few months ago network engineering was approached by the Blood Bank IT manager. He was looking for a way to provide wireless access to the mobile blood bank buses while they were parked at the MD Anderson blood bank so they could upload data once they return rather than having to store the data on a laptop and bring it into the blood bank building to upload the data with a wired connection.

Originally, the plan was to install Cisco access points in the parking lot that could be used to provide wireless access to the buses as they sat in the parking lot. Eventually, delays getting the access points deployed provided us with an opportunity to provide a solution that is significantly better than providing wireless access in the parking lot.

We came up with a way to provide a mobile wireless solution that provided institutional wifi access wherever the blood bank buses went. This was accomplished with two pieces of equipment. A Cisco Aironet 3502 or 1142 access point and a Cisco 819 series router with 4G LTE functionality built into it. The cell modem configuration on the router was fairly straight forward and only required a few simple commands. The trickiest part was configuring the cellular profile, bet even that wasn't overly difficult. The configuration documentation can be found here.  Once the access point is connected it connects to a wireless controller in the standard manner assuming it can reach the controller through the network.  For us, once the VPN tunnel is established the access point is allowed to reach institutional resources as if it were located inside the institution.

This is what the contraption looks like...



After testing for a month the system works well.  We have had a few issues with weak cell signal, but that is to be expected.  MD Anderson is planning on deploying these things in all the blood bank buses.  We will also be testing them with the remote blood bank units that perform on site blood draws.  The key difference with these is the lack of a bus.  The blood bank personnel go on location and set up inside a building.  A greater number of issues are anticipated for these tests since the routers will be deployed inside buildings that may have cellular connection issues.  I expect I will be updating this blog at some point or adding a new one once that testing is done.

It should be noted that, in addition to the 819/Access point combination with also tried a Cisco 1900 series router with an integrated wireless access point and a cell modem card.  This was abandoned for a number of reasons.  The router is quite large and takes up more space than the other two devices combined.  The cell connection took about fifteen minutes to come up.  Debugging revealed that the connection would repeatedly fail until it finally got a stable connection.  This happened regardless of where we tested it and it should be noted that we have an AT&T D.A.S. system deployed throughout the institution so we always have good cell coverage.  Finally, when the integrated access point was added to an AP group it rebooted and failed to come up.  This was problematic because we only wanted to provide on SSID to the blood bank buses.

In the future we are going to test some other solutions to see if there is anything better out there.  We plan to substitute the access point with an Aironet 600 series office extend access point.  Once configured for OEAP a 600 series access point will be able to connect to the institution without the hassle of a VPN tunnel.  Also, early in 2014 Cisco is supposed to release a version of the 819 router that has an integrated cell modem as well as an integrated access point.  We will be testing the new 819 as soon as we can get our hands on one.