Category Archives: Xenserver

XenServer Host Is In Emergency Mode

It’s 8 pm on a Sunday evening, and I get a panicked call from a customer because he cannot connect to his XenServersTM via the XenCenterTM management tool. However, as near as he could tell, all of the hosted virtual machines were up and running and in a healthy state. He had unsuccessfully tried to point the XenCenter management tool at another member of the XenServer pool but was unsuccessful.

So what happened and how do you fix it?

This situation can happen for several reasons but generally it happens when there are only two servers in the XenServer pool, and the pool master suddenly fails. In essence, what happens is the surviving server (let’s just call it the “slave”) can no longer see its peer, the pool master, so it assumes it has been stranded and goes into emergency mode to protect its own VMs. There are other ways this can happen (an incorrectly configured pool with HA turned on for example), but this is the most common reason that I have personally experienced.

Depending upon the situation, you may not be able to ping the master server because it is actually down, or you may be able to ping the server but it is in an inconsistent, “locked up”, state such that it cannot answer requests to it. If you are able to connect to the console of the master server either directly with a monitor, keyboard, and mouse (the old fashioned way) or through a remote management interface (DRAC, ILO, ILOM, etc) the server may appear to be running, but you may not be able to do anything with it.

At this point you may be thinking, “This is no big deal - just reboot the machine and it will be fine.” If you are lucky that may actually solve the problem, but in many cases it will not. What you might see is that after the master reboots you will be able to connect to the master but you will not see the slave. Or it may be that your master is truly broken and you are not able to simply reboot it due to a system or hardware failure. But, of course, you’ve still got to get your pool online and working again regardless.

During this period of time, if you try to use a tool such as Putty to connect to the slave via its management interface, you may not be able to connect to it either. If you try to ping the slave on the management interface you may not get any replies. But if you connect to the console of the slave (again, either the physical console or via a remote management interface) you will probably see that the machine is running, but if you look at XSconsole it will appear that the management interface is gone because there will be no IP address showing. By now you’ll probably be scratching your head because the strange thing is all the VMs are running.

So at this point your master appears to be down, or at least impaired, you’ve got no management interface on the slave, your pool is broken and you cannot manage the VMs. So what do you do?

Well, if this happens to you and your VMs are still up and running the first thing you should do is take a deep breath, because more than likely it is not as bad as you might think. XenServer is a robust platform and if the infrastructure is built correctly (and I’m going to quote a customer), “you can really slam the things around and they still work”.

After you take a deep breath and let it out slowly, from the console of the slave server, you will need to access the command line and start by typing:

xe host-is-in-emergency-mode

If the server returns an answer of “True” then you’ve confirmed that the server has gone into emergency mode in order to protect itself and the VMs running on it. (If the server returns an answer of “False” then you can stop reading, because the rest of this post isn’t going to help you.)

Assuming you receive the answer of “True” the slave server is in emergency mode because it cannot see a master – either because the master is actually down, or because the management interface(s) is(are) not working. Therefore, the next step is to promote the slave to master to get it out of emergency mode. We do this by typing:

xe pool-emergency-transition-to-master

At this point the slave server should take over as the pool master and the management interface should be available again. Now if you type the xe host-is-in-emergency-mode command again you should get an answer of “False”.

Now, open XenCenter again. It will first try to connect to the server that was the master, but after it times out it will then attempt to connect to the new master server. Be patient, because eventually it will connect (it may take several seconds) and you will again see your pool and be able to manage your VM’s. If some of the VMs are down because they were on the server that failed you’ll be able to start them on the remaining server (assuming you have shared backend storage and sufficient processor and memory resources).

Now what about the master if it has totally failed? What do I do after I’ve fixed, say, a hardware problem in order to return it to my pool?

If the following two conditions are true:

  1. You are using shared storage so that your VMs are not stored on the XenServer local drives, and
  2. You have built your XenServers with HBAs (fiber or iSCSI) rather than using Open iSCSI, which means the connectivity information to your backend SAN will be stored within the HBA,

…then it may be much simpler and quicker just to reload the XenServer operating system. (If you do not have shared backend storage, which means your VMs are on local storage, DO NOT DO THIS). I can rebuild my XenServers from scratch in about 20 - 30 minutes and have them back in the pool and running.

If either of those two conditions is not true then, depending upon your situation, recovery may be significantly more difficult. It could be as simple as resetting your Open iSCSI settings and connecting back to your SAN (still easy but takes more time to accomplish) or it could be as painful as rebuilding your VMs because you lost your server drives. (OUCH!)

Real world example: I recently had a NIC fail on the motherboard of my master server. Of course since the NIC was on the motherboard it meant the whole motherboard had to be replaced which significantly modified the hardware configuration for that server.

In this case, when I brought that XenServer back online it still had all the information about the old NICs showing in XenCenter, plus it had all the new NICs from the new hardware. Yes I could have used some PIF forget commands to remove the NICs that no longer existed and reconfigure everything but that would have taken me a bit of time to straighten out. Since I had iSCSI HBAs attached to a Datacore SAN (great product, by the way) for shared storage, all I did was reload XenServer on that machine, modify the multipath-enabled.conf file (that is a different blog topic for another day), and rejoin the server to the pool. Because the HBAs already had all the iSCSI information saved in the card, the storage automatically reconnected all the LUNs, the network interfaces took the configuration of the pool, and I was back online and running in less than 30 minutes.

After you repair the machine that failed and get it back online, you may want it to once again be the master server. To do this type:

xe host-list

You will get a list of available servers with their UUID’s. Record the UUID of the server that you want to designate as the new master and then type:

xe pool-designate-new-master host-uuid=[the uuid of the host you want]

After you type this your pool will again disappear from XenCenter, but after about 20 – 30 seconds (be patient) it will reappear with the new server as the master. Your pool should now be healthy, and you should again be able to manage servers as normal.

XenServer Tips - HBAs, HA, and HOSTDEVSCAN

In this installment of the Moose Logic Video Series, Steve Parlee, our Director of Engineering, talks about:

  • Why we always use iSCSI HBAs in our Citrix XenServer deployments.
  • The possible risks of using HA in a two-server pool. (NOTE: Initial testing indicates that XenServer v5.6 may not present the same problems in a two-server pool as earlier versions. When we have completed our testing, we will post an update here.)
  • A useful utility for XenServer called “hostdevscan.”

Just Sign the Check Right Here - We’ll Fill In the Amount Later

Back in the old days of minicomputers and mainframes, we used to joke about IBM’s ability to, for all intents and purposes, get the customer to sign a blank check. They were better than anybody I’ve ever seen at getting people to commit to a solution when they really had no idea what the ultimate cost would be - and they were successful because of another cliche (which became a cliche because it was so accurate): “Nobody ever got fired for buying from IBM.” The message was basically, “Yes, we may be more expensive than everybody else, but we’ll take care of you.”

For the most part, those days are long gone, which made it all the more amazing to me to read that VMware is adopting per-VM licensing for most of its management products.

The article nails the basic problem with this licensing approach:

You know how many processors you have on a system, and that’s a fixed number. But the number of VMs on one host — let alone throughout your entire infrastructure — is regularly in flux. How do you plan your purchasing around that? And how do you make sure you don’t violate your licensing terms?

Hey, it’s easy - you just let VMware tell you what to put on your check at the end of the year:

You estimate your needs for the next year and buy licenses to meet those needs. Over the course of those 12 months, vCenter Server calculates the average number of concurrently powered-on VMs running the software. And if you end up needing more licenses to cover what you used, you just reconcile with VMware at the end of the year.

And, before you ask, no, you don’t get money back if you use fewer licenses than you originally purchased.

Sounds to me like a sweet deal - for VMware.

By comparison, the most expensive version of XenServer is $5,000 per server (not per processor, not per VM), and all of the management functionality is included. And the basic version of XenServer, which includes live motion, is free, and still includes the XenCenter distributed management software. (Here’s a helpful comparison chart of which features are included in which version of XenServer.)

A number of years ago, I attended a seminar that discussed the product adoption curve, and how products moved from the “innovation” phase to the “commodity” phase. The inflection point for a particular market was referred to as the “point of most” - where most of the products met most of the needs of most of the customers most of the time. When this point is reached, additional feature innovation no longer justifies a premium price.

The fact is that XenServer and Hyper-V are rapidly achieving feature parity with VMware. If we haven’t reached the “point of most” yet, we certainly will before much more time goes by. So even if you have a substantial investment in VMware already, at some point you have to re-examine what it’s costing every year, don’t you? Or are you OK with just signing a check and letting them fill in the amount later?

Citrix Continues to Virtualize Appliances

Five or six years ago, when Citrix first announced the Citrix Access Gateway appliance, I remember scratching my head and thinking, “Wait a minute, Citrix is in the software business. Why in the world do they want to start selling hardware, with all of the warranty, repair, and support issues that come along with being a hardware manufacturer?” The answer, of course, was that in order to build out the complete Application Delivery solution they envisioned, they needed components that, at the time, couldn’t be delivered using software alone.

But the world turns, and time moves on, and today Citrix has a world-class virtualization platform that runs on off-the-shelf server hardware that is itself mind-bogglingly powerful compared to what was available five or six years ago. So it makes all the sense in the world for Citrix to turn all of those hardware devices into virtual appliances as quickly as they can.

Yesterday, they formally announced virtualized versions of both the Access Gateway and the Branch Repeater. We’ll get to the virtual Branch Repeater in another post, because we’ll have our hands full in this one just covering the things you need to know about the Access Gateway VPX.

First, you need to know that the Access Gateway VPX is fundamentally a virtualized version of the 2010 CAG Appliance - with some exceptions that we’ll get into in a moment. You can download it and use XenCenter to import it directly into your XenServer environment. The cost is only $995 (compared to $3,500 for the 2010 hardware appliance), with an ongoing Subscription Advantage cost of $129/year. Here’s where it gets cool:

  • It was difficult to come up with a good solution for redundancy and automatic failover with the 2010 appliance. Unless you wanted to put a load-balancer in front of it (and if you’re going to do that, you may as well just buy a NetScaler in the first place), redundancy depended on putting primary and secondary appliance URLs or IP addresses into the CAG client. And that did you no good at all if you were trying to run it in “CSG-replacement mode” just to provide secure Web access to a XenApp farm. But the VPX virtual appliance fully supports Live Motion, XenServer HA, and NIC bonding. So the VPX doesn’t have to be redundant, because the underlying XenServer infrastructure can provide the resilience you need.
  • If you were using a 2010 appliance, and wanted to use “SmartAccess,” you had to stand up a separate “Advanced Access Control” Web server in your protected network. Obviously, that added to the cost and complexity of the solution. The VPX appliance, on the other hand, supports SmartAccess policies directly.

    Edit July 27, 2010: Not sure now where I originally picked up this information, but it is incorrect. An Advanced Access Control Web server is still required to enable SmartAccess policies with the Access Gateway VPX.

NOTE: SmartAccess, in case you’re not familiar with the term, allows you to control, at a very granular level, what applications and information a user can access, and what they can do with that information, based on the access scenario. The same user, presenting the same authentication credentials, might get a totally different level of access depending on whether s/he is connecting from inside the corporate network, from outside the network using a company-owned laptop, from home using a personal PC, or from a hotel business center using a totally untrusted device. For more information on how SmartAccess works and why it’s cool, check out this video from Citrix TV:

  • The VPX appliance fully supports the latest generation of the Citrix Receiver, and works with Dazzle and the Merchandising Server.
  • You no longer need to buy VPN client licenses to run it in “CSG replacement” mode. This is a biggie. Citrix made it clear some time ago that they would not be putting any more development time and effort into enhancing the software “Citrix Secure Gateway.” But the CSG just wouldn’t die, for one simple reason: it’s free. If you own XenApp or XenDesktop licenses with current Subscription Advantage, you’ve got the rights to use the CSG software, and your only cost is a server to run it on…and that’s pretty low in today’s virtual world. Yes, it could be argued that the CAG appliance was somewhat more secure, since it ran on a hardened Linux-derived kernel. But it cost $3,500 plus roughly $100 per concurrent user. Hmmm… CSG, free, CAG appliance, several thousand dollars. That was an easy decision for a lot of users.

    Co-incident with the release of the VPX appliance, Citrix is announcing that they’re eliminating the Access Gateway Standard User Licenses. They will no longer be sold as of June 30. Instead, when you buy an Access Gateway (physical or virtual), you get a “platform license” that entitles you to use it to secure access to a XenApp or XenDesktop farm (i.e., what’s generally referred to as “CSG Replacement Mode”) at no additional charge. So now the equation is: CSG, free, but I’ve got to put it on a server, and if it’s a Windows Server, the OS is going to cost me $700 - $800 or so. CAG VPX, $995, but I import it directly into my XenServer infrastructure and don’t have to pay for anything else unless I want the advanced access functionality. Suddenly the value proposition looks a lot more attractive.

  • Speaking of the advanced access functionality, Citrix has made some licensing changes there as well. The Access Gateway Universal licensing model has been reduced from three tiers to two, and the prices have been lowered. So now, if you didn’t purchase the XenApp or XenDesktop Platinum Editions (which include Access Gateway Universal licenses), you can purchase the Access Gateway Universal licenses separately for $100/concurrent user in quantities up to 2,500, and $50/concurrent user for 2,500+ users.

What’s the down side? Well, I’m not sure there is one. The VPX appliance isn’t going to work well as a general-purpose SSL/VPN for thousands of concurrent users, but then neither did the 2010 hardware appliance. So if that’s what you need, or if you need the high-end enterprise features like Global Server Load Balancing to enable transparent failover to a Disaster Recovery site, then we need to have a conversation about NetScalers. But for basic CSG-like functionality, or a SmartAccess deployment for a few hundred concurrent users, the virtual appliance looks pretty darned attractive to me.

For more information on the Access Gateway VPX, including a demo of just how easy it is to import it into your XenServer environment and get it running, check out the following video from Citrix TV:

Installing Server 2008 R2 on XenServer 5.0

Recently I had my first opportunity to create a Windows 2008 R2 virtual machine on Citrix XenServer 5.0. When I attempted to install the operating system I ran into an interesting issue where the installation would hang right at the initial Windows install screen and the CPU usage pegged at 100%. Once that happened no matter how long I waited the installation never progressed beyond that point.

Of course what did I do? I turned to Google and quickly found the following article which provided a workaround for the issue I was having.

I followed the advice in the article and after running the “xe vm-param-set uuid= platform:viridian=false” command as outlined in the article was able to install Windows Server 2008 R2.

Two more things are worth mentioning here, which are not specifically addressed in the previously referenced article:

  1. With Windows 2008 R2 I was able to install the XenServer 5.0 tools with none of the problems others people are having with Windows 7 installations.
  2. This issue has been resolved in XenServer v5.5!