Manually Add Desktop to Automated Pool

Situations may arise that require an administrator to add existing desktops to an existing View desktop pool. For example, two environments merge and a group of persistent desktops need to be combined into a new environment. This would also be useful if a desktop gets inadvertently removed from View.

Backup ADAM Database using

  • Launch Command Prompt with elevated credentials
  • Navigate to: C:\Program Files\VMware\VMware View\Server\Tools\bin
  • VDMEXPORT –f ADAMEXPORT.LDF

Figure 1 – Backup of ADAM Database

If the desktop is already a member of a desktop pool, remove the desktop from View Administrator (Optional)

Figure 2 – Remote Desktop from Pool

From a connection server, launch ADSI Edit and connect to the View ADAM instance. The connection settings are listed below.

Figure 3 – Connect to ADAM Database

Navigate to Server Groups, and select the pool to be modified.

Figure 4 – Modify Desktop Pool

Convert the pool from Automated to Manual by modifying the pae-ServerPoolType attribute in the ADAM database.

Automated Pool with naming pattern – Attribute Value = 1

Manual Pool – Attribute Value = 5

Automated Pool with manual naming– Attribute Value = 12

Note: There are other possible attributes that the desktop pool may be. Be sure to note the original pool type, so it can be changed back in step 8.

In View Administrator, the pool will be listed as manual pool.

Add the desktop to the pool as shown below, and refresh the status.

Figure 5 – Add Desktop to Pool

Return the pool back to the previous type by modifying the pae-ServerPoolType
attribute

Reassign user to View desktop (if applicable)

Figure 6 – Assign User to Desktop

Once the desktop is added back to the pool, and the pool is back to an automated pool, the desktop virtual machine should be available. In some cases, the desktop may go into a customizing state, and eventually change to Agent Unavailable.

Host-based Caching with QLogic FabricCache

Introduction

In a previous article, the topic of discussion was leveraging flash for a virtual infrastructure.  Some environments today are still not fully virtualized; this is typically because the last workloads to be virtualized are the systems with high horsepower application requirements, such as SQL and Oracle.  These types of workloads can benefit greatly from flash-based caching.  While many hardware/software combo solutions do exist for in-OS caching in non-virtualized environments, they will be covered in a separate article.  This article focuses on the QLogic host bus adapter-based caching solution FabricCache.

How it works

QLogic’s take on using flash technology to accelerate SAN storage is to place the cache directly on the SAN, in their HBA’s, viewing cache as a shared SAN resource.

Overview

The Qlogic FabricCache 10000 Series Adapter consists of an FC HBA and a PCI-e Flash device.  These two adapters are ribbon-connected, providing the HBA with direct access to the PCI-e Flash device, as the PCI bus provides only power to the flash card.  While some consider this a two-card solution, keep in mind that if you have a flash PCI adapter in the host to accelerate data traffic, the HBA will also exist.  That being said, the existing HBA cannot be re-used with the FabricCache solution.

Caching

A LUN is pinned to a specific HBA/Cache – all hosts that need to access the cached data for the LUN in question access it from the FabricCache adapter to which the LUN is pinned.  For example:  Server_A and Server_B are two ESX hosts in a cluster.  Datastore_A is on LUN_A and Datastore_B is on LUN_B.  LUN_A is tied to HBA_A on Server_A, while LUN_B is tied to HBA_B on Server_B.  All of the cached data for LUN_A will be on the cache on HBA_A, so if a VM on Server_B is running on LUN_A, its cache resides on HBA_A on Server_A.  A cache miss results in storage array access.  In this case, the host would access the cache across the SAN, to the neighboring host.  For this reason, each adapter must be in the same zone to be able to see other caches in the FabricCache cluster.

FabricCacheBlog

In a FabricCache cluster, all data access is in-band between the HBAs on the fabric, with no cache visibility by the host operating system or hypervisor, or the HBA driver.   QLogic’s FabricCache is a write through caching, where all writes are sent to through the cache, requiring acknowledgement from the array before acknowledging to the operating system that the write operation is complete.  The lack of write caching is based on the fact that 80% of SAN traffic is reads, thus accelerating the largest portion of SAN traffic.

The technology can be managed by a command line using QLogic Host CLI, but there is also a GUI available called the QConvergeConsole.  There is also a plug-in available for VMware’s vCenter.  In this interface, an administrator can target specific LUNs to cache, and assign LUNs to the specific HBA cache that should be used.

When and Why FabricCache

In my opinion, the biggest reason to use FabricCache is the pricing model.  Since the solution is all hardware, there is no software maintenance required on an annual basis.  That being said, there is a hardware warranty that must be purchased the first year, but typically that cost can be capitalized, thus resulting in an OpEx free solution.  Ongoing maintenance, at least when I checked, is not required.

Once you reach 5 systems with the adapter, the annual maintenance cost for those 5 systems is equal to the cost of purchasing a stand-by adapter and keeping it on the shelf in case of failure, thus avoiding the recurring maintenance costs.

A few scenarios that I believe this technology is a great fit are where the supportability of a cache solution, lifespan of the flash, and recurring cost of the solution are paramount to the decision.  These areas are discussed in greater detail in the sections below.

Supportability

  • No software aside from HBA driver – this is very beneficial in that there is no risk of 3rd party software acting as a shim into your operating system. Many times, custom written cache drivers are only supported by the manufacturer of the software, and not thoroughly tested by the operating system/hypervisor vendor.  This can result in complications when upgrading, or integrating with other pieces of software.
  • Physical Clusters – Since cache is part of the SAN, cache will still be available during failover.
  • Non-VMware Hypervisors – Many hypervisor products do not have host-based cache products available. Since this solution is all hardware, with a simple operating system driver for the HBA, this makes it a good candidate for systems that do not have a cache product available for them.

Longer Life/ Lower Cost

  • Uses SLC flash – SLC flash typically lasts about five to ten times longer than MLC or eMLC flash. SLC flash also has typically twice the performance.   The result is faster cache access, and reduced time until failure.
  • Slightly higher CapEx due to cost of flash, but gives the option to avoid OpEx

When Not and Why Not FabricCache

There are also a few cases where I would avoid this solution in favor of an in-OS or in-guest solution.  These cases are listed below.

  • Virtual Environment – Since the HBA owns the LUN, as shown in the diagram earlier in this article, all VM’s would need to be on the same host to have access to local cache. In the case of a vMotion, the host would need to access the cache on another host since the VM has moved.
  • Many organizations use two HBA’s in a single host – each HBA connecting to a separate switch, for resiliency in the event of an HBA failure. This configuration would require the purchase of two FabricCache adapters, which makes the price almost unpalatable.

Another notable item with this solution is that in the event of a server in the FabricCache cluster becoming unavailable, no cache acceleration will be available for the LUNs serviced by the FabricCache adapter on the failed host.

Conclusion

This article focuses on a host bus adapter-based caching solution, QLogic’s FabricCache, how it works, and where I see it as being a good fit into an infrastructure.

Troubleshooting Tip – The resource ‘104’ is in use

View Administrator experienced a strange issue not long before a scheduled upgrade to Horizon 6. Each time Composer attempted to spin up a new desktop, the task would error out in vCenter with the error The resource ‘104’ is in use. One would think that this issue would not have much impact. Unfortunately, this issue wreaked havoc on a linked clone pool. Over a thousand errors were reported in View Administrator overnight when this issue arose, as the desktop pool was not set to Stop provisioning on error. Upon refreshing the ports on the vSwitch, the error went away.

A quick search turned up a VMware KB article discussing that the number 104 refers to a specific port on the virtual distributed switch. Looking at the ports on the virtual distributed switch, it was observed that an older version of a template that was once servicing a full clone pool was occupying the port in question. There seemed to be some sort of confusion on the side of vCenter. Errors like this had been solved in the past by rebooting vCenter. The next outage window available for a vCenter reboot was allocated for the upgrade window. Since the first step in the upgrade was to reboot vCenter, this seemed like a good time. After the reboot of vCenter, and a successful upgrade the issue persisted.

It was time to look at the template. Since the template was an older version of the standard desktop that was kept around for fallback purposes, it was determined that a good first step was to convert the template to a virtual machine, and disconnect it from the vDS. Upon right-clicking on the template, it was discovered that the option to Convert to Virtual Machine was grayed out. A quick search returned the VMware KB article discussing possible permissions issues. Since permissions were not the case, the article recommended removing the VM from inventory. Removing the template from inventory fixed the problem, and it has yet to return. Rather than re-add the VM to inventory, the VM was deleted, as it was no longer in use.  

Horizon 6.0 Upgrade Notes – Composer Error 1920

Lesson: Backup machine certificates before upgrading View Composer

When upgrading a View environment, the first place to start is View Composer. In a recent upgrade to Horizon 6 the upgrade of Composer failed with Error 1920, shown below:

It did not take long to find this VMware KB article which then pointed to his Microsoft KB article. The error occurs in environments that do not have Internet connectivity and cannot connect to the appropriate URLs for checking certificate trust lists. Microsoft provides a patch and a policy setting to resolve the issue. The fix was simple enough. Upon implementing the fixes provided by Microsoft the installation was restarted. When reaching the step in the installation where the certificate is to be selected, it was found that the machine certificate for this system was removed. There is no explanation for this certificate removal. The two places this was seen was in a lab environment, where certificates were easy to redeploy, and an environment that was using self-signed certificates. This is why it is always a good idea to backup certificates before upgrading View Composer, or any operation for that matter.

Horizon 6.0 Upgrade Notes – View Events Database Unresponsive

Upon upgrading a View 5.2 environment to Horizon 6.0, the previously functional events database in View Administrator became unusable after the upgrade. The first place that was examined was the SQL database server where the View Events database resides. Upon logging into the server, it was quickly observed that the CPU of the SQL server was pegged at 100%. After some time the CPU of the SQL server returned to normal. To confirm the suspicion that the events database issues was related to the pegged SQL CPU some testing was performed. While watching task manager on the SQL server, View Administrator was launched and navigated to Monitoring\Events. Upon attempting to view the Events, the CPU on the SQL server spiked dramatically. It was time to find out why.

A little digging into the background of the View Events Database in VMware View explains the table layout. There are a couple of notable tables:

  • Event – Contains metadata and search optimization data for recent events
  • Event_data – Contains data for recent events
  • Event_historical – Contains data for older events
  • Event_data_historic – Contains metadata and search optimization data for older events.

In the Events view with a functioning events database the text below is shown:


This article from the VMware KB formally explains that the UI in View Administrator defaults to a maximum of 2000 events to be viewed. It also includes the settings to change to increase this value. It notes that this maximum number of events is in place to ensure performance. The higher the number of events in the event and event_data tables, the more time and system resources required to fetch and display the records.

There is also a setting in View Administrator that allows us to configure how long to Show events in View Administrator. When Show events in View Administrator is set for 2 months, as shown below, events older than 2 months are moved to the historical tables in the database. The Classify events as new for option specifies how long view considers these events “new.” Environments that have a lot of events should keep these values low, to ensure View Administrator can pull back the events quickly.


Now, getting back to the original problem that this article was written to discuss. Looking at the Events Database under system health showed the following:


The following items were gleaned from this view 1) View Administrator can only see 2000 events for performance reasons 2) there were 89964 events in the recent tables and 3) there were over 2 million events that have been recorded since the initial deployment of the environment. It seemed time to create a new database. After reconfiguring View Administrator to point to the new database, the Events view started working again.

In retrospect, it may have made more sense to modify the Show events in View Administrator time down to a week, hopefully pushing many of those events into the historical tables. Either way, the problem was resolved. After a few days, the number of events recorded in the events database was a much more reasonable.


One thing to keep in mind during the upgrade is that each desktop disconnect from the connection server and reconnect to the connection server when the connection server is rebooted registers an event in the events database.  Multiply that by however many hundreds of desktops are on that connection server, and by how many times each connection servers was rebooted – likely more than once.  This generated a lot of events. Also, some composer issues that occurred not long before the upgrade registered quite a few events. The issue there was that the linked clone pools were not configured with Stop provisioning on error and continued to fail over and over again. The error, caused by the “The resource ‘182’ is in use” (covered in another article) generated thousands of events in the events database.

By creating this new DB, there was a dramatic reduction in the amount of events that View Administrator was trying to access.  The problem View was having was that the tables storing current events was too large.