How to Fix Java issues with Citrix Netscaler GUI

We have all encountered the dreaded Java error when trying to connect to the Citrix Netscaler GUI.  In this post I would like to walk through the steps of resolving those Java error messages. There are a few technical articles that TRY to walk you through the process of troubleshooting this issue, but I have found the method that I use to be the most successful.  For me this is one of the most frustrating error messages, as I am constantly working in different versions of Java, Netscaler firmware or browser.

Auth

For starters, lets go ahead and uninstall any version of Java you currently have installed.  Most versions of Netscaler 10.1 and above will support the most recent version of Java.  You can download the most recent version Here.  For this exercise, we are going to assume you are using chrome, Firefox or IE.  In my experience, I have had the most success with the Netscaler GUI and the Chrome browser.

After you have successfully installed Java and went through the confirmation process go ahead and browse to your java configuration applet or go to control panel > Java (32bit).

Once the Java Control Panel pops up, click on the Settings button.

Auth

You will now be redirected to the Temporary Internet files dialog.  First, click on the “Delete Files” button

Auth

One the “Delete Files and Applications” box appears, UNCHECK all of the checkboxes and click OK.

Auth

Before clicking out of the Temporary Internet files dialog, make sure to uncheck ” Keep Temporary files on my computer” and click OK.  Having all of these temporary files are one of the main causes for applet corruption.

Auth

That last set of steps will clear out all the previously downloaded temporary applets, cookies and certificates you currently have in your configuration.  If you are launching java for the first time after the new install this might be a moot point, but I do it anyway 🙂

Now, stay in the Java Control Panel and at the top, click on the “Security” Tab.  Inside of that tab, click on “Edit Site List” at the bottom.

Auth

Once you have clicked on Edit Site list, Click on Add.  Here you will be able to add the Netscaler access gateway FQDN as an exception.  Only add websites here that you know you can trust their certificate.

Auth

After you click add you will notice a text box appear in the same window.  Go ahead and add your Netscaler FQDN into that field and click OK  example:  Https://yournetscaler.yourdomain.com

Auth

After clicking OK, you will notice your Netscaler FQDN is now in the exceptions list.  Click Ok to exit the Java Control panel and relaunch your browser to test.

Auth

 

This article applies to Netscaler versions 9.3, 10.0, 10.1

Let me know how it goes.  Add your comments below!

 

 

Kevin B. Ottomeyer @OttoKnowsBest

 

 

Configuring Citrix Storefront Domain Pass-through with Receiver for Windows

I would like to discus the procedure for configuring and implementing Domain Pass-through with Citrix Storefront and Citrix Receiver.

First things first, let’s get a receiver installed on a test machine.

****Note, this machine and all subsequent machines must be a member of the domain that your storefront server is currently attached to in order for the pass-through to work.

Download the Citrix receiver Here

Once downloaded find the path of your download location.  Now, we will need to install the receiver with the single sign on switch as follows:User-added image

This will install the receiver, enable and start the single sign-on service on that machine.  After your installation is completed and the machine is rebooted,  log back in to your workstation and double-check to make sure the ssonsvr.exe service was installed and is currently running under services.

User-added image

Once you have confirmed.  Lets move over to your Storefront server.

Launch the Storefront administration console from the storefront server and on the left side of the console, click on Authentication.

Auth

Once authentication is selected move over to the right side of the console screen and under actions > authentication, click on add/remove Methods.

Auth

After clicking on Add/Remove Methods, a dialog box should appear with options to select what methods you would like to enable in Storefront.  The second option from the top is, “Domain pass-through”, click on the check box next to that option and click OK.  This will enable Storefront to take the credentials from the ssonsvr service on your workstation and pass them through Storefront and enumerate the app list without authenticating twice.

Auth

Depending on your Citrix infrastructure, you might need to propagate the changes to the other Storefront servers in your Server Group.  If you have more than one Storefront server and you do not propagate changes, you might see mixed results in your testing.

To do this, click on “Server Group” on the right side of the console and then on the left side under actions, click on “Propagate Changes”.    This action will replicate all the changes you just made to your authentication policies over to the other Storefront servers in your Server Group.

Now that you have all the configuration pieces in play, reboot the workstation you installed the receiver to and log back in.  Once logged in your should be able to right-click on the receiver and click open.  Receiver will now prompt you for your Storefront FQDN or email address if you have email based discovery enabled.  At this point your application list should enumerate without prompting for credentials. This also goes for the Web portal. Test both to make sure they are passing those credentials through appropriately.

********If your credentials still do not pass through, below are a few troubleshooting steps you can take.  Of course this all depends on how your environment is set up and what access you have to modify certain components in your windows infrastructure.

Modifying local Policy to enable pass-through on the workstation

Apply the icaclient.adm template located in C:\Program Files\Citrix\ICA Client\Configuration to the client device through Local or Domain Group Policy.

Once the adm template is imported, Navigate to Computer Configuration\Administrative Templates\Classic Administrative Templates\Citrix Components\Citrix Receiver\User authentication\, then double-click on the “Local user name and password” setting.

User-added image

The following box should appear and make sure to select both “Enable pass-through authentication” and “Allow pass-through authentication for all ICA connections”.

User-added image

Adding Trusted Sites in your browser

On the same workstation you are testing the pass-through.  Open IE and navigate to Tools > Internet Options.  Click on Trusted Sites and add your Storefront FQDN (the same address you entered into the receiver when you set it up.

Auth

Also, it wouldn’t hurt to configure pass through in IE.  In The Internet Options Security tab with Trust Sites selected, choose Custom level, security zone. Scroll to the bottom of the list and select Automatic logon with current user name and password.

User-added image

Configure the NIC provider order

On the workstation you installed the receiver, launch control panel and click on Network Connections, choose Advanced > Advanced Settings > Provider Order tab and move the Citrix Single Sign-on entry to the top of the Network Providers list.

User-added image

If you are still having problems with the receiver not passing the credentials, leave a comment with your specific issue.

References:

https://www.citrix.com/downloads/citrix-receiver.html

http://support.citrix.com/article/CTX200157

 

 

Kevin B. Ottomeyer @OttoKnowsBest

 

 

 

 

Disaster Recovery Event was a Great Success

Thank you to all who attended the event last week at Fortrust where we discussed how we help clients with the DR planning, implementation ongoing management.  We discussed premise solutions and DRaaS (DR to our cloud infrastructure) solutions as well. Thanks also to out partners: Fortrust, Faction and Zerto for participating.  Many of you have asked for the presentation and I’ve posted it here for your convenience.

DR Preso Setup (customer preso)

Screen Shot 2015-05-11 at 11.20.15 AM

Screen Shot 2015-05-11 at 11.17.21 AM

Why Hyperconverged Architectures Win

Much has been made recently by the likes of Nutanix, Simplivity, Atlantis, and even VMware (vSAN, EVO|RAIL) about the benefits of hyper-coverged Architecture.

I thought I’d take a few moments and weigh in on why I think that these architectures will eventually win in the virtualized datacenter.

First, I encourage you to read my earlier blogs on the evolution of storage technology and from that I’ll make two statements.   1.) physical storage has not changed and 2.) what differentiates one storage array vendor from another is not the hardware but the software their arrays run.

Bold statements I know, but bear with me for the moment and let’s agree that spinning disk is no longer evolving, and that all storage array vendors are basically using the same parts – x86 processors, Seagate, Fugitsu, Western Digital hard disks, and Intel, Micron, Sandisk or Samsung flash.   What makes them unique is the way they put the parts together and the software that makes it all work.

This is most easily seen in the many storage companies who’s physical product is really a Supermicro chassis (x86 server) with a mix of components inside.   We’ve seen this with Whiptale (Cisco), Lefthand (HP), Compellent (Dell), Nutanix, and many others.  The power of this is evidenced where the first 3 were purchased by major server vendors and then transitioned to their own hardware.   Why was this possible?   Because the product was really about software running on the servers and not about the hardware (servers, disks) itself.

Now, let’s consider the economics of storage in the datacenter.  The cheapest disk and thus the cheapest storage in the datacenter are those that go inside the servers.   It’s often a factor of 5-10x less expensive to put a given disk into a server than it is to put it into a dedicated storage array.   This is because of the additional qualification and in some cases custom firmware that goes onto the drives that are certified for the arrays; and the subsequent reduction in volume associated with a given drive only being deployed into a single vendor’s gear.  The net being that the drives for the arrays carry a premium price.

So, storage is about software, and hard disks in servers are cheaper.   It makes sense then to bring these items together.    We see this in products like VMware vSAN, and Atlantis USX.  These products let you choose your own hardware and then add software to create storage.

The problem with a roll-your-own storage solution is that it’s necessary to do validation on the configuration and components you use.   Will it scale?  Do you have the right controllers? Drivers? Firmware? What about the ratios of CPU, Memory, Disk, and Flash?  And of course there is the support question if it doesn’t all work together.  If you want the flexibility to custom configure then the option is there.   But it can be simpler if you want it to be.

So here enters the hyper-converged appliance.   The idea is that vendors combine commodity hardware in validated configurations with software to produce an integrated solution with a single point of contact for support.  If a brick provides 5TB of capacity and you need 15TB, buy 3 bricks.   Need more later?  Add another brick.   It’s like Legos for your datacenter, just snap together the bricks.

This approach removes the need to independently size RAM, Disk, and CPU; it also removes the independent knowledge domains for storage and compute.  It leverages the economy of scale of server components and provides the “easy button” for your server architecture, simplifying the install, configuration, and management of your infrastructure.

Software also has the ability to evolve very rapidly.   Updates to existing deployments do not require new hardware.

Today the economics for hyper-converged appliances have fallen short of delivering on the price point.   While they use the inexpensive hardware the software has always been priced at a premium.

The potential is there, but the software has been sold in low volumes with the vendors emphasizing OpEx savings. As competition in this space heats up we will see the price points come down.   As volumes and competition increase the software companies will be willing to sell for less.

This will drive down the cost, eventually making legacy architectures cost prohibitive due to the use of proprietary (and thus low volume) components.  Traditional storage vendors who are based on commodity components will be more competitive, but being “just storage” will make their solutions more complicated to deploy, scale, and maintain.   The more proprietary the hardware the lower the volume and higher cost.

For these reasons – cost, complexity, and the ability of software to evolve – we will see hyper-converged, building block architectures eventually take over the datacenter.  The change is upon us.

Are you ready to join the next wave?   Reach out to your Lewan account executive and ask about next generation datacenters today.   We’re ready to help.

Benefits of Cisco ACI (SDN) architecture

Cisco ACI, Cisco’s software-defined networking (SDN) architecture, enhances business agility, reduces TCO, automates IT tasks, and accelerates data center application deployments.

Why Today’s Solutions Are Insufficient:

Today’s solutions lack an application-centric approach. The use of virtual overlays on top of physical layers has increased complexity by adding policies, services, and devices.

Traditional SDN solutions are network centric and based on constructs that replicate networking functions that already exist.

ACI Key Benefits:

Centralized Policy-Defined Automation Management

  • Holistic application-based solution that delivers flexibility and automation for agile IT
  • Automatic fabric deployment and configuration with single point of management
  • Automation of repetitive tasks, reducing configuration errors

Real-Time Visibility and Application Health Score

  • Centralized real-time health monitoring of physical and virtual networks
  • Instant visibility into application performance combined with intelligent placement decisions
  • Faster troubleshooting for day-2 operation

Open and Comprehensive End-to-End Security

  • Open APIs, open standards, and open source elements that enable software flexibility for DevOps teams, and firewall and application delivery controller (ADC) ecosystem partner integration
  • Automatic capture of all configuration changes integrated with existing audit and compliance tracking solutions
  • Detailed role-based access control (RBAC) with fine-grained fabric segmentation

Application Agility

  •  Management of application lifecycle from development, to deployment, to decommissioning—in minutes
  • Automatic application deployment and faster provisioning based on predefined profiles
  • Continuous and rapid delivery of virtualized and distributed applications

ACI Technology Benefits

The main purpose of a datacenter fabric is to move traffic from physical and virtualized servers, bring it in the best possible way to its destination and while doing so apply meaningful services such as:

  • Traffic optimization that improves application performance
  • Telemetry services that go beyond classic port counters
  • Overall health monitoring for what constitutes an application
  • Applying security rules embedded with forwarding

The main benefits of using a Cisco ACI fabric are the following:

  •  Single point of provisioning either via GUI or via REST API
  • Connectivity for physical and virtual workloads with complete visibility on virtual machine traffic
  • Hypervisors compatibility and integration without the need to add software to the hypervisor
  • Ease (and speed) of deployment
  • Simplicity of automation
  • Multitenancy (network slicing)
  • Capability to create portable configuration templates
  • Hardware-based security
  • Elimination of flooding from the fabric
  • Ease of mapping application architectures into the networking configuration
  • Capability to insert and automate firewall, load balancers and other L4-7 services
  • Intuitive and easy configuration process

More information can be found at www.cisco.com/go/aci

Disk Physics

Today’s topic – Disk Performance.  A warning to the squeamish ..Math ahead.

Throughput refers to the amount of data  read or written per unit of time.  Generally measured in units like Megabytes per second (MB/s), or Gigabytes per Hours (GB/h).  Often when dealing with networks we see Kilobits per second (Kb/s) or Megabits per second (Mb/s).   Note that the abbreviations of some of those units look similar, pay attention to the capitalization because the differences are a factor of at least 8x.

It’s easy to talk about a hard drive, or a backup job, or even a network interface providing 250MB/s throughput and understand that if I have 500GB of data that it’s going to take a little over a half hour to transfer the data. (500GB * 1024MB/GB / 250MB/s / 3600s/h = 0.56h)

Throughput is by far the most talked about and in general most understood measure of performance.   By the same token when people ask about network performance they often go to speedtest.com and tell me that “my network is fast, because I get 15Mb/s download.”  I agree that’s a pretty decent throughput, but that’s not the only measure of  performance that’s important.

A second measure is Response Time (or latency).   This is a measure of how long it takes a request to complete.   In the network world we think about this being how long it takes a message to arrive at it’s destination after being sent.   In the disk world we think about how long from when we request an IO operation happen until the system completes it.  Disk latency (and network latency) are often measured in direct units of time – milliseconds (ms) or microseconds (us), and occasionally in seconds (s).   Hopefully you never see IT technology latency measured in hours or days unless you’re using an RFC1149 circuit.

The combination of a request response time and throughput, combined with the size of the request (amount of data moved at a time) yields a metric which amounts to how many requests can be completed per unit of time.  We see this most often in the disk world as I/O operations per second or IOPS.   We talk about IOPS a way of thinking about how “fast” a given disk system is, but it’s arguably more of a measure of workload capability than either latency or throughput; however both latency and throughput contribute to the maximum IO operations per second a given disk can mange.

For example – if we have a hard disk with an maximum physical throughput of 125MB/sec, which is capable of processing requests at a rate of 80 requests per second, what is the throughput of the drive if my workload consists of 4KB reads and writes?  Well in theory at 125MB/sec throughput the drive could process 125MB/s * 1024KB/MB / 4KB/IO = 32,000 IO/s.   Hold on, the drive is only capable of 80 IOPS so the maximum physical throughput won’t be achieved.  80IO/s * 4KB/IO = 320KB/s.   If we wanted to maximize this drive’s throughput we need to increase the size (or payload) of the IO requests.  Ideally we’d perform reads an writes in blocks equal to the maximum throughput divided by the maximum IO rate (125MB/s / 80IO/s = 1.562MB).

This last trick by the way is what many vendors use to improve the performance of relatively “slow” hard disks; referred to as IO coalescing they take many small IO operations and buffer them until they can perform one large physical IO.

What governs the drive’s maximum IOPS is actually a function of multiple factors.

Physical media throughput is one of them – which is governed by the physical block size (often 512bytes), the number of blocks per track on the platter, the number of platters, and the rotational velocity of the drive (typically measured in revolutions per minute – RPM).   The idea here being that the drive can only transfer data to or from the platters at the rate at which the data is moving under the heads.   If we have a drive spinning at 7200RPM, with say 100 blocks per track, and 512bytes/block and a single platter/head we have a drive with a maximum physical media transfer rate of 512B/block * 100blocks/track * 7200 tracks/minute / 60seconds/minute / 1024bytes/KB / 1024KB/MB = 5.85MB/s.  Under no circumstances can the physical throughput of the drive exceed this value, because the data simply isn’t passing under the head any faster.

To improve this value you can add more heads (with two platters and 4 heads this drive could potentially move 23.4MB/s). You can increase the number of blocks per track (with 200 blocks per track and one head the drive would have a throughput of 11.7MB/s). Or you can increase the velocity at which the drive spins (at 36,000 RPM this drive would move 29.25MB/sec).  As you can see though this maximum throughput is governed by the physical characteristics of the drive.

A second factor impacting the IOPS is the question of how long it takes to position the head to read or write a particular block from the disk.  IO operations start at a given block and then proceed to read or write subsequent sequential blocks until the size of the IO request has been fulfilled.  So on our sample drive above a 4KB request is going to read or write 8 adjacent (sequential) blocks.  We know what the physical transfer rate for the drive is, but how long does it take to physically move the mechanism so that the 8 blocks we care about will pass under the head?   Two things have to happen, first we have to position the head over the right track, and then we have to wait for the right block to pass under the head.  This is the combination of “seek time” and “rotational latency”.   Our 7200RPM drive completes one revolution every (7200RPM / 60 seconds/minute = 120 revolutions/second or every 120th of a second or 0.00833 seconds or 8.33 milliseconds).  On average then every IO operation will take 4.16ms to start performing IO after the head are aligned.  Again we can reduce the rotational latency by spinning the drive faster.  Seek time (how long it takes to align the heads) varies by drive, but if it takes 6ms then the average physical access time for the drive would be 10.15ms.   Drives which are physically smaller will have to move the heads shorter distances and will have lower seek times, and therefore lower access times.  Larger drives, or drives with heavier head assemblies (more heads) will have higher seek times. For a given drive you can typically look up the manufacture’s specs to see what the average seek time is.  So, let’s say that it takes 10ms to typically position a head and read a block, then our drive could potentially position the head 100 times per second.  That means the maximum IOPS for this drive is 100 per second.

So, IOPS is governed by physics, and throughput (from media) is governed by physics.   What else is a factor?   There are additional latencies introduced by controllers, interfaces, and other electronics.  Generally these are fairly small, measured in micro-seconds (us) relative to the latencies we’ve talked about generally become negligible.   The other side is physical interface throughput   ATA-133 for instance had a maximum throughput on the channel of 133MB/s; where Ultra-160 SCSI was limited to 160MB/s.  The maximum throughput of a given drive will be limited to the throughput of the channel to which it’s attached.  The old ATA and SCSI interfaces noted earlier also attached multiple devices to a channel which limited the sum of all devices to the bandwidth of the channel.  Newer SAS and SATA architectures generally dedicated a channel per device, however the use of expanders serves to connect multiple devices to the same channel.  The net of this being that if you have 10 devices at 25MB/sec throughput each connected to a channel with a maximum throughput of 125MB/sec then the maximum throughput you’ll see is 125MB/sec.

So that covers what governs the speed of a hard disk.   Some may ask “what makes SSD’s so fast?”  The short answer is that it’s because they aren’t spinning magnetic devices, and therefore don’t have the same physical limits.   The long answer is a topic for another blog.

Why is a smaller number of virtual CPUs better?

Note: This article is designed to serve as a high level introduction to the topic and as such uses a very basic explanation. Papers for those that wish to dive into more technical details of the topic are available elsewhere.

In a virtual environment such as VMware or Hyper-V, multiple virtual machines (VMs) operate on the same physical hardware. In order to make this function, a small piece of software, called a hypervisor operates to schedule the virtual resources with the physical hardware. As a virtual machine enters a state where CPU resources are required the VM is placed into a CPU ready state until enough physical CPUs are available to match the number of virtual CPUs.

The hypervisor will schedule VMs to available physical resources until all resources that can be scheduled are used.

Each VM will run on the physical CPUs until either it needs to wait for an I/O operation or the VM uses up its time slice. At that point the VM will either be placed into the I/O wait state until the I/O completes or be placed back in the ready queue, waiting for available physical resources.

As physical resources become available, they hypervisor will schedule VMs to run on those resources. In some cases, not all physical resources will be in use, due to the number of virtual CPUs required by the VMs in the ready state.

The process continues as VMs either wait for I/O or use their time slice on the physical CPUs.

In some cases there are no VMs in the ready state, at which point the scheduled VM will not time out until another VM requires the resources

Often a VM with fewer virtual CPUs will be able to be scheduled before one with more virtual CPUs due to resource availability.

In some cases a VM will complete an I/O operation and immediately be scheduled on available physical resources.

Algorithms are in place to ensure that no VM completely starves for CPU resources but the VMs with more virtual CPUs will be scheduled less frequently and will also impact the amount of time the smaller VMs can utilize the physical resources.

A VM with high CPU utilization and little I/O will move between the ready queue and running on the CPUs more frequently. In this case, the operating system will report high CPU utilization, even though the VM may not be running for a majority of the real time involved.

In these situations, operating system tools that run within the VM may indicate that more CPUs are required when, in reality, the opposite is actually the case. A combination of metrics at the hypervisor and at the operating system level is usually required to truly understand the underlying issues.

A brief history of storage

I’ve told several groups I’ve spoken to recently that “disk storage hasn’t gotten faster in in 15 years.” Often that statement is met with some disbelief. I thought I’d take a few paragraphs and explain my reasoning.

First – Lets cover some timeline about the evolution of spinning disk storage.

  • 7200 RPM HDD introduced by Seagate in 1992
  • 10,000 RPM HDD introduced by Seagate in 1996
  • 15,000 RPM HDD introduced by Seagate in 2000
  • Serial ATA introduced in 2002
  • Serial Attached SCSI introduced 2004
  • 15,000 RPM SAS HDD ships in 2005

So, my argument starts with the idea that this is 2015, and the “fastest” hard disk I can buy today is still only 15,000 RPM, and those have been shipping since 2000.  Yes, capacities have gotten larger, data densities greater, but they have not increased in rotational speed, and hence have not significantly increased in terms of IOPS.

To be fair, the performance of a drive is a function of several variables, rotational latency (the time for a platter complete one revolution) is just one measure.  Head seek time is another measure.  As is the number of bits which pass under the head(s) in a straight line per second.

Greater data densities will increase the amount of data on a given cylinder for a drive, and thus increase the amount of data that can be read or written per revolution – So you could argue that throughput may have increased as a function of greater density.   But only if you don’t have to re-position the head, and only if you are reading most of a full cylinder.  I also submit that the greater densities have lead to drives having fewer platters and thus fewer heads.  This leads to my conclusion that the reduction in drive size mostly offsets any significant increased throughput due to the greater densities.

Today we’re seeing a tendency towards 2.5″ and sometimes even 1.8″ drives.   These form factors have a potential to increase IO potential by decreasing seek times for the heads.   Basically the smaller drive has a shorter head stroke distance and thus potentially will take less time to move the head between tracks.   The theory is sound, but unfortunately the seek latency is typically much lower than the rotational latency; so the head gets there faster, but is still waiting for the proper sector to arrive as the disk spins.

Interestingly some manufacturers used to take advantage of a variable number of sectors per track and recognized that the outer tracks held more sectors.  For this reason they would use the outer 1/3 of the platter for “fast track” operations looking to minimize the head seek time and maximize the sequential throughput.   Again a sound theory, but the move from 3.5″ to 2.5″ drives eliminates this faster 1/3 of the platter.  Again, negating any gains we may have made.

Another interesting trend in disk storage is a movement to phase out 15,000RPM drives.  These disks are much more power hungry, and thus produce more heat than their slower (10,000RPM and 7,200RPM) counterparts.   Heat eventually equates to failure.  Likewise the rest of the tolerances in the faster drives are much tighter.   This results in faster drives having shorter service lives and being more expensive.   For those reasons (and the availability of flash memory) many storage vendors are looking to discontinue shipping of 15,000RPM disks. A 10K drive has only 66% of the IOP potential of a 15K drive.

So I submit that any gains we’ve had in the last 15 years in spinning disk performance have largely be offset by the changes in form factor.   Spinning disk hasn’t gotten faster in 15 years.  The moves towards 2.5″ and 10K drives could arguably suggest that disks are actually getting slower.

So IO demands for performance are getting greater.   VDI, Big Data Analytics, Consolidation and other trends demand more data and faster response times.  How do we address this?  Many would say the answer is flash memory, often in the form of Solid State Disk (SSD).

SSD storage is not exactly new

  • 1991 SanDisk sold a 20MB SSD for $1000
  • 1995 M-Systems introduced Flash based SSD
  • 1999 BiTMICRO announced a 18GB SSD
  • 2007 Fusion IO PCIe @ 320GB and 100,000 IOPS
  • 2008 EMC offers SSD in Symmetrix DMX
  • 2008 SUN Storage 7000 offers SSD storage
  • 2009 OCZ demonstrates a 1TB flash SSD
  • 2010 Seagate offers Hybrid SSD/7.2K HDD
  • 2014 IBM announces X6 with Flash on DIMM

But Flash memory isn’t without it’s flaws.

We know that a given flash device has a finite lifespan measured in write-cycles.  This means that every time you write to a flash device you’re wearing it out.  Much like turning on a light bulb, each time you change the state of a bit you’ve consumed a cycle. Do it enough and you’ll eventually consume them all.

Worse is that the smaller the storage cells used for flash are (and thus the greater the memory density) the shorter the lifespan.   This means that the highest capacity flash drives will sustain the fewest number of writes per cell.   Of course they have more cells so there is an argument that the drive may actually sustain a larger total number of writes before all the cells are burned out.

But… Flash gives us fantastic performance.   And in terms of dollars per IOP flash has a much lower cost than spinning disk.

DRAM memory (volatile) hasn’t gone anywhere either – in fact it keeps increasing in it’s own densities and reduced cost per GB.  DRAM doesn’t have the wear limit issue of Flash, nor the latencies associated with Disk. However it suffers from it’s inability to store data without power. If DRAM doesn’t have it’s charges refreshed periodically (every few milliseconds) it will loose whatever it’s storing.

Spinning disk capacities keep growing, and getting cheaper.  In December of 2014 Engadget announced that Seagate was now shipping 8TB hard disks for $260.

So the ultimate answer (for today) is that we need to use Flash or DRAM for performance and spinning disk (which doesn’t wear out from being written to, or forget everything when the lights go out) for capacity and data integrity.  Thus the best overall value comes from solutions which combine technologies to their best use.  The best options don’t ask you to create pools of storage of each type, but allow you to create unified storage pools which automatically store data optimally based on how it’s being used.

This is the future of storage.

Citrix Access via Chrome is Broken

Purpose:
This post explains Google Chrome functionality that can negatively impact the access to any Citrix environment.

Symptom:
After clicking on a published application or desktop icon in StoreFront using Chrome–nothing happens.

or

After logging on to StoreFront using Chrome, it never thinks Citrix Receiver is installed and offers it to me to download before I get to see my icons.

or

You have a warning to, “Unblock the Citrix plug-in.”

blocked_citrix_pluginResolution:
1) Re-enable the plugin using CTX137141.  This workaround will end in November 2015 when Google permanently disables NPAPI.
2) Customize StoreFront to remove the prompt to download Receiver with customized code.
3) Customize StoreFront with a link to download Receiver with customized code.
4) Enable a user setting to always open .ica files using CTX136578.
5) Use another browser not affected by the Chrome changes.

Cause:
Back in November 2014, Google announced it would remove NPAPI support from Chrome.  They are making this change to “improve security, speed, and stability” of the browser.   In April 2105, they will change Chrome’s default settings to disable NPAPI before removing it entirely in September of 2015.

What does this mean for my Citrix users who use Chrome?

Receiver detection.  The NPAPI plugin that Receiver (Windows and Mac) installs allows Receiver for Web (aka StoreFront) to detect if Citrix Receiver is or is not installed.  Without this plugin, it assumes you do not have Receiver and will offer it for you to download and install.  As an aside, you may have noticed that Internet Explorer has an ActiveX control that does the same thing.  If your user does not have Receiver then they can not launch their Citrix application or desktop, so this is a good thing. If your user is already running Receiver but gets offered the Receiver download this will be confusing and could potentially be a bad thing.

Launching applications and desktops.   Let me explain what should happen when you click on the icon for, say, Outlook 2010 in StoreFront (aka Receiver for Web).  StoreFront will talk to a delivery controller to figure out what machine is hosting Outlook 2010 and has the lowest load.  StoreFront will then offer you a .ica file to download.  If you have the plugin, Windows will know that this is a configuration file that should be opened by Receiver.  Receiver will then connect you to your application.  This all happens quickly and seamless making it seem like Outlook 2010 launches immediately.

Without the plugin, you will download an .ica file but Outlook 2010 will not open until you click it.  Chrome does have the option (the arrow on the downloaded file) to “Always open files of this type” as shown in CTX136578.

References:
http://blogs.citrix.com/2015/03/09/preparing-for-npapi-being-disabled-by-google-chrome/
http://blog.chromium.org/2014/11/the-final-countdown-for-npapi.html
http://support.citrix.com/article/CTX141137
http://support.citrix.com/article/CTX136578

Brian Olsen @sagelikebrian

vGPU, vSGA, vDGA, software – Why do I care?

I want to take a moment and talk for a second about an oft mentioned but little understood new feature of vSphere 6.  Specifically NVIDIA’s vGPU technology.

First, we need to know that vGPU is a feature of vSphere Enterprise Plus edition; which means it’s also included in vSphere for Desktops.  But if this sounds like something you need and you’re running Standard or Enterprise, now might be a good time think about upgrading, and taking advantage of the trade up promotions.

Many folks think “I don’t need 3D for my environment.  We only run office.”  If that’s you, then please take a good close look at what your physical desktop’s GPU is doing while you run office 2013; especially PowerPoint.  Nearly every device sold since ~1992 has had some form of hardware based graphics acceleration.  No so your VM’s.   Software expects this.  Your users will demand it.

With that, let’s talk about what we’ve had for a while as relates to 3D with vSphere.  Understand you can take advantage of these features regardless of what form of Desktop or Application virtualization you choose to deploy because it’s a feature of the virtual machine.

No 3D Support – I mention this because it is an option.  You can configure a VM where 3D support is disabled.  Here an application that needs 3D either has to provide it’s own software rendering, or it will simply error out.  If you know your App doesn’t use any 3D rendering at all this is an option to ensure that no CPU time or memory is taken up trying to provide the support.  No vSphere drivers are required.

Software 3D – Ok, here we recognize that DirectX and OpenGL are part of the stack and that some applications are going to use them.  VMware builds support into their VGA driver (part of VMware Tools) that can render a subset of the API’s (DX9, OpenGL 1.2) in software, using the VM’s CPU.  This works for a set of apps that need a little 3D to run, and we aren’t concerned about the system CPU doing the work.  No hardware ties here as long as you can live with the limited API support and performance. No vSphere drivers are required.  No particular limits on how many VM’s can do this beyond running out of CPU.

vSGA – or Virtual Shared Graphics – In this mode the software implementation above gets a boost by putting a supported hardware graphics accelerator in to the vSphere host.  The API support is the same because it’s still the VMware VGA driver in the VM, but it hands off the rendering to an Xorg instance in the service console which in turn does the rendering on the physical card.   This mode does require a supported ESXi .vib driver, provided by the card manufacturer.   That means you can’t just use any card, but have to buy one specifically for your server and which has a driver.  NVIDIA and AMD provide these for their server centric GPU cards.  Upper bound of VM’s is determined by the amount of video memory you assign to the VM’s and the amount of memory on your card.

vDGA – or Virtual Dedicated Graphics – In this mode we do a PCI pass-through for a GPU card to a given virtual machine.  This means that the driver for the GPU resides inside the virtual machine and VMware’s VGA driver is not used.  This is a double (or tripple) edge sword.   Having the native driver in the VM ensures that the VM has the full power and compatibility of the card, including latest API’s supported by the driver (DX11, OpenGL 4, etc.).  But having the card assigned to single VM means no other VM’s can use it.  It also means that the VM can’t move off it’s host (no vMotion, no HA, no DRS). This binding between the PCI device and the VM also prevents you from using View Composer or XenDesktop’s MCS, though Citrix’s Provisioning Services (PVS) can be made to work.  So this gives great performance and unmatched compatibility but at a pretty significant cost.  It also means that we do not want a driver installed for ESXi since we’re only going to pass-through the device.  That means you can use pretty much any GPU you want.  You’re limit on how many VM’s per host is tied to how many cards you can squeeze into the box.

All of the above are available in vSphere 5.5, with most of it actually working under vSphere 5.1.   I’ve said if you care about your user experience you wanted to have vSGA as a minimum requirement and consider vDGA for anyone who’s running apps that clearly “need” 3D support.   Though vDGA’s downside has had a way of pushing it out of of high volume deployments.

Ok so what’s new?   The answer is NVIDIA vGPU.  The first thing to be aware of is that this is an NVIDIA technology, not VMware.  That means you won’t see vGPU supporting AMD (or anyone else’s) cards any time soon.  Those folks will need to come up with their own version.   NVIDIA also only supports this with their GRID cards (not GeForce or Quadro). So you’ve got to have the right card, in the right server.   Sorry, that’s how it is.  It’s only fair to mention that vGPU first came out for XenServer about two years ago, and came out for vSphere with vSphere 6.0.  So while it’s new to vSphere, it’s not exactly new to the market.

So what makes this different?   vGPU is a combination of an ESXi driver .vib and some additional services that make up the GRID manager.  This allows dynamic partitioning of GPU memory and works in concert with a GRID enabled driver in the virtual machine.   The end result is that the VM runs a native NVIDIA driver with full API support (DX11, OpenGL 4) and has direct access (no Xorg) to the GPU hardware, but is only allowed to use a defined portion of the GPU’s memory.  Shared  access to the GPU’s compute resources is governed by the GRID manager.   Net-Net is that you can get performance nearly identical to vDGA without the PCI pass-through and it’s accompanying downsides.  vMotion remains a ‘no’ but VMware HA and DRS do work.  Composer does work, and MCS works.  And, if you set your VM to use only 1/16th of the GPU’s memory then you have the potential to share the GPU amongst 16 virtual machines.  Set it to 1/2 or 1/4 and get more performance (more video RAM) but at a lower VM density.

So why does this matter?   It means we get performance and comparability for graphics applications (and PowerPoint!) and an awesome (as in better than physical) user experience while gaining back much of what drove us to the virtual environment in the first place.  No more choosing between a great experience and management methods, HA, and DR. Now we can have it all.

If you’re using graphics, you want vGPU!  And if you’re running Windows apps, you’re probably using graphics!