Hello, thanks to all who attended my session. For those who have asked the session presentation can be downloaded here – download presentation.
During a recent Deployment of XenApp 7.6 on Windows Server 2012 R2 when users ran an application that exported data to Excel they kept getting this error.
Checking the XenApp session host server which was sized at 2vCPU and 8GB of RAM there was plenty of memory available as there was only one users logged into the server. Launching Excel then opening a workbook was fine and did not result in the error and after patching Office 2010 to the latest patch the error still persisted. After investigating there was no reason why this error would appear.
It would appear that this is a bug in Excel 2010 and Excel 2013 running on Windows Server 2012 R2 and excluding AppData\Local with Citrix Profile Management which is done to reduce the size of profile. With this configured the Cache folder ends up not having allocated enough space, the folder is part of the User Shell Folders in their profile.
The solution. Redirect the user Cache directory to C:WindowsTemp, but doing so without the need to load the hive and hack the default profile’s NTUSER.dat.
First assign Users Modify rights to C:WindowsTemp, otherwise they will not have access and this will not work.
Create a GPO Preference Registry Collection named something descriptive such as Excel Cache Directory
Create a new Registry Item pointing to: HKEY_CURRENT_USERSOFTWAREMICROSOFTWINDOWSCURRENTVERSIONEXPLORERUSER SHELL FOLDERS
The Value Should be Cache
The Data Should be C:\WINDOWS\TEMP
The Type Should be a REG_EXPAND_SZ
Allow for the GPO to replicate and run a GPUPDATE /FORCE and test and you should no longer see the error.
The next time you encounter this issue give this a try. For more information please leave a comment.
Johnny Ma @mrjohnnyma
Cisco says it is adding more sensors to network devices to increase visibility, more control points to strengthen enforcement, and pervasive threat protection to reduce time-to-detection and time-to-response. The plan includes:
- Endpoints: Customers using the Cisco AnyConnect 4.1 VPN client now can deploy threat protection to VPN-enabled endpoints to guard against advanced malware
- Campus and Branch: FirePOWER Services solutions for Cisco Integrated Services Routers (ISR) provides centrally managed intrusion prevention system and advanced malware protection at the branch office where dedicated security appliances may not be feasible
- Network as a Sensor and Enforcer: Cisco says it has embedded multiple security technologies into the network infrastructure to provide threat visibility to identify users and devices associated with anomalies, threats and misuse of networks and applications. New capabilities include broader integration between Cisco’s Identity Services Engine (ISE) and Lancope StealthWatch to allow enterprises to identify threat vectors based on ISE’s context of who, what, where, when and how users and devices are connected and access network resources.
StealthWatch can also now block suspicious network devices by initiating segmentation changes in response to identified malicious activity. ISE can then modify access policies for Cisco routers, switches, and wireless LAN controllers embedded with Cisco’s TrustSec role-based technology.
Cisco has also added NetFlow monitoring to its UCS servers give customers greater visibility into network traffic flow patterns and threat intelligence information in the data center.
Other aspects of the plan include Hosted Identity Services, which is designed to provide a cloud-delivered service for the Cisco Identity Services Engine security policy platform. The new hosted service provides role-based, context-aware identity enforcement of users and devices permitted on the network, Cisco says.
The strategy also includes a pxGrid ecosystem of 11 new partners that plan to develop products for cloud security and network/application performance management for Cisco’s pxGrid security context information exchange fabric. The fabric enables security platforms to share information to better detect and mitigate threats.
The company is also investing heavily in integrating its ASA firewalls with its Application Centric Infrastructure SDN,
More information can be found at http://www.networkworld.com/article/2932547/security0/cisco-plans-to-embed-security-everywhere.html
Interest in Software Defined Networking (SDN) continues to grow through the ability to make networks more programmable, flexible and agile. This is accomplished by accelerating application deployment and management, simplifying automating network operations and creating a more responsive IT model.
Cisco is extending its leadership in SDN and Data Center Automation solutions with the announcement today of Cisco Virtual Topology System (VTS), which improves IT automation and optimizes cloud networks across the entire Nexus switching portfolio. Cisco VTS focuses on the management and automation ofVXLAN-based overlay networks, a critical foundation for both enterprise private clouds and service providers. The announcement of the VTS overlay management system follows on Cisco’s announcement earlier this year supporting the EVPN VXLAN standard, which underlies the VTS solution.
Cisco VTS extends the Cisco SDN strategy and portfolio, which includes Cisco Application Centric Infrastructure (ACI), as well Cisco’s programmable NX-OS platforms, to a broader market and for additional use cases, which includes our massive installed base of Nexus 2000-7000 products, and to customers whose primary SDN challenge is in the automation, management and ongoing optimization of their virtual overlay infrastructure. With support for the EVPN VXLAN standard, VTS furthers Cisco’s commitment to open SDN standards, and increases interoperability in heterogeneous switching environments, with third-party controllers, and with cloud automation tools that sit on top of the open northbound API’s of the VTS controller.
Cisco is committed to delivering this degree of interoperability and integration with multi-vendor ecosystems for all of its SDN architectures, as we have previously exhibited with ACI, with the contributions we have made on Group Based Policies (GBP) to open source communities, and with our own Open SDN Controllerbased on Open Daylight. With VTS, we now offer the broadest range of SDN approaches across the broadest range of platforms and the broadest ecosystem of partners in the industry.
Programmability | Automation | Policy
Programmable Networks: With Nexus and NX-OS Programmability across the entire portfolio, we deliver value to customers deploying a DevOps model for automating network configuration and management. These customers are able to leverage the same toolsets (such as existing Linux utilities) to manage their compute and networks in a consistent operational model. We continue to modernize the Nexus operating system and enhance the existing NX-APIs by adding secure SDK with native Linux packaging support, additional OpenFlow support and delivering an object driven programming model. This enables speed and efficiency when programming the network while also securely deploying 3rd party applications for enhanced monitoring and visibility such as Splunk, Nagios and tcollector natively on the network.
Programmable Fabrics: Overlay networks provide the foundation for scalable multi-tenant cloud networks. VXLAN, developed by Cisco along with other virtualization platform vendors, has emerged as the most widely-adopted multi-vendor overlay technology. In order to advance this technology further, a scalable and standards-based control plane mechanism such as BGP EVPN is required. Using BGP EVPN as a control-plane protocol for VXLAN optimizes forwarding and eliminates the need for inefficient flood-and-learn approaches while improving scale. It also facilitates large scale deployments of overlay networks by removing complexity, fosters higher interoperability through open standard control plane solutions, and access to a wider range of cloud management platforms.
Application Centric Policy: Cisco will be able to offer the most complete solution on the Nexus 9000 series whether it is ACI policy-based automation or BGP EVPN-based overlay management. Customers will now have a choice for running an EVPN VXLAN controller in a traditional Nexus 9000 “standalone” mode, or to leverage ACI and the APIC controller with the full ACI application policy model, and integrated overlay and physical network visibility, telemetry and health scores. VTS will support EVPN VXLAN technology across a range of topologies (spine-leaf, three-tier aggregation, full mesh) with the full Nexus portfolio, as well as interoperate with a wide range of Top of Rack (ToR) switches and WAN equipment.
VTS Design and Architecture
The Cisco Virtual Topology System (VTS) is an cloud/overlay SDN solution that provides Layer 2 and Layer 3 connectivity to tenant, router and service VMs. Cisco VTS is designed to address the multi-tenant connectivity requirements of virtualized hosts, as well as bare metal servers. VTS is comprised of the Virtual Topology Controller (VTC), the centralized management and control system, and the Virtual Topology Forwarder (VTF), the host-side virtual networking component and VXLAN tunnel endpoint. Together they implement the controller and forwarding functionality in an SDN context.
The Cisco VTS solution is designed to be hypervisor agnostic. Cisco VTS supports both VMware ESXihypervisor and KVM on RedHat Linux. VTS will support integration with OpenStack and VMware vCenter for integration with other data center and cloud infrastructure automation. VTS also integrates with Cisco Prime Data Center Networking Manager (DCNM) for underlay management. The Cisco VTC, the VTS controller component, will provide a REST-based Northbound API for integration into other systems.
Cisco VTS will be available in August. 2015
Source of Blog post was from Gary Kinghorn @ http://blogs.cisco.com/datacenter/vts
I’ll start by answering the title question first. IOP is an acronym standing for Input Output Operation. It does seem like it should be IOO, but that’s just not the way it worked out.
A related bit of trivia, we generally talk either about total IOPs for a given task, or we talk about a rate – IOPs per second typically, noted as IOPS.
With that the Wikipedia portion of today’s discussion is complete. Let’s move on to why we care about IOPs.
Most frequently the topic comes up in terms of either measuring a disk system’s performance, or attempting to size a disk system for a specific workload or loads. We want to know not how much throughput a given system needs, but how many discrete reads and writes it’s going to generate in a given unit of time.
The reason we want to know is that a given storage system has a discrete number of IOPS it can deliver. You can read my article on Disk Physics to get a better understanding of why.
In the old days this was mostly a math problem. We knew that a 7.2K drive would deliver 60-80 IOPS, a 10K drive would deliver 100-120, and a 15K drive would give us 120-150 IOPS. We also knew that we had to deal with RAID penalties associated with write operations to storage arrays. Typical values were 1 IO penalty for RAID1 and 10, and 4 for RAID5 and 50.
The idea here was fairly simple. If I needed a disk subsystem that would give me 1500 IOPS read, then I needed 10 15K drives to do that (1500/150 = 10). If I needed 1500 IOPS write in a RAID10 comfit, then I needed 20 15K drives ((1500 + (1500 * 1))/150 = 20). The same 1500 IOPS write in a RAID5 config took more spindles because of the RAID penalties but it was also easily calculated as 50 drives ((1500+(1500*4))/150 = 50).
That last by the way is how come database vendors have always asked that their logs be placed on RAID1 or RAID10 storage. When writing to RAID5 storage it’s necessary to read the entire RAID stripe, recalculate, and re-write it. Thus the 4 penalties.
The math got a bit more complicated when we had a mix of reads and writes. What we have to do there is to calculate the read and write portions separately and then add the result together. Suppose we had a workload of 3000 IOPS, where 50% was read and 50% was write. Thus we’d have 1500 IOPS read and 1500 IOPS write. On a RAID10 system we’d need 10 drives to satisfy the reads, and 20 drives to satisfy the writes. A total of 30 drives then is needed to satisfy the whole 3000 IOPS workload.
Those were the old days when we could pretty easily look at a disk subsystem and calculate how much performance it should deliver. Modern disks however have changed the rules some.
How did they change the rules? Well, basically they have a way of making IOPs disappear.
Consider for a moment NetApp’s WAFL configuration. WAFL works by caching write operations to an NVRAM on the controller, and telling the application that the IO is complete. No physical IO operation has actually taken place. Now, thus far this sounds like a write back cache, but here’s the difference. WAFL doesn’t just perform a “lazy write” of the cached data, it actually waits until it has a series of writes which need to be written to the physical disks, and then it looks for a place on disk where it can write all of those blocks down at once in sequence. Thereby taking perhaps 4 or 10 (or more) physical IOPs and combining them into one. WAFL actually takes this a step further by looking for places on disk where it doesn’t have to read the stripe before writing it in an attempt to also avoid paying the RAID write penalties. This last is the reason WAFL performance degrades as the disk array becomes very full; it becomes harder to find unused space.
Another example of vanishing IOPs is Nimble’s CASL filesystem that expands on what WAFL does by doing two additional things. First, it compresses all the data as it comes into the array, which further reduces the number of IOPs necessary to write the data. Second CASL is based around the idea of having very large FLASH memory based caches so that physical IOPs to spinning disk can be avoided for reads. The net of this being that write IOPs are reduced and read IOPs are nearly eliminated completely. In testing done by Dan Brinkman while he was at Lewan, a Nimble array with 12 7.2K disks was clocked at over 18,000 IOPS. We know that the physical disks were capable of no more than 960 IOPS (80 * 12 = 960). This is a testament to how effective CASL is at reducing physical IOPs.
A third example of IO reduction is what Atlantis Computing does in their Ilio and USX products when dealing with persistent data (in-memory volumes is a topic for another day). Atlantis takes the idea of caching and compression further still by adding inline data deduplication, wherein data is evaluated before being written to determine if an identical block has already been written. If it’s an identical block then no physical write is actually performed for the block, and the Filesystem pointer for that block is merely updated to reflect an additional reference. Atlantis caches the data (reads and writes) in RAM or on FLASH as well to further reduce physical IO operations.
The extreme case of this is the all-flash storage array (or subsystem), which is available from many vendors these days (Compellent, NetApp, Cisco, Atlantis, VMware vSAN, all offer all flash options and there are many more options as well). All flash arrays eliminate physical disk IO by eliminating the physical disks. They’ve made the FLASH cache tier so large that there is no longer any need to store the data on a spinning drive. There is still an upper bound for these arrays but it’s tied to controllers and bandwidth rather than the physics of the storage medium.
So what’s the net of all this?
The first part is that storage has gotten smarter and more efficient by making better use of CPU’s and memory. Letting them deliver higher performance and better data density with fewer spinning drives.
The second part of the answer is that the old-school disk math around how many IOPS you need and how many spindles (spinning disks) will be required is largely obsolete. Unless you’re building an old-school storage array or using internal disks in your server the storage is probably doing something to reduce and/or eliminate physical disk IOPs on your behalf. Making the idea that you can judge the performance of the storage by the number and type of drives is uses pretty much false. A case of not being able to judge the book by its cover.
You’ll need to discuss your workload with your storage vendor and determine how the array is going to handle your data and then rely on the vendor to size their solution properly for your need.
We have all encountered the dreaded Java error when trying to connect to the Citrix Netscaler GUI. In this post I would like to walk through the steps of resolving those Java error messages. There are a few technical articles that TRY to walk you through the process of troubleshooting this issue, but I have found the method that I use to be the most successful. For me this is one of the most frustrating error messages, as I am constantly working in different versions of Java, Netscaler firmware or browser.
For starters, lets go ahead and uninstall any version of Java you currently have installed. Most versions of Netscaler 10.1 and above will support the most recent version of Java. You can download the most recent version Here. For this exercise, we are going to assume you are using chrome, Firefox or IE. In my experience, I have had the most success with the Netscaler GUI and the Chrome browser.
After you have successfully installed Java and went through the confirmation process go ahead and browse to your java configuration applet or go to control panel > Java (32bit).
Once the Java Control Panel pops up, click on the Settings button.
You will now be redirected to the Temporary Internet files dialog. First, click on the “Delete Files” button
One the “Delete Files and Applications” box appears, UNCHECK all of the checkboxes and click OK.
Before clicking out of the Temporary Internet files dialog, make sure to uncheck ” Keep Temporary files on my computer” and click OK. Having all of these temporary files are one of the main causes for applet corruption.
That last set of steps will clear out all the previously downloaded temporary applets, cookies and certificates you currently have in your configuration. If you are launching java for the first time after the new install this might be a moot point, but I do it anyway 🙂
Now, stay in the Java Control Panel and at the top, click on the “Security” Tab. Inside of that tab, click on “Edit Site List” at the bottom.
Once you have clicked on Edit Site list, Click on Add. Here you will be able to add the Netscaler access gateway FQDN as an exception. Only add websites here that you know you can trust their certificate.
After you click add you will notice a text box appear in the same window. Go ahead and add your Netscaler FQDN into that field and click OK example: Https://yournetscaler.yourdomain.com
After clicking OK, you will notice your Netscaler FQDN is now in the exceptions list. Click Ok to exit the Java Control panel and relaunch your browser to test.
This article applies to Netscaler versions 9.3, 10.0, 10.1
Let me know how it goes. Add your comments below!
Kevin B. Ottomeyer @OttoKnowsBest
I would like to discus the procedure for configuring and implementing Domain Pass-through with Citrix Storefront and Citrix Receiver.
First things first, let’s get a receiver installed on a test machine.
****Note, this machine and all subsequent machines must be a member of the domain that your storefront server is currently attached to in order for the pass-through to work.
Download the Citrix receiver Here
Once downloaded find the path of your download location. Now, we will need to install the receiver with the single sign on switch as follows:
This will install the receiver, enable and start the single sign-on service on that machine. After your installation is completed and the machine is rebooted, log back in to your workstation and double-check to make sure the ssonsvr.exe service was installed and is currently running under services.
Once you have confirmed. Lets move over to your Storefront server.
Launch the Storefront administration console from the storefront server and on the left side of the console, click on Authentication.
Once authentication is selected move over to the right side of the console screen and under actions > authentication, click on add/remove Methods.
After clicking on Add/Remove Methods, a dialog box should appear with options to select what methods you would like to enable in Storefront. The second option from the top is, “Domain pass-through”, click on the check box next to that option and click OK. This will enable Storefront to take the credentials from the ssonsvr service on your workstation and pass them through Storefront and enumerate the app list without authenticating twice.
Depending on your Citrix infrastructure, you might need to propagate the changes to the other Storefront servers in your Server Group. If you have more than one Storefront server and you do not propagate changes, you might see mixed results in your testing.
To do this, click on “Server Group” on the right side of the console and then on the left side under actions, click on “Propagate Changes”. This action will replicate all the changes you just made to your authentication policies over to the other Storefront servers in your Server Group.
Now that you have all the configuration pieces in play, reboot the workstation you installed the receiver to and log back in. Once logged in your should be able to right-click on the receiver and click open. Receiver will now prompt you for your Storefront FQDN or email address if you have email based discovery enabled. At this point your application list should enumerate without prompting for credentials. This also goes for the Web portal. Test both to make sure they are passing those credentials through appropriately.
********If your credentials still do not pass through, below are a few troubleshooting steps you can take. Of course this all depends on how your environment is set up and what access you have to modify certain components in your windows infrastructure.
Modifying local Policy to enable pass-through on the workstation
Apply the icaclient.adm template located in C:\Program Files\Citrix\ICA Client\Configuration to the client device through Local or Domain Group Policy.
Once the adm template is imported, Navigate to Computer Configuration\Administrative Templates\Classic Administrative Templates\Citrix Components\Citrix Receiver\User authentication\, then double-click on the “Local user name and password” setting.
The following box should appear and make sure to select both “Enable pass-through authentication” and “Allow pass-through authentication for all ICA connections”.
Adding Trusted Sites in your browser
On the same workstation you are testing the pass-through. Open IE and navigate to Tools > Internet Options. Click on Trusted Sites and add your Storefront FQDN (the same address you entered into the receiver when you set it up.
Also, it wouldn’t hurt to configure pass through in IE. In The Internet Options Security tab with Trust Sites selected, choose Custom level, security zone. Scroll to the bottom of the list and select Automatic logon with current user name and password.
Configure the NIC provider order
On the workstation you installed the receiver, launch control panel and click on Network Connections, choose Advanced > Advanced Settings > Provider Order tab and move the Citrix Single Sign-on entry to the top of the Network Providers list.
If you are still having problems with the receiver not passing the credentials, leave a comment with your specific issue.
Kevin B. Ottomeyer @OttoKnowsBest
Thank you to all who attended the event last week at Fortrust where we discussed how we help clients with the DR planning, implementation ongoing management. We discussed premise solutions and DRaaS (DR to our cloud infrastructure) solutions as well. Thanks also to out partners: Fortrust, Faction and Zerto for participating. Many of you have asked for the presentation and I’ve posted it here for your convenience.
Much has been made recently by the likes of Nutanix, Simplivity, Atlantis, and even VMware (vSAN, EVO|RAIL) about the benefits of hyper-coverged Architecture.
I thought I’d take a few moments and weigh in on why I think that these architectures will eventually win in the virtualized datacenter.
First, I encourage you to read my earlier blogs on the evolution of storage technology and from that I’ll make two statements. 1.) physical storage has not changed and 2.) what differentiates one storage array vendor from another is not the hardware but the software their arrays run.
Bold statements I know, but bear with me for the moment and let’s agree that spinning disk is no longer evolving, and that all storage array vendors are basically using the same parts – x86 processors, Seagate, Fugitsu, Western Digital hard disks, and Intel, Micron, Sandisk or Samsung flash. What makes them unique is the way they put the parts together and the software that makes it all work.
This is most easily seen in the many storage companies who’s physical product is really a Supermicro chassis (x86 server) with a mix of components inside. We’ve seen this with Whiptale (Cisco), Lefthand (HP), Compellent (Dell), Nutanix, and many others. The power of this is evidenced where the first 3 were purchased by major server vendors and then transitioned to their own hardware. Why was this possible? Because the product was really about software running on the servers and not about the hardware (servers, disks) itself.
Now, let’s consider the economics of storage in the datacenter. The cheapest disk and thus the cheapest storage in the datacenter are those that go inside the servers. It’s often a factor of 5-10x less expensive to put a given disk into a server than it is to put it into a dedicated storage array. This is because of the additional qualification and in some cases custom firmware that goes onto the drives that are certified for the arrays; and the subsequent reduction in volume associated with a given drive only being deployed into a single vendor’s gear. The net being that the drives for the arrays carry a premium price.
So, storage is about software, and hard disks in servers are cheaper. It makes sense then to bring these items together. We see this in products like VMware vSAN, and Atlantis USX. These products let you choose your own hardware and then add software to create storage.
The problem with a roll-your-own storage solution is that it’s necessary to do validation on the configuration and components you use. Will it scale? Do you have the right controllers? Drivers? Firmware? What about the ratios of CPU, Memory, Disk, and Flash? And of course there is the support question if it doesn’t all work together. If you want the flexibility to custom configure then the option is there. But it can be simpler if you want it to be.
So here enters the hyper-converged appliance. The idea is that vendors combine commodity hardware in validated configurations with software to produce an integrated solution with a single point of contact for support. If a brick provides 5TB of capacity and you need 15TB, buy 3 bricks. Need more later? Add another brick. It’s like Legos for your datacenter, just snap together the bricks.
This approach removes the need to independently size RAM, Disk, and CPU; it also removes the independent knowledge domains for storage and compute. It leverages the economy of scale of server components and provides the “easy button” for your server architecture, simplifying the install, configuration, and management of your infrastructure.
Software also has the ability to evolve very rapidly. Updates to existing deployments do not require new hardware.
Today the economics for hyper-converged appliances have fallen short of delivering on the price point. While they use the inexpensive hardware the software has always been priced at a premium.
The potential is there, but the software has been sold in low volumes with the vendors emphasizing OpEx savings. As competition in this space heats up we will see the price points come down. As volumes and competition increase the software companies will be willing to sell for less.
This will drive down the cost, eventually making legacy architectures cost prohibitive due to the use of proprietary (and thus low volume) components. Traditional storage vendors who are based on commodity components will be more competitive, but being “just storage” will make their solutions more complicated to deploy, scale, and maintain. The more proprietary the hardware the lower the volume and higher cost.
For these reasons – cost, complexity, and the ability of software to evolve – we will see hyper-converged, building block architectures eventually take over the datacenter. The change is upon us.
Are you ready to join the next wave? Reach out to your Lewan account executive and ask about next generation datacenters today. We’re ready to help.
Today’s topic – Disk Performance. A warning to the squeamish ..Math ahead.
Throughput refers to the amount of data read or written per unit of time. Generally measured in units like Megabytes per second (MB/s), or Gigabytes per Hours (GB/h). Often when dealing with networks we see Kilobits per second (Kb/s) or Megabits per second (Mb/s). Note that the abbreviations of some of those units look similar, pay attention to the capitalization because the differences are a factor of at least 8x.
It’s easy to talk about a hard drive, or a backup job, or even a network interface providing 250MB/s throughput and understand that if I have 500GB of data that it’s going to take a little over a half hour to transfer the data. (500GB * 1024MB/GB / 250MB/s / 3600s/h = 0.56h)
Throughput is by far the most talked about and in general most understood measure of performance. By the same token when people ask about network performance they often go to speedtest.com and tell me that “my network is fast, because I get 15Mb/s download.” I agree that’s a pretty decent throughput, but that’s not the only measure of performance that’s important.
A second measure is Response Time (or latency). This is a measure of how long it takes a request to complete. In the network world we think about this being how long it takes a message to arrive at it’s destination after being sent. In the disk world we think about how long from when we request an IO operation happen until the system completes it. Disk latency (and network latency) are often measured in direct units of time – milliseconds (ms) or microseconds (us), and occasionally in seconds (s). Hopefully you never see IT technology latency measured in hours or days unless you’re using an RFC1149 circuit.
The combination of a request response time and throughput, combined with the size of the request (amount of data moved at a time) yields a metric which amounts to how many requests can be completed per unit of time. We see this most often in the disk world as I/O operations per second or IOPS. We talk about IOPS a way of thinking about how “fast” a given disk system is, but it’s arguably more of a measure of workload capability than either latency or throughput; however both latency and throughput contribute to the maximum IO operations per second a given disk can mange.
For example – if we have a hard disk with an maximum physical throughput of 125MB/sec, which is capable of processing requests at a rate of 80 requests per second, what is the throughput of the drive if my workload consists of 4KB reads and writes? Well in theory at 125MB/sec throughput the drive could process 125MB/s * 1024KB/MB / 4KB/IO = 32,000 IO/s. Hold on, the drive is only capable of 80 IOPS so the maximum physical throughput won’t be achieved. 80IO/s * 4KB/IO = 320KB/s. If we wanted to maximize this drive’s throughput we need to increase the size (or payload) of the IO requests. Ideally we’d perform reads an writes in blocks equal to the maximum throughput divided by the maximum IO rate (125MB/s / 80IO/s = 1.562MB).
This last trick by the way is what many vendors use to improve the performance of relatively “slow” hard disks; referred to as IO coalescing they take many small IO operations and buffer them until they can perform one large physical IO.
What governs the drive’s maximum IOPS is actually a function of multiple factors.
Physical media throughput is one of them – which is governed by the physical block size (often 512bytes), the number of blocks per track on the platter, the number of platters, and the rotational velocity of the drive (typically measured in revolutions per minute – RPM). The idea here being that the drive can only transfer data to or from the platters at the rate at which the data is moving under the heads. If we have a drive spinning at 7200RPM, with say 100 blocks per track, and 512bytes/block and a single platter/head we have a drive with a maximum physical media transfer rate of 512B/block * 100blocks/track * 7200 tracks/minute / 60seconds/minute / 1024bytes/KB / 1024KB/MB = 5.85MB/s. Under no circumstances can the physical throughput of the drive exceed this value, because the data simply isn’t passing under the head any faster.
To improve this value you can add more heads (with two platters and 4 heads this drive could potentially move 23.4MB/s). You can increase the number of blocks per track (with 200 blocks per track and one head the drive would have a throughput of 11.7MB/s). Or you can increase the velocity at which the drive spins (at 36,000 RPM this drive would move 29.25MB/sec). As you can see though this maximum throughput is governed by the physical characteristics of the drive.
A second factor impacting the IOPS is the question of how long it takes to position the head to read or write a particular block from the disk. IO operations start at a given block and then proceed to read or write subsequent sequential blocks until the size of the IO request has been fulfilled. So on our sample drive above a 4KB request is going to read or write 8 adjacent (sequential) blocks. We know what the physical transfer rate for the drive is, but how long does it take to physically move the mechanism so that the 8 blocks we care about will pass under the head? Two things have to happen, first we have to position the head over the right track, and then we have to wait for the right block to pass under the head. This is the combination of “seek time” and “rotational latency”. Our 7200RPM drive completes one revolution every (7200RPM / 60 seconds/minute = 120 revolutions/second or every 120th of a second or 0.00833 seconds or 8.33 milliseconds). On average then every IO operation will take 4.16ms to start performing IO after the head are aligned. Again we can reduce the rotational latency by spinning the drive faster. Seek time (how long it takes to align the heads) varies by drive, but if it takes 6ms then the average physical access time for the drive would be 10.15ms. Drives which are physically smaller will have to move the heads shorter distances and will have lower seek times, and therefore lower access times. Larger drives, or drives with heavier head assemblies (more heads) will have higher seek times. For a given drive you can typically look up the manufacture’s specs to see what the average seek time is. So, let’s say that it takes 10ms to typically position a head and read a block, then our drive could potentially position the head 100 times per second. That means the maximum IOPS for this drive is 100 per second.
So, IOPS is governed by physics, and throughput (from media) is governed by physics. What else is a factor? There are additional latencies introduced by controllers, interfaces, and other electronics. Generally these are fairly small, measured in micro-seconds (us) relative to the latencies we’ve talked about generally become negligible. The other side is physical interface throughput ATA-133 for instance had a maximum throughput on the channel of 133MB/s; where Ultra-160 SCSI was limited to 160MB/s. The maximum throughput of a given drive will be limited to the throughput of the channel to which it’s attached. The old ATA and SCSI interfaces noted earlier also attached multiple devices to a channel which limited the sum of all devices to the bandwidth of the channel. Newer SAS and SATA architectures generally dedicated a channel per device, however the use of expanders serves to connect multiple devices to the same channel. The net of this being that if you have 10 devices at 25MB/sec throughput each connected to a channel with a maximum throughput of 125MB/sec then the maximum throughput you’ll see is 125MB/sec.
So that covers what governs the speed of a hard disk. Some may ask “what makes SSD’s so fast?” The short answer is that it’s because they aren’t spinning magnetic devices, and therefore don’t have the same physical limits. The long answer is a topic for another blog.