Citrix Virtual App user density on AWS
Login VSI workload testing on AWS EC2 instances
If you are in the End User Compute (EUC) space like me, you are undoubtedly aware of the annual VDI Like A Pro survey. In this survey we continue to see the trend of interest in the public cloud growing, however most of the respondents cited cost and performance as major concerns when considering cloud options. At Login VSI we are seeing more traction with enterprises adopting cloud strategies for their digital workspaces, and I am personally very excited about this.
When we talk about enterprises in the cloud, we are talking about large migrations as the average size of published application deployments are between 1000-5000 users (another survey data point). Due to the size of these deployments, mistakes can become quite costly when not properly sizing, and performance issues can stifle the efficiency of a sizeable number of employees within your environment.
Taking into consideration the interest in cloud-based offerings, Citrix partnered with Login VSI using the Login VSI industry standard benchmarking software to formulate not only sizing, but also a method for determining user density of Amazon Web Services EC2 instance types. The benefit of this is that Citrix and AWS customers can have some objective data for sizing, and testing would provide some baseline user performance data as well, thereby increasing the consumer’s likelihood of success with less cost.
At Login VSI we are acutely aware that the success of these type of projects is largely dependent upon the user experience, and therefore we’ve developed our suite of enterprise tools to ensure the best user experience once your configuration leaves the test bed and enters into production.
To learn more about Citrix Cloud services, see the following link.
To learn more about the annual VDI Like a Pro survey – an industry wide independent survey conducted with hundreds of respondants throughout the world – go to the results. These results give IT professionals insight about trends happening within the industry. This enables us to be better prepared, and make intelligent decisions about what technology we invest our time into learning, and deploying.
In our test configuration the Citrix VDA was installed onto a vanilla Windows Server 2016 RDSH installation with the latest patches. It is important to note that there was no optimization tuning performed on the OS. This gave us the opportunity to be more conservative about the max concurrent sessions per instance type. This is important as we wanted to provide a baseline that is applicable to most enterprise workload’s demand on the system resources like CPU, memory and storage. It is important to note that your mileage will vary, and this baseline is a great starting point, but you should always test your specific workload to validate your densities and optimizations.
Each test was executed multiple times to ensure that the results were accurate.
Typically, we follow the best practice of rebooting the virtual machines between each test, however we found that there is no difference between test results where the machines were rebooted. In cloud-based offerings you can reboot the virtual machine but the physical hardware backing it is not accessible, and therefore we were unable to see the performance of the underlying hardware, or reboot it. We did however capture performance monitoring data on the Windows Server VM itself and found the bottleneck to be the CPU in most instances. We’ve also found that this is typical in most modern-day VDI or SBC cloud deployments.
To properly stress the environment, the test duration was set to 48 minutes, the default Login VSI test duration. This test duration configures the period between each login to maintain a smooth and evenly distributed login for all tests.
For this testing, the default Login VSI Knowledge Worker was used for all sets of test data.
The Knowledge Worker is the most popular of the Login VSI workloads as it most closely represents today’s enterprise worker. It utilizes programs such as Adobe Reader, Freemind/Java, Internet Explorer, Microsoft Excel/Outlook/PowerPoint/Word and Photo Viewer.
It also uses native Windows apps (like Notepad and Zip) to complete the print and zip actions used by the workload meta functions.
The Knowledge Worker is designed for 2(v)CPU environments. This is a well-balanced intensive workload that stresses the system smoothly, resulting in higher CPU, RAM and IO usage.
Finding the maximum user density
Maximum user density is determined when a dynamically created threshold is exceeded. This threshold is established by first determining the baseline (best case) performance of the system. The baseline performance is calculated by a proprietary set of actions executed within the workload every 3 – 4 minutes. The first 15 of these response times are calculated, removing the highest and lowest values (variance). The threshold is then set to a response time where users are expected to no longer be receiving a good experience. This point represents the “curve” in the hockey stick, where ordinary users will start to feel the impact separating from their “normal” experience with the delivered system.
It is important to consider performing these tests yourself as there are many factors that can greatly modify the density numbers you will see in production. For instance, running an antivirus software like Windows Defender, or utilizing a profile management solution, such as Workspace Environment Manager (WEM).
Our Citrix configuration was simple. A single Storefront server, single cloud connector, and a single Windows Server 2016 RDSH XenApp host were used, in addition to Login VSI infrastructure (file share and launchers). Authentication and access were controlled by a typical Windows based domain.
In order to test the user density for each AWS instance, the RDSH role was installed on each instance type, along with the Citrix VDA. Here is a list of the test configuration components:
- Citrix Virtual Apps and Desktops service (formerly Citrix Cloud XenApp and XenDesktop Service)
- Citrix Server OS VDA 7.17 (default policies, no optimizations)
- Citrix Cloud Connector
- Windows Server 2016, with current patches as of the time of testing and no optimizations (i.e., Citrix Optimizer or VMware OSOT)
- Microsoft Office 2016, out of the box configuration (no optimizations)
- Windows Defender (no optimizations or exclusions)
- The Login VSI target setup (Adobe Reader, Freemind, etc…)
The M5 Instance Type
Amazon provides a variety of different instance types for each type of workload. For our testing we focused on the general purpose instances. These instance types are the most popular in use currently, according to several of our inside sources. The instance type you select can have a profound effect on the user density and user experience, as you will see below. For additional information on instance types see: https://aws.amazon.com/ec2/instance-types/
Note: Don’t forget to consider which AWS geographic location(s) you’d like your instances to run from. The closer the AWS datacenter is to your users, the less latency will impact the user experience.
M5 instances are the latest generation of General Purpose Instances. This family provides a balance of compute, memory, and network resources, and it is a good choice for many applications.
- 2.5 GHz Intel Xeon® Platinum 8175 processors with new Intel Advanced Vector Extension (AXV-512) instruction set
- New larger instance size, m5.24xlarge, offering 96 vCPUs and 384 GiB of memory
- Up to 25 Gbps network bandwidth using Enhanced Networking
- Requires HVM AMIs that include drivers for ENA and NVMe
- Powered by the new light-weight Nitro system, a combination of dedicated hardware and lightweight hypervisor
- Instance storage offered via EBS or NVMe SSDs that are physically attached to the host server
- With new M5d instances, local NVMe-based SSDs are physically connected to the host server and provide block-level storage that is coupled to the lifetime of the M5 instance
Use Cases: Small and mid-size databases, data processing tasks that require additional memory, caching fleets, and for running backend servers for SAP, Microsoft SharePoint, cluster computing, and other enterprise applications. The M5 instances are also a great choice for Citrix Virtual App workloads.
The table below shows the results of the Login VSI Knowledge Worker scalability tests. This table is intended to help you identify the best starting point for your initiatives.
Remember that we tested a vanilla Windows Server configuration with antivirus. Because there was no performance tuning optimizations, this is expected to be pretty conservative and a good starting point for instance selection.
For production workloads, in order to derive more accurate testing with respect to sizing we like to ask ourselves questions like, “What would happen if...”:
- ...I used a different antivirus application? Would it impact CPU utilization more?
- ...I enabled Citrix Workspace Environment Manager (WEM)? Would it improve user experience?
- ...I used the Citrix Optimizer on my session host? Would I get 20% more sessions per host?
- ...I install Spectre and Meltdown related patches? Would I get 20% fewer sessions per host? Since our testing was recently conducted, we know that some Spectre and Meltdown patches were installed.
- ...I install monitoring agents? Would I get fewer sessions per host?
These questions help illustrate just a few of the things that will influence the scaling and performance you’ll see with your specific workload, in addition to what’s likely a different set of applications and user behavior. Our results here are based on a consistent but synthetic/reproducible workload, configured conservatively – you’ll want to run your own tests, with your specific workload and configuration, to get more more directly applicable results, and LoginVSI can help!
If you are this type of user, then you may want to start here
Knowledge workers are the most common type of enterprise worker. Based on our testing:
- The most cost effective option for Knowledge Worker workloads is the M5.2XLarge. This was also confirmed by testing that Citrix published in this blog https://www.citrix.com/blogs/2018/08/16/citrix-scalability-in-a-cloud-world-2018-edition/
- The least cost effective option for Knowledge Worker workloads is the M5.Large
- The configuration offering the most user density for Knowledge Worker workloads is the M5.4Xlarge
Prior to updating our test results with the M5 instances, we tested the M4 instances. In general you should always use the latest tech from cloud providers. With the M4 testing we did find some behavior that seemed counter intuitive. When comparing the M4.10XLarge to the M4.16XLarge we saw that the larger instance type provided less user density. The M4.16XLarge has 24 more vCPUs and 64GB more RAM, yet the M4.10XLarge was able to host 24% more sessions. Knowing this, how much money would you have saved by selecting the cheaper of the two?! While we aren’t sure what caused this, we suspect processor power settings may have played a role. This just reinforces the point that you should always test the platforms you plan on deploying your Citrix Virtual Apps on, before you go to production.