Thursday, 10 September 2015

The need and how to virtualize business critical applications


Virtualizing business critical applications brings many benefits for organizations. This blog explains the technical challenges and offers solutions.

When companies deploy virtual infrastructure environments they achieve immediate savings in data center footprint by consolidating server workloads onto few hardware components. Tuning those achieves higher levels of availability for those applications running on them. But after much virtualization organizations often fail to progress.

Getting all applications migrated to a virtual infrastructure platform requires new skills and ways of managing capacity. The shift to Software Defined Databases requires fundamental shift of how applications are developed and deployed. Licensing issues require special attention (as vendors also realize that compute workloads are no longer directly tied to physical hardware components).

The most common problem

As soon as a Business-critical application require a higher levels of availability than available virtual infrastructure can provide, problems arise.

Business critical applications are understood as applications such as Microsoft SQL, Exchange, SharePoint; SAP; Custom Java on Linux; Oracle and Oracle RAC (most common examples) as well as DB2, Cassandra, Hadoop/ HBase, WebLogic, WebSphere; Tibco, Rabbit MQ, MQ Services and other message queue systems and finally in-house custom built/ maintained “home grown” applications.

When the application runs slowly or even becomes unstable, the application is temporarily moved back to the original physical infrastructure and the virtual environment is blamed. The reason is not a problem of the virtual environment, but in the configuration of how the virtual environment was deployed on the physical infrastructure. Further, some often basic mistakes are made.

Understand the key issues

Business-critical applications share a number of technological characteristics: They have high compute loads (with heavy math or thread processing), RAM utilization, specialized I/ O (particularly storage), availability configurations (requiring OS or application clustering) and complex networking configurations (public and private networks to support clustering).

Each critical application requires a disproportionate amounts of CPU, RAM, Disk (including disk space and I/O) and network (including number of connections and bandwidth) and higher levels of redundancy, availability and recoverability. Each application’s requirements are unique, but predictable. Important to translate resource requirements to run on native hardware to the virtual environment.

Although every application has something unique it is not necessary to define individual best practices for each application to thrive in a virtual infrastructure environment. The abstraction layer of the virtual environment with a set of common practices can apply to all critical applications. Then each application can be further tuned like on any other physical infrastructure.

Solutions

Critical applications are already complex, so keep design and solution simple.

-          Avoid adding disks and spreading them across multiple data stores. Keep number of disks and data stores to a minimum. Avoid splitting out base files that are part of a virtual machine’s core components (vswap and others).

-          Avoid duplicating features for high availability or redundancy through external/ homegrown solutions (often already present in the base systems or architecture).

-          Avoid assigning more CPU cores than necessary as it may slow performance (hypervisor may seek to schedule CPU cores that will do nothing; heavily threaded applications use more cores while number crunchers use fewer cores and more cycles).

Instead, architect hardware from a total performance perspective.

The virtual environment always depends on the hardware. Therefore, size HW components appropriately to handle the anticipated loads. Optimize CPU, RAM, Disk and Network.

-          RAM is almost always exhausted first on virtual infrastructure environments.

-          Spread I/ O appropriately across storage area network (SAN); use solid state drive (SSD) and cache capabilities to boost performance. Enable jumbo frames as norm for IP SAN technologies (iSCSI and NFS).

-          Use 10GbE connections for all network connectivity.

Storage is the perhaps the most complex resource to manage, because it is almost always abstracted in multiple layers and varying dependent on the make & model of the storage system used. It is where most application performance problems arise first and most frequently.

-          Storage capabilities should be pushed as low as practical in the hardware stack.

-          Storage should appear as simple, local disks, and networks should appear a simple connections.

-          Make sure that individual components are not easily overwhelmed similar to architecting shared storage for high-capacity I/O systems and applications.

-          Use raw disk mappings (RDMs) as last resort only (does not add performance advantage over a virtual disk located in a properly configured data store). Instead and where feasible, use OS-level storage systems like ASM on Oracle.

Keep networks simple.

-          Avoid virtual network interface controller (vNIC) teaming and bonding inside a VM, as it is already handled by the hypervisor. Use one NIC for each distinct network to connect to.

-          Keep virtual machines simple and transparent. Do not install/ turn off unnecessary services and features.

-          Follow best practices to harden OS (it should feel too the applications as any other optimized environment).

A typical business critical application optimization stack

A typical business critical application optimization stack could look as follow (from bottom to top):

-          Application oriented optimization

o   5b) Java Application

§  Resource Allocation, App Tunables

o   5a) Java Virtual Machine

§  Heap Size, Threads, etc.

o   5) Application

§  Cache, SGA, RM Commitment, App Specific Tunables

o   4) Operating System

§  Para-virtual Drivers, Kernel Parameter Tuning (Linux)

-          Virtual infrastructure oriented optimization

o   3) Virtual Machine Hardware

§  Optimize vCPU, RAM, Storage, Resource Limits & Reservations

o   2) Hypervisor

§  Resource Pools, HA, DRS, Data Stores, Parameter Tuning

o   1) Physical Hardware

§  Server, storage, network

Clustering/ final optimizations

Understand when to cluster and when not.

With a well-engineered virtual infrastructure platform certain high-availability configurations provided by system clustering for physical infrastructure deployments can often be eliminated. However, clustering plays and important role still for active-active clustered systems to support rolling upgrades, regular maintenance, minimize downtime during patches, etc.

When clustering on top of virtual infrastructure the high-availability features of each layer should be optimized to complement one another. Avoid clustering techniques that may interfere with infrastructure layers above and below.

To use shared disk between individual nodes (voting and quorum drives) for operating system clusters on VM use one of the four available methods. The iSCS/ NFS Gateway VM is gaining traction as it resolves almost all of the limitations of the other available solutions (RDM, multi- write virtual disk, iSCSI or NFS on SAN/NAS). However, it is also more complex to set up and maintain.
Use anti-affinity policies between the various cluster nodes to avoid that two nodes run on the same physical host at the same time (and by thus defeating one of the high-availability purposes of clustering).

Use a multi-write virtual disk to have all data remain in virtual disk files on a data store. All cluster nodes can then access that folder.

Credits & Special thanks: This blog incorporates thought leadership and publicized content of Chris William, director of Cognizant Virtual Solutions.



+++
To share your own thoughts or other best practices about this topic, please email me directly to alexwsteinberg (@) gmail.com.

Alternatively, you also may connect with me and become part of my professional network of Business, Digital, Technology & Sustainability experts at

https://www.linkedin.com/in/alexwsteinberg   or
Xing at https://www.xing.com/profile/Alex_Steinberg   or
Google+ at  https://plus.google.com/u/0/+AlexWSteinberg/posts


 
 

No comments:

Post a Comment