Objective 3.3 – Create a vSphere 5.x Physical Storage Design from an Existing Logical Design
1. Describe selection criteria for commonly used RAID types.
The IOMEGA comes configured as RAID5 which I don’t intend to change as it gives a decent balance between performance and redundancy.
- RAID0 = JBOD (Stripe)
- RAID1 = mirror (data copied across both disks); can lose only 1 disk
- RAID3 = Dedicated parity disk (min of 3 disks); can lose only 1 disk
- RAID5 = Distributed Parity across all RAID disks; data loss potential during RAID rebuilds (min of 3 disks) ; decent Reads but Write is a division of 4 (n*IOPS/4)
- RAID6 = Dual Parity disk distribution; n+2 during RAID rebuilds; less Reads due to 2 disks lost to Parity; Writes is less as a division of 6 (n*IOPS/6)
- RAID1+0 = 2 disks used for striping and mirroring (min. 4 disks): best performance & most expensive; Read = sum of all disks * IOPS; Writes = ½ Read IOPS (n*IOPS/2)
A good diagram from the VMware trouble shooting storage perfomance blog
Skills and Abilities
2. Based on the service catalog and given functional requirements, for each service:
- Determine the most appropriate storage technologies for the design.
- Implement the service based on the required infrastructure qualities.
I intend to use a combination of VSAN and ISCSI storage, VSAN will come at a later date as I’m restricted by budget. I’ll be using an IOMEGA NAS drive to present 2x 1TB datastores to the HP MicroServers, it will provide satisfactory IOPS and comes in at a good price point.
3. Create a physical storage design based on selected storage array capabilities, including but not
- Active/Active, Active/Passive
- ALUA, VAAI, VASA
- PSA (including PSPs and SATPs
Obviously I can’t apply most of this to my design but below are some things to think about if it were a real world deployment.
Multipathing policies are largely driven by the storage vendors and they should always be consulted for recommended configurations.
- Active-active storage system Allows access to the LUNs simultaneously through all the storage ports that are available without significant performance degradation. All the paths are active at all times, unless a path fails.
- Active-passive storage system A system in which one storage processor is actively providing access to a given LUN. The other processors act as backup for the LUN and can be actively providing access to other LUN I/O. I/O can be successfully sent only to an active port for a given LUN. If access through the active storage port fails, one of the passive storage processors can be activated by the servers accessing it.
- Asymmetrical storage system Supports Asymmetric Logical Unit Access (ALUA). ALUA-complaint storage systems provide different levels of access per port. ALUA allows hosts to determine the states of target ports and prioritize paths. The host uses some of the active paths as primary while others as secondary
- Most Recently Used (MRU) — Selects the first working path, discovered at system boot time. If this path becomes unavailable, the ESX/ESXi host switches to an alternative path and continues to use the new path while it is available. This is the default policy for Logical Unit Numbers (LUNs) presented from an Active/Passive array. ESX/ESXi does not return to the previous path when if, or when, it returns; it remains on the working path until it, for any reason, fails.
Note: The preferred flag, while sometimes visible, is not applicable to the MRU pathing policy and can be disregarded.
- Fixed (Fixed) — Uses the designated preferred path flag, if it has been configured. Otherwise, it uses the first working path discovered at system boot time. If the ESX/ESXi host cannot use the preferred path or it becomes unavailable, ESX/ESXi selects an alternative available path. The host automatically returns to the previously-defined preferred path as soon as it becomes available again. This is the default policy for LUNs presented from an Active/Active storage array.
- Round Robin (RR) — Uses an automatic path selection rotating through all available paths, enabling the distribution of load across the configured paths. For Active/Passive storage arrays, only the paths to the active controller will used in the Round Robin policy. For Active/Active storage arrays, all paths will used in the Round Robin policy.
Note: This policy is not currently supported for Logical Units that are part of a Microsoft Cluster Service (MSCS) virtual machine.
- Fixed path with Array Preference — The VMW_PSP_FIXED_AP policy was introduced in ESX/ESXi 4.1. It works for both Active/Active and Active/Passive storage arrays that support ALUA. This policy queries the storage array for the preferred path based on the arrays preference. If no preferred path is specified by the user, the storage array selects the preferred path based on specific criteria.
Note: The VMW_PSP_FIXED_AP policy has been removed from ESXi 5.0. For ALUA arrays in ESXi 5.0 the PSP MRU is normally selected but some storage arrays need to use Fixed.
- Full copy, also called clone blocks or copy offload. Enables the storage arrays to make full copies of data within the array without having the host read and write the data. This operation reduces the time and network load when cloning virtual machines, provisioning from a template, or migrating with vMotion.
- Block zeroing, also called write same. Enables storage arrays to zero out a large number of blocks to provide newly allocated storage, free of previously written data. This operation reduces the time and network load when creating virtual machines and formatting virtual disks.
- Hardware assisted locking, also called atomic test and set (ATS). Supports discrete virtual machine locking without use of SCSI reservations. This operation allows disk locking per sector, instead of the entire LUN as with SCSI reservations.
- Array thin provision, help to monitor space use on thin-provisioned storage arrays to prevent out-of-space conditions, and to perform space reclamation, space reclamation is a manual process and needs to be run for the ESXi CLI.
PSA – collection of APIs to allow 3rd party ISVs to design their own load balance/failover techniques
PSPs – I/O path selection; MRU (default for A/P), Fixed (default for A/A), RR (either)
Storage Array Type Plug-Ins (SATPs) run in conjunction with the VMware NMP and are responsible for
ESXi offers a SATP for every type of array that VMware supports. It also provides default SATPs that support non-specific active-active and ALUA storage arrays, and the local SATP for direct-attached devices.
Each SATP accommodates special characteristics of a certain class of storage arrays and can perform the array-specific operations required to detect path state and to activate an inactive path. As a result, the NMP module itself can work with multiple storage arrays without having to be aware of the storage device specifics.
After the NMP determines which SATP to use for a specific storage device and associates the SATP with the physical paths for that storage device, the SATP implements the tasks that include the following:
Monitors the health of each physical path.
Reports changes in the state of each physical path.
Performs array-specific actions necessary for storage fail-over. For example, for active-passive devices, it can activate passive paths.
4. Identify proper combination of media and port criteria for given end-to-end performance requirements.
Refers to tiered storage based on performance type, e.g.
Gold = SSD
Silver = FC 15k SAS
Bronze = 7K Sata
5. Specify the type of zoning that conforms to best practices and documentation.
With ESXi hosts, use a single-initiator zoning or a single-initiator-single-target zoning. The latter is a preferred zoning practice. Using the more restrictive zoning prevents problems and misconfigurations that can occur on the SAN.
Zoning not only prevents a host from unauthorized access of storage assets, but it also stops undesired host-to-host communication and fabric-wide Registered State Change Notification (RSCN) disruptions. RSCNs are managed by the fabric Name Server and notify end devices of events in the fabric, such as a storage node or a switch going offline. Brocade isolates these notifications to only the zones that require the update, so nodes that are unaffected by the fabric change do not receive the RSCN. This is important for non-disruptive fabric operations, because RSCNs have the potential to disrupt storage traffic.
There are two types of Zoning identification: port World Wide Name (pWWN) and Domain,Port (D,P). You can assign aliases to both pWWN and D,P identifiers for easier management. The pWWN, the D,P, or a combination of both can be used in a zone configuration or even in a single zone. pWWN identification uses a globally unique identifier built into storage and host interfaces. Interfaces also have node World Wide Names (nWWNs). As their names imply, pWWN refers to the port on the device, while nWWN refers to the overall device. For example, a dual-port HBA has one nWWN and two pWWNs. Always use pWWN identification instead of nWWN, since a pWWN precisely identifies the host or storage that needs to be zoned.
6. Based on service level requirements utilize VMware technologies, including but not limited to:
Storage I/O Control
Storage I/O Resource Allocation
VMware vSphere provides mechanisms to dynamically allocate storage I/O resources, allowing critical workloads to maintain their performance even during peak load periods when there is contention for I/O resources. This allocation can be performed at the level of the individual host or for an entire datastore. Both methods are described below.
The storage I/O resources available to an ESXi host can be proportionally allocated to the virtual machines running on that host by using the vSphere Client to set disk shares for the virtual machines (select edit, virtual machine settings, choose the Resources tab, select Disk, then change the Shares field).
The maximum storage I/O resources available to each vi rtual machine can be set using limits. These limits, set in I/O operations per second (IOPS), can be used to provide strict isolation and control on certain workloads. By default, these are set to unlimited. When set to any other value, ESXi enforces the limits even if the underlying datastores are not fully utilized.
An entire datastore’s I/O resources can be proportiona lly allocated to the virtual machines accessing that datastore using Storage I/O Control (SIOC). When enabled, SIOC evaluates the disk share values set for all virtual machines accessing a datastore and allocates that datastore’s resources accordingly. SIOC can be enabled using the vSphere Client (select a datastore, choose the Configuration tab, click Properties…(at the far right), then under Storage I/O Control add a checkmark to the Enabled box).
With SIOC disabled (the default), all hosts accessing a datastore get an equal portion of that datastore’s resources. Share values determine only how each host’s portion is divided amongst its virtual machines.
Formerly called virtual machine storage profiles, to ensure that virtual machines are placed to storage that guarantees a specific level of capacity, performance, availability, redundancy, and so on. When you define a storage policy, you specify storage requirements for applications that would run on virtual machines. After you apply this storage policy to a virtual machine, the virtual machine is placed to a specific datastore that can satisfy the storage requirements
used for no downtime for datastore maintenance; transitioning to new
Array; datastore load balancing (SDRS)
A feature that provides I/O load balancing across datastores within a datastore cluster.
This load balancing can avoid storage performance bottlenecks or address them if they occur
7. Determine use case for virtual storage appliances, including the vSphere Storage Appliance.
VSA provides High Availability and automation capabilities of vSphere to any small environment without shared storage hardware. Get business continuity for all your applications, eliminate planned downtime due to server maintenance, and use policies to prioritize resources for your most important applications. VSA enables you to do all this, without shared storage hardware.
Don’t understand why VSAN isn’t mentioned in the Blueprint??
8. Given the functional requirements, size the storage for capacity, availability and performance,
Virtual Storage (Datastores, RDMs, Virtual Disks)
Physical Storage (LUNs, Storage Tiering)
Some of the info below was borrowed from the Brownbag VCAP DCD Study notes PDF.
Take I/O metrics of guests (VDI) and server workloads. Take into acct disk type & write
penalty for RAID Type
- Capacity – consider overhead for snapshots, vswp, and logging
- Availability – multiple HBAs, multipathing, multiple switches
- Performance – enable read/write cache on SAN; enable CRBC in VDI; *NOTE: disable write
cache if not battery backed*
- Datastores – segregate high I/O traffic on different DSs
- RDMs – needed for SAN based replication & tasks; required for MSCS
- Virtual Disks – recommended; better provisioning capability over RDM; more portable;
functional with all vSphere features
- LUNs -ONE VMFS (DS) per LUN; can have multiple on a target or 1 per target
Storage Tiering – based on app SLAs (SSD vs SAS vs SATA); thin provisioning
How Large a LUN?
The best way to configure a LUN for a given VMFS volume is to size for throughput first and capacity second.
That is, you should aggregate the total I/O throughput for all applications or virtual machines that might run on a given shared pool of storage; then make sure you have provisioned enough back-end disk spindles (disk array cache) and appropriate storage service to meet the requirements.
This is actually no different from what most system administrators do in a physical environment. It just requires an extra step, to consider when to consolidate a number of workloads onto a single vSphere host or onto a collection of vSphere hosts that are addressing a shared pool of storage.
Each storage vendor likely has its own recommendation for the size of a provisioned LUN, so it is best to check with the vendor. However, if the vendor’s stated optimal LUN capacity is backed with a single disk that has little or no storage array write cache, the configuration might result in low performance in a virtual environment. In this case, a better solution might be a smaller LUN striped within the storage array across many physical disks, with some write cache in the array. The RAID protection level also factors into the I/O throughput performance.
Because there is no single correct answer to the question of how large your LUNs should be for a VMFS volume, the more important question to ask is, “How long would it take one to restore the virtual machines on this datastore if it were to fail?”
The recovery time objective (RTO) is now the major consideration when deciding how large to make a VMFS datastore. This equates to how long it would take an administrator to restore all of the virtual machines residing on a single VMFS volume if there were a failure that caused data loss. With the advent of very powerful storage arrays, including Flash storage arrays, the storage performance has become less of a concern. The main concern now is how long it would take to recover from a catastrophic storage failure.
Another important question to ask is, “How does one determine whether a certain datastore is overprovisioned or underprovisioned?”
There are many performance screens and metrics that can be investigated within vCenter to monitor datastore I/O rates and latency. Monitoring these metrics is the best way to determine whether a LUN is properly sized and loaded. Because workload can vary over time, periodic tracking is an important consideration. vSphere Storage DRS, introduced in vSphere 5.0, can also be a useful feature to leverage for load balancing virtual machines across multiple datastores, from both a capacity and a performance perspective.
9. Based on the logical design, select and incorporate an appropriate storage network into the physical design:
- Plan for failures
- Connect the host and storage ports in such a way as to prevent a single point of failure from affecting redundant paths. For example, if you have a dual-attached host and each HBA accesses its storage through a different storage port, do not place both storage ports for the same server on the same Line Card or ASIC.
- Use two power sources.
- For host and storage layout To reduce the possibility of congestion, and maximize ease of management, connect hosts and storage port pairs to the same switch where possible.
- Use single initiator zoning For Open Systems environments, ideally each initiator will be in a zone with a single target. However, due to the significant management overhead that this can impose, single initiator zones can contain multiple target ports but should never contain more than 16 target ports