Azure HPC Cache
What is Azure HPC Cache?
Azure HPC Cache is an Azure service to provide low-latency file access to support high-performance computing (HPC) workloads running in Azure. Azure HPC Cache accelerates access to files in high-performance computing workloads through an aggregated namespace. Use Azure HPC Cache to get better productivity through low latency with lower infrastructure management overhead. This service can be used even for workflows where your data is stored across WAN links, such as in your local datacenter network-attached storage (NAS) environment.
It is very easy to create Azure HPC Cache in the Azure portal and can be monitored from the same portal page. Since Azure HPC Cache is working on an aggregated namespace, client access is simple even if you change the back-end storage target.
Azure HPC Cache resides in a single region. It can access data stored in other regions if you connect it to Blob containers located there. In this blog, we are going to describe the steps involved in creating an HPC cache in the Azure portal briefly.
Creating an Azure HPC Cache in the Azure portal
Log in to the Azure portal using valid credentials with a valid subscription. Choose All services → Storage → HPC caches. Choose the button “create HPC cache”. Since you are going to create a first Azure HPC cache in the portal, it shows “No HPC caches to display”. The below screen shows this step.
Creating an Azure HPC Cache in the Azure portal involves five steps: Basic, Cache, Disk Encryption Keys, Tags, and finally, Review & Create step.
In the first step Basic, the user needs to provide the Subscription and Resource group details under the Project details section under the Service details section.
- The user has to provide a name for the HPC Cache resource
- He has to choose the region where this Azure resource should be created
- The user should create a new virtual network and subnet group to use for the HPC Cache. The user can create a new virtual network and subnet or can use his existing virtual network and subnet. In our case, we are creating a new virtual network and subnet for learning purposes, i.e. we are creating a new virtual network called “VemvuDemoHPCCacheVN” with default subnet 10.0.0.0/24 as shown in the image below
- Click Next to configure Cache
In this second step Cache, we are choosing the Maximum throughput and cache size.
- Maxim throughput – Choose the maximum data transfer rate for the cache, in gigabytes per second. Available transfer rates are up to 2GB/s, 4GB/s, and 8GB/s; You can choose any of these three depending on your required data transfer rate
- Cache Size – Select the total storage size of the cache. Note: this value cannot be changed after creation. If you increase the maximum throughput rate your minimum cache size will also increase. Ranging from 3Tb to 48 Tb, you can choose the cache size. In our case, we are choosing maximum throughput up to 2 GB/s and Cache size is 3Tb as shown in the image below
- Click Next to configure Disk encryption keys
In the third step- Disk Encryption Keys, the user has to define the encryption key to use. Azure HPC Cache data is always encrypted.
- Use this section to specify the type of encryption key to use; You can choose Microsoft-managed encryption keys (the default) or Customer-managed keys that are stored in Azure Key Vault
- You cannot change a cache from customer-managed keys to Microsoft-managed keys. So we recommend using Microsoft-managed keys, and so we are choosing the same. The below image depicts this configuration step
- Click Next to configure Tags
In this fourth step- Tags, you may provide Tags name and value. Tags are name/value pairs that enable you to categorize resources and view consolidated billing by applying the same tag to multiple resources and resource groups.
- You may also skip this configuration if you don’t require it
- Click Next to Review and Create
In this final step- Review & Create, Azure validates your configuration inputs given as in the above steps and provides the result validation passed.
- This page also provides an estimated cost for the HPC Cache service Instance created with the chosen Maximum throughput and cache size as shown in the image below
- Click create to proceed to create the new HPC Cache
You will see a deployment progress notification area on the notification icon on top. After a few minutes, say approximately around 20 minutes, your newly created HPC Cache is ready with its own newly created virtual network and its subnet.
Click Goto resource to get more details about the newly created VembuDemoHPCCache.
Monitoring Azure HPC Cache
Once your Azure HPC Cache is ready and is moved to the production stage, you can monitor its activities from the Overview tab of the newly created Azure HPC Cache home page as shown in the image below.
You will be able to see the metrics in the graph over the period intervals ranging from one hour to 30 days. Available data shown in the graph are:
- Cache throughput
- Cache operations/second
- Cache client-facing latency
Important configuration settings related to Azure HPC Cache:
Here we can set,
- MTU size – Either 1500 or 1400; it denotes the largest packet or frame size, specified in octets (eight-bit bytes) that can be sent in a packet- or frame-based network such as the internet
- NTP Server – If you want to specify anything other than the Microsoft NTP server, you can specify its IP address or FQDN of the NTP server. By default, Microsoft’s NTP server is used
- DNS Configuration – You can configure DNS settings to use for this Azure HPC Cache by providing a DNS search domain and its DNS server for name resolution
Use this page to construct an NFS mount command to connect client machines to this Azure HPC Cache. On this page, you will check the client prerequisites for mount access and mount details. You can set the Client path, Cache mount address, and Virtual namespace path in the mount details section. Based on these values, you will be shown the complete mount command to be executed from any connecting machine.
Storage Targets – Here you can add a storage target to this running Azure HPC Cache by providing Storage Target name, Target type, Host Name, and Usage model
Things to be noted before adding a storage target to Azure HPC Cache. To successfully create a storage target, the back-end storage system and its network must be configured to allow access from the Azure HPC Cache:
- The storage system must allow the cache to list exports
- The storage system must permit the cache to access exports as root (UID 0)
- Firewalls between the cache subnet and the data storage system must allow traffic on several ports
NameSpace – Set the virtual namespace paths that clients use to access data from the storage targets.
Client access policies – You can add or edit policies to customize how your storage target exports are protected. You can define policies on this page, and apply them to storage targets on the Namespace page. Configuring Storage Target, Namespace, and Client access policies together, one can securely connect the Azure HPC cache from client machines.
Locks – Under this page, you can set locks by adding a new lock to provide the clients with Read-only access and prevent deleting. You can create a new lock by providing a name and its lock type i.e. either “Read Only” or “Delete”.
The main advantage of having Azure HPC Cache in your Hybrid cloud is that it reduces latency between Azure and on-premises storage. Azure HPC Cache reduces latency for applications where data may be tethered to existing data center infrastructure because of dataset sizes and operational scale.
Azure HPC Cache works by automatically caching active data in Azure that is located both on-premises and in Azure, effectively hiding latency to on-premises network-attached storage environments. It is an ideal solution to deal with cloud-bursting applications or hybrid NAS environments.