Setup Storage Replica in Azure

In my last article we discussed the various different options for providing SMB shares in Azure given the lack of shared storage. One of the options we discussed for this was using a new feature of Server 2016 – Storage Replica, and in this article we will take a deep dive into how to setup this up in Azure. This Windows Server feature allows you to replicate data between two servers (or two clusters) and could potentially be a great solution for replicating shares in Azure, if  you can cope with the limitations. As discussed in the previous article, the key strengths of this solution are:

  • Block level replication, so it can deal with open files and only replicate changes
  • Choice of Synchronous or Asynchronous replication
  • Easy to backup

These feature make storage replica particularly appealing as a replacement for DFS Replication, which is another common solution for Azure based shares, but has limitations due to the fact it is file based rather than block, and only supports asynchronous replication.

The main disadvantage of this solution is the lack of automated fail-over. Storage replica is primarily designed for DR, so fail-over is a manually initiated process. This means that this isn’t really a solution for those looking for instant fail-over in the event of an issue, but for those who can stand some delay or want to use this solution for planned maintenance for DR it will work well.

Storage Replica is pretty quick and easy to setup and use, so if you want to give it a try to see if it will work for you it should be pretty straightforward.

VM Creation

The first step to implementing storage replica is to create your VM’s. We will look at server to server replication here, not cluster to cluster, so you will need 2 virtual machines. These VM’s don’t need to be identical, although it can make life easier if they are. What they do need is to each have two data disks one for your data and one for your logs. The data and log drives can be different sizes, but the data drive on server 1 must be the same size as the data drive on server 2 and the same for the log files.

The documentation for storage replica recommends using SSD for log files, but given Microsoft’s recommendation for using premium storage for all work loads I suggest using premium storage for all drives. The size of the data drive will obviously be dependent on how much data you wish to store, the log file size is more dependent on the number of expected operations. If required you can combine multiple Azure disks into a single volume using software RAID to create large volumes (although with 4TB volumes now available this hopefully won’t be needed). Given this you will need to choose on of the DS, FS or GS series VMs.

Each VM needs a minimum of 2GB or RAM and 2 cores and obviously run Server 2016, so choose your VM size accordingly.

If you want to use this as a DR solution, make sure that you create your second node in another region and that connectivity of suitable performance exists between the virtual networks in the two regions.

Disk Setup

Once your VM is up and running you will need to initialize and format your VM. These disks **must **be initialized as GPT, not MBR. They can then be formatted with NTFS and I would recommend setting the drive letter and label the same on each server for simplicity.

Install Server Replica Feature

The Server Replica feature needs to be installed on both servers. You can do this through server manager, add features or you can do it through PowerShell. The following can be run on any machine on the same network assuming you have not disabled remote PowerShell (obviously changing the server names).

$Servers = 'server1','server2' $Servers | ForEach { Install-WindowsFeature -ComputerName $_ -Name Storage-Replica,FS-FileServer -IncludeManagementTools -restart }

Test Topology

The Storage Replica feature comes with a command to test your intended replication topology to make sure it is fit for purpose. You can run this either in a requirements only mode for a quick test to make sure you meet the minimum required, or you can have it run over an extended period and throw some expected load at it to see how it would hold up.

The following command will run for 30 minutes, during which time you can send some test data to server1 to test how well it holds up. Make sure the volume of data is representative of your normal workload.

 Test-SRTopology -SourceComputerName server1 -SourceVolumeName f: -SourceLogVolumeName g: -DestinationComputerName server2 -DestinationVolumeName f: -DestinationLogVolumeName g: -DurationInMinutes 30 -ResultPath c:\temp

Once complete the test will generate a HTML report in C:\temp, check this to see if there are any issues and see how well it performed

Enable Replication

Once you have completed the tests your ready to enable replication between the two servers. This is done with a simple PowerShell command (there is no GUI option):

New-SRPartnership -SourceComputerName server1 -SourceRGName rg01 -SourceVolumeName f: -SourceLogVolumeName g: -DestinationComputerName server2 -DestinationRGName rg02 -DestinationVolumeName f: -DestinationLogVolumeName g:

This will create a synchronous replication partnership between server 1 and server 2, this is the default. If  you wish to use Asynchronous replication you can add the -replicationMode paramater with a value of Asynchronous or 2:

New-SRPartnership -SourceComputerName server1 -SourceRGName rg01 -SourceVolumeName f: -SourceLogVolumeName g: -DestinationComputerName server2 -DestinationRGName rg02 -DestinationVolumeName f: -DestinationLogVolumeName g: -replicationMode Asynchronous

There are additional options that can be enabled during partnership creation that we won’t discuss here, for more details you can look at the documentation for the New-SRPartnership command

Once the command completes it will have setup the partnership and started initial replication. Note that you cannot fail-over to the second node until initial replication has completed, if the drive has little data this will be quick, if you have a lot of data it can take some time.

To view the status of the replication you can run the Get-SRGroup command, this will show you the status of the replication group. You are looking for the replicationStatus to show “ContinuouslyReplicating”, if it states InitialBlockCopy then it is still in it’s initial sync.

You can check the event log for any errors relating to storage replica during the initial sync.

Once you enable the partnership, you will notice that your two drives are no longer visible in explorer on server 2, this is expected as these drives can only be written to on the active server.

Fail-Over

Once your initial replication is complete you are able to fail-over from server 1 to server 2. When you do this you are making server 2 the primary in the partnership and switching replication to go the other way. To do this, you run the following PowersShell

Set-SRPartnership -NewSourceComputerName server2 -SourceRGName rg02 -DestinationComputerName server1 -DestinationRGName rg01

Once this command completes you will see that the data and log drives are no longer available on server 1 and are available on server 2 with all your data.

To go back to server 1 being primary, just reverse the command:

Set-SRPartnership -NewSourceComputerName server1 -SourceRGName rg01 -DestinationComputerName server2 -DestinationRGName rg02

 

Creating Shares

We now have a storage replica setup and replicating, but we don’t yet have any shares setup. You can go ahead and create folders and share them as you require on server1. Once that is done you can access these normally using \server1. However, these shares are only configured on Server1, when we initiate a failover to server 2 you will notice that while all your data is there, it is no longer shared. During your initial setup you will need to fail-over to Server2 and setup your shares a second time on this machine, these can be accessed using \server2 whilst this server is active.

Accessing Shares

You will note from the previous example that you need to access the active node directly by it’s URL, so for a share called “Data” when server1 is active you would go to \server1\data and when server2 is active \server2\data. This can be confusing for users, and would break any application that has the URL of the server hard coded. Client caching can  also add to this confusion by displaying the share on the secondary node even when it is not available.

To work around this issue you can do one of three things:

  1. If your share is only accessed by applications then you can build your applications to support a list of servers and to try all servers to establish which is the active one
  2. Setup a DNS CNAME for your file server, then when you fail-over  update your CNAME to point to the right server
  3. Use DFS namespaces to create a single namespace and add the primary server as a target. In the event of fail-over update the target to point to the new primary

We will look at setting up DFS namespaces with Storage Replica in a future article.

You now have a simple server to server storage replica setup between two Azure nodes. In the event of maintenance, or disaster you can now switch between nodes as required and be sure your data is replicated (assuming you stick with synchronous replication). Should you wish to look at some of the mode advanced features of storage replica such as encryption, seeding and more take a look at the Storage Replica documentation and PowerShell commands.