I’m starting a new series this week called “Cloud Support”. In this series, I’m going to look at tackling some of the most common issues that people experience when coming to Azure for the first time. These are the sort of questions I see repeated over and over on things like Microsoft Q&A, ServerFault and so on.
This week we are going to tackle one of the most common issues I see. You’ve created a virtual machine in Azure, your run a service on that machine that you need to access from the outside world, and it’s not working, Your requests time out or get an error. This is a pretty common issue but can be hard to resolve given the number of different layers between your user and your application. Let’s have a look at some of the troubleshooting steps you can do to investigate and solve this issue.
Step 1: Is the service actually running?
The first thing to check is that your service is actually running, and it is listening on the port you expect it to be. If the service isn’t running, then no amount of network changes are going to help resolve the issue.
The first thing to check is whether you can access the service on the virtual machine it’s self, this removes the network, firewalls etc. from the process. Depending on what your service is doing and how you access it, you can look to access it on the VM using a browser, command line or tools like PostMan. You can also look at using a tool like Telnet just to see if you can connect to the port your service is running on. If you’re unable to get your service to respond, then it may be that it is not running or has failed to start. At this point, you would want to look at the logging provided by your application, look at event logs, services etc. to try and determine why it is not running.
Step 2: Is the service listening on the right port and IP?
If your service appears to be running OK, but you can’t connect to it on the expected port, it may be that it is running on a different port. If you can access it on the VM’s localhost address but not externally, it may also be that it is not binding to the right IP address.
On Windows, you can use Resource Monitor, on the Network tab there is a “Listening Ports” section that will show all the ports in use, what is running on them, and what IP they or bound to.
Alternatively, we can instead use the netstat command at the CLI to see what port and IP applications are using:
netstat -a -b -o
Using either of these tools, we can check which port our service is using, and which IP it is bound to. Make sure it is using the port you expect, is bound to right IP to be accessible to the outside world and isn’t just using IPv6 if you want IPv4. If it is not correct, you’ll need to look at your application configuration to change it.
Step 3: Is there a firewall on the VM blocking access
If we know, the app works locally and is on the right port and IP, then we can start looking at what might be blocking access from the outside. The first thing to look at is whether there is a firewall running on the virtual machine itself. If there is a local firewall, we need to make sure that it allows access to the port from outside of the virtual machine.
On a Windows machine, you would want to look at the Windows firewall and look in the “advanced settings” section to check there is an inbound rule allow which allows access to your service from the network.
Step 4: Network Security Group Rules
If your VM firewall allows traffic in, then the next layer that could be preventing access is the Network Security Group, Azure’s implementation of the network firewall. By default, an NSG does not allow inbound traffic, you will need to add a rule that allows traffic inbound to your service.
A key thing to be aware of is that Network Security Groups can be applied at two levels. They can be applied to a VM directly and also applied to a subnet. If you have an NSG applied to both, then the rules will be an amalgamation of both. The subnet rules get applied first, so if there is a deny at the subnet level but allow at the VM, then the traffic will be rejected. You can see what the result of the combination of VM and Subnet NSG’s looks like by using the “Effective Security Rules” option on the network card of your VM.
Check that any NSG’s you have applied to your VM has an allow rule for the destination port your VM is running on. Also, check that the Source Port Ranges is set to either the source IP you are coming from or set to *.
Be careful to make your NSG rules as restrictive as you can to keep your VM secure. Avoid setting your destination port range to * just to test, as it’s not uncommon to forget to remove it!
Step 5: External Gateways
If your VM has a public IP attached directly then you can skip this step; however if you have another service in front of your VM, you want to check this is also configured to allow traffic inbound to the right port. Services might include:
- Azure Load Balancer
- Application Gateway
- Azure Front Door
- Azure Firewall
- Third-party load balancers, reverse proxies etc.
An excellent way to determine if these services are at fault is to temporarily attach a public IP directly to your VM. If you can access the service through this IP but not the gateway, then it would tend to indicate the gateway is at fault.
Step 6: Local Issues
Could the issue have nothing to do with your service in Azure and be something local on your network? Are you on a corporate network with proxies, packet inspection and so on that could be causing issues? The easiest way to test this is to check your service over a different connection, ideally, one that is not locked down by corporate policies etc. Most people have a mobile phone that can be used to test, or tethered to a machine that can then use that connection.
If you’re trying to expose a service to the outside world and it’s not working follow these steps in order, and hopefully you will find the thing blocking your access.
If you’ve got additional tips for resolving this issue I’d love to hear them and add them to the article, please do add them in the comments.