Cloud-native apps need a lot of monitoring. It’s not that they’re inherently unstable, but there are issues of concurrency and consistency in distributed application development that can give rise to bugs that are hard to trace and reproduce, especially when they’re built on a multitenant platform you can’t control. Things get harder to manage when you’re dealing with autoscaling or having services start on demand using serverless approaches like Azure Functions.
What’s needed is a native monitoring technology that’s baked into the platform along with the tools needed to work with that data, either analyzing logs and metrics or responding to that data automatically. It’s an approach that gives us the elements necessary to build both an observability platform and the levers to turn that into a modern control framework.
The public cloud is an example of where classical control techniques break down. It’s too complex to control by defining the exact output state based on current inputs, so we can’t build a classical governor around our applications. Instead, we need to move to modern control theory approaches where we use the outputs of a service to determine the state of its internal systems, and then artificial intelligence controls those to operate within a set of boundary conditions.
Introducing Azure Monitor
In Azure, that’s the role of Azure Monitor, a tool for collecting, collating, and storing logs and metrics from across your apps and services. Much of what Azure Monitor does is enabled as soon as you turn on a service and add it to a resource group. You can use tools like App Insights to build Azure Monitor support into your own code, use its agents in your virtual infrastructure, and get data from its touchpoints in the Azure platform services. It helps manage what can be a lot of information, especially when you’re running code at global scale.
Data is collected either as near-real-time metrics or as log files, which also include telemetry data. The result is a mix of data that provides point-in-time and time series information. Azure Monitor provides a dashboard where you can view and analyze your data, as well as APIs that allow it to be a source for other applications, such as triggering automations via Logic Apps or Power BI dashboards for management desktops. If you’re working in the Azure Portal, you can use its Log Analytics tool.
Azure Monitor provides the analytics framework that’s used by Azure Application Insights, VM Insights, and Container Insights. These help you extend it into your devops environment, giving you tools for working with your code, with Kubernetes, and with Linux and Windows virtual machines in an infrastructure-as-a-service (IaaS) environment. Cloud applications are heterogenous, mixing platform as a service and IaaS, platform applications, and your own code, hosted on that platform, in those VMs, or in that container environment. It’s sensible to have one monitoring environment that can bring in data from everywhere, analyze it, and generate appropriate alerts.
It’s possible to use rules to bring different alerts together to help deliver appropriate alerts for your applications based on specific metrics. You can even direct alerts to specific individuals, so database support engineers get database alerts, and infrastructure alerts go to site reliability engineers. Building alerts into your devops model ensures application resilience even when automated systems can’t keep it online. Rules can then be used to automate specific operations, for example, autoscaling services when response times drop or when load crosses preset limits.
You don’t need to do much to enable Azure Monitor for Azure services. It’s enabled automatically whenever you create an Azure resource of any type. These basic features are free, though you do need to pay for additional log file ingestion and storage. Here you can choose pay-as-you-go options at $2.99 per GB per day or select one of several schemes that commit you to a set amount of data per day, from 100GB at $219.20 per day to 5,000GB at $9,016 per day. Committed ingestion is intended for very large sites generating a lot of log data. Once ingested, data is stored for up to 31 days for free, with longer-term storage billed at $0.13 per GB per month.
There are other costs if you need to add additional custom metrics or if you need to query more than a million times a month. You’ll need to pay for more than 10 alerts and for more than a set number of alerts each month. There are also costs if you chose to use automated SMS or voice alerts for on-call engineers.
Working with Azure Monitor
Using the Azure Monitor portal is easy enough. From the Azure Portal, select Monitor to open its web view. The Overview page shows you new services while giving you a jumping-off point to tools for exploring metrics and logs as well as setting up any alerts.
Exploring metrics can give you quick insights into an application. For example, an Azure-hosted web app running in Azure App Services can be examined to see how much memory and CPU it’s using and what response codes are being generated, among a large set of possible metrics. These can be plotted, filtered, and used to build dashboards for your application. You could look to see if there was any connection between CPU usage and failed responses. Graphs can be plotted using any of a set of chart types, from line and bar charts to a grid of values. Once you’ve created a chart you can add it to your application’s dashboard.
Similar tools help you work with log files, using Microsoft’s at-scale data query language Kusto to explore your logs. Working with Kusto makes sense. It’s designed for quick queries and analysis of big data using a SQL-like query language. It’s a read-only tool, so you don’t have to worry about inexperienced engineers accidentally deleting or editing data. All Kusto can do is process data, ready for use and display. For example, if you know that a problem occurred between two time stamps, you can use Kusto to refine all your log data to help extract anything relevant in that time period. Log data can be exported to Power BI for better visualizations.
Generating alerts from metrics
Creating an alert is relatively simple, using a basic workflow to choose the resource to be monitored and then choosing a condition to be used to generate an alert. Maybe you’re using an Azure static website and want to know when it’s updated from GitHub by a DependaBot action. You can create a rule to detect this and then email the appropriate staff engineers to indicate that an automated update has occurred to a site.
The whole process is relatively simple. You’re working with the default actions that are set up when you create an Azure resource so there’s no need to add custom actions for most operations. Microsoft regularly updates the service with new tools and often has tools ready to be used as soon as a service or technology goes into general access. It instrumented Azure App Services’ .NET 6 support on day zero.
Azure Monitor is very much a tool for your devops and site reliability engineering teams. Along with the metrics you need to track, the built-in analytics tools help you build more complex tools to understand how your application is running. However, this is only part of Azure’s cloud-native application management suite. Once you’ve used Azure Monitor to collect, collate, and process your data, you can use it with other tools. Data can be exported into a security platform to identify possible breaches or into a Cognitive Services–based tool to predict system demands so you can preemptively scale and avoid transient failures.