Streamlining Error Management: Logging Azure Synapse Pipeline Error Messages into Azure Log Analytics

Adityansnair
4 min readOct 19, 2023

As a data engineer, managing pipelines and activities that produce reports or carry out data movement in Azure Synapse can be challenging. It’s necessary to set up alerts that notify the relevant parties when they fail. Usually, the complexity of activities in these pipelines means that it can be difficult to narrow down which part of the system is producing errors. In this article, I will show you how to customize your Azure Synapse pipeline/notebook error messages using Azure Log Analytics to send alerts to specific users to make your life easier.

To demonstrate, we will try to ingest a CSV file from a data lake into an Azure SQL DB. There will be three activities used i.e., Copy Activity, Set Variable Activity, and Fail Activity.

Connect the Set Variable Activity as a successor to the activity you wish to monitor and log errors for. In this case, it is a Copy Activity. In a complex pipeline, you will have multiple activities like Get MetaData, Notebook etc. All these activities can be monitored and logged by adding a Set Variable on failure. Connect all the Set Variable activities to the Fail Activity. Define the Set Variable as below. Remember based on the predecessor activity the syntax would change in the Value under the Set Variable. In this case, it is ‘@activity(‘Copy Data’).Error.message’. Use the Variable as the Fail Message under Fail Activity.

So for all the activity failures, we could use the same Fail Activity. We can make use of Web Activity to send emails to designated recipients with the help of Logic Apps. When at the enterprise level, it’s good to have a view of all the error categories and associated messages so it’s easier for the team to take action. That’s where we use Log Analytics. Here is a link on How to enable diagnostic settings in Azure Synapse for monitoring. Once the diagnostic settings are complete. It will take a couple of hours for the logs to reflect. Also note, that the pipeline runs completed before the diagnostic setup is not reflected.

As per the current configuration, the Error Details from the Fail Activity do not pop up in the Log Analytics. But, there is a way out, we could make use of the User Properties tab. Unlike other sections, the parameters here do not have a dynamic content option. But, we can simply copy the syntax from Settings > Fail message.

In this example I only used Error Message, we can add more parameters like Error Code, and Error Category. This way it is more organised and the team can always refer to a mapping sheet which will get me more info about the codes and categories. In Log Analytics, we could query the SynapseIntegrationActivityRuns table to get the pipeline run details. Here, I tried to customise the KQL to make it more readable.

SynapseIntegrationActivityRuns
| where ActivityType == ‘Fail’ and Level == ‘Error’
| extend ErrorMessage = tostring(UserProperties.ErrorMessage) ,
PipelineLink = strcat(‘https://web.azuresynapse.net/en/monitoring/pipelineruns/',PipelineRunId,'?workspace=',_ResourceId),
PipelineRunTime = datetime_utc_to_local(TimeGenerated,’Australia/Sydney’)
| project PipelineName, ActivityName, PipelineRunTime, ErrorMessage, PipelineLink

Once we are happy with the query, we could make use of the Alerting functionality.

Click on the +New alert rule, and it should direct you to the Custom Log Search Signal. Add relevant Dimensions as shown below. If you do not see your dimension, it could be that the Column used is not of string type. Here, we converted the ErrorMessage to a string.

For completing the alerting, refer to https://learn.microsoft.com/en-us/azure/azure-monitor/alerts/tutorial-log-alert. It will help you complete the rest of the steps.

Generally, we have an ingestion layer, a staging layer, and a business layer and will have associated pipelines in place. Different stakeholders will be interested in different pipelines. e.g. A source team will only be interested in identifying the errors found in the files they send to the platform or a Business Team will be interested in being notified about the errors that happened in the business layers. We could write custom Kusto queries to different alerts based on the target audience. Simply, join the SynapseIntegrationPipelineRuns table (RunId) to SynapseIntegrationActivityRuns (PipelineRunId) and filter by PipelineName to cater to different users.

In conclusion, by following these steps, you can log all errors generated by all the activities across multiple pipelines. The organization of the errors goes a long way in helping you understand the context of each problem, and providing a structure for the team to act appropriately. This solution will help you ensure that your team is notified of errors in a prompt and efficient manner.

--

--