Azure AI Services includes a service to analyze text content, and assign a sentiment analysis rating. This service is exposed via high level SDKs including Java, JavaScript, C# and Python. It's also available via its underlying REST APIs.

From a Fabric Jupyter Notebook, we can also access Azure AI Sentiment analysis by using the pre-installed Synapse ML libraries. In this post we'll use this Synapse ML library to apply sentiment analysis to each row of a Spark DataFrame according to the sentiment of user review content.

Video Tutorial Available

The balance of this tutorial-orientated post is available in video format on YouTube using the following embedded video. The rest of the text walk-through continues after the embedded video.

YouTube video version of this post

Input Data

Our input data is stored in a Delta Table within a Fabric Data Lake.

The translation of data was the output of a previous post on language translation in Microsoft Fabric notebooks.

Fetching Azure AI Key from Key Vault

The sentiment analysis transformer requires a key to use the Azure AI Service. When using secret keys in notebooks, we don't want them in the notebook source code, and as a best practice store them in Azure Key Vault.

Microsoft provides the PyTridentTokenLibrary to securely access Key Vault secrets from Jupyter Notebooks.

Fetching the Azure AI Key from Azure Data Vault

When this cell completes, we have the Azure AI access key stored in the access_token session variable.

Create and Use a Synapse Transformer

To process the input DataFrame with Azure AI Services, we create a Synapse TextTransformer, and configure it according to our input and output requirements:

Using the Synapse Transformer to apply sentiment analysis

One line 6 we specify that the text column will be the source of data for sentiment analysis.
On line 7 we specify that the sentiment result will be added to the DataFrame in a column called sentiment_result.
Lines 8 and 9 specify the region where we created our Azure AI service, and the key for that service.
Line 10 specifies that if errors are encountered when analyzing text for sentiment, those errors should be added to a new column error.

Line 13 runs the transformer, and provides DataFrame df as the transformer's data source.

Review Transformer Output

The output of the transformer is a Spark DataFrame with the sentiment column appended.

The sentiment_result column is a JSON object, that contains a wealth of information, including confidence, sentiment of sub-sections of text, and an overall sentiment evaluation.