Text Analytics – Dutch Support

Recently, Microsoft added Dutch language support for the Text Analysis API within the Cognitive Services stack. In this blogpost I will show how to determine the sentiment of a newsfeed in Dutch.

The importance of language support

Most development articles are written in English. Most documentation is written in English. I’m Dutch and even I am blogging in English!

With the rise of chatbots and other dialogue based UX apps, native language support becomes ever more important. Think of an insurance company wanting to let users talk to their chatbot, or a banking app that lets you choose what mortgage you want for your house. In these situations most companies and clients want to use their native language in communicating with the apps.

With the latest update, the Text Analysis API now partly supports my native language Dutch (along with several other ones).

Creating the sample project

For this example I’ve created a simple console application using the default template in Visual Studio 2017. The target framework is 4.6.1 and its an empty Console Application. The goal of this application is to retrieve the latest newsfeed from Nu.nl, a populair Dutch news site, and determine the sentiment in Dutch.

Using the Text Analytics Client library

Most examples use the Microsoft Cognitive Services Text Analytics API Client Library but this library does not support the current preview languages.

This triggered me to check out the source code. I’ve found out that the current new preview languages are not added to the client library. To solve this issue, I’ve forked the library and created a pull request so future users can use the NuGet package. This blog will use code based on that library. To do this, check out the repository and add a reference to your project.

Create API Keys in Azure

To create API keys, the process is pretty simple. Go to the Cognitive Services blade, select ‘Add’ and find the Text Analytics (Preview) service.

In comparison to other APIs in the Cognitive Services stack, the Text Analytics API is relatively expensive. Fortunately there is a Free tier for us developers to use and get up and running without high costs:

0TextAnalysisAzure

The lowest rate (above the free tier) is S1, which allows 100.000 calls per month for 150 USD. Every 1000 calls above that rate will cost an additional 1.50 USD.

After getting the appropriate API key(s) you are good to go!

RSS Feed

Retrieving the RSS feed is pretty straightforward. The gist further on in this blog shows the code example. As the text output is pretty short, the analysis will sometimes be off. The more you put in, the better the analysis can do its work.

The output at this moment of the RSS feed is as follows:

2AnalysisRss

To get a better estimation, I’ve concatenated the title and summary before sending it up to the Text Analysis API.

The actual code

The complete example is shown in the gist below:
https://gist.github.com/prombouts/2c723620cb6ea6afc4d7a3df5dc3c40d.js

Endresult of the application

Running this application gives the latest 10 results, shown here:

1TextAnalysis

In this example, two items stand out. The first one is concerning an accident involving a child, the second one concerns the Swiss female soccer team winning a game on the European Football Championship.

Conclusion

Sentiment Analysis is the first feature to work in Dutch. This powerful tool allows you to analyse anything text related. In this example I’ve used a newsfeed as input, but I could easily add Twitter to find out how the sentiment around an event, of company brand, is evolving.

I can’t wait until Key Phrase Extraction and Topic Detection get into preview phase to allow us to use the full power of Text Analytics in my native language.