Skip to content

Jump Start to “AI” in Power BI Dataflows and Desktop

AI in Power BI

AI in Power BI Dataflows and Desktop

What is the difference between machine learning and AI (Artificial intelligence) ?

If it is written in Python, it’s probably machine learning If it is written in PowerPoint, it’s probably AI (Mat Velloso former Technical Advisor to the CEO at Microsoft)

As is generally known, Power BI currently supports both R- and Python Scripts as well as R- and Python Visuals in Power BI Desktop and Power BI Service. (partly preview). In the Dataflow area, the so-called Cognitive Services were made available in a first version. These are per-trained ML Models from Microsoft. They can be called from Power Query as well. Recently released was also the Key influencer Visual.

Demystifying Power BI Data Flows which are now generally available

Content:

  1. Cognitive Services in Power BI (Preview)
  2. Cluster Analysis with R Script
  3. Visual Clustering R
  4. Python Visual

1 – Cognitive Services in Power BI (Preview)

Microsoft Docs Getting Started

Detect Language

The language detection function evaluates text input, and for each field, returns the language name and ISO identifier. This function is useful for data columns that collect arbitrary text, where language is unknown. The function expects data in text format as input.

Text Analytics recognizes up to 120 languages.

CognitiveServices.DetectLanguageProductName

Expanded CognitiveServices.DetectLanguageProductName

Filtered rows Supported Languages for Image Tagging

Tag Images

The Tag Images function returns tags based on more than 2,000 recognizable objects, living beings, scenery, and actions. When tags are ambiguous or not common knowledge, the output provides ‘hints’ to clarify the meaning of the tag in context of a known setting. Tags are not organized as a taxonomy and no inheritance hierarchies exist. A collection of content tags forms the foundation for an image ‘description’ displayed as human readable language formatted in complete sentences.

After uploading an image or specifying an image URL, Computer Vision algorithms output tags based on the objects, living beings, and actions identified in the image. Tagging is not limited to the main subject, such as a person in the foreground, but also includes the setting (indoor or outdoor), furniture, tools, plants, animals, accessories, gadgets, and so on.

This function requires an image URL or abase-64 field as input. At this time, image tagging supports English, Spanish, Japanese, Portuguese, and Simplified Chinese. For more information, see Supported languages.

  • en, es – Spanish, ja – Japanese, pt – Portuguese, zh – Simplified Chinese are supported Languages for Images Tagging

CognitiveServices.TagImages

Expanded CognitiveServices.TagImages

CognitiveServices.ExtractKeyPhrases

More:

2 – R Script for Cluster


# ‘dataset’ holds the input data for this script

c(108,122)    Input Column 108 and Column 122

4        Number of Clusters based on Elbow Chart

Salt and Sugar Cluster

SS<- kmeans(dataset[,c(108,122)], 4)

mydata <- data.frame(dataset, SS$cluster)

Energy Cluster (Energy, Fat, Carbo)

ENE<- kmeans(dataset[,c(69,71,107)], 3)

mydata1 <- data.frame(dataset, ENE$cluster)

Fat Cluster (Fat , Saturated Fat)

FAT<- kmeans(dataset[,c(72,71,106)], 3)

mydata1 <- data.frame(dataset, FAT$cluster)

Salt and Fat Cluster

SF<- kmeans(dataset[,c(122,71)], 3)

mydata1 <- data.frame(dataset, SF$cluster)

Here we can see the assigned clusters to the raw data.

Result in Power BI:

Tagging Details

Key Ingredients phrase Extraction

Analysis

Elbow Chart for getting and Idea about how many clusters you may need as a good starting point:

# The following code to create a dataframe and remove duplicated rows is always executed and acts as a preamble for your script:

# dataset <- data.frame(product_name, salt_100g, sugars_100g, fat_100g, brands, bad.cluster)

# dataset <- unique(dataset)

# Paste or type your script code here:

array <- (nrow(dataset[,1:3])-1)*sum(apply(dataset[,1:3],2,var))

for (a in
2:15) array[a] <- sum(kmeans(dataset[1:3], centers=a)$withinss)

plot(1:15, array, type=“b”, xlab=“Number of ALL Clusters”, ylab=“Within grp. sum of squares”)

    

Cluster Results

Salt and Sugar Cluster

Energy Cluster (Energy, Fat, Carbo)

Fat Cluster (Fat , Saturated Fat)

Salt and Fat Cluster

Scatter

Visual Clustering R

library(cluster)

# Ward Hierarchical Clustering

#d <- dist(dataset, method = “euclidean”) # distance matrix

#H.fit <- hclust(d, method=”ward”)

#plot(fit) # display dendogram

#groups <- cutree(fit, k=5) # cut tree into 5 clusters

# draw dendogram with red borders around the 5 clusters

#rect.hclust(fit, k=5, border=”red”)

D=daisy(dataset, metric=‘manhattan’)

H.fit <- hclust(D, method=“ward”)

row.names(dataset) <- dataset$Geo

#plot(H.fit, labels=dataset$Geo, hang=-1)

groups <- cutree(H.fit, k=5, tree) # cut tree into 4 clusters

#rect.hclust(H.fit, k=5, border=”red”)

clusplot(dataset, groups, color=TRUE, lines=0, main= ‘Fat’, span=F, labels=2, shade=FALSE)

import seaborn as sns

import matplotlib.pyplot as plt

sns.swarmplot(x=“DetectLanguage.Detected.Language.Name”, y=“energy_100g”, data=dataset)

plt.show()

import matplotlib.pyplot as plt

import pandas

from pandas.plotting import scatter_matrix

scatter_matrix(dataset)

plt.show()

Views: 139