Configure BI Optimizer for Power BI (Beta)

BETA FEATURE: BI Optimizer is a beta feature. Functionality, configuration steps, and supported options are subject to change as the feature evolves.

We recommend reading BI Optimizer Requirements Overview before setting up BI Optimizer. It outlines the required prerequisites, roles, and configuration steps in both Azure and Metric Insights to help ensure the setup is completed correctly and without issues.

IMPORTANT:

Creating Power BI Groups to Users Dataset (Step 5) requires the Microsoft Graph Group.Read.All and User.ReadBasic.All permissions:

  • for Password authentication – under Delegated permissions.
  • for OAuth or Service Principal authentication – under Application permissions.

This article details how to configure the BI Optimizer App for Power BI in Metric Insights.

PREREQUISITES:

  1. Start depending on your needs:
  2. The following files requested from Metric Insights support:

IMPORTANT: If BI Optimizer is already configured for another Data Source, complete Steps 1-6, and Step 10.1, then run the Report Cleanup Custom Script.

1. Set DEFAULT_DATASET_TEXT_COLUMN_SIZE System Variable to "4100"

Access Admin > System > System Variables

DEFAULT_DATASET_TEXT_COLUMN_SIZE: Set the System Variable to "4100".

2. Sync Metadata for the Power BI Data Source

Access Admin > Collection & Storage > Data Sources

  1. Sync Metadata under the Metadata tab for the Power BI Data Source that is to be optimized.
  2. IMPORTANT: Note the Data Source ID for use in Step 10.1.

3. Configure Notification Schedule

Access Admin > Distribution > Notification Schedules

Configure Notification Schedule.

NOTE: The frequency must be set to at least once per day, but it is recommended to configure it to run every two hours.

4. Configure External Application

Access Admin > System > External Application

Configure External Application for Custom Scripts.

IMPORTANT: Make sure "Application Can Manage Users" is enabled.

5. Create Power BI Datasets for Assets

  1. Access +New > Dataset > Create New
  2. Under Data Source, select the name of the Power BI Data Source to be optimized

NOTE: "USAGE, USER_PERMISSIONS" must be listed under the Scope: Types to collect Optional Parameter for the Power BI Data Source for the following objects to be collected.

Create the following Datasets:
It is recommended to use the same naming convention to make the instructions easier to follow.

  1. Power BI Usage Dataset from System/Usage report;
    NOTE:
    • Use the following statement in the MDX / DAX query Editor:
param usage_date_to_collect = :measurement_time
fields = UserId, ReportId, ReportName, WorkSpaceName
filter = ReportId is not null
aggregates = count(*),max(CreationTime)
  • Enable Snapshot Dataset? (keep history) for this Dataset;
  • Make sure Include current Day is disabled;
  • Make sure the DEFAULT_DATASET_TEXT_COLUMN_SIZE System Variable is set to "4100" (see Step 1 above).
  • You can collect the last 30 days historical instances. This is useful for BI Ops calculations.
  • IMPORTANT: Note the Dataset ID to be used in Step 6.1 and Step 6.2.

2. Power BI User Permissions Dataset from System/Users Permissions report. This Dataset maps Power BI Objects to users and their respective access rights.

3. Power BI Groups to Users Dataset from System/Groups to Users report. This Dataset maps all users to their Power BI groups. It must include Group ID and User ID columns.

For more details on how to create Power BI Datasets, refer to Create a Dataset from Microsoft Power BI Cloud.

NOTE: If BI Optimizer is already running for another Data Source, proceed to Step 10.1.

6. Create Derived Datasets from Power BI Usage Dataset:

6.1. Usage by Month Dataset

  1. Select the storage you used to save Power BI Usage Dataset in Step 5 as Dataset Source.
  2. Use the following Query:
    • where dataset_XXX is Power BI Usage Dataset.
SELECT 
CEIL(DATEDIFF(NOW(), ds.measurement_time) / 30) - 1 as bucket_no,
  ds.userid,
  ds.reportid,
  ds.reportname,
  ds.workspacename,
  sum(ds.count___) as viewcount,
  MAX(ds.max_creationtime_) as lastviewed
FROM dataset_XXX ds
GROUP BY 
  bucket_no,
  ds.userid,
  ds.reportid,
  ds.reportname,
  ds.workspacename
order by lastviewed desc

6.2. Power BI Last Viewed by Username Dataset

  1. Select the storage you used to save Power BI Usage Dataset in Step 5 as Dataset Source.
  2. Use the following Query:
    • where dataset_XXX is Power BI Usage Dataset.
SELECT ds.creationtime as last_viewed_time, 
       ds.userid as username, 
       ds.reportid as workbook_guid 
FROM dataset_XXX ds
JOIN (SELECT MAX(max_creationtime_) as mv, reportid FROM dataset_XXX  GROUP BY reportid) lvt ON lvt.mv = ds.max_creationtime_ AND lvt.reportid = ds.reportid 
WHERE ds.reportid IS NOT NULL
GROUP BY ds.reportid;

7. Create Empty CSV Datasets for App Template Entities and "Why Similar Storage" Variable

Access +New > Dataset > CSV Dataset

  1. Provide a descriptive Name. It is recommended to use the same naming convention to make the instructions easier to follow.
    • Datasets to be created for App Template Entities:
      • Report Cleanup access storage – for the accessData Entity;
      • Report Cleanup excluded storage – for the excludedRollups Entity;
      • Report Cleanup group data storage – for the groupToUserData Entity;
      • Report Cleanup processor storage – for the processorData Entity;
      • Report Cleanup unused storage – for the unusedInfo Entity.
    • Datasets to be created for Why Similar Storage variable:
      • Report Cleanup Criteria – for the criteriaDatasetId property;
      • Report Cleanup Criteria Hash – for the reportHashDatasetId property;
      • Report Cleanup Multipliers – for the multipliersDatasetId property.
        • IMPORTANT: Note the Dataset IDs required in Step 10.1.
  2. Choose a Category for the Dataset.
  3. [Save]

8. Create CSV Dataset for Custom Script

Access +New > Dataset > CSV Dataset

  1. Create a CSV Dataset by uploading a CSV file with one column, reportName. This Dataset is only needed if you want to exclude specific Reports from the optimization. If you use it, make sure to include at least one Report Name.
    • As a Dataset Name, it is recommended to use Report Cleanup Excluded names to make the instructions easier to follow.
    • NOTE: Report names are not case sensitive. For example, "sales report" will exclude both "sales report" and "Sales Report".
    • IMPORTANT: Note the Dataset ID for use in Step 12.2. If you don't need to exclude any Reports, the Parameter can be left blank.

For more details on how to create CSV Datasets, refer to Create a Dataset from CSV File.

9. Create App Template

Access Content > Apps > Templates tab > [+New Template]

  1. Provide a descriptive Name such as "Report Cleanup".
  2. Name used in URL must be set to "report-cleanup".
  3. Display without Metric Insights navigation bar must be checked.

9.1. Upload Assets

  1. From the Assets tab, upload the Assets requested from Metric Insights Support as part of the prerequisites.
  2. The uploaded Assets will appear as a list.

10. Create App Based on this Template

Access Content > Apps > [+New App]

  1. Provide a descriptive Name.
  2. Choose the Template created at Step 9.
  3. [Save]
    • IMPORTANT: Note the App ID to be used in Steps 11.2 and 12.2. See below.

10.1. Add Variables

Access the Content tab

  1. [+Add Item], then fill in the following fields where needed:
PropertyDescriptionValue
connectionIdThe ID of the Power BI data source to be optimized 
datasetIdPBI Usage by Month (see Step 6.1) 
workbookLuidKeyColumn name for the Workbook LUID key from the datasetID (above)default: ReportId
rollupNameKeyColumn name for the Rollup name key from the datasetID (above)default: WorkSpaceName
reportNameKeyColumn name for the Report name key from the datasetID (above)default: ReportName
accessListDatasetIdPower BI User Permissions Dataset (see Step 5) 
accessListKeysMapping

Mapping of the Access List Dataset keys to the connection report metadata Dataset keys. The mapping should be in JSON formatting. 

The keys in the object must be the keys (column names) from the accessListDatasetId Dataset. 

The value must be one of the following: username, groupName, rollupNameKey, reportNameKey, permission, or a special format object like 

{
"name": "permission",
 "value":
  {
    "Allow": true,
    "Deny": false,
    "[DEFAULT]": false
  }
}

For help with this variable, contact Metric Insights support.

Copy / Paste from the  accessListKeysMapping code block below.
groupToUserDatasetIdPower BI Groups to Users Dataset (see Step 5) 
groupToUserKeysMapping

Mapping of the group to user Dataset keys to the connection report metadata Dataset keys. The mapping should be in JSON formatting.

The keys in the object must be the keys (column names) from the groupToUserDatasetId Dataset. 

The value must be one of the following: username, fullName, groupName.

See the example on the right for reference:

{
 "User Email": "username",
 "User Name": "fullName",
 "Group Name": "groupName"
}
lastViewedDatasetIdPower BI last viewed by username (see Step 6.2) 
lastViewedKeysMapping 
{
"last_viewed_time":"lastViewedAt", 
"username":"username",
"workbook_guid":"workbookGUID"
}
additionalParams

Remove Report columns from the criteria list and skip them during duplicate calculations for this specific Data Source.

The value must be one of the following: true, false.

NOTE: To remove Report columns for all connections intended for BI optimization, configure the Additional Params variable below. 

{
  "disableColumnsCriteria": true
}


 

Expand or collapse content accessListKeysMapping
{
  "Email Address": "username",
  "Display Name":
  [
    {
      "name": "groupName",
      "$value":
      {
        "type": "ternary",
        "condition":
        {
          "left":
          {
            "type": "field",
            "name": "Principal Type"
          },
          "operator": "==",
          "right":
          {
            "type": "value",
            "value": "Group"
          }
        },
        "ifTrue":
        {
          "type": "field",
          "name": "Display Name"
        },
        "ifFalse": ""
      }
    },
    {
      "name": "fullName",
      "$value":
      {
        "type": "ternary",
        "condition":
        {
          "left":
          {
            "type": "field",
            "name": "Principal Type"
          },
          "operator": "==",
          "right":
          {
            "type": "value",
            "value": "User"
          }
        },
        "ifTrue":
        {
          "type": "field",
          "name": "Display Name"
        },
        "ifFalse": ""
      }
    }
  ],
  "project": "rollupNameKey",
  "workbook": "reportNameKey",
  "Access Right":
  {
    "name": "permission",
    "value":
    {
      "[DEFAULT]": true
    }
  },
  "Report ID": "guid"
}
Click to copy

2. Why Similar Storage: fill in the IDs for the Datasets created in Step 7, strictly following JSON formatting:

  • criteriaDatasetIdReport Cleanup Criteria Dataset ID;
  • reportHashDatasetIdReport Cleanup Criteria Hash Dataset ID;
  • multipliersDatasetIdReport Cleanup Multipliers Dataset ID.

3. Additional Params: set disableColumnsCriteria to "true" to remove Report columns from the criteria list and skip them during duplicate calculations for all connections configured for BI optimization.

  • NOTE: This parameter overrides the additionalParams configuration (see the Connection Report Meta Datasets table above), which removes Report columns only for the specific Data Source.

4. [Save]

10.2. Add Entities

Access the Entities tab

  1. [+Add App Entity], then the following Entities and configure them accordingly.
Entity Name"Type""Access Type""App Dataset"Dataset
accessDataInternalPublicYReport Cleanup access storage
assigneeInternalPublic  
clustersInternalPublic  
dataInfoInternalPublic
excludedRollupsInternalPublicYReport Cleanup excluded storage
groupToUserDataInternalPublicYReport Cleanup group data storage
historyInternalPublic  
processorDataInternalPublicYReport Cleanup processor storage
recommendationsInternalPublic  
unusedInfoInternalPublicYReport Cleanup unused storage

NOTE: If BI Optimizer is already running, test-run the Custom Script.

11. Create Run Sync Metadata Custom Script

Metadata refresh does not run automatically and does not depend on other refresh triggers. To run metadata refresh automatically, you must configure the Run Sync Metadata Custom Script and associate it with a daily Notification Schedule. This section explains how to do that.

Custom Scripts and their Parameter configuration can be handled manually, as described in this article. However, there's a way to automate this process by configuring a dedicated Custom App, Setup CS Helper. This Custom App provides a more efficient and error-free alternative by automating Parameter and Parameter Set population.

Access Admin > System > Custom Scripts > [+New Custom Script]

  1. Provide a descriptive Name such as Run Sync Metadata.
  2. Choose an External Application configured in Step 4.
  3. Specify an Authentication User (Admin or System Admin).
  4. [Save]

11.1. Add Parameters

  1. From the Info tab, [+Add Parameters], then add the following Parameters and configure them accordingly.
NameTypeRequiredDefaultAvailable Values
logLevelStringY info debug,info,warn,error,emergency,silent
pageIdNumber Y 
scriptTimeoutNumber Y600000

11.2. Add Parameter Set

  1. [+Add Parameters Set]
  2. Provide a descriptive Name like Default.
  3. Provide the ID of the App configured at Step 10.
  4. Enable Is Default.
  5. [Save]

11.3. Add Custom Script Code and Run Script

  1. Under the Editor tab, copy and paste the code from the Run Sync Metadata Custom Script file requested from our support team.
  2. Select the Parameter Set.
  3. [Run Script]
    • See the example output below.

11.4. Associate Notification Schedule With Run Sync Metadata Script

  1. Access Admin > Distribution > Notification Schedules
  2. Select the Notification Schedule configured in Step 3.

NOTE: The Script must run at least once per day. Configure the Notification Schedule accordingly.

  1. Under the Run Scripts tab, [+Add Script].
  2. Choose the Custom Scrip created in Step 11.
  3. Choose the Parameter Set.
  4. [Save]

12. Create Report Cleanup Custom Script

Custom Scripts and their Parameter configuration can be handled manually, as described in this article. However, there's a way to automate this process by configuring a dedicated Custom App, Setup CS Helper. This Custom App provides a more efficient and error-free alternative by automating Parameter and Parameter Set population.

Access Admin > System > Custom Scripts > [+New Custom Script]

  1. Provide a descriptive Name such as Report Cleanup Processor.
  2. Choose an External Application configured in Step 4.
  3. Specify an Authentication User (Admin or System Admin).
  4. [Save]

12.1. Add Parameters

  1. From the Info tab, [+Add Parameters], then add the following Parameters and configure them accordingly.
NameTypeRequiredDefaultAvailable Values
pageIdStringY  
storageEntityNameString processorData 
unusedForMonthsNumber 61,2,3,4,5,6
scriptTimeoutNumber 600000 
logLevelString info0,1,10,2,3,4,debug,emergency,error,info,silent,warn
reportColumnsMatchCaseSensitiveString true,true
minColumnCountDuplicatesNumber 2 
worksheetNameMultiplierNumber 0.1 
tableMultiplierNumber 1 
tableColumnMultiplierNumber 1 
reportColumnMultiplierNumber 1 
tagMultiplierNumber 1 
minOverlapInClusterNumber 0.2 
filteredColumnTypesStringYstring,date,DateTime 
clustersEntityNameStringYclusters 
excludedRollupsEntityNameString excludedRollups 
thresholdReportsToBeUsedNumber 0 
skipUnusedCheckForCreatedInDaysNumber 15 
unusedInfoEntityNameString unusedInfo 
removeUnusedFromClustersString  ,true
accessEntityNameString accessData 
groupToUserEntityNameString groupToUserData 
calculateAccessString  ,true
excludedReportNamesDatasetIdNumber   
excludedReportNamesObjectKeyString reportName 
minItemsCountNumber 8 
excludeReportDashboardsString  ,true

12.2. Add Parameter Set

  1. [+Add Parameters Set]
  2. Provide a descriptive Name like Default.
  3. Include the Report Cleanup Excluded names CSV Dataset ID if you are using one. Otherwise this can be left blank.
  4. Provide the ID of the App configured at Step 10.
  5. Enable Is Default.
  6. Fill in the rest of the fields based on the list below and [Save].

Parameter Description:

  • pageId: The ID of the App created in Step 10 above.
  • storageEntityName: The name of the storage entity where the script will store its data. (See Step 10.2) The default is processorData.
  • unusedForMonths: The number of months without activity after which a report is marked as unused. The default is 6.
  • scriptTimeout: The timeout for the script in milliseconds. The default is 600000 (10 minutes).
  • logLevel: The log level for the script. Values can be 0,1,10,2,3,4,debug,emergency,error,info,silent, or warn. The default is info.
  • reportColumnsMatchCaseSensitive: Specify whether the report columns are case-sensitive. The default is true.
  • minColumnCountDuplicates: The minimum number of columns to consider for duplicates. The default is 2.
  • worksheetNameMultiplier: The multiplier for the worksheet name. The default is 0.1.
  • tableMultiplier: The multiplier for the table name. The default is 1.
  • tableColumnMultiplier: The multiplier for the table column name. The default is 1.
  • reportColumnMultiplier: The multiplier for the report column name. The default is 1.
  • tagMultiplier: The multiplier for the tag name. The default is 1.
  • minOverlapInCluster: The threshold percentage of similarity between two reports in a cluster, specified as a decimal. A report cluster will be included in the match results when the similarity percentage between any two reports in the cluster exceeds this threshold. Acceptable values range from 0 to 1 (e.g. 0.2, 0.35, 0.5), where 0 represents 0% similarity and 1 represents 100%. The default is 0.2 (20%).
  • filteredColumnTypes: The column types to be filtered out as a comma-separated list of types with no spaces. The default is string,date,DateTime.
  • clustersEntityName: The name of the clusters entity. (See Step 10.2) The default is clusters.
  • excludedRollupsEntityName: The name of the excluded rollups entity. (See Step 10.2) The default is excludedRollups.
  • thresholdReportsToBeUsed: The minimum number of views to consider a report as used. The default is 0.
  • skipUnusedCheckForCreatedInDays: The number of days after the report was created to skip the unused check. The default is 15.
  • unusedInfoEntityName: The name of the unused info entity. (See Step 10.2) The default is unusedInfo.
  • removeUnusedFromClusters: Specify whether to remove unused reports from clusters. The default is blank.
  • accessEntityName: The name of the access entity. (See Step 10.2) The default is accessData.
  • groupToUserEntityName: The name of the group-to-user entity. (See Step 10.2) The default is groupToUserData.
  • calculateAccess: Specify whether to calculate access information. The default is blank.
  • excludedReportNamesDatasetId: The ID of Report Cleanup Excluded names Dataset. (See Step 6) If not used, this can be left blank.
  • excludedReportNamesObjectKey: If using excludedReportNamesDatasetId, the column name from the dataset that contains the excluded report names. The default is reportName.
  • minItemsCount: The minimum number of items required in the report for processing to begin. The default is 8.
  • excludeReportDashboards: Specify whether the BI Optimizer should exclude Power BI dashboards from duplicate-detection when those dashboards only contain (or represent) the same report included within them. Use when a dashboard embeds a single underlying report and is flagged as a duplicate of that report. Acceptable values: blank(default): dashboards and reports are evaluated normally; true: skip similarity flagging between a dashboard and its included reports.

13. Add Custom Script Code and Run Script

  1. Under the Editor tab, copy and paste the code from the BI Optimizer Custom Script file requested from our support team.
  2. Select the Parameter Set.
  3. [Run Script]
    • See the example output below.

14. Verify Result

Access Content > Apps > Report Cleanup Processor> [View]

15. Associate Notification Schedule with Report Cleanup Custom Script

  1. Access Admin > Distribution > Notification Schedules
  2. Select the Notification Schedule set in Step 3.

NOTE: The Script must run at least once per day, but it is recommended to run it once every two hours. Configure the Notification Schedule accordingly.

  1. Under the Run Scripts tab, [+Add Script].
  2. Choose the Custom Scrip created in Step 13.
  3. Choose the Parameter Set.
  4. [Save]