The April 2019 update to Power BI has many good features, as this blog explains
Part seven of a nine-part series of blogs

The April 2019 update to Power BI introduces a new concept: the ability to make the value of a property dynamic. Other welcome new changes include M Intellisense, data profiling in query editor and cross-report drill-down.

  1. The April 2019 Update to Power BI Desktop
  2. Conditional titles for visuals and conditional URLs
  3. Intellisense in the M Query Editor Language
  4. Data profiling in Query Editor
  5. Cross-report drill-through
  6. Fuzzy merging
  7. Power BI Dataflows (this blog)
  8. Other changes in the April 2019 update
  9. Power BI features waiting in preview, as of April 2019

For a cumulative list of all of the updates to Power BI Desktop in the last few years see this blog, or have a look at the Power BI courses that we run.

Posted by Andy Brown on 18 April 2019

You need a minimum screen resolution of about 700 pixels width to see our blogs. This is because they contain diagrams and tables which would not be viewable easily on a mobile phone or small laptop. Please use a larger tablet, notebook or desktop computer, or change your screen resolution settings.

Power BI Dataflows

The aim of this blog is to give you, gentle reader, an idea of whether you should be using a new feature, but for the full Monty see the dedicated Microsoft dataflow site.

I'd like to think the rest of this blog is reasonably polished, but I must admit that this article is slightly more of a stream-of-consciousness, as I try to work out where Microsoft are going with this new feature.

Creating a dataflow

To do this you must be in Power BI Service, must have a paid account and you must be using an app workspace.  Click on the Create button which should appear:

Creating a dataflow

Click on the button shown at the top right of this screen shot to create a dataflow.

You can then choose to create a dataflow:

Dataflow menu

Choose this option (obviously) on the dropdown menu which appears.

 

You're now invited to start creating a dataflow:

Creating dataflow

The 3 options which appear on the screen.

If you choose to add new entities, you'll see a huge list of all the possible data sources to which you could connect:

Data sources

There are many more data sources listed - these are the only ones I could fit in my screenshot.

I then tried connecting to a number of data sources, but whether you're using SQL Server, an Excel workbook or a website, you'll need to create a gateway to it first.

I did try uploading an Excel workbook and linking to that, but got an unhelpful Unexpected Error message (I think because you need to specify the server drive on which the workbook is stored).

So the heading for this section should really have read "NOT creating a dataflow".  However, having read around it I think I now understood the point of dataflows, and why you might want to use them, and I'll try to explain this now.

Before dataflows 

Let's imagine you're Jo(e) in Data Analytics.  You've been asked to create a report showing average salary trends.  You load two tables:

  • One called Payments from your proprietary accounts system; and
  • One called tblStaff from your HR database. 

You get your basic data model:

Basic data model

Initially the data model doesn't look very promising.

You then spend a while tidying it up in Query Editor and in Power BI to get something a bit nicer-looking:

Better model

The same tables, but with some columns removed, a new Staff name column created and a relationship added.

 

You save this as a PBIX file, happy that you've built up a sensible series of query steps which you can repeat any time you want to reload the data:

The query steps

Typical query steps that you might have created.

 

Exhausted, you go on holiday.  The next day Sam from IT is asked to do a similar exercise.  S/he has 3 choices:

  1. Recreate the wheel by going through the same steps as you went through the day before;
  2. Copy the query steps from your PBIX file, by selecting the corresponding M language instructions and copying them; or
  3. Take a copy of your PBIX file and using that as a basis for analysis.

What dataflows do 

What a dataflow allows you to do is to perform the steps described above, but in Power BI Service (so there is now a version of Query Editor which runs online). 

Here's how this could work.  You create a dataflow to load data from your accounts and HR systems, providing that you have set up a gateway giving access to them:

The two tables

The tables from which you need to load data.

 

You then massage the data using online Query Editor into the right format:

The format

The final format of data that you want.

 

The data is loaded and stored in the Azure Data Lake internally, although you absolutely do NOT need to know this.

Using a dataflow 

When Jo(e) and Sam want to work with this data, they could both link to a dataflow:

Choosing a dataflow

The option to get data from a dataflow.

They can then link to the tables returned from the dataflow:

Dataflow tables

Jo(e) and Sam will see something in this format, although obviously the table and column names will be different.

So the dataflow gives obvious advantages: it avoids duplication of effort, and centralises the process of loading data. 

What I don't get 

Aren't Microsoft competing with themselves?  The process I've described above is surely a different version of:

  • Take a relational database stored in (say) SQL Server;
  • Use Integration Services or some other ETL software to load the data on a regular basis into a data warehouse stored in Analysis Services (called processing); 
  • Create reports in Power BI and other client software applications linking to this data warehouse.

Dataflows provide support for something called the Common Data Model, but otherwise how are they different from the traditional data warehouse?

This blog has 0 threads Add post