Fix your slow Power Query refresh from SharePoint (700% faster)
By Excel Off The Grid
Summary
Topics Covered
- SharePoint.files slows with site growth
- Switch to SharePoint.contents for speed
- Convert existing queries to Contents
Full Transcript
When we use SharePoint as a source for Power Query, it can lead to slow refreshes. So today, I want to show you
refreshes. So today, I want to show you a simple technique that you can use to significantly speed up refresh times when combining files in a folder from
SharePoint. So, if you're ready, let's
SharePoint. So, if you're ready, let's get started.
Here is my SharePoint site, and it contains various subfolders and files.
In total, there are over 32,000 files in this site. However, we only want to
this site. However, we only want to combine the files which are in the data folder and then in the combine folder.
So, you can see that we have just three CSV files. So, we want to combine three
CSV files. So, we want to combine three files out of the total of 32,000 files on this site. I'm going to start by
copying the URL that I need to connect to the site. You can see here we have the base file path. Then we have the word sites. And finally, we have the
word sites. And finally, we have the name of our site. And that is the section that we need to copy so that we can connect to this SharePoint site. So
I'll select that and press Ctrl C to copy. Now let's head into Excel and see
copy. Now let's head into Excel and see how we can combine these three files.
To combine those files using Power Query in Excel, we go to the data ribbon, then get data from file and from SharePoint folder. Now, don't worry if you don't
folder. Now, don't worry if you don't have this option in your menu. Shortly,
I will go over how you can still use this technique. I'll select that and
this technique. I'll select that and then I'm going to paste the site URL I previously copied and then I'll click okay. If you've already connected to
okay. If you've already connected to this site before, you won't need to sign in. But for this example, I haven't
in. But for this example, I haven't signed in before, so I will go to Microsoft account and then go to sign in. And if necessary, I will enter my
in. And if necessary, I will enter my username and password. I'm now signed into my site, so I can click connect.
This now brings up a preview of the files which are on the site. We're going
to click transform data. This now
connects to the SharePoint site. Now, if
you didn't have from SharePoint folder in your menu, we can now see the M code that we need to make that same connection without using the user
interface. I will copy that. I could, if
interface. I will copy that. I could, if I didn't have that connector, right click, go to new query, other sources, blank query, and then simply enter that
same piece of code. So,
SharePoint.files,
then the URL. After that, we want the API version 15. When we commit that, that would connect to that SharePoint site, and we did that without using our
from SharePoint site option from the menu. So if you didn't have that option
menu. So if you didn't have that option inside your get data dropdown, it's not an issue. You can create it from a blank
an issue. You can create it from a blank query. Now I don't need this query. So I
query. Now I don't need this query. So I
will select it and press delete. Let's
delete query 2. Let's come back to query one. And I'm going to rename this to SP
one. And I'm going to rename this to SP folder.
Our current connector is sharepoint.files.
sharepoint.files.
that will list all of the files on our SharePoint site. But we don't want to
SharePoint site. But we don't want to combine all of these files. We only want to combine the files which are in our combine folder. I will search for that
combine folder. I will search for that and then I'll click okay.
That now filters down to just those three files. We're now ready to combine
three files. We're now ready to combine these three files. And to do that, I'll click on the combine files icon. Power
Query now launches the combine files dialogue box and shows us the first file from our list. We're not going to make any changes in here. So I will click
okay. This is when Power Query now adds
okay. This is when Power Query now adds all of the helper queries that we need to combine all of the files in the folder. Now for this example, we don't
folder. Now for this example, we don't need to make any changes to these queries. They are CSV files which have
queries. They are CSV files which have been loaded directly into Power Query and they are already in the right format. So now we can close and load
format. So now we can close and load this back to an Excel table. From the
home ribbon, I'll click on the close and load dropdown and then close and load two. We want to load this as a table on
two. We want to load this as a table on an existing worksheet in cell B3. And
then I'll click okay. Power Query will then connect to the SharePoint site. it
will filter down to the relevant files and combine those files together and load them as a table.
Now, that refresh took quite a long time. So, now let's carry out a test and
time. So, now let's carry out a test and let's time that query to find out how long it actually takes to refresh. Over
a sample of 10 refreshes using Sharepoint.files,
Sharepoint.files, the average time was 44.6 seconds. with a difference between the
seconds. with a difference between the fastest and slowest times of 6.8 seconds. Now, 45 seconds is an
seconds. Now, 45 seconds is an exceptionally slow refresh, especially when we consider that we only loaded 1,000 rows into our table. The thing is
that if we use SharePoint files, the refresh time is impacted by the number of files on the SharePoint site. Here we
can see the timed refreshes based on different numbers of files on the site.
They all combine the same number of files. So this is purely impacted by the
files. So this is purely impacted by the number of files and these are files which don't even appear in the output.
Power query doesn't use these files at all in returning the result. That means
that over time as SharePoint sites grow with more and more files, our queries will get slower and slower. So, let's go and try another method.
Here we are back in Excel and we haven't yet loaded any of the data from our SharePoint site. Let's go through the
SharePoint site. Let's go through the same process. Data, get data from file
same process. Data, get data from file from SharePoint folder. I will now paste in the URL of my SharePoint site and
then click okay. Once again, we get a preview of our data and we're going to click transform data. As before, this now lists all of the files in our
SharePoint site. Let's start by renaming
SharePoint site. Let's start by renaming our query to SP folder.
And now we're going to change the code that we've used. Currently, we've used SharePoint. else. Let's change this to
SharePoint. else. Let's change this to SharePoint.
And as before, if we needed to, we could enter this M code into a blank query.
I'll now commit this code. And rather
than connecting to all the files on the SharePoint site, it now only includes each of the folders. Therefore, we can now navigate to the folder that we want
to combine the files from. We have
shared documents. I'll click into that table. That now gives us the subfolder.
table. That now gives us the subfolder.
Next, I'll click into the next subfolder. And finally, I'll click into
subfolder. And finally, I'll click into the table of our combine subfolder. And
here you can see that we have the same three files as we combined previously.
Let's continue with the same steps. I'll
click combine files. We get the same dialogue box as before. And in there, we can simply click okay.
Once again, Power Query creates all of the helper queries. And as before in this example, we don't need to make any further changes. So from the home
further changes. So from the home ribbon, we can go to the close and load dropdown. Select close and load two. We
dropdown. Select close and load two. We
want this as a table on the existing worksheet in cell B3. And then I'll click okay. Power Query now goes to the
click okay. Power Query now goes to the SharePoint site. It reduces that down to
SharePoint site. It reduces that down to only get the contents of each individual folder. We then navigate to the folder
folder. We then navigate to the folder that we want and combine the files together and that now loads as a table.
So how does SharePoint contents compare?
Over a sample of 10 refreshes using SharePoint.contents, contents. The
SharePoint.contents, contents. The
average refresh time was 6.1 seconds with the difference between the fastest and the slowest being 1.8 seconds. As
you can see in this scenario, SharePoint.Contents is significantly
SharePoint.Contents is significantly faster than SharePoint files. The reason
this occurs is because SharePoint.files
returns information about all the files on the site. That means it retrieves information about all 32,000 files and only then lets us filter down to the
files that we need. But
SharePoint.contents
only returns the contents of each individual folder. It doesn't look in
individual folder. It doesn't look in subfolders. So depending on how we've
subfolders. So depending on how we've structured our folders, it means it might only need to read the files that we actually want to combine. Therefore,
this gives us a much faster refresh time. So, if you're setting up a new
time. So, if you're setting up a new SharePoint folder connection, you should change the M code from SharePoint.files
to SharePoint.Contents and you will get a faster refresh. But what happens if we already have a query set up using SharePoints?
How can we convert that to SharePoint contents? Let's go take a look.
contents? Let's go take a look.
We are now back in Power Query. And if I click on the source step, you can see that we currently have the Sharepoint.files
Sharepoint.files connector, which is the slower connector. And we want to change this to
connector. And we want to change this to SharePoint.
To do that, we're going to start by right-clicking and duplicating our query. Now, the reason we do this is
query. Now, the reason we do this is because we are going to undertake some destructive actions.
Let's start by changing SharePoint.files
files to shareepoint.contents.
That gives us just the items inside each folder. I'll now drill into my shared
folder. I'll now drill into my shared documents. And when we do this, it tells
documents. And when we do this, it tells us that we're about to create a step that will remove the subsequent steps.
And this is the destructive action.
Let's click continue. You'll notice that all of our applied steps have now been removed. But let's continue drilling
removed. But let's continue drilling into our folder structure. We're going
to go into data and then into combine.
This is the point that we now have the three files that we want to combine together. We're going to go up to view
together. We're going to go up to view and then advanced editor. We're going to select everything before the word in and
we're going to copy that code. I'll then
click done. Let's now come back to our original query and let's click through our steps. We have our source step.
our steps. We have our source step.
Next, we have our filtered rows step and this gives us exactly the same output as our new temporary query. This step is
called filtered rows. I will come up to advanced editor and we want to remove everything before filtered rows. I will
select all of that and paste the code that we previously copied. We now want to blend these two queries together.
I need to add a comma after the pasted code. And then rather than referencing
code. And then rather than referencing filtered rows, which was the name of our previous step before we pasted the code, we want to change that so that it now
references combine.
And then I'll click done because we replaced like with like our query now continues to flow correctly. That means
we no longer need our temporary query that we created earlier. I'll select
that and press delete. And yes, we do want to delete that query. So now when we go back to home and then close and load, it will load the
Sharepoint.contents contents version,
Sharepoint.contents contents version, which will give us a faster refresh.
Now, this only works because the files are contained in a single folder. If
they're contained in subfolders, it won't work. So, depending on the refresh
won't work. So, depending on the refresh times you're currently experiencing, you may want to change your folder structure to make use of the refresh benefit from SharePoint. Contents. If you like this
SharePoint. Contents. If you like this video and learn something new, all I ask in return is that you subscribe and get notifications so you don't miss any of our future videos. And also share this
video with somebody else so that they too can stop wasting time waiting for Power Query to refresh. Thanks for
watching and I'll catch you next time.
Loading video analysis...