Importing XML data into Excel allows you to have a readily available data source that can be refreshed on Demand
Once that data loaded into Excel you can use Excels features to explore, filter, sort or augment that data from other sources
What is an XML file?
An XML file is a plain text file that uses custom tags to describe the structure and features of the document. It can be opened and edited with text editors, web browsers, online editors, and application-specific programs. XML files are often used to store and exchange data between different systems or applications. For example, XML files can be used for web feeds, configuration files, office documents, and databases. If you want to learn more about XML, its uses, and how to view or edit XML files, you can check out these resources:
Typically a website uses an xml file for its sitemap, this contains details of each url within a website
Here is an example from an SEO consulting website
If we look directly at the XML data in its raw form it looks like this
Its not simple to work with at this point
We have saved a copy of this XML file and we will use Power Query to import this static file into Excel
Using Power Query to import data from an XML file
Go to the Data tab and click on Get Data -> From File -> From XML.
Select the XML file that you want to query and click Import.
In the Navigator window, select the table or element that you want to import.
In the Power Query Editor, you can expand, filter, or modify the data as needed.
Click Close & Load to load the data into a worksheet.
We can now see the data as an easy to read data table in Excel, Excel has parsed the data into neat columns.
Although this data can be refreshed it only refreshes from the file it was loaded from. As this is only an example this is fine, but ideally we want to create a refreshable connection
Connect Excel to a sitemap on a website
For this example we will use our own sitemap from the Select Distinct Website
In Excel we will create a connection to this web based data source
On the data ribbon, click on the From Web icon
Then, in the URL field paste in the URL of the sitemap XML location
Click OK and the data is loaded into the navigator
Then click Load to import the data to the spreadsheet
A Connection is created to the URL and a table is created in Excel that can be refreshed on demand
This can then be used to have a convenient list of all pages on our website
Use cases for importing a sitemap in Excel
We can export data from Google Search Console to see traffic to each webpage, but, this does not show us pages that have no traffic
By having a list of all of the URLs we can then simply use a look up to return the traffic for each URL using the URL as a lookup reference
Once we can see which pages have seen no impressions, we can then look into them in more detail
(Or ask Fisher SEO: Freelance eCommerce Consultancy to take a look on our behalf
Conclusion
Now you can create a refreshable dataset from a website sitemap and then go on to use that to analyse web page performance
Subscribe to our channel to see more tips and timesavers
Select Distinct YouTube Channel
Or find other useful SQL, Power BI or other business analytics timesavers in our Blog
Our Business Analytics Timesavers are selected from our day to day analytics consultancy work. They are the everyday things we see that really help analysts, SQL developers, BI Developers and many more people.
Our blog has something for everyone, from tips for improving your SQL skills to posts about BI tools and techniques. We hope that you find these helpful!