Excel web scraping and connecting to database. Dynamic updating of Excel imported data
After writing the previous post on web scraping and the example given of using Google Spreadsheet to obtain a web table I started wondering about Excel. I was aware that it had a web connection so decided to explore those.
This first video shows how to automate the data scraping from websites into Excel:
Then there is this video on how to dynamically update the data that has been imported:
You can also import data to Excel from an API, there are a couple of videos on YouTube on this. Here is one of them. The only thing that i can see that is slightly different from the standard URL web request is that the data could come back in an XML or JSON format and this needs to be adjusted when coming into Excel.
So I thought I would play with the basic import of data from the web to a Table in Excel.
Importing Sharemarket Table from Web to Excel and allow for Automatic update
The first document below is not the data we want, so we’ll check the table
So, the steps to the above image import the table.
Now we want to be able to refresh that data when we reopen the file and I have also set it to refresh every 5 minutes. Just to make sure that the data is current.
Importing Table from Database to Excel and allow for Query of Data
Here is a video that shows the process of importing a table from MS SQL to Excel:
My Process
First In the Data Tab I click on Get Data and Choose From Database & then from SQL Server Database. (There does not seem to be very many non- microsoft options here, like MySQL or PostGresSQL etc)
Choose Advanced options:
Choose server, and database as you most probably have an idea of what you want to query in the database, and this shows you the SQL statement box.
Go to your SQL Management Studio to get your Server Name & Database name
For querying a specific table ( you may want to query several tables with INNER, OUTER JOINS etc) just do a basic query on the table
You can see the SQL query, so you can copy/paste to the Excel Query Box
I have altered it slightly by using select * (all) anmd using a WHERE statement to only choose one buildings Room Data.
After pressing OK it has a pop-up in Excel showing the tabulated data. You can check that you have the right information (if not, go back and alter query- I personally would develop the Query in SQL Management Studio, as it would be quicker- then paste it into Excel)
Use the Load To.. as this will give you options on where you want to place the data (New/Existing Worksheets and which Cell)
And the data comes in. If the database is being constantly updated you can use the REFRESH ALL button to update it , and if you want to modify the QUERY click on the highlighted QUERY on the right
Then Click on the SOURCE text in the box on the right hand side and this brings up the query that is for the table. You can alter it ( eg change project ID to another) press the OK button and update the table.
End Comments
The web scraping and dynamically updating tables/data is interesting. A good way to keep tables current if they are using web data that is changing over time.
The Excel from Database is also quite useful. I have loaded CSV files that have a phenomenal amount of rows and so Excel just slows to a crawl as its trying to manage that data. Putting the data into a SQL table and then using Excel to extract only the data it needs to analyse is definitely an alternative method. This will allow all the Excel tools to be used rapidly on a much smaller data set, which it does very well.
Definitely both processes are useful.