Python webscraping weather & tide data and information about tides

I have been doing work on my bespoke Epherems data using either Python libraries or API’s to fetch the data, so I thought I’d explore getting web data via web scraping process.

This is not something I’m keen on as I’ve had a lot of frustrations with the method in the past, but thought it was time for another attempt.

Scraping Canvas PNG image from website

Whilst doing the video for scraping data I came across the canvas element fir creating a PNG image so that you couldn’t scrape the data , so I ended up diving down a rabbit hole seeing if I could extract data from a PNG file that was generated by JavaScript for the weather page to try and prevent scraping , this is the video on using OCR (Optical Character Recognition) library to try and get the data from the image.

Information about Tides when scraping tide data

Further to getting data for tides via an api, where the high tides were measured as a positive integer and the low tides by a negative integer, which I think assumed that the Datum was at Mean Sea Level, I decided to look into tidal issues, datums (or data?), where charts are set from and other information about setting of tides around New Zealand, and to use python to scrape website for tide data.

End comment

After doing the web scraping process with python I like the appeal of getting data from the back-end for a dynamic web page if at all possible.

I found the Selenium and beautiful soup method hard work and very time consuming. I know its there but will try and use API’s or other methods of gathering information for automation instead.

I’m pleased to have revisited the web scraping process, there are a few new things there (eg PNG and canvas elements), and python does have a variety of tools to deal with it also.