Automatic web pages download with C# browser control

Automatic web pages download with C# browser control

I have decided to write a short post on how to automate page download from the Visual Studio using Browser control after my long (longer than one day 🙂 ) very hard research on the internet. Couple of days ago I have decided to start my new personal project which is offline feed reader named ‘Crocodile’ (more on this in the near future) and after googling here and there I have decided to write the app in C#. The platform and language decision was mainly based on the fact that I couldn’t find anything like ‘Browser control’ in Java and mastering another language was also very tempting.
Browser control in VS is just stripped down version of IE and allows easy access to many features of normal IE browser straight from C# code. Everything looked very nice and things went smooth for a couple of development hours until I tried to implement automatic page saving. Using only methods exposed by browser control we can download the page by first displaying the “Save As..” dialog which is definietly not what I wanted.
I have decided that my new shiny ‘Crocodile’ will download the pages together with all goodness/badness like CSS, JS etc to see the pages exactly like they look online and I definietly don’t want to answer ‘OK’ to every single feed I want to download (about 100 a day).
I have found a simple solution on one of the VB or C# forums (don’t remember where exactly):

 
CDO.MessageClass message = new CDO.MessageClass(); 
message.CreateMHTMLBody(i.Link, CDO.CdoMHTMLFlags.cdoSuppressObjects, "", ""); 
ADODB.Stream st = message.GetStream();
 st.SaveToFile("file.mht", ADODB.SaveOptionsEnum.adSaveCreateOverWrite);

All you have to do is add reference to Microsoft CDO and you are ready to go! There is a number of CdoMHTMLFlags to experiment with to suppress CSS stylesheets, images etc.
The drawback of this approach is the MHT format which can be used only by IE.

3 responses on “Automatic web pages download with C# browser control

  1. Frank September 6, 2008 at 9:53 pm

    Another nice and easy way to automate file downloads is to use the iMacros software. Even the free versions can automate file downloads:
    http://www.iopus.com/imacros/compare/all/

  2. Ra February 25, 2009 at 9:56 pm

    I recently decided to write my own little application scheduler in c# since windows vista’s native scheduler kept crashing my programs on execution. In anycase, I got the sceduler working and want to add more functionality to it. Have you perhaps found another little code snippet like the above one that will allow me automaticly download a file from a website?

  3. Asp.Net September 10, 2010 at 6:43 am

    I have tried to download page automatically this way, but it did not work, 🙁

Leave a Reply