This is the special program of convertation of the list of URL-s into the list of rows with text and links from webpages. Tool to update and expand of links.
Written in VBScript. MS Internel Explorer and MS Excel are used also.
The main idea: how to extract the information from webpages? Both text and links are needed.
The manual way:
Our script do it. And what we have? Many rows are empty! Therefore the script delete all empty rows.
Well now we have the set of rows from the specified URL.
We can put them into the other worksheet – “Allpages”. But “Allpages” already have the others rows. The script scan the existing rows to try find the equal. If no equals – it add the row into the Allpages sheet (rows are equal if the text in the cells from “E” to “O” columns are the same).
This thing the script do for all rows.
But how to avoid this process for links to do the optimization? Local html-files. Each time the script save all downloaded files on the hard disk. Next time it compare the new webpage with one saved. If it have the differents — the row equal checking will be run. If the old file is the same – no updates in “AllPages”.
To do the equal checking manually you'll need to delete all html-files.
More simplest way: you have own website and can to change it manually.
When the script download a webpage – it create the work file “work.txt” and the worksheet “work” in the MS Escel workbook. Later it delete this worksheet.
Before the script running the workbook “allpages.xls” should exists and have the worksheet “Links” inside.
When the script is running you should not do anything on the computer. Because the script uses Sendkeys operation to emulate the human work.
How to kill vbscript if any errors was:
When a webpage is not found or any other network errors – the script not support this problem.
When copying into the “Allpages” worksheet – the webpage will truncated into first 10 columns. Seems it's not problem because most webpages have 2-4 columns in the excel worksheet.
The script remove the background from cells, pictures. But the font color not changed (because the links have the blue color in MS Escel).