Tech Support Guy banner
Status
Not open for further replies.

Website Crawl Question

522 views 2 replies 3 participants last post by  NaderHussain 
#1 ·
Does anyone know a simple way to crawl a website to extract text data, with the URL changing by one interval for each page checked, as in a range? For example - website.com/1000 - website.com/1999. Thanks in advance.
 
#3 ·
This would involve web scripting.
At the scripting level, a number of requirements would be the following.
Parsing script on a loaded web page.
A search routine on the URL's text for the subpage name with a numerical component in it.


I think virtual frames would be required to open the website without displaying it on the screen or browser, too. The process is not all that simple. You may want to consider using a web scripting language for this. JavaScript, PHP and Python are some examples of web scripting languages. It is better to keep web scraping scripts in separate within a website's script.
 
Status
Not open for further replies.
You have insufficient privileges to reply here.
Top