Many tools provide the ability to map out websites, but often you are limited to style of output or the location in which the results are provided. This base plate for a spidering script allows you to map out websites in short order with the ability to alter them as you please.
In order for this script to work, you'll need the BeautifulSoup
library, which is installable from the apt
command with apt-get install python-bs4
or alternatively pip install beautifulsoup4
. It's as easy as that.
This is the script that we will be using:
import urllib2 from bs4 import BeautifulSoup import sys urls = [] urls2 = [] tarurl = sys.argv[1] url = urllib2.urlopen(tarurl).read() soup = BeautifulSoup(url) for line in soup.find_all('a'): newline = line.get('href') try: if newline[:4] == "http": if tarurl in newline: urls.append(str(newline)) elif newline[:1] == "/": combline = tarurl+newline urls.append...