I read a very interesting post a few days ago about TOR. For those who don’t know about it, have a read up at https://www.torproject.org/ . Its a very good anonymity system, however it has gotten allot of slack in the media because allot of dodgy things go on in the deep web. One thing I wanted to do was search through the hidden services, however…that’s not really possible. From several articles i found online, most suggest having an exit node configured and then waiting a few days until you get a listing of all the hidden service…kind of like a DNS server does after is has been online for a while. However I find the idea of having an active exit node running from a service registered to my name a little daunting…because of said dodgy activities on the network. SO i set up a TOR proxy on one of my home servers and set up a very basic trawler/web-crawler.
TORs hidden service addresses are not like your usual www.hotmail.com addresses. They are 16 char hashes followed by .onion. EG 2mzqwfzoidbeqxj5.onion . I created a very quick PHP script that randomly generates a 16 char .onion address, grabs the index page from that address is valid, and dumps it to a SQL database. I realize this is a very long winded, and un-intelligent way of doing it, but her maybe ill find something interesting…
You can download the code here -> tordiver
Im running this on a windows system and using teh following prerequisites
- GNU wget tool -> https://www.gnu.org/software/wget/
- Make sure you can run SYSTEM calls via PHP