Questions? Call 877-752-7170 or [email protected]

Advanced Recon Techniques or How We Find All The Things (part 1)

By Nate Robb | March 1, 2019

After the rules of engagement are finalized but before we actually start a penetration test, we perform no-touch reconnaissance or “recon”, all without sending a single packet to our target organization’s environment. The goal is to identify all the in scope assets and data exposures, as the more coverage we attain of a target environment the more thorough we can be with our penetration testing. This information can include things like ip addresses, subdomains, code repositories, employee data or anything that tells us more about the company and introduces a crack in the external defenses. Whether a development server was stood up with an obscure subdomain name, an employee committed credentials to a public Github repository, or a sysadmin posted server logs to Pastebin, our goal is to find these exposures and assess potential risk to an environment before an attacker does.

DNS Enumeration

A common technique for identifying internet facing assets is through the use of DNS (Domain Name System). We can utilize various techniques to identify the subdomains mapped to servers within a target’s environment. A couple of great DNS enumeration tools which won’t be discussed in depth here but which use many of the techniques covered in this post are OWASP Amass and SubFinder. While these tools are a great starting point, it is important to understand how the tools work so that these techniques can be customized to extend coverage and scope even further. (NOTE: All domains, ip addresses, and results in the examples below are made up so as not to expose any company’s infrastructure data.)

Subdomain brute-force

Subdomain brute-forcing is the act of resolving a list of subdomains with a wordlist. With this technique, we are limited by our wordlists as a subdomain won’t be discovered if it is not in our wordlist.

One tool that generates excellent wordlists is Commonspeak2. Commonspeak2 uses publicly available datasets from Google BigQuery to generate subdomain wordlists from HackerNews and HTTPArchives latest scans. This means that subdomain wordlists are retrieved from live sites that are currently seen in the wild. Additionally, bug bounty hunter Jason Haddix has consolidated wordlists from various DNS discovery tools and posted the list on Github as all.txt. Combining these two lists and removing duplicates gives a nice several million word list to start with.

In the command below SubFinder is used to brute-force subdomains for the example.com domain with the CommonSpeak2 and all.txt wordlists, using 100 threads.

$ subfinder -v -o results.txt -d example.com --no-passive -b -w commonspeakandall.txt -t 100
===============================================
-=Subfinder v1.1.3 github.com/subfinder/subfinder
===============================================

Running enumeration on example.com
	
Starting Bruteforcing of example.com with 2563249 words
[BRUTE] mfa.example.com : 192.168.2.1
[BRUTE] auth.example.com : 192.168.3.23
[BRUTE] forums.example.com : 192.168.2.56
...	

Subdomain Permutation

Next we will create subdomain permutations based on patterns in the discovered subdomains to identify even more subdomains. Generally companies name their assets with some predefined naming scheme, so if we identify a subdomain named “dev-1.example.com” it is likely there might be an asset named “staging-1.example.com” or “test-2.example.com”.

A fantastic tool to automate this process is altdns. Altdns takes in a list of words that could be present in the target’s subdomains like test, dev, staging, and creates permutations of these words with the previously discovered subdomains. This allows coverage of even more subdomains that wouldn’t be discovered otherwise.

The command below uses altdns to create possible subdomain permutations by combining discovered subdomains and words.txt, a wordlist containing words like dev, test, staging, etc. The results would then be fed into SubFinder to identify any subdomain permutations that exist.

$ python altdns.py -i results.txt -o permutations.txt -w words.txt
	
$ cat permutations.txt
mfa-dev.1.example.com
prodmfa.example.com
auth.northamerica.example.com
12-auth.example.com
forums-admin.example.com
forums.aws.example.com
...

Multi-level Recursive DNS Brute Force

Last we will perform multi-level recursive DNS brute-forcing to find 4th, 5th, and 6th level domains. For example, a subdomain like “admin.mfa.example.com” would be found by brute-forcing the 4th level domain after “mfa.example.com” is discovered.

The command below is a for loop which recurses through our discovered subdomains to brute-force the 4th level domain of each with SubFinder.

$ for subdomain in $(cat 3rd-level-results.txt); do subfinder -v -o ${subdomain}-results.txt -d $subdomain --no-passive -b -w commonspeakandall.txt -t 100; done
===============================================
-=Subfinder v1.1.3 github.com/subfinder/subfinder
===============================================

Running enumeration on mfa.example.com
	
Starting Bruteforcing of mfa.example.com with 2563249 words
[BRUTE] admin.mfa.example.com : 192.168.2.11
[BRUTE] dev-1.mfa.example.com : 192.168.3.233
[BRUTE] test-2.mfa.example.com : 192.168.2.34
...
	
===============================================
-=Subfinder v1.1.3 github.com/subfinder/subfinder
===============================================

Running enumeration on auth.example.com
	
Starting Bruteforcing of auth.example.com with 2563249 words
[BRUTE] admin.auth.example.com : 192.168.2.12
[BRUTE] dev-3.auth.example.com : 192.168.3.223
[BRUTE] test-4.auth.example.com : 192.168.2.44
...
	

Internet-wide Scan Data

Through the use of various services that scan all internet connected devices and index the results, it is possible to identify in scope ip addresses, subdomains, open ports, and services, without sending any packets to our target.

Censys

Censys is a search engine which scans/indexes internet connected devices, designed by researchers at University of Michigan and the creators of ZMap Scanner. Censys performs weekly scans of many protocols for IPv4 hosts and daily scans for popular websites.

There are a variety of queries which can be used to discover assets, below are a couple favorites using the Censys API. The first searches ASNs associated with the company name and outputs all IP addresses of indexed hosts belonging to those ASNs, the second outputs the IP addresses of all indexed hosts with TLS certificates associated with a given domain name.

$ curl -X POST -H "Content-Type: application/json" \
-u Censys-API-ID:Secret \
-d '{"query":"autonomous_system.description:company name","page":1,"fields":["ip"],"flatten":false}' \
https://censys.io/api/v1/search/ipv4 \
| jq . \
| grep ip \
| cut -d '"' -f 4
	
192.68.11.45
192.68.79.57
192.68.79.75
192.68.79.17
192.68.79.43
192.68.79.64
...
$ curl -X POST -H "Content-Type: application/json" \
-u Censys-API-ID:Secret \
-d '{"query":"443.https.tls.certificate.parsed.extensions.subject_alt_name.dns_names:example.com","page":1,"fields":["ip"],"flatten":false}' \
https://censys.io/api/v1/search/ipv4 \
| jq . \
| grep ip \
| cut -d '"' -f 4
	
192.168.135.116
192.168.94.214
192.168.249.93
192.168.57.229
192.168.80.57
192.168.41.81
...

Shodan

Shodan is another search engine similar to Censys, which collects banners from specific services while performing scans of internet connected devices. Shodan’s crawlers are constantly scanning 24/7, the database is updated in real-time, and data includes IPv6 hosts (which Censys does not).

Shodan has it’s own query syntax and command-line interface, below are cli queries to find all web server assets owned by an organization showing 200 OK responses and all indexed devices within a CIDR IP range.

$ shodan search --fields ip_str,port,org,hostnames org:"Company Name" "200 OK"

192.168.63.14    8443    Company   auth.example.com
192.168.103.171  8443    Company   login.example.com
192.168.74.19    8081    Company   private.example.com
192.168.227.225  80      Company   mfa.example.com
...
$ shodan search --fields ip_str,port,org,hostnames net:192.168.0.0/16

192.168.144.112  80      Company   admin.example.com
192.168.157.53   445     Company   forums.example.com
192.168.148.110  8008    Company   dev01.test.example.com
192.168.176.92   443     Company   blog.example.com
...

Project Sonar

Project Sonar is another security research project which conducts internet wide scans and provides data sets. This data includes SSL/TLS certificates visible on public IPv4 HTTPS web servers, HTML content (index page) of all public IPv4 web servers, reverse DNS records for all IPv4 addresses, results of DNS “ANY” record requests from domain names gathered/TLD zone files, and scans of a variety of different TCP and UDP services.

A great data set for identifying subdomains belonging to a target is the Forward DNS data set. Below is a bash command which pulls down the 30gb compressed file via curl, extracts the file, greps out the domain we are searching for, and does some formatting magic to filter only the unique subdomains from the data.

$ curl -L https://opendata.rapid7.com/sonar.fdns_v2/2019-03-29-1553817918-fdns_any.json.gz \
| pigz -dc \
| grep -E "\.example.com" \
| jq .name \
| uniq \
| sed 's/\"//g'

auth.example.com
autodiscover.example.com
blog.example.com
forums.example.com
feedback.example.com
...

Certificate Transparency Search Engines

To curb certificate-based security threats, certificate transparency was created as an open framework to allow for the monitoring of TLS/SSL certificates. This can be helpful during reconnaissance to enumerate any web servers with issued TLS/SSL certificates belonging to an organization.

crt.sh

crt.sh is a certificate transparency search engine with publicly accessible results and output that can be neatly formatted in json. Below is a command to extract all of the subdomains from issued TLS/SSL certificates for a given target.

$ curl 'https://crt.sh/?q=%.example.com&output=json' \
| jq '.[].name_value' \
| sed 's/\"//g' \
| sed 's/\*\.//g' \
| sort -u
	
ar.example.com
blog.example.com
el.example.com
helpdesk.example.com
...

Certspotter

Certspotter is another certificate transparency search engine which provides an api and results formatted in json. As the certificates contained within each search engine might differ, it is wise to pull data from various different sources so as not to miss any valuable data. Below is a command to extract all of the subdomains from issued TLS/SSL certificates for a given target.

$ curl -s https://certspotter.com/api/v0/certs\?domain\=example.com 
| jq '.[].dns_names[]' 
| sed 's/\"//g' 
| sed 's/\*\.//g' 
| sort -u

auth.example.com
blog.example.com
admin.example.com
test.example.com
...

Stay Tuned..

Check back for part 2, where we discuss scraping passive data sets like archive.org, finding secrets in public code repositories, and locating user passwords from deleted Pastebin pages.