MetaGooFil: Scraping Google For File MetaData So You Don’t Have To

MetaGooFil is a Python script written by Christian Martorella, who also wrote the Harvester to collect email addresses.  MetaGooFil automates Google queries looking for files (PDF, Word, etc.) related to the specified domain name.  Then it extracts metadata from those files.

This metadata might be useful for pen testing, and could include information like the file name, size, the username of the author, the file location on disk, and so on.

How do I get it?

As with many other tools, if you have Kali Linux, it’s automatically installed.  Otherwise, check out the MetaGooFil repository here.  On the off-chance that you don’t have it, you can type this to get it:

apt-get install metagoofil

How do I use MetaGooFil?

You’ll need to navigate to the /usr/bin/ directory:

cd /usr/bin/

And then invoke the script:

./metagoofil [options here]

To specify a domain, use -d and then the domain name:

./metagoofil -d example.com

To limit filetypes to a particular type, you can type -t followed by the file types (comma-separated).  The options are:  pdf,doc,xls,ppt,odp,ods,docx,xlsx,pptx.  You can also limit the number of files by using the -n switch, followed by a number.  For example:

./metagoofil -d example.com -t pdf,pptx -n 30

Lastly, you can specify a location on your local machine for saving downloaded files with the -o option, and use -f for the output file name:

./metagoofil -d example.com -t pdf,pptx -n 30 -o results_dir -f results.html

If you download a bunch of files, you can run the tool without the -d and domain name, and use -h yes instead to do analysis on files already in the output directory.

Results

Examples like microsoft.com or syngress.com didn’t reveal anything, but you may have better luck with other targets.  If the target websites did have files of the specified filetype, MetaGooFil would strip out useful information (path names, usernames, etc.) and stick them in a file for us.

Beck knows what’s up