Whether Google Webmaster Tools link data is reliable enough for a successful link cleanup or not in not a new topic in the SEO community.
Last year John Mueller explained that it’s enough to use GWT data and pointed out that links missing from the report weren’t significant anyway. Later John added that using 3rd party tools is not necessarily a bad idea as it can give us a greater ability to process link data and find problematic areas (e.g. anchor text abuse and other spammy patterns).
I’d like to highlight one very important point that most webmasters are missing here.
Why is Google showing us examples of manipulative links?
The purpose of examples in Google’s Manual Actions page is to give us an idea of what type of links are in breach of their guidelines. It’s an educational device to help us recognise bad links when we see them. It was never meant to be the case of “fix these 14 links and you’ll be sweet.” More than once Google’s search quality team has lifted manual penalty for websites which have demonstrated good faith and significant efforts in their cleanup process.
Document your link cleanup process and show evidence of genuine effort to Google’s webspam team when filing a reconsideration report. One or two missing links from the sample link report is not going to be a problem for them.
The Real Issue
One problem which remains unsolved however, is link data for very large websites. As I write this one of my guys is bullet-proofing a big corporate website ensuring the links are squeaky clean. What they’re finding is that the sample of links Google provides is too thin in proportion to the true link profile of this gigantic site, the type of data we can get using 3rd party tools.
So Googlers if you’re reading this, here’s an idea. Scale the amount of links in Google Webmaster Tools in line with the size of website’s link profile to make sure we have enough data to work with.