A question in a comment led me to an answer that I think deserves a wider audience, as it led me to explain how to remove a page from Google.
If you’ve ever wanted to do that, here’s how.
But before I get into that, here’s how this post started:
FortitudoX asked on Header tags and how to use them in SEO Copy:
Quick (Well it started quick and then grew) question for you; I use archive pages as main pages and thus have roughly 25 H1s for each author page. I know keyword density isn’t a huge factor but does using my key word 25 times in a H1 setting on 1 page affect my SEO negatively? My keyword pops up roughly 75 times per page, 1/3 from H1, 1/3 from body and 1/3 from URL. Example page can be found here:http://www.brilliantlifequotes…
This is one of those questions that I’ve never got an absolutely clear answer to – either from personal experience, or others’ recommendations.
Let’s try to get to the bottom of your situation, though.
There are two options open to you:
1. Leave well alone!
Writing For SEO has category pages such as this one. I don’t normally worry about pages like that getting spidered.
Why? I have no reason to believe that Google doesn’t understand a standard blog structure – or, more to the point, a standard WordPress structure. After all, over 20% of the top 10 million sites on the Internet are supposed to be hosted on WordPress, so there are lots of sites doing this.
And from what I’ve seen across many, many sites I’m convinced Google ignores duplicate content issues arising from things like category pages. However, the issue isn’t duplication on your pages, but over-optimisation – high key phrase densities.
If you look at the content on my Category pages and compare it to your Bertrand Russell page, the key phrase densities are much lower for my less focused content. I don’t have the same key phrase repeated time after time.
So if you’re finding your natural search engine results are getting worse, or you’ve never ranked for some of these terms, then you may want to try removing the author pages from Google’s index.
2. Get Google to ignore the pages
This may look a bit scary, but it’s really not that difficult – particularly if you’re using WordPress, as I think you are. So here goes…
1. Stop the pages getting spidered
The first thing to do is to put a meta tag in the header of each of the pages (or posts) that you want Google to forget about.
<meta name=”robots” content=”noindex,follow“/>
It tells search engine robots not to index the page, but to follow any links to other pages.
If you’re using WordPress, you can get this done much more simply by installing the WordPress SEO plug-in – the one everyone calls Yoast.
To insert the Robots meta tag into a Category, click on Categories under Posts on the WordPress dashboard. Then click on Edit under whichever category you want to stop being spidered. Use the Noindex this category pull-down to choose Always Noindex. You can also exclude the page from the sitemap by using the other pull-down. Click Update and you’re done.
If you want to Noindex a post, you’ll find the settings under the Advanced tab on the post edit screen. And there’s an identical tab for Noindexing pages.
2. Head over to Google Webmaster Tools
Choose Remove URLs:
Click on Create a new removal request, and paste in you URL. Then choose Remove from Index and Cache and submit the URL.
The URL will then show as Pending.
Come back in a few days and check if its status has changed.
Once it has, confirm your page has really been removed. Type:
cache:your page’s URL
into Google. A 404 error confirms your page is no longer in Google’s cache.
3. Just to be safe
OK. So Google has removed your URL.
There’s just one more thing to do to remove a page from Google, just to be safe. Add to your robots.txt file.
I set it up using the Files section in WordPress SEO, but you can create it as a text file and FTP it to your server.
To make doubly sure that my category pages aren’t spidered, I add this line:
and save my robots.txt file.
As I wrote this post, Google updated its robots.txt tester. You can find it in Google Webmaster Tools, where you click robots.txt Tester under Crawl.
The tool highlights any problems you have have with the file.
There’s lots more you can do with robots.txt. Visit this site for chapter and verse.
FortitudoX: Let me know if you decide to remove your authors pages from Google, and if it helps your site’s performance.
Indeed, have any other readers had success by removing potentially problematic pages from Google?
Thanks to Vox Efx for his great photo!