In case you missed the news, Google’s Webmaster Central announced that in accordance with the new Sitemap.org, you do not have to submit your XML sitemap to Google or other search engines any more. Instead, you can make your sitemap “autodiscoverable” by directing the visiting web crawler to the location of the sitemap.
A sitemap is an XML file on your blog’s server with a “roadmap” of all the posts on your blog and how to move through your blog. While blogging programs haven’t yet caught up with XML sitemaps as a built-in feature…yet…you can add an XML sitemap to your WordPress blog with the Google Sitemap Generator WordPress Plugin.
The new instructions for directing the search engine web crawler to the sitemap are:
- With a text editor, edit your site’s
robots.txtfile. - Add the following link to your sitemap location such as:
Sitemap: http://example.com/sitemap.xml
- Save the
robots.txtfile.
Search engine web crawlers will find the file and use it as a map to your blog’s contents.
So far, Google, Yahoo, Microsoft Live Search, and Ask.com all support the new autodiscovery protocol.
For information on what a robots.txt file is, how to make one, and other benefits, see:
- Create A Robots.txt File And Increase Your Search Engine Rankings
- Daily Blog Tips - Create a robots.txt file
- Enginemage’s Robots Listing
- Database of Search Engine Robots
- Crawler Alert: Be notified by Email when your web page has been spidered
- The User Agent Database (spiders, robots, harvesters, etc.)
- List of Robot Agent Strings
- The Web Robots Database
- Site Map Tool checks your spidered pages with Google.com’s database. Good for verifying titles and descriptions.
- Robots.txt Syntax Checker
- Search Engine Optimization Tools - Search Engine Spider Simulator
- How to block spambots, ban spybots, and tell unwanted robots to go to hell
- Using Apache to Stop Bad Robots
- Search Engine Robots that Search Your Site
- HTML Author’s Guide to the Robots Exclusion Protocol
- A Standard for Robots Inclusion
- All About Search Indexing Robots and Spiders
- Generating Simple URLs for Search Engines Robots and Spiders
Related Articles
- Search Engine Friendly: Helping Googlebot Crawl Your Blog
- How Search Engines See, Search, and Visit Your Website
- How People Search the Web and How They Can Find Your Blog
- Website Development - Search Engine Submission Preparation
- RSSTop55 - Best Blog Directory And RSS Submission Sites
- Secret Out - How Google Ranks Websites
- More Than You Want to Know - Search Engine Articles, Information, and Resources
- New Search Engines Help Users Find Blogs
- Do-It-Yourself Search Engine Optimization Guide
- Search Engine Site Submission Secrets
- Website Hammered by Hotlinking, Spammers, and Free Loaders?


Site Search Tags: sitemaps, google sitemaps, autodiscovery sitemaps, robots.txt, seo, search engine optimization, page rank, pagerank, ranking, search engines, getting found on the web, blogging tips, seo tips
Copyright Lorelle VanFossen, member of the 9Rules Network
Subscribe
Via Feedburner
Subscribe by Email










18 Comments
Hi Lorelle,
I love your site and ordered your book this past Friday. Do you have an idea as to when I might expect it in the mail?
A couple of other questions.
Regarding the robots.txt file, I have seen conflicting guidance on what is best … some feel that Google’s algorithms sort everything out and that comments are okay to include, and obviously some think not. In fact, I’ve looked at the robots file for some popular sites and some of them implement what you outline in the robots.txt discussions and some of them just have a two-liner that allows everything and disallows nothing (which is what I am presently using). What way should I really go on this?
In a related vein, if I try to exclude comments from the robots spidering, is the /wp-admin/comments really the right thing to use in all circumstances, or can it depend on the WP theme. For example, here is the link to some comments on one of my recent posts:
http://www.dkeener.com/keenstuff/blog/2007/05/31/three-huge-tech-developments-this-week/#comments
Note the /#comments is after the blog name and after the title and so on, and does not even reference wp-admin. Additionally, there does not seem to be a directory of /wp-admin/comments/ within wordpress …
I am a bit new to this, only having been at it for a few months, so it is all a bit confusing to me.
My final question, and thank you for your patience, is: should I delete posts that I think are junk? For example, I have several Site News posts where I talk about trying out new themes and so on, and I know for a fact that my readers just don’t care about that stuff … they just want me to stick to the blog’s focus, and I am finally beginning to understand that. But, seems like I read somewhere that Scoble recommends never deleting a post. I am inclined to do it anyway, as there are probably 15 entries out of my 150 that are “junking up my site.” But, your guidance in this area will be of paramount importance to me in deciding on this.
Best regards. I look forward to reading (studying) your book.
Bruce
I’ll check on the status of your book order and let you know.
On the questions, you are thinking too much. As someone (including me) recommended, narrow your focus to your content and quit tweaking.
If Google indexes your comments, it’s just more keyword connections for you if the comment content has value. I’m not writing in my blog post on this topic, but the answer, if someone else was seeking it, would be in this comment, right? So why rule out the value of comment contributions?
You do understand that this has nothing to do with the nofollow issue, right? Completely different point. Just let your comments be indexed and get more coverage and stop thinking so much about this new toy you are playing with.
Next, define “junk”. What is junk to some is value to others. If you don’t like the posts and they don’t add value, then sure, get rid of them, but put your energy into producing even better, more focused and helpful content. Use them to learn from and grow. Concentrate on what you will do not what you did do, except to learn from it and move forward.
Remember, even if you delete them, they are not “gone”. They are there for ever. And when you have reached 1500 posts, that’s the time to really look at what kind of body of work you’ve created and where your focus is. If you want to start focusing your blog content, do it from the moment fingers touch keyboard, not after publishing.
And you will find those tips and more in the book.
Thank you, Lorelle, for the great advice.
Actually I don’t understand the difference between the /wp-admin/comments exclusion in the robots.txt file and the nofollow. I assume it is covered in your book, though, and, even if not I will research it.
Thank you again!
The usage of
rel="nofollow"in a link is supposed to indicate to the search engine, specifically Google as I never heard of any other search engines adopting this, that the link was to get no credit. No link juice. No page rank influence. Google was to ignore the link.Many believed this meant that the link would also not be indexed, added to Google’s database. This was never true.
I don’t know why you want to exclude comments in your robots.txt file. Why bother? As I said, comments are content.
Thanks, again, Lorelle. I finally understand. I’ll not mess with the comments. Good points you brought out (of course).
Although I’ve heard about auto discovery of sitemap a long time back, I didn’t feel like changing because there was much controversy over the subject. Take that article on QuickOnlineTips which found that a single line could produce errors.
Anyway I’ll try implementing that today and will be checking through the webmasters tools if there are any errors.
Robot.txt file should always exist in the system, as Google, MSN and Yahoo have agreed to have the same Sitemap, it makes sense to maximize the use of Robot.txt intelligently.
Thanks! this site has been very helpful to me ^_^
Hi,
Thanks for this post. I have not done my sitemap yet on my site. Maybe that is tha answer to another question that has been bothering me. I posted on your Do It Yourself SEO post way back. You helped me and my SEO has been great…on Yahoo. I recently realized that while I come up first or on the first page for related searches on Yahoo, I am nowhere to be found for the same searches on Google. Any thoughts on why that is?
Google and Yahoo use very different methods for establishing page ranking. When I write my SEO instructions, they are typically geared towards Google as that is the favorite flavor of the month.
There are many reasons. It could be that you have less competition on Yahoo. It could mean that something you are doing is costing you points or you aren’t doing enough to earn posts on Google. Or it could be as simple as you aren’t using the right key words to search for your blog on Google. See Blog Challenge: What Keywords Make You Number One in Google for some tips on how to find out what makes you number one.
Excellent Information Lorelle!
Thanks
Chris
“Many believed this meant that the link would also not be indexed, added to Google’s database. This was never true. ”
Hey Lorelle,
Does this mean that, although comments never share pagerank, ranking juice of any kind, that the linked page may still be “spidered” and then indexed? If that’s the case, wouldn’t this also still be used as a significant source of SEO as it would let Google continue to crawl the site and then added to relevant searches? If this is the case, then is “nofollow” a threat to SEO at all?
Thanks,
Bryan
@ Bryan Miller - SEO “Guru” Exploiter:
Any link can be “spidered” (crawled is the better term) no matter where it is. Comments are added and indexed by Google, whether or not there is a nofollow. As I understand it, no other web crawler signed on for the nofollow tag.
The point in what you are asking is “added to relevant searches”. If the search terms are in ANY content, comments, post content, site content, or otherwise, it will appear in the search results. As for it’s relevancy, that’s the mysterious algorithm for Google.
Nofollow is dead. It never worked and it plays no part in SEO any more. People keep bringing it up, but it’s a lost cause. If the search terms are narrow enough, nofollow enabled links will rise to the top.
Remember, Google ain’t the only game in town. Also, Google gets their information through other search engines and directories, not just their own web crawler.
And as an SEO Guru, you really should know this. It’s old news.
Yes but regardless of them being autodiscoverable if you change your website and google has a cached version of your website; they will use the old one until they update it. It is useful to submit a new sitemap when you update your page.
The ‘no follow’ tags are said to be included in site maps so that they are not indexed by Google.but Lorelle seems to think otherwise. Whatever it is, their algorithms are never disclosed be it for PageRank or indexing.
@ Internet:
I haven’t said anything about nofollow in XML sitemaps (which are different from site maps - I wish I was in charge of naming things
) but Google and other search engines did not follow through on the nofollow tag. Google alone uses it and indexes all links with the nofollow but doesn’t apply it to PageRank points. Their algorithms are disclosed in their patent applications, with some things held back publicly or wrapped around complex legal language, but on this issue, even Google admits that nofollow was a great idea but didn’t work.
Hi Lorelle! I really appreciate the post. I at first thought a site map had to be created for any webpage. Maybe that is only the case with a wordpress blog? Thanks again.
Chris
@ Chris, Success4uTeam!:
A sitemap is a hidden file in your root directory. There are WordPress Plugins that will automatically update the XML sitemap when you add a new post, which makes it nice and simple - and forgettable. Hopefully, a built-in sitemap will be added to an upcoming version of WordPress. It was just added to WordPress.com so it could be coming to the full version. It’s becoming a standard as it was accepted by the major search engine players a year ago.
As for non-WordPress blogs, I don’t know what’s available to create a sitemap automatically, but if you have to create one manually and upload it every time to your server…I’d switch to something else or forget about it.
If you have a well designed website with strong intrasite navigation, and/or a site map, which is the name of a table of contents on your site, you don’t need one. It just helps. It doesn’t give you brownie points or anything. It just speeds up the process a bit and makes sure everything gets indexed.
15 Trackbacks/Pingbacks
[...] Sitemaps: Lorelle explains the new “autodiscovery” feature around sitemaps. It includes a vast collection of links related to the topic. [...]
[…] Lorelle on WordPress has advice on using the Google Sitemap Generator for WordPress. […]
[...] 1 查看 从Lorelle那里发现的消息。说搜索引擎可以从robots文件中提取sitemap的地址,而不用特地去搜索那里提交sitemap的地址。 [...]
[...] by Bob Morris on June 5th, 2007 Lorelle has the details. The plug-in installs easily and quickly, and automatically gives Google, Yahoo, and MSN more info [...]
[...] Read the full article [...]
[...] SEO Sitemaps Now Autodiscoverable: Easy and Automatic Roadmaps to Your Blog Content Tags: blog, Google, motori di ricerca, ottimizzazione, ranking, robots.txt, seo, sitemap [...]
[...] robots.txt file should exist in the root directory of your website. For example, if your domain is [...]
[...] Create or use a sitemap with a WordPress Plugin to build a roadmap/table of contents of your blog for search engine web crawlers. [...]
[...] how to submit your sitemap to Google, but you can thank Lorelle VanHousen for her post informing me sitemaps are now autodiscoverable. She said: “In case you missed the news, Google’s Webmaster Central announced that in [...]
[...] how to submit your sitemap to Google, but you can thank Lorelle VanFossen for her post informing me sitemaps are now autodiscoverable. She said: “In case you missed the news, Google’s Webmaster Central announced that in [...]
[...] how to submit your sitemap to Google, but you can thank Lorelle VanFossen for her post informing me sitemaps are now autodiscoverable. She said: “In case you missed the news, Google’s Webmaster Central announced that in [...]
[...] how to submit your sitemap to Google, but you can thank Lorelle VanFossen for her post informing me sitemaps are now autodiscoverable. She said: “In case you missed the news, Google’s Webmaster Central announced that in [...]
[...] how to submit your sitemap to Google, but you can thank Lorelle VanFossen for her post informing me sitemaps are now autodiscoverable. She said: “In case you missed the news, Google’s Webmaster Central announced that in [...]
[...] you how to submit your sitemap to Google, but a little bird named Lorelle VanFossen let me know sitemaps are now autodiscoverable. She said: “In case you missed the news, Google’s Webmaster Central announced that in [...]
[...] you how to submit your sitemap to Google, but a little bird named Lorelle VanFossen let me know sitemaps are now autodiscoverable. She said: “In case you missed the news, Google’s Webmaster Central announced that in [...]
Post a Comment