Skip navigation

SEO Sitemaps Now Autodiscoverable: Easy and Automatic Roadmaps to Your Blog Content

Search Engine OptimizationIn case you missed the news, Google’s Webmaster Central announced that in accordance with the new , you do not have to submit your XML sitemap to Google or other search engines any more. Instead, you can make your sitemap “autodiscoverable” by directing the visiting web crawler to the location of the sitemap.

A sitemap is an XML file on your blog’s server with a “roadmap” of all the posts on your blog and how to move through your blog. While blogging programs haven’t yet caught up with XML sitemaps as a built-in feature…yet…you can add an XML sitemap to your WordPress blog with the Google Sitemap Generator WordPress Plugin.

The new instructions for directing the search engine web crawler to the sitemap are:

  1. With a text editor, edit your site’s robots.txt file.
  2. Add the following link to your sitemap location such as:
    Sitemap: http://example.com/sitemap.xml
  3. Save the robots.txt file.

Search engine web crawlers will find the file and use it as a map to your blog’s contents.

So far, Google, Yahoo, Microsoft Live Search, and Ask.com all support the new autodiscovery protocol.

For information on what a robots.txt file is, how to make one, and other benefits, see:

Related Articles

Member of the 9Rules Blogging Network


Site Search Tags: , , , , , , , , , , , ,
Copyright Lorelle VanFossen, member of the 9Rules Network
Feed on Lorelle on WordPress Subscribe Feedburner iconVia Feedburner Subscribe by Email

21 Comments

  1. Posted June 3, 2007 at 8:12 am | Permalink

    Hi Lorelle,

    I love your site and ordered your book this past Friday. Do you have an idea as to when I might expect it in the mail?

    A couple of other questions.

    Regarding the robots.txt file, I have seen conflicting guidance on what is best … some feel that Google’s algorithms sort everything out and that comments are okay to include, and obviously some think not. In fact, I’ve looked at the robots file for some popular sites and some of them implement what you outline in the robots.txt discussions and some of them just have a two-liner that allows everything and disallows nothing (which is what I am presently using). What way should I really go on this?

    In a related vein, if I try to exclude comments from the robots spidering, is the /wp-admin/comments really the right thing to use in all circumstances, or can it depend on the WP theme. For example, here is the link to some comments on one of my recent posts:

    http://www.dkeener.com/keenstuff/blog/2007/05/31/three-huge-tech-developments-this-week/#comments

    Note the /#comments is after the blog name and after the title and so on, and does not even reference wp-admin. Additionally, there does not seem to be a directory of /wp-admin/comments/ within wordpress …

    I am a bit new to this, only having been at it for a few months, so it is all a bit confusing to me.

    My final question, and thank you for your patience, is: should I delete posts that I think are junk? For example, I have several Site News posts where I talk about trying out new themes and so on, and I know for a fact that my readers just don’t care about that stuff … they just want me to stick to the blog’s focus, and I am finally beginning to understand that. But, seems like I read somewhere that Scoble recommends never deleting a post. I am inclined to do it anyway, as there are probably 15 entries out of my 150 that are “junking up my site.” But, your guidance in this area will be of paramount importance to me in deciding on this.

    Best regards. I look forward to reading (studying) your book.

    Bruce

  2. Posted June 3, 2007 at 10:10 am | Permalink

    I’ll check on the status of your book order and let you know.
    On the questions, you are thinking too much. As someone (including me) recommended, narrow your focus to your content and quit tweaking.
    If Google indexes your comments, it’s just more keyword connections for you if the comment content has value. I’m not writing in my blog post on this topic, but the answer, if someone else was seeking it, would be in this comment, right? So why rule out the value of comment contributions?
    You do understand that this has nothing to do with the nofollow issue, right? Completely different point. Just let your comments be indexed and get more coverage and stop thinking so much about this new toy you are playing with.
    Next, define “junk”. What is junk to some is value to others. If you don’t like the posts and they don’t add value, then sure, get rid of them, but put your energy into producing even better, more focused and helpful content. Use them to learn from and grow. Concentrate on what you will do not what you did do, except to learn from it and move forward.
    Remember, even if you delete them, they are not “gone”. They are there for ever. And when you have reached 1500 posts, that’s the time to really look at what kind of body of work you’ve created and where your focus is. If you want to start focusing your blog content, do it from the moment fingers touch keyboard, not after publishing.
    And you will find those tips and more in the book.

  3. Posted June 3, 2007 at 10:17 am | Permalink

    Thank you, Lorelle, for the great advice.

    Actually I don’t understand the difference between the /wp-admin/comments exclusion in the robots.txt file and the nofollow. I assume it is covered in your book, though, and, even if not I will research it.

    Thank you again!

  4. Posted June 3, 2007 at 9:25 pm | Permalink

    The usage of rel="nofollow" in a link is supposed to indicate to the search engine, specifically Google as I never heard of any other search engines adopting this, that the link was to get no credit. No link juice. No page rank influence. Google was to ignore the link.

    Many believed this meant that the link would also not be indexed, added to Google’s database. This was never true.

    I don’t know why you want to exclude comments in your robots.txt file. Why bother? As I said, comments are content.

  5. Posted June 4, 2007 at 5:43 am | Permalink

    Thanks, again, Lorelle. I finally understand. I’ll not mess with the comments. Good points you brought out (of course).

  6. Posted June 6, 2007 at 1:53 am | Permalink

    Although I’ve heard about auto discovery of sitemap a long time back, I didn’t feel like changing because there was much controversy over the subject. Take that article on QuickOnlineTips which found that a single line could produce errors.

    Anyway I’ll try implementing that today and will be checking through the webmasters tools if there are any errors.

  7. Posted June 16, 2007 at 12:57 am | Permalink

    Robot.txt file should always exist in the system, as Google, MSN and Yahoo have agreed to have the same Sitemap, it makes sense to maximize the use of Robot.txt intelligently.

  8. Posted June 19, 2007 at 10:19 am | Permalink

    Thanks! this site has been very helpful to me ^_^

  9. Posted June 20, 2007 at 3:53 am | Permalink

    Hi,
    Thanks for this post. I have not done my sitemap yet on my site. Maybe that is tha answer to another question that has been bothering me. I posted on your Do It Yourself SEO post way back. You helped me and my SEO has been great…on Yahoo. I recently realized that while I come up first or on the first page for related searches on Yahoo, I am nowhere to be found for the same searches on Google. Any thoughts on why that is?

  10. Posted June 20, 2007 at 11:22 am | Permalink

    Google and Yahoo use very different methods for establishing page ranking. When I write my SEO instructions, they are typically geared towards Google as that is the favorite flavor of the month.

    There are many reasons. It could be that you have less competition on Yahoo. It could mean that something you are doing is costing you points or you aren’t doing enough to earn posts on Google. Or it could be as simple as you aren’t using the right key words to search for your blog on Google. See Blog Challenge: What Keywords Make You Number One in Google for some tips on how to find out what makes you number one.

  11. Posted June 24, 2007 at 9:08 am | Permalink

    Excellent Information Lorelle!

    Thanks
    Chris

  12. Posted December 24, 2007 at 11:27 am | Permalink

    “Many believed this meant that the link would also not be indexed, added to Google’s database. This was never true. ”

    Hey Lorelle,
    Does this mean that, although comments never share pagerank, ranking juice of any kind, that the linked page may still be “spidered” and then indexed? If that’s the case, wouldn’t this also still be used as a significant source of SEO as it would let Google continue to crawl the site and then added to relevant searches? If this is the case, then is “nofollow” a threat to SEO at all?

    Thanks,
    Bryan

  13. Posted December 24, 2007 at 2:49 pm | Permalink

    @ Bryan Miller – SEO “Guru” Exploiter:

    Any link can be “spidered” (crawled is the better term) no matter where it is. Comments are added and indexed by Google, whether or not there is a nofollow. As I understand it, no other web crawler signed on for the nofollow tag.

    The point in what you are asking is “added to relevant searches”. If the search terms are in ANY content, comments, post content, site content, or otherwise, it will appear in the search results. As for it’s relevancy, that’s the mysterious algorithm for Google.

    Nofollow is dead. It never worked and it plays no part in SEO any more. People keep bringing it up, but it’s a lost cause. If the search terms are narrow enough, nofollow enabled links will rise to the top.

    Remember, Google ain’t the only game in town. Also, Google gets their information through other search engines and directories, not just their own web crawler.

    And as an SEO Guru, you really should know this. It’s old news. ;-)

  14. NYC
    Posted July 2, 2008 at 11:38 am | Permalink

    Yes but regardless of them being autodiscoverable if you change your website and google has a cached version of your website; they will use the old one until they update it. It is useful to submit a new sitemap when you update your page.

  15. Internet
    Posted July 2, 2008 at 10:45 pm | Permalink

    The ‘no follow’ tags are said to be included in site maps so that they are not indexed by Google.but Lorelle seems to think otherwise. Whatever it is, their algorithms are never disclosed be it for PageRank or indexing.

  16. Posted July 3, 2008 at 8:26 am | Permalink

    @ Internet:

    I haven’t said anything about nofollow in XML sitemaps (which are different from site maps – I wish I was in charge of naming things :D ) but Google and other search engines did not follow through on the nofollow tag. Google alone uses it and indexes all links with the nofollow but doesn’t apply it to PageRank points. Their algorithms are disclosed in their patent applications, with some things held back publicly or wrapped around complex legal language, but on this issue, even Google admits that nofollow was a great idea but didn’t work.

  17. Chris, Success4uTeam!
    Posted July 3, 2008 at 9:07 am | Permalink

    Hi Lorelle! I really appreciate the post. I at first thought a site map had to be created for any webpage. Maybe that is only the case with a wordpress blog? Thanks again.

    Chris

  18. Posted July 3, 2008 at 6:30 pm | Permalink

    @ Chris, Success4uTeam!:

    A sitemap is a hidden file in your root directory. There are WordPress Plugins that will automatically update the XML sitemap when you add a new post, which makes it nice and simple – and forgettable. Hopefully, a built-in sitemap will be added to an upcoming version of WordPress. It was just added to WordPress.com so it could be coming to the full version. It’s becoming a standard as it was accepted by the major search engine players a year ago.

    As for non-WordPress blogs, I don’t know what’s available to create a sitemap automatically, but if you have to create one manually and upload it every time to your server…I’d switch to something else or forget about it.

    If you have a well designed website with strong intrasite navigation, and/or a site map, which is the name of a table of contents on your site, you don’t need one. It just helps. It doesn’t give you brownie points or anything. It just speeds up the process a bit and makes sure everything gets indexed.

  19. Posted September 6, 2008 at 5:32 am | Permalink

    Hi,

    Nice article you’ve got.. I wonder what’s the essence in having a sitemap even though there’s already a rss feed available. Please clarify my confusion.. Thanks..

  20. Posted September 6, 2008 at 8:25 am | Permalink

    @ mcaronan:

    For someone with SEO in their domain name, I’m confused on why this should be explained. A feed is an apple. A sitemap is an orange. A site map is a tomato. Are you talking about an XML sitemap or a “table of contents on your site” site map?

    A feed is a version of your site’s content that can be read through feed readers. Your feed may be indexed by search engines, but not all search engines index feeds.

    An XML sitemap is a “version” of your site recognized and used by several major search engines to index your site. It is not visible to the reader. It works behind the scenes. Those search engines recognizing XML sitemaps use it to index your site, connecting all the pieces together to ensure there are no page orphans – pages not indexed by the search engines.

    A site map is a table of contents listing of your site’s web pages. It may be categorized by categories (subject matter), an alphabetical listing of post titles, or by archives (chronological) order. It is displayed on a pseudo-static page for blogs and a static web page for static sites. While these help search engines track down all the pages within your site, the site map is mostly used by readers trying to find specific content on your site.

    Does that clarify things? I know it’s confusing, and I wish I were in charge of naming things to help clarify much of the confusion over names.

  21. David
    Posted October 17, 2010 at 8:09 pm | Permalink

    The use of sitemap is indeed confusing; appears in too many places. I have Google Custom Search Engine employed for my site and there is an option in my Google CSE account to submit a sitemap to direct Google to index my site for the purpose of the CSE; this is separate from the indexing of my site for normal Google Search. When I direct Google to index my site for CSE, it eats into my 50 free on-demand pages. After the freebies are gone, I can always pay to continue getting on-demand pages. And as far as I can tell, each time I ask for an index update, it eats into my on-demand pages quota. This is another way for Google to monetize their search engine technology.


17 Trackbacks/Pingbacks

  1. [...] Sitemaps: Lorelle explains the new “autodiscovery” feature around sitemaps. It includes a vast collection of links related to the topic. [...]

  2. […] Lorelle on WordPress has advice on using the Google Sitemap Generator for WordPress. […]

  3. [...] 1 查看 从Lorelle那里发现的消息。说搜索引擎可以从robots文件中提取sitemap的地址,而不用特地去搜索那里提交sitemap的地址。 [...]

  4. [...] by Bob Morris on June 5th, 2007 Lorelle has the details. The plug-in installs easily and quickly, and automatically gives Google, Yahoo, and MSN more info [...]

  5. [...] Read the full article [...]

  6. [...] SEO Sitemaps Now Autodiscoverable: Easy and Automatic Roadmaps to Your Blog Content Tags: blog, Google, motori di ricerca, ottimizzazione, ranking, robots.txt, seo, sitemap [...]

  7. [...] robots.txt file should exist in the root directory of your website. For example, if your domain is [...]

  8. [...] Create or use a sitemap with a WordPress Plugin to build a roadmap/table of contents of your blog for search engine web crawlers. [...]

  9. [...] how to submit your sitemap to Google, but you can thank Lorelle VanHousen for her post informing me sitemaps are now autodiscoverable. She said: “In case you missed the news, Google’s Webmaster Central announced that in [...]

  10. [...] how to submit your sitemap to Google, but you can thank Lorelle VanFossen for her post informing me sitemaps are now autodiscoverable. She said: “In case you missed the news, Google’s Webmaster Central announced that in [...]

  11. [...] how to submit your sitemap to Google, but you can thank Lorelle VanFossen for her post informing me sitemaps are now autodiscoverable. She said: “In case you missed the news, Google’s Webmaster Central announced that in [...]

  12. [...] how to submit your sitemap to Google, but you can thank Lorelle VanFossen for her post informing me sitemaps are now autodiscoverable. She said: “In case you missed the news, Google’s Webmaster Central announced that in [...]

  13. [...] how to submit your sitemap to Google, but you can thank Lorelle VanFossen for her post informing me sitemaps are now autodiscoverable. She said: “In case you missed the news, Google’s Webmaster Central announced that in [...]

  14. [...] you how to submit your sitemap to Google, but a little bird named Lorelle VanFossen let me know sitemaps are now autodiscoverable. She said: “In case you missed the news, Google’s Webmaster Central announced that in [...]

  15. [...] you how to submit your sitemap to Google, but a little bird named Lorelle VanFossen let me know sitemaps are now autodiscoverable. She said: “In case you missed the news, Google’s Webmaster Central announced that in [...]

  16. […] SEO Sitemaps Now Autodiscoverable: Easy and Automatic Roadmaps to Your Blog Content […]

  17. […] SEO Sitemaps Now Autodiscoverable: Easy and Automatic Roadmaps to Your Blog Content […]

Post a Comment

Follow

Get every new post delivered to your Inbox.

Join 20,104 other followers

%d bloggers like this: