![]() |
Google Sitemaps: A Little 411 For All
Oops .. meant to start this as a new thread (kind of a 6K post thing...) Anyhow, a big read ...
Here is a few tips and observations regarding Google Sitemaps. If you don't know, or haven't heard about it, I'd highly recommend getting up to speed. (Btw, this is not the same as having a sitemap.html page on your site!) After a lot of investigating, testing, frustration and some success, I've come up with a few things to point out, feel free to dispute, add, etc.: 1) Sitemaps, when done properly, is a great tool when it comes to indexing your site. It doesn't guarantee anything in terms of rankings, but you can more quickly get sub-pages indexed and have far more control in terms of what appears in Google's index. 2) Use the recommended Sitemap method Google suggests. That is, build the sitemap with the Python script they provide. Then ping them every time you update it. (There's full details in the Google Sitemaps FAQ). From my experience, the other options are far less likely to help, and may actually hurt you. (I know) 3) This from a Google engineer I talked to in New Orleans last year: Your robots.txt will override the Sitemap file you register if there are any conflicting details (e.g. your robots.txt file says to disallow access to a page which appears in your sitemap file - it won't include it) 4) The Google sitemap Control Panel is very useful for much more than monitoring your sitemap. It verifies your robots.txt file, provides a simple place to see links to your site, indexed pages, etc. and if you verify there is a wealth of details available. 5) Big thing to note about the Robots.txt file (also if you find out you cannot Verify your site with the file they provide). If you are using an ErrorDocument directive in your .htaccess for 404s - redirecting to another page, something most of us do - there's a fair chance Google isn't reading your robots.txt file properly. This will also prevent Google from verifying your site if you try it. Use the Sitemap Control Panel to see if Google is fetching your Robots.txt. Just because your logs say Googlebot is reading it, doesn't mean it is. If you see a "file not found" in the Sitemaps for your robots.txt it's a good bet that is why. Solution? Remove the ErrorDocument directive - but, of course, for many of us that's not a fair trade-off. 6) There will be some overhead in terms of time, but if you don't update your site daily it won't be a big chore. A programmer or admin can set up a cron job on your server to automatically build the sitemap file whenever the site is updated (pages added, deleted, etc.) With a little coding you can make this work very well. If you do it manually, using the Python script, it should only be a minor inconvenience. If anyone wants to discuss, get some advice (i'm no guru, but if I can't help directly, I can offer at least some places to get help) or tips on getting started, feel free to hit me up. ICQ 312104564 I'll also be in Phoenix. You'll see me at the poker tables getting trounced and probably wearing a red "Canada" hat. Hey, I'm a homer. In any event, if anyone wants to discuss, I'd love to. Lastly, before doing *anything* thoroughly go through the Sitemaps FAQ or get a techie to. You don't want to screw things up - though any reasonable mistake can be corrected, it will just be a pain until it is. Good luck |
|
Quote:
|
thx for the article dude
|
interesting read, thanks for the post!
|
Oops, one thing I forgot: The link to the FAQ:
http://www.google.com/webmasters/sit...cs/en/faq.html :) |
Nice post. :thumbsup
|
One more thing to add:
If you use the Google Sitemaps feature and Verify your site (once you've set up your Sitemap, there will be a tab for Sitemap, Stats, Error, Verify) you will have access to 3 additional, very helpful stats: 1) Query Stats. Shows the top queries used to find your site and top query clicks - the searches that resulted in the most click-throughs to your site. The first also shows you your Average Top Position for each keyword/phrase. 2) Page Anlaysis. Shows the most common words found in your content as well as in external link text pointing to your site. This will give you a very good idea of why are you ranking well (or poorly) for particular terms. Although there are sites with these type of tools out there, this is from Google itself, so it is obviously the most relevant! 3) Crawl Stats. Shows PageRank distribution within your site. A Low Distribution suggests your top page(s) - usually your home page - has most of the PR and you have not effectively spread it out to your sub-pages. It also, in the first column, shows errors, timeouts and other potential issues. Seriously, people, if you care even a little about SEO then I highly suggest investing a little bit of time in getting a SiteMap up and running. There are so many advantages and almost certainly will give you far greater insight into why your site is ranking the way it is. See (some of) you in Phoenix! :) |
Thanks Good Info
|
|
bump for bumping sake
|
Kevsh - excellent compilation of the info - and although kinda hidden in the article - that ErrorDocument issue is something that is incredibly important to a lot of people - even without using the sitemaps app. But like you say there is the tradeoff (damn fuskers) :)
|
Quote:
I haven't tested to see if Google keeps track of these stats once you replace the directive, but I don't see why not - the Verify function only exists to see if you own the site. Of course, scratch all this if you really need the robots.txt to be read by Google on an ongoing basis. You obviously would have to keep the ErrorDocument directive out. Damn, I should be sleeping now... :Oh crap |
I was doing some sitemap work, but this helped out alot!
|
dude, that was an interesting post. thanks for sharing.
|
TY for the tutorial. Definitly bookmarked this thread... I never fucked much with site maps ... but it definitly is a good idea from the SE stand point.
|
You can also use a sitemap generator like the one found at http://www.xml-sitemaps.com
|
I always worry that they can see all your sites. Even if they are on diffrent c classes now they know its the same person?
|
I dont think they care one bit if youre the same person - plus if your "honest" about your whois they already knew :)
|
Quote:
But if you're worried about one account for all your sites, you can always register each under different accounts - they only need a unique email addy for each as they don't ask for any personal details. |
This one might be of interest for those who don't want to tackle sitemap just yet:
13. Do I have to add a Sitemap in order to see statistics for my site? No, you can simply add a site to your account and then verify ownership in order to see statistics for it. If you don't have an existing google account you can create one here https://www.google.com/accounts/NewAccount |
Quote:
|
Very good info thx!!!
|
nice stuff!
10x |
thanks, might take a look at sitemaps properly after your tutorial
|
For the robots and 404 redirect problem. Couldn't you just have a plain robots.txt that allows all and still use the 404 redirect?
|
All times are GMT -7. The time now is 10:07 AM. |
Powered by vBulletin® Version 3.8.8
Copyright ©2000 - 2025, vBulletin Solutions, Inc.
©2000-, AI Media Network Inc123