Irish SEO,  Marketing & Webmaster Discussion

 

XML Sitemap

This is a discussion on XML Sitemap within the Search Engine Optimisation forums, part of the Online Marketing category; Hi, We have been advised to create/add an XML Sitemap for our site. Can anyone recommend a good program to ...


Go Back   Irish SEO, Marketing & Webmaster Discussion > Online Marketing > Search Engine Optimisation

Register Forum Rules FAQDonate Calendar Search Today's Posts Mark Forums Read



Reply

 

LinkBack Thread Tools Display Modes
  #1 (permalink)  
Old 08-07-2009, 10:33 AM
php.allstar's Avatar
Wannabe Geek
 
Join Date: Apr 2009
Location: Monamolin, Gorey, Co. Wexford
Posts: 203
Nominated 1 Time in 1 Post
TOTW/F/M Award(s): 0
Thanks: 1
Thanked 0 Times in 0 Posts
php.allstar is a splendid one to beholdphp.allstar is a splendid one to beholdphp.allstar is a splendid one to beholdphp.allstar is a splendid one to beholdphp.allstar is a splendid one to beholdphp.allstar is a splendid one to beholdphp.allstar is a splendid one to beholdphp.allstar is a splendid one to behold
Default XML Sitemap

Hi,

We have been advised to create/add an XML Sitemap for our site.

Can anyone recommend a good program to automatically run, generate the sitemap and upload it to our server?

I was thinking about GSiteCrawler but I'm not sure if that can be scheduled.

Any thoughts?

Thanks.
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!Spurl this Post!Reddit! Wong this Post!
Reply With Quote
  #2 (permalink)  
Old 08-07-2009, 12:04 PM
Kieran's Avatar
Wannabe Geek
 
Join Date: Oct 2008
Location: Cork
Posts: 163
Nominated 1 Time in 1 Post
Nominated TOTW/F/M Award(s): 1
Thanks: 0
Thanked 0 Times in 0 Posts
Kieran will become famous soon enough
Arrow

Would be interested to see such a tool but have never been comfortable with offline services doing the work.

For a Wordpress blog you can use a plugin that does it automatically and I guess for other CMS there are similar types of tools.

How often will you be adding pages to the site? If only once a week or so it is often easier to just of it manually and upload the sitemap. So when you add a page you just update the XML and away you go. Easy peasy.. (as he sneaks of to update a couple of his that htsi post reminded him to do :-)
__________________
All the best

Kieran
Cork Website Design | Design Blog
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!Spurl this Post!Reddit! Wong this Post!
Reply With Quote
  #3 (permalink)  
Old 08-07-2009, 12:50 PM
php.allstar's Avatar
Wannabe Geek
 
Join Date: Apr 2009
Location: Monamolin, Gorey, Co. Wexford
Posts: 203
Nominated 1 Time in 1 Post
TOTW/F/M Award(s): 0
Thanks: 1
Thanked 0 Times in 0 Posts
php.allstar is a splendid one to beholdphp.allstar is a splendid one to beholdphp.allstar is a splendid one to beholdphp.allstar is a splendid one to beholdphp.allstar is a splendid one to beholdphp.allstar is a splendid one to beholdphp.allstar is a splendid one to beholdphp.allstar is a splendid one to behold
Default

Thanks Keiran,

I'll be adding brand new pages and mini-apps about once or twice a week. But seing that this is a dynamic site with 4000+ pages we have pages that are modified/removed/added on a daily basis, so I think the sitemap has to be created and uploaded daily? Am I wrong on this?

It turns out, GSiteCrawler can be automatically scheduled to create and upload XML sitemaps

I would have liked to run it from our bare-bones linux development box but it seems I'll have to run it on my windows workstation as it's a windows exe.

I don't want it to consume my CPU and RAM on my workstation while I'm working during the day so I'll have to schedule it for every night, which means leaving my workstation powered on, something that I never do.

Maybe if I put my workstation on Stand By mode windows scheduler will wake it, run GSiteCrawler save a log file on my desktop for the following morning and then power my workstation off.
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!Spurl this Post!Reddit! Wong this Post!
Reply With Quote
  #4 (permalink)  
Old 08-07-2009, 01:26 PM
Frontpage User
 
Join Date: Jul 2009
Posts: 4
Nominated 0 Times in 0 Posts
TOTW/F/M Award(s): 0
Thanks: 0
Thanked 0 Times in 0 Posts
hydrosylator will become famous soon enough
Default

I can't actually post url's just yet but I highly recommend looking at googles list of sitemap generators
Enter "Sitemap Generators A collection of links to tools and code snippets that generate Sitemap files" into google, and click on the first link, which should be on the code[dot]google[dot]com site.

There's free and commercial ones that can be used on and offline. I'd be inclined to use server-side.
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!Spurl this Post!Reddit! Wong this Post!
Reply With Quote
  #5 (permalink)  
Old 08-07-2009, 01:31 PM
Kieran's Avatar
Wannabe Geek
 
Join Date: Oct 2008
Location: Cork
Posts: 163
Nominated 1 Time in 1 Post
Nominated TOTW/F/M Award(s): 1
Thanks: 0
Thanked 0 Times in 0 Posts
Kieran will become famous soon enough
Default

If the content of the pages are being updated then you shouldn't have to update daily / nightly as the sitemap just says "I have a page on my site and this is the name of it"

In my schizophrenic posting response method of course it would be idela ot have it done automatically if you are the forgetful type but again only if the new page has something that you want to have immediate

hope this helps

Kieran
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!Spurl this Post!Reddit! Wong this Post!
Reply With Quote
  #6 (permalink)  
Old 08-07-2009, 03:02 PM
link8r's Avatar
Hardcore Geek
Recent Blog:
 
Join Date: Nov 2008
Location: Limerick
Posts: 723
Nominated 10 Times in 10 Posts
Nominated TOTW/F/M Award(s): 10
Thanks: 1
Thanked 1 Time in 1 Post
link8r is on a distinguished roadlink8r is on a distinguished roadlink8r is on a distinguished roadlink8r is on a distinguished roadlink8r is on a distinguished roadlink8r is on a distinguished roadlink8r is on a distinguished roadlink8r is on a distinguished road
Default

If you have that many Pages that it will chew lots of processor time creating the sitemaps, then I suggest using multiple sitemaps - close off different sections of the site or if you have sub-sites within your sites for different languages/regions, put them into different sitemaps

Bear in mind that sitemaps are an assistant to Google's and your hosting bandwidth, so having a relly huge sitemap that takes a while to download may defeat the purpose.

While you are at it include a custom 404 error page with the google widget code for letting it remove missing URL's too.

You could also create a custom map crawler on the server and use a cron job to schedule it ?
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!Spurl this Post!Reddit! Wong this Post!
Reply With Quote
  #7 (permalink)  
Old 08-07-2009, 04:10 PM
php.allstar's Avatar
Wannabe Geek
 
Join Date: Apr 2009
Location: Monamolin, Gorey, Co. Wexford
Posts: 203
Nominated 1 Time in 1 Post
TOTW/F/M Award(s): 0
Thanks: 1
Thanked 0 Times in 0 Posts
php.allstar is a splendid one to beholdphp.allstar is a splendid one to beholdphp.allstar is a splendid one to beholdphp.allstar is a splendid one to beholdphp.allstar is a splendid one to beholdphp.allstar is a splendid one to beholdphp.allstar is a splendid one to beholdphp.allstar is a splendid one to behold
Default

Thanks guys

Quote:
Originally Posted by Kieran View Post
If the content of the pages are being updated then you shouldn't have to update daily / nightly as the sitemap just says "I have a page on my site and this is the name of it"
This is a uk golfing website. We have about 6 different information pages for around 320 golf courses in the UK.

Now this may be flawed in terms of seo but if a course becomes inactive on our web services it becomes inactive on our site (6 unique pages for that course are no longer available) and the user requesting the page is given a custom 404.

When that course is set to active again, the 6 unique pages for that course are available again, no more 404 for the user.

This also happens when courses are removed from our site or have just joined.

This is so sporadic, 1 course a month might leave, 2 courses a week might be set to inactive, 3 courses a week may join. Beacuse this activity is all over the place I feel as if I have to run GSiteCrawler every night. I don't want to have to generate and upload the file on an as it happens basis, i think this would be too much work (I'm a developer, not an SEO'er!)

Quote:
Originally Posted by link8r View Post
If you have that many Pages that it will chew lots of processor time creating the sitemaps, then I suggest using multiple sitemaps - close off different sections of the site or if you have sub-sites within your sites for different languages/regions, put them into different sitemaps

Bear in mind that sitemaps are an assistant to Google's and your hosting bandwidth, so having a relly huge sitemap that takes a while to download may defeat the purpose.
I could let it run during the day, like I have done today on my first run, which took about 30 mins. You know yourself, I'm greedy with my CPU and RAM, I just don't like other applications slowing my workstation down. (Not that it was too noticeable today!) Running at night was just an idea, but in hindsight, that would be bad for the environment!

I wouldn't call the sitemap huge. It's just the time that it takes to generate the XML Sitemap and the fact that it consumes some RAM and CPU on me that are the issues! (Beggars can't be choosers!) The raw xml version of the sitemap (4000+ pages) is 858KB, the GZipped version is 41KB.

Does google use the GZipped version?

Is there an optimum file size for an XML filesize that won't banjax google and our bandwidth?
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!Spurl this Post!Reddit! Wong this Post!
Reply With Quote
  #8 (permalink)  
Old 08-07-2009, 05:53 PM
link8r's Avatar
Hardcore Geek
Recent Blog:
 
Join Date: Nov 2008
Location: Limerick
Posts: 723
Nominated 10 Times in 10 Posts
Nominated TOTW/F/M Award(s): 10
Thanks: 1
Thanked 1 Time in 1 Post
link8r is on a distinguished roadlink8r is on a distinguished roadlink8r is on a distinguished roadlink8r is on a distinguished roadlink8r is on a distinguished roadlink8r is on a distinguished roadlink8r is on a distinguished roadlink8r is on a distinguished road
Default

The idea is that lots of small files download easier than a big file - simple timeout concept.

Making them inactive - is that because they are no longer a client or that the course isn't accessible ? Why not just keep the url/page and forward to the home page or display a message that the course is no longer active?

Why not group the courses by region for purposes of a sitemap.

That way your trigger is when a site becomes active/inactive, you rebuild the sitemap.

BUT REMEMBER: Just because you create a sitemap, doesn't mean Google will index your site - your site index is set to a Google dictated crawl cycle, which could be weekly or monthly...so you could be generating 4 sitemaps for every 1 that Google actually reads, hence why you need that 404 widget so much...

Official Google Webmaster Central Blog: Make your 404 pages more useful
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!Spurl this Post!Reddit! Wong this Post!
Reply With Quote
  #9 (permalink)  
Old 08-07-2009, 08:05 PM
jmcc's Avatar
Wannabe Geek
 
Join Date: Feb 2006
Posts: 445
Nominated 0 Times in 0 Posts
TOTW/F/M Award(s): 0
Thanks: 0
Thanked 0 Times in 0 Posts
jmcc is just really nicejmcc is just really nicejmcc is just really nicejmcc is just really nice
Default

Quote:
Originally Posted by php.allstar View Post
Does google use the GZipped version?
Yes. So does Yahoo (I think). You could write your own script for a sitemap generator instead of using an off-the-shelf one. If you think that a 4000 page sitemap is bad, I've just finished working on a preliminary one for 9.39 million pages. And Google has downloaded it but spidering it will take a while. Yahoo is currently downloading the gzipped sitemap files at the moment. Microsoft's Bing is missing in action as usual.

It might be possible to set up a database table with the page name, page url and state (active/deleted) and use this to generate your sitemap via a php script or similar. I think that Wordpress might have the last modified date of a page in its database schema. Most of the server load is probably due to all the database calls made by Wordpress for each individual page. This is a very inefficient way of generating a sitemap and some of those online sitemap generators are better suited to simple, static websites.

Quote:
Is there an optimum file size for an XML filesize that won't banjax google and our bandwidth?
41k is smaller than a lot of webpages these days.

Regards...jmcc
__________________
http://www.hosterstats.com
Hoster Stats and Domain Hosting History.
Hoster Stats for over 2.9M hosters. Domain history for over 236M active/deleted domains.
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!Spurl this Post!Reddit! Wong this Post!
Reply With Quote
  #10 (permalink)  
Old 09-07-2009, 12:23 AM
blacknight's Avatar
Web Slave
 
Join Date: Jan 2006
Location: Ireland
Posts: 7,689
Thanks: 4
Thanked 5 Times in 4 Posts
blacknight will become famous soon enoughblacknight will become famous soon enoughblacknight will become famous soon enoughblacknight will become famous soon enoughblacknight will become famous soon enoughblacknight will become famous soon enoughblacknight will become famous soon enoughblacknight will become famous soon enough
Default

Quote:
Originally Posted by php.allstar View Post
Is there an optimum file size for an XML filesize that won't banjax google and our bandwidth?
Bandwidth won't be an issue - sitemaps are tiny
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!Spurl this Post!Reddit! Wong this Post!
Reply With Quote
Reply

Tags
sitemap, xml

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On


Similar Threads

Thread Thread Starter Forum Replies Last Post
Indexing & uploading sitemap eamon92 HTML Basics 3 19-10-2009 07:08 PM
sitemap linking paul Search Engine Optimisation 3 30-01-2008 02:55 PM
Sitemap Feedback lostie Site Reviews / Announcements 7 19-06-2007 06:51 PM
Google Sitemap - Heads Up! blacknight Search Engine Optimisation 2 18-11-2006 08:39 PM
Sitemap Question Cormac Search Engine Optimisation 2 16-10-2006 06:29 PM


Sponsored links

Pepperjam Network
Paid On Results www.zanox.com Get Chitika Premium


All times are GMT +1. The time now is 03:15 PM.


Powered by: vBulletin Version 3.8.4, Copyright ©2000 - 2010, Jelsoft Enterprises Limited.
Hosted in Ireland by Blacknight - Test your ISP |Irish Hosting Directory| Armchair.ie|Logo by Eden Web Design|Avatars by Afterglow |Latest Blog Entries | VPS HostingAd Management by RedTyger

Search Engine Friendly URLs by vBSEO 3.3.2

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62