+ Reply to Thread
Page 1 of 2
1 2 LastLast
Results 1 to 10 of 18

Thread: XML Sitemap

  1. #1
    php.allstar's Avatar
    php.allstar is offline Wannabe Geek php.allstar is a splendid one to behold php.allstar is a splendid one to behold php.allstar is a splendid one to behold php.allstar is a splendid one to behold php.allstar is a splendid one to behold php.allstar is a splendid one to behold php.allstar is a splendid one to behold php.allstar is a splendid one to behold
    Join Date
    Apr 2009
    Location
    Monamolin, Gorey, Co. Wexford
    Posts
    209

    Default XML Sitemap

    Hi,

    We have been advised to create/add an XML Sitemap for our site.

    Can anyone recommend a good program to automatically run, generate the sitemap and upload it to our server?

    I was thinking about GSiteCrawler but I'm not sure if that can be scheduled.

    Any thoughts?

    Thanks.

  2. #2
    Kieran's Avatar
    Kieran is offline Wannabe Geek Kieran will become famous soon enough
    Join Date
    Oct 2008
    Location
    Cork
    Posts
    164

    Arrow

    Would be interested to see such a tool but have never been comfortable with offline services doing the work.

    For a Wordpress blog you can use a plugin that does it automatically and I guess for other CMS there are similar types of tools.

    How often will you be adding pages to the site? If only once a week or so it is often easier to just of it manually and upload the sitemap. So when you add a page you just update the XML and away you go. Easy peasy.. (as he sneaks of to update a couple of his that htsi post reminded him to do :-)
    All the best

    Kieran
    Cork Website Design | Design Blog

  3. #3
    php.allstar's Avatar
    php.allstar is offline Wannabe Geek php.allstar is a splendid one to behold php.allstar is a splendid one to behold php.allstar is a splendid one to behold php.allstar is a splendid one to behold php.allstar is a splendid one to behold php.allstar is a splendid one to behold php.allstar is a splendid one to behold php.allstar is a splendid one to behold
    Join Date
    Apr 2009
    Location
    Monamolin, Gorey, Co. Wexford
    Posts
    209

    Default

    Thanks Keiran,

    I'll be adding brand new pages and mini-apps about once or twice a week. But seing that this is a dynamic site with 4000+ pages we have pages that are modified/removed/added on a daily basis, so I think the sitemap has to be created and uploaded daily? Am I wrong on this?

    It turns out, GSiteCrawler can be automatically scheduled to create and upload XML sitemaps

    I would have liked to run it from our bare-bones linux development box but it seems I'll have to run it on my windows workstation as it's a windows exe.

    I don't want it to consume my CPU and RAM on my workstation while I'm working during the day so I'll have to schedule it for every night, which means leaving my workstation powered on, something that I never do.

    Maybe if I put my workstation on Stand By mode windows scheduler will wake it, run GSiteCrawler save a log file on my desktop for the following morning and then power my workstation off.

  4. #4
    hydrosylator is offline Frontpage User hydrosylator will become famous soon enough
    Join Date
    Jul 2009
    Posts
    4

    Default

    I can't actually post url's just yet but I highly recommend looking at googles list of sitemap generators
    Enter "Sitemap Generators A collection of links to tools and code snippets that generate Sitemap files" into google, and click on the first link, which should be on the code[dot]google[dot]com site.

    There's free and commercial ones that can be used on and offline. I'd be inclined to use server-side.

  5. #5
    Kieran's Avatar
    Kieran is offline Wannabe Geek Kieran will become famous soon enough
    Join Date
    Oct 2008
    Location
    Cork
    Posts
    164

    Default

    If the content of the pages are being updated then you shouldn't have to update daily / nightly as the sitemap just says "I have a page on my site and this is the name of it"

    In my schizophrenic posting response method of course it would be idela ot have it done automatically if you are the forgetful type but again only if the new page has something that you want to have immediate

    hope this helps

    Kieran
    All the best

    Kieran
    Cork Website Design | Design Blog

  6. #6
    link8r's Avatar
    link8r is offline Hardcore Geek link8r is a splendid one to behold link8r is a splendid one to behold link8r is a splendid one to behold link8r is a splendid one to behold link8r is a splendid one to behold link8r is a splendid one to behold link8r is a splendid one to behold link8r is a splendid one to behold
    Join Date
    Nov 2008
    Location
    Limerick
    Posts
    729

    Default

    If you have that many Pages that it will chew lots of processor time creating the sitemaps, then I suggest using multiple sitemaps - close off different sections of the site or if you have sub-sites within your sites for different languages/regions, put them into different sitemaps

    Bear in mind that sitemaps are an assistant to Google's and your hosting bandwidth, so having a relly huge sitemap that takes a while to download may defeat the purpose.

    While you are at it include a custom 404 error page with the google widget code for letting it remove missing URL's too.

    You could also create a custom map crawler on the server and use a cron job to schedule it ?

  7. #7
    php.allstar's Avatar
    php.allstar is offline Wannabe Geek php.allstar is a splendid one to behold php.allstar is a splendid one to behold php.allstar is a splendid one to behold php.allstar is a splendid one to behold php.allstar is a splendid one to behold php.allstar is a splendid one to behold php.allstar is a splendid one to behold php.allstar is a splendid one to behold
    Join Date
    Apr 2009
    Location
    Monamolin, Gorey, Co. Wexford
    Posts
    209

    Default

    Thanks guys

    Quote Originally Posted by Kieran View Post
    If the content of the pages are being updated then you shouldn't have to update daily / nightly as the sitemap just says "I have a page on my site and this is the name of it"
    This is a uk golfing website. We have about 6 different information pages for around 320 golf courses in the UK.

    Now this may be flawed in terms of seo but if a course becomes inactive on our web services it becomes inactive on our site (6 unique pages for that course are no longer available) and the user requesting the page is given a custom 404.

    When that course is set to active again, the 6 unique pages for that course are available again, no more 404 for the user.

    This also happens when courses are removed from our site or have just joined.

    This is so sporadic, 1 course a month might leave, 2 courses a week might be set to inactive, 3 courses a week may join. Beacuse this activity is all over the place I feel as if I have to run GSiteCrawler every night. I don't want to have to generate and upload the file on an as it happens basis, i think this would be too much work (I'm a developer, not an SEO'er!)

    Quote Originally Posted by link8r View Post
    If you have that many Pages that it will chew lots of processor time creating the sitemaps, then I suggest using multiple sitemaps - close off different sections of the site or if you have sub-sites within your sites for different languages/regions, put them into different sitemaps

    Bear in mind that sitemaps are an assistant to Google's and your hosting bandwidth, so having a relly huge sitemap that takes a while to download may defeat the purpose.
    I could let it run during the day, like I have done today on my first run, which took about 30 mins. You know yourself, I'm greedy with my CPU and RAM, I just don't like other applications slowing my workstation down. (Not that it was too noticeable today!) Running at night was just an idea, but in hindsight, that would be bad for the environment!

    I wouldn't call the sitemap huge. It's just the time that it takes to generate the XML Sitemap and the fact that it consumes some RAM and CPU on me that are the issues! (Beggars can't be choosers!) The raw xml version of the sitemap (4000+ pages) is 858KB, the GZipped version is 41KB.

    Does google use the GZipped version?

    Is there an optimum file size for an XML filesize that won't banjax google and our bandwidth?

  8. #8
    link8r's Avatar
    link8r is offline Hardcore Geek link8r is a splendid one to behold link8r is a splendid one to behold link8r is a splendid one to behold link8r is a splendid one to behold link8r is a splendid one to behold link8r is a splendid one to behold link8r is a splendid one to behold link8r is a splendid one to behold
    Join Date
    Nov 2008
    Location
    Limerick
    Posts
    729

    Default

    The idea is that lots of small files download easier than a big file - simple timeout concept.

    Making them inactive - is that because they are no longer a client or that the course isn't accessible ? Why not just keep the url/page and forward to the home page or display a message that the course is no longer active?

    Why not group the courses by region for purposes of a sitemap.

    That way your trigger is when a site becomes active/inactive, you rebuild the sitemap.

    BUT REMEMBER: Just because you create a sitemap, doesn't mean Google will index your site - your site index is set to a Google dictated crawl cycle, which could be weekly or monthly...so you could be generating 4 sitemaps for every 1 that Google actually reads, hence why you need that 404 widget so much...

    Official Google Webmaster Central Blog: Make your 404 pages more useful

  9. #9
    jmcc's Avatar
    jmcc is offline Wannabe Geek jmcc is just really nice jmcc is just really nice jmcc is just really nice jmcc is just really nice
    Join Date
    Feb 2006
    Posts
    462

    Default

    Quote Originally Posted by php.allstar View Post
    Does google use the GZipped version?
    Yes. So does Yahoo (I think). You could write your own script for a sitemap generator instead of using an off-the-shelf one. If you think that a 4000 page sitemap is bad, I've just finished working on a preliminary one for 9.39 million pages. And Google has downloaded it but spidering it will take a while. Yahoo is currently downloading the gzipped sitemap files at the moment. Microsoft's Bing is missing in action as usual.

    It might be possible to set up a database table with the page name, page url and state (active/deleted) and use this to generate your sitemap via a php script or similar. I think that Wordpress might have the last modified date of a page in its database schema. Most of the server load is probably due to all the database calls made by Wordpress for each individual page. This is a very inefficient way of generating a sitemap and some of those online sitemap generators are better suited to simple, static websites.

    Is there an optimum file size for an XML filesize that won't banjax google and our bandwidth?
    41k is smaller than a lot of webpages these days.

    Regards...jmcc
    http://www.hosterstats.com
    Hoster Stats and Domain Hosting History.
    Hoster Stats for over 2.9M hosters. Domain history for over 236M active/deleted domains.

  10. #10
    blacknight's Avatar
    blacknight is offline Web Slave blacknight is a splendid one to behold blacknight is a splendid one to behold blacknight is a splendid one to behold blacknight is a splendid one to behold blacknight is a splendid one to behold blacknight is a splendid one to behold blacknight is a splendid one to behold blacknight is a splendid one to behold
    Join Date
    Jan 2006
    Location
    Ireland
    Posts
    7,890

    Default

    Quote Originally Posted by php.allstar View Post
    Is there an optimum file size for an XML filesize that won't banjax google and our bandwidth?
    Bandwidth won't be an issue - sitemaps are tiny

+ Reply to Thread
Page 1 of 2
1 2 LastLast

Similar Threads

  1. Indexing & uploading sitemap
    By eamon92 in forum HTML Basics
    Replies: 3
    Last Post: 19-10-2009, 07:08 PM
  2. sitemap linking
    By paul in forum Search Engine Optimisation
    Replies: 3
    Last Post: 30-01-2008, 02:55 PM
  3. Sitemap Feedback
    By lostie in forum Site Reviews / Announcements
    Replies: 7
    Last Post: 19-06-2007, 06:51 PM
  4. Google Sitemap - Heads Up!
    By blacknight in forum Search Engine Optimisation
    Replies: 2
    Last Post: 18-11-2006, 08:39 PM
  5. Sitemap Question
    By Cormac in forum Search Engine Optimisation
    Replies: 2
    Last Post: 16-10-2006, 06:29 PM

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts

Search Engine Optimization by vBSEO

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64