This is a discussion on XML Sitemap within the Search Engine Optimisation forums, part of the Online Marketing category; Hi, We have been advised to create/add an XML Sitemap for our site. Can anyone recommend a good program to ...
| |||||||
| Register | Forum Rules | FAQ | Donate | Calendar | Search | Today's Posts | Mark Forums Read |
| |||||
| Hi, We have been advised to create/add an XML Sitemap for our site. Can anyone recommend a good program to automatically run, generate the sitemap and upload it to our server? I was thinking about GSiteCrawler but I'm not sure if that can be scheduled. Any thoughts? Thanks.
__________________ Pedigree dogs for sale in Ireland | Dogzone.ie |
| |||||
| Would be interested to see such a tool but have never been comfortable with offline services doing the work. For a Wordpress blog you can use a plugin that does it automatically and I guess for other CMS there are similar types of tools. How often will you be adding pages to the site? If only once a week or so it is often easier to just of it manually and upload the sitemap. So when you add a page you just update the XML and away you go. Easy peasy.. (as he sneaks of to update a couple of his that htsi post reminded him to do :-) |
| |||||
| Thanks Keiran, I'll be adding brand new pages and mini-apps about once or twice a week. But seing that this is a dynamic site with 4000+ pages we have pages that are modified/removed/added on a daily basis, so I think the sitemap has to be created and uploaded daily? Am I wrong on this? It turns out, GSiteCrawler can be automatically scheduled to create and upload XML sitemaps I would have liked to run it from our bare-bones linux development box but it seems I'll have to run it on my windows workstation as it's a windows exe. I don't want it to consume my CPU and RAM on my workstation while I'm working during the day so I'll have to schedule it for every night, which means leaving my workstation powered on, something that I never do. Maybe if I put my workstation on Stand By mode windows scheduler will wake it, run GSiteCrawler save a log file on my desktop for the following morning and then power my workstation off. |
| ||||
| I can't actually post url's just yet but I highly recommend looking at googles list of sitemap generators Enter "Sitemap Generators A collection of links to tools and code snippets that generate Sitemap files" into google, and click on the first link, which should be on the code[dot]google[dot]com site. There's free and commercial ones that can be used on and offline. I'd be inclined to use server-side. |
| |||||
| If the content of the pages are being updated then you shouldn't have to update daily / nightly as the sitemap just says "I have a page on my site and this is the name of it" In my schizophrenic posting response method of course it would be idela ot have it done automatically if you are the forgetful type but again only if the new page has something that you want to have immediate hope this helps Kieran |
| |||||
| Thanks guys Quote:
Now this may be flawed in terms of seo but if a course becomes inactive on our web services it becomes inactive on our site (6 unique pages for that course are no longer available) and the user requesting the page is given a custom 404. When that course is set to active again, the 6 unique pages for that course are available again, no more 404 for the user. This also happens when courses are removed from our site or have just joined. This is so sporadic, 1 course a month might leave, 2 courses a week might be set to inactive, 3 courses a week may join. Beacuse this activity is all over the place I feel as if I have to run GSiteCrawler every night. I don't want to have to generate and upload the file on an as it happens basis, i think this would be too much work (I'm a developer, not an SEO'er!) Quote:
I wouldn't call the sitemap huge. It's just the time that it takes to generate the XML Sitemap and the fact that it consumes some RAM and CPU on me that are the issues! (Beggars can't be choosers!) The raw xml version of the sitemap (4000+ pages) is 858KB, the GZipped version is 41KB. Does google use the GZipped version? Is there an optimum file size for an XML filesize that won't banjax google and our bandwidth? |
| |||||
| The idea is that lots of small files download easier than a big file - simple timeout concept. Making them inactive - is that because they are no longer a client or that the course isn't accessible ? Why not just keep the url/page and forward to the home page or display a message that the course is no longer active? Why not group the courses by region for purposes of a sitemap. That way your trigger is when a site becomes active/inactive, you rebuild the sitemap. BUT REMEMBER: Just because you create a sitemap, doesn't mean Google will index your site - your site index is set to a Google dictated crawl cycle, which could be weekly or monthly...so you could be generating 4 sitemaps for every 1 that Google actually reads, hence why you need that 404 widget so much... Official Google Webmaster Central Blog: Make your 404 pages more useful |
| |||||
| Yes. So does Yahoo (I think). You could write your own script for a sitemap generator instead of using an off-the-shelf one. If you think that a 4000 page sitemap is bad, I've just finished working on a preliminary one for 9.39 million pages. And Google has downloaded it but spidering it will take a while. Yahoo is currently downloading the gzipped sitemap files at the moment. Microsoft's Bing is missing in action as usual. It might be possible to set up a database table with the page name, page url and state (active/deleted) and use this to generate your sitemap via a php script or similar. I think that Wordpress might have the last modified date of a page in its database schema. Most of the server load is probably due to all the database calls made by Wordpress for each individual page. This is a very inefficient way of generating a sitemap and some of those online sitemap generators are better suited to simple, static websites. Quote:
Regards...jmcc
__________________ http://www.hosterstats.com Hoster Stats and Domain Hosting History. Hoster Stats for over 2.9M hosters. Domain history for over 236M active/deleted domains. |
| |||||
| Bandwidth won't be an issue - sitemaps are tiny |
| Tags |
| sitemap, xml |
| Thread Tools | |
| Display Modes | |
| |
| ||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Indexing & uploading sitemap | eamon92 | HTML Basics | 3 | 19-10-2009 07:08 PM |
| sitemap linking | paul | Search Engine Optimisation | 3 | 30-01-2008 02:55 PM |
| Sitemap Feedback | lostie | Site Reviews / Announcements | 7 | 19-06-2007 06:51 PM |
| Google Sitemap - Heads Up! | blacknight | Search Engine Optimisation | 2 | 18-11-2006 08:39 PM |
| Sitemap Question | Cormac | Search Engine Optimisation | 2 | 16-10-2006 06:29 PM |
| ||||
| | ![]() | |||
| | ![]() | |||