Status
Not open for further replies.

jmcc

Active Member
I had to write a sitemap generator for HosterStats.com as it is a large site and it would easily break most of the ordinary generator scripts. The limited page count on the hoster stats and history pages is around 10 million pages. I haven't included any domain records in that as it would be another 250 million pages of which 124 million or so would be dropped/deleted domains. The good thing is that the site is database driven and as such it is possible to generate a list of hoster records to generate. The yearly aspect of the hosting market also means that it is necessary to generate a set of yearly sitemap files. As the sitemap generator I use was written at zero dark thirty, I used Tcl as the scripting language as it is insanely flexible when it comes to this kind of stuff.

One important aspect of writing your own sitemap generator is to have a good model of your website in mind. You have to be able to sort the pages that change continually from those that change yearly or are historically frozen.

Regards...jmcc
 

andywozhere

Member
Wow my definition of large looks rather small now. I was thinking of thousands or tens of thousand of pages. Interesting to see you can write your own sitemap generator, although I think I'll probably be going for an off the shelf generator for convenience sake.
 

CMDublin

New Member
GSiteCrawler is what I have been using for the past few years. Never had a problem with it - gsitecrawler.com
 

andywozhere

Member
I tried GSiteCrawler and things appear to have gone pretty well so far.

Anyone out there using Windows Vista should note that when loading it up you should use the “Run as Administrator” option on Vista (took me a while to work this out).
 

maco

New Member
I have used this XML Site generator as well. You can try the free version first only supports 500 pages I think.

I do agree is bit slow for a bigger site, it's perfect for a smaller size . . .
 

nanotriffid

New Member
WebCEO free version

Personally I love the the sitemap generator used as part of the WebCEO software. Fortunately the free version allows you access to the sitemap generation tool. What I love about it is the customisibility of the xml sitemaps and the fact that you can generate multiple formats at the same time.

Typically I would generate the xml, html, ror (for rss fees), and urllist.txt (for yahoo) all at the same time. My largest client was approx 16,000 static html pages with just under 500 blog posts and the average generation would take 8 hours (we used a shared internet connection that was non too hot.)
 

ndrewstrauss

New Member
Sitemap generators create a Sitemap compliant with the Sitemap Protocol of sitemaps Common inputs for the generators include access logs, URL lists, and webserver directories.
 
Status
Not open for further replies.
Top