Robot Text

Discussion in 'Coding Help' started by Caz, Jun 17, 2006.

  1. Caz

    Caz New Member

    I need a little help understanding the robot text analysis of my site. Does anyone here specailise/understand this stuff I am completely lost?

    Thanks Caz.
     
  2. mneylon

    mneylon Administrator Staff Member

    Caz

    I've a fairly code understanding of it.

    What do you need to know?

    Michele
     
  3. Caz

    Caz New Member

    Hi Michelle,

    The robot.txt validator I am using is showing a series or errors all more or less the same as this;

    ERROR Invalid Line:
    <!doc type html public "-//W3C//DTD HTML 4.01 transitional// EN">

    I did not build the site myself and have no coding experience whatsoever. However I've learnt a lot over the last couple of months and have found some things that concern me about our site a little. I am convinced that the SE are having trouble reading the pages, although I know that this could be due to an inherent problem with osCommerce, it might also be this.

    Any help with clarification is greatly appreciated, thanks C.
     
  4. mneylon

    mneylon Administrator Staff Member

    Could you give me link to the page you are checking?

    It looks like you may have made an error with your document type declaration...

    Though that has nothing to do with robots :)
     
  5. Caz

    Caz New Member

  6. mneylon

    mneylon Administrator Staff Member

    That's giving a 404 ie. there is no robots.txt file
     
  7. Caz

    Caz New Member

    Hi Michelle,

    So what I'm typing in is wrong? Do you know what I need to put in? Sorry, I just can't seem to find the answer anywhere within the programme I'm working on and as your probably aware by now I'm not exactly an expert.

    Many thanks C.
     
  8. mneylon

    mneylon Administrator Staff Member

    Caz

    Maybe if you told us what you were trying to do exactly we would be able to help you

    Michele
     
  9. louie

    louie New Member

    robots.txt file should be farely simple.
    you tell the robots that follows the rules what directory to spider and which one not.
    Also not a very good ideea as you give the secret directories out for the ones that aren't really just visitors.
    in the robots.txt file you choose what spiders should index your website and what not.
    An example:

    User-agent: *Disallow: /secure/Disallow: /images/that tells all robots not to spider the /secure and /images folders.Everything else is ok.a better way of protecting a directory is to have it based on login and valid session.
     
  10. grandad

    grandad Moderator

    Correct me if I'm wrong but shouldn't this be
    <!doctype html public "-//W3C//DTD HTML 4.01 transitional// EN"> ??

    And, as Michele says, there is no such file as "robots.txt", so it can't be validated. Are you talking about HTML validation?
     
  11. louie

    louie New Member

    it looks right to me, but he probably made the modification himself.
     
  12. Caz

    Caz New Member

    Guys,

    Many thanks for all your responses, someone has been able to calm my concerns by PM, Caz.
     
  13. louie

    louie New Member

Share This Page