What Robots.txt Should Block: A List Of What Shouldn't Be Crawled?

Community Forums Forums General Discussion What Robots.txt Should Block: A List Of What Shouldn't Be Crawled?

This topic is: not resolved

Tagged: 

This topic contains 2 replies, has 1 voice, and was last updated by  jcohen 1 year, 10 months ago.

Viewing 3 posts - 1 through 3 (of 3 total)
  • Author
    Posts
  • #3452

    jcohen
    Participant
    Post count: 1

    Here’s a partial list of what maybe shouldn’t be crawled that I put in my robots.txt.

    Is there any thing else that should be added that shouldn’t be crawled?

    User-agent: *
    Disallow: /blog/wp-admin
    Disallow: /blog/wp-includes
    Disallow: /blog/wp-content/plugins
    Disallow: /blog/wp-content/cache
    Disallow: /blog/wp-content/themes
    Disallow: /blog/trackback
    Disallow: /cgi-bin
    Disallow: /search
    Disallow: /blog/feed
    Disallow: /blog/rss
    Disallow: /blog/comments/feed
    Disallow: /blog/feed/$
    Disallow: /blog/*/feed/$
    Disallow: /blog/*/feed/rss/$
    Disallow: /blog/*/trackback/$
    We generally like to put blogs in a

    #3514

    jcohen
    Participant
    Post count: 1

    Disregard  the last line:  “We generally like to put blogs in a..”

    #3618

    jcohen
    Participant
    Post count: 1

    Any chance on any help with this?

    Thanks.

    Joel

Viewing 3 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic.