|
Notices |
![]() |
|
Thread Tools | Rate Thread | Display Modes |
![]() |
#1 |
Registered User
Join Date: May 2010
Posts: 14
|
![]()
I have a few sites on a shared hosting where each has 1GB transfer per month. Those are super low traffic sites and i thought that won't be an issue but, take a look at this:
https://dl.dropbox.com/u/8346986/robots.jpg That's just for the first 5 days of this month, and when the last month I put robots.txt like this: Code:
# Begin block Bad-Robots from robots.txt User-agent: robot Disallow:/ User-agent: bot Disallow:/ User-agent: spider Disallow:/ User-agent: crawl Disallow:/ User-agent: spider Disallow:/ User-agent: asterias Disallow:/ User-agent: BackDoorBot/1.0 Disallow:/ User-agent: Black Hole Disallow:/ User-agent: BlowFish/1.0 Disallow:/ User-agent: BotALot Disallow:/ User-agent: BuiltBotTough Disallow:/ User-agent: Bullseye/1.0 Disallow:/ User-agent: BunnySlippers Disallow:/ User-agent: Cegbfeieh Disallow:/ User-agent: CheeseBot Disallow:/ User-agent: CherryPicker Disallow:/ User-agent: CherryPickerElite/1.0 Disallow:/ User-agent: CherryPickerSE/1.0 Disallow:/ User-agent: CopyRightCheck Disallow:/ User-agent: cosmos Disallow:/ User-agent: Crescent Disallow:/ User-agent: Crescent Internet ToolPak HTTP OLE Control v.1.0 Disallow:/ User-agent: DittoSpyder Disallow:/ User-agent: EmailCollector Disallow:/ User-agent: EmailSiphon Disallow:/ User-agent: EmailWolf Disallow:/ User-agent: EroCrawler Disallow:/ User-agent: ExtractorPro Disallow:/ User-agent: Foobot Disallow:/ User-agent: Harvest/1.5 Disallow:/ User-agent: hloader Disallow:/ User-agent: httplib Disallow:/ User-agent: humanlinks Disallow:/ User-agent: InfoNaviRobot Disallow:/ User-agent: JennyBot Disallow:/ User-agent: Kenjin Spider Disallow:/ User-agent: Keyword Density/0.9 Disallow:/ User-agent: LexiBot Disallow:/ User-agent: libWeb/clsHTTP Disallow:/ User-agent: LinkextractorPro Disallow:/ User-agent: LinkScan/8.1a Unix Disallow:/ User-agent: LinkWalker Disallow:/ User-agent: LNSpiderguy Disallow:/ User-agent: lwp-trivial Disallow:/ User-agent: lwp-trivial/1.34 Disallow:/ User-agent: Mata Hari Disallow:/ User-agent: Microsoft URL Control - 5.01.4511 Disallow:/ User-agent: Microsoft URL Control - 6.00.8169 Disallow:/ User-agent: MIIxpc Disallow:/ User-agent: MIIxpc/4.2 Disallow:/ User-agent: Mister PiX Disallow:/ User-agent: moget Disallow:/ User-agent: moget/2.1 Disallow:/ User-agent: NetAnts Disallow:/ User-agent: NICErsPRO Disallow:/ User-agent: Offline Explorer Disallow:/ User-agent: Openfind Disallow:/ User-agent: Openfind data gathere Disallow:/ User-agent: ProPowerBot/2.14 Disallow:/ User-agent: ProWebWalker Disallow:/ User-agent: QueryN Metasearch Disallow:/ User-agent: RepoMonkey Disallow:/ User-agent: RepoMonkey Bait & Tackle/v1.01 Disallow:/ User-agent: RMA Disallow:/ User-agent: SiteSnagger Disallow:/ User-agent: SpankBot Disallow:/ User-agent: spanner Disallow:/ User-agent: suzuran Disallow:/ User-agent: Szukacz/1.4 Disallow:/ User-agent: Teleport Disallow:/ User-agent: TeleportPro Disallow:/ User-agent: Telesoft Disallow:/ User-agent: The Intraformant Disallow:/ User-agent: TheNomad Disallow:/ User-agent: TightTwatBot Disallow:/ User-agent: Titan Disallow:/ User-agent: toCrawl/UrlDispatcher Disallow:/ User-agent: True_Robot Disallow:/ User-agent: True_Robot/1.0 Disallow:/ User-agent: turingos Disallow:/ User-agent: URLy Warning Disallow:/ User-agent: VCI Disallow:/ User-agent: VCI WebViewer VCI WebViewer Win32 Disallow:/ User-agent: Web Image Collector Disallow:/ User-agent: WebAuto Disallow:/ User-agent: WebBandit Disallow:/ User-agent: WebBandit/3.50 Disallow:/ User-agent: WebCopier Disallow:/ User-agent: WebEnhancer Disallow:/ User-agent: WebmasterWorldForumBot Disallow:/ User-agent: WebSauger Disallow:/ User-agent: Website Quester Disallow:/ User-agent: Webster Pro Disallow:/ User-agent: WebStripper Disallow:/ User-agent: WebZip Disallow:/ User-agent: WebZip/4.0 Disallow:/ User-agent: Wget Disallow:/ User-agent: Wget/1.5.3 Disallow:/ User-agent: Wget/1.6 Disallow:/ User-agent: WWW-Collector-E Disallow:/ User-agent: Xenu's Disallow:/ User-agent: Xenu's Link Sleuth 1.1c Disallow:/ User-agent: Zeus Disallow:/ User-agent: Zeus 32297 Webster Pro V2.9 Win32 Disallow:/ # Begin Exclusion From Directories from robots.txt Disallow: /cgi-bin/ and as you can see robot, bot, spider, crawl are not respecting that at all so few questions here 1. Does anybody know who runs those bots and why they don't respect robots.txt 2. What's up with googlebot consuming 660mb for 5 days? Aren't they supposed to NOT be aggressive like that. There was a video where Matt Cuts explains how they are extra careful to not crawl sites too fast and aggressive since this might cause problems to smaller hosts. 3. if i add the line: Disallow:/ User-agent: *bot since this is the ID of one of the bots, will that also disallow "Googlebot" or in robots.txt * is literally * not a catch all symbol? answering on any of the 3 questions will be appreciated ![]()
__________________
cool arcade games |
![]() |
![]() |
![]() |
#2 |
Registered User
Join Date: Dec 2013
Posts: 942
|
ohhhhhhhhhhhh
__________________
Affiliate Vote,Affiliate Vote Blog,Affiliate Vote News,Affiliate vote forum,Affiliate Vote Tips,Affiliate Vote CPA,Adscendmedia affiliate review,7search Affiliate review,Adworkmedia affiliate review,peerfly affiliate review,link connector affiliate review,Adgatemedia affiliate review,Click bank affiliate review |
![]() |
![]() |
![]() |
Currently Active Users Viewing This Thread: 1 (0 members and 1 guests) | |
|
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Traffic Increase On My Sites | alen12345 | Traffic Corner | 17 | 06-06-2016 12:49 AM |
Sites Losing Traffic | webguru11 | Search Engine Optimization | 1 | 10-02-2014 03:09 AM |
Sites Losing Traffic | webguru11 | Search Engine Optimization | 1 | 01-22-2013 01:55 AM |
Increase traffic sites | ooffrrttjaaj | Traffic Corner | 18 | 09-11-2012 10:44 PM |
How to generate real, targeted business traffic sites? | eooc.box | Search Engine Optimization | 3 | 06-15-2012 02:46 AM |