# robots.txt generated at http://www.viaguru.com User-agent: Googlebot User-agent: Slurp User-agent: msnbot* User-agent: Mediapartners-Google* User-agent: Googlebot-Image User-agent: Yahoo-MMCrawler User-agent: AOL User-agent: Appie User-agent: Ask Jeeves User-agent: Infoseek SideWinder User-agent: Inktomi slurp User-agent: Larbin User-agent: Linknzbot User-agent: Microsoft URL Control User-agent: NuSearch Spider User-agent: PipeLiner User-agent: Scrubby User-agent: SpeedySpider User-agent: Szukacz User-agent: WebSearch User-agent: Webwombat User-agent: WWWeasel User-agent: FAST-WebCrawler User-agent: Fluffy the spider User-agent: Overture User-agent: 123 India User-agent: 100 Hot Sites User-agent: 2Ants Search Engine User-agent: 321 Webmaster User-agent: 777-Mall.com User-agent: 777Media User-agent: Aaspaas User-agent: Abacho User-agent: Abraham Search User-agent: Access New Zealand User-agent: Accessfp User-agent: Accoona User-agent: Achei User-agent: Acoon Search User-agent: AdmCity User-agent: AdNet User-agent: Aeiwi User-agent: Aesop User-agent: Ah-ha User-agent: AIGAM.com User-agent: AiwaGulf(Persian Gulf) User-agent: Alexa User-agent: All the Sites User-agent: AllofEarth User-agent: AllTopWebsites User-agent: Alluna.de User-agent: Amfibi User-agent: Amidalla User-agent: Anadir Web User-agent: Ananzi South Africa User-agent: Anazitisis User-agent: Anoox User-agent: Antena User-agent: Arianna(IT) User-agent: Arianna(Italy) User-agent: Artbit Krakow (PL) User-agent: Ascentdirectory User-agent: AT-LA User-agent: AtSearch User-agent: AU Search Engine User-agent: AuctiOn User-agent: AustroNaut(AT AltaVista) User-agent: Auyantepui(ve) User-agent: Avalonsearch User-agent: AxxaSearch User-agent: Balla(China) User-agent: Balinesia User-agent: Beamed User-agent: Belport(DE) User-agent: Bestyellow User-agent: BigFinder User-agent: Biveroo User-agent: Bivlo User-agent: Blue Windo(CH-AltaVista) User-agent: Bluelu User-agent: Boitho User-agent: Boitho.com User-agent: Boston Search Engine Directory User-agent: Breizhoo User-agent: Brujula User-agent: Burf.com User-agent: Busca Popular(br) User-agent: Busca Real User-agent: BuscaArtigo(br) User-agent: Buscapique User-agent: Busco.com User-agent: Business-Inc.net User-agent: Businessweek Biz User-agent: Buzzle.com User-agent: CA Web Search User-agent: Caloweb.de User-agent: Camana User-agent: CambSearch(uk) User-agent: Canada One(ca) User-agent: Canadian Content User-agent: Canadopedia User-agent: Cavarzano.com User-agent: Cdnbusinessdirectory User-agent: CdNet User-agent: Cerain italia(IT) User-agent: Chevere(ve) User-agent: Cipinet User-agent: Classified2000.net User-agent: Claymont.com User-agent: Clixtore User-agent: Columus-Finder(de) User-agent: Comoestamos Search User-agent: CompletePlanet User-agent: CoolHomepages User-agent: Cool Site User-agent: CoolFishy User-agent: Cowleys Austrlia Business Directory(AU) User-agent: Cranik(br) User-agent: Crawler(de) User-agent: Curioso(br) User-agent: CyngoSeek.com User-agent: CyperZip(ca) User-agent: Deepindex User-agent: Demon User-agent: DIABOLOS(IT) User-agent: DinoSearch User-agent: Direct Hit User-agent: Directory Home User-agent: DWInfoServer User-agent: Ecila(FR) User-agent: Elcano User-agent: Enter UK User-agent: EntireWeb User-agent: Entireweb(USA) User-agent: e-sysgoing User-agent: EuBusco(br) User-agent: Eureka User-agent: EuroNaut(AT) User-agent: EuroNet User-agent: eVisum User-agent: Evreka(FI) User-agent: Exalead User-agent: Exploora(br) User-agent: FAST Search User-agent: Femina User-agent: Find Once User-agent: Find Once(UK) User-agent: Forcehigh User-agent: FoundaYa(UK) User-agent: FreshLinks User-agent: Friday Night Online User-agent: FyberSearch User-agent: Gais User-agent: Gaylifeuk User-agent: GB Search(uk) User-agent: Genesis(br) User-agent: GeoWeb User-agent: Ghetosearch User-agent: Gigabusca(br) User-agent: Gibiblast User-agent: Go2Directory User-agent: Google User-agent: Google (BR) User-agent: Google (ES) User-agent: Google (FR) User-agent: Google (IT) User-agent: Google (NO) User-agent: Google (SE) User-agent: Google (NO) User-agent: Google (UK) User-agent: Google (AU) User-agent: Google.de User-agent: Gooru (Poland) User-agent: GoSnoop User-agent: Great British Pages (uk) User-agent: GuiaCade(br) User-agent: Hannover Web User-agent: HomerWeb User-agent: Hotbot User-agent: Hotbot (UK) User-agent: Howzat User-agent: Howzat(au) User-agent: Iconnic User-agent: Igwanna.com User-agent: In.gr(Greece) User-agent: Infignos User-agent: Infohighway User-agent: Infoseek.pl User-agent: Infotiger User-agent: Infotiger(Germany) User-agent: InnerMobiles User-agent: IntelSeek User-agent: Inter Lap User-agent: Intersearch Europe User-agent: ISAmillionaire Search User-agent: Jadoo User-agent: JDGO User-agent: Jonga.co.za User-agent: Jopinet User-agent: Jungle-Spider.de User-agent: Jyxo Group User-agent: Kerplop User-agent: Kgogo(KR) User-agent: KHOJ(India) User-agent: Kor-Seek(kr) User-agent: Kvasir (NO) User-agent: Lets Find It Now User-agent: Life2Web Search User-agent: LimeSearch(UK) User-agent: LinksReference User-agent: Local(UK) User-agent: Lycos User-agent: Magellan Info User-agent: Maple Square(ca) User-agent: Megaglobe User-agent: Mirago User-agent: Mixcat User-agent: MSN Search User-agent: Bling User-agent: MyCanadaSearch User-agent: MyIdeaz.com User-agent: NationalDirectory User-agent: Net scan User-agent: Netcelsius User-agent: Netsprint (Poland) User-agent: NetWhat User-agent: Nublo User-agent: NZ Explorer User-agent: OfficialSearch User-agent: Onde Achar Sites(br) User-agent: Online Shopping Directory User-agent: Online-Favoriten(Germany) User-agent: OnlinePilot.de User-agent: Paginas Croquer(IT) User-agent: Parai(br) User-agent: Peskisa(br) User-agent: Pesquisando(br) User-agent: Phone Warehouse (uk) User-agent: PlanetSearch User-agent: PleaseRetrieve User-agent: Poddys User-agent: Ponteiro(br) User-agent: Portal-Internet(br) User-agent: PowerSearch User-agent: Publicitalia(IT) User-agent: QuestFinder User-agent: Radar UOL (BR) User-agent: Radix(br) User-agent: Rambler(Russia) User-agent: Read2Com Free Links User-agent: REX User-agent: Rocketinfo User-agent: Sandseekers User-agent: Sapo(br) User-agent: Sawaal Network User-agent: Scrub The Web User-agent: Sear User-agent: Search Around The World User-agent: Search ch User-agent: Search Hound User-agent: Search King User-agent: Search Sight User-agent: Search ZA User-agent: Search.NL (nl) User-agent: SearchAve.com User-agent: SearchBlue User-agent: SearchIT User-agent: Search-O-Rama.com User-agent: SearchPower User-agent: SearchRamp User-agent: SearchTheWorld User-agent: SearchUK User-agent: SearchUK(uk) User-agent: Searchwarp User-agent: SearchWho User-agent: Seek.de User-agent: Searchwiz User-agent: SeekItOut.com User-agent: SentenceSeek User-agent: SEO7 User-agent: Shoula! User-agent: Shoula! Search User-agent: Slider.com User-agent: SOL(es) User-agent: Somuch User-agent: SonicQuest User-agent: SonicRun User-agent: Spark Search User-agent: Speedfind (de) User-agent: SplatSearch User-agent: StopDog User-agent: SubBrain User-agent: Subjex Search User-agent: Subjex.com User-agent: Submit-It User-agent: Suchmaschine User-agent: SuperSites(br) User-agent: Superwap User-agent: SurfGizmo User-agent: SurfGopher User-agent: SusySearch User-agent: Swiss Search User-agent: Szigg User-agent: Szukacz User-agent: Taqui-Tche User-agent: Te Pierdes User-agent: Tecnet User-agent: Tein Cool Links User-agent: Telepolis User-agent: Terra User-agent: The Biz UK User-agent: The Business Dir User-agent: The Search King User-agent: The Swiss Search Engine User-agent: Tiggr User-agent: Top 50 Web Sites User-agent: Topmart User-agent: Trovator User-agent: True Search User-agent: Try America User-agent: TurnPike User-agent: U2Search User-agent: Ugabula User-agent: Ugo Moi User-agent: UK Plus User-agent: UK Search User-agent: UK Top 100 User-agent: Um.es User-agent: Unasked.com User-agent: Uwad.com User-agent: Velendi.com User-agent: Voida User-agent: VoyagerSearch User-agent: Walhello User-agent: Walhello Internet Search User-agent: Web 100 User-agent: Web De Links User-agent: Web Top User-agent: Web Wombat Global User-agent: Webatola.com User-agent: Webbizfind User-agent: WebEstate User-agent: WebSearch.com.au User-agent: WebSquash User-agent: Webwatch User-agent: WebWizard.at User-agent: Whatuseek User-agent: Whichone User-agent: Wholesale Engines User-agent: World Man User-agent: World Trawler Search User-agent: WorldHot.com User-agent: WorldLight Network User-agent: Woyaa! Africa Search User-agent: WWW.Ru User-agent: Xland Web Directory User-agent: YooZee User-agent: Your Weblog Here User-agent: YourBizz Search User-agent: Yupi Spain User-agent: ZenSearch User-agent: Advalvas Yellow Pages User-agent: Belgian Yellow Pages User-agent: Busca Direta User-agent: Busca Site User-agent: Cari Search User-agent: dmoz User-agent: Huifa User-agent: iBound Search User-agent: EurNet Online User-agent: Emcontrar User-agent: Information Marketplace User-agent: Jayde Online User-agent: Latin World User-agent: LinXdirectory User-agent: Max South African Web User-agent: Meu Jovem User-agent: Mex Search Yellow Pages User-agent: Network22 Business Directory User-agent: MyPage Web Directory User-agent: Nos Achamos User-agent: Ohio Sites User-agent: OzSearch User-agent: Rex Directory User-agent: RoadHouse User-agent: SBEL User-agent: SearchUp User-agent: ShopInternet User-agent: SiteInclusion User-agent: SoftwareShop User-agent: solascope User-agent: SurfChina User-agent: Thailand WWW Directory User-agent: TheYellowPages.com User-agent: Top 50 Web Sites User-agent: Top Online Shopping User-agent: Total Finder User-agent: Where2Go Business Directory User-agent: WorldBazaarOnline User-agent: Yehey User-agent: Yellow Pages SuperHighway User-agent: World Shopping Internet Directory User-agent: Web Shopping Internet Directory User-agent: Netmall User-agent: Select Stores User-agent: Top 50 Web Sites User-agent: Web 100 User-agent: 100 Hot Sites Disallow: User-agent: * Disallow: / User-agent: googlebot-image Disallow: / User-agent: psbot Disallow: / User-agent: asterias User-agent: * Disallow: /admin/ Disallow: /cgi-bin/ Disallow: /images/ Disallow: /wp- Disallow: /wp-* Disallow: /wp-admin/ Disallow: /wp-includes/ Disallow: /wp-content/ Disallow: /wp-content/plugins/ Disallow: /wp-content/themes/ Disallow: /gurugrid/wp-admin/ Disallow: /comments/ Disallow: /author/ Disallow: /trackback/ Disallow: /gurugrid/wp-includes/ Disallow: /gurugrid/wp-content/plugins/ Disallow: /gurugrid/wp-content/cache/ Disallow: /gurugrid/wp-content/themes/ Disallow: /gurugrid/wp- Disallow: /gurugrid/category/ Disallow: /gurugrid/comments/ Disallow: /gurugrid/tag/ Disallow: /gurugrid/author/ Disallow: /gurugrid/trackback/ Disallow: /gurugrid/twitbig/followers/ Disallow: /globalbusiness/administrator/ Disallow: /globalbusiness/components/ Disallow: /globalbusiness/administrator/ Disallow: /bonus/ Disallow: /Images-Root-Viaguru/ # Robots known or highly suspected of collecting email addresses for spam RewriteCond %{HTTP_USER_AGENT} ^(autoemailspider|Bullseye|CherryPicker|Crescent|ecollector|EmailCollector|Email.Extractor|EmailSiphon|EmailWolf|ExtractorPro|fastlwspider|.*LWP|Digger|.*hhjhj@yahoo|Microsoft.URL|Mozilla/3.Mozilla/2.01|Mozilla.*NEWT|NICErsPRO|SurfWalker|Telesoft|WebBandit|WebEMailExtrac|Zeus.*Webster) [NC,OR] # Robots (sometimes called spiders) which regularly violate robots.txt RewriteCond %{HTTP_USER_AGENT} ^(ADSARobot|.*almaden\.ibm|ASSORT|big.brother|bumblebee|Digimarc|FavOrg|FAST|.*fluffy|.*Girafabot|HomePageSearch|IncyWincy|Ingelin|NPBot|Openfind|OpenTextSiteCrawler|OrangeBot|Robozilla|ScoutAbout|.*searchhippo|searchterms\.it|sitecheck|UIowaCrawler|.*webcraft@bea\.com|WEBMASTERS|WhosTalking|WISEbot|Yandex) [NC,OR] # Agents used for both good and bad purposes, such as sucking up bandwidth # by downloading entire sites, or probing servers for security exploits. RewriteCond %{HTTP_USER_AGENT} ^(ASPSeek|Deweb|Fetch|FlashGet|Getleft|GetURL|GetWebPage|.*HTTrack|KWebGet|libwww-perl|Mirror|NetAnts|NetCarta|netprospector|Net.Vampire|pavuk|PSurf|PushSite|reget|Rsync|Shai|SpiderBot|SuperBot|tarspider|Templeton|w3mir|web.by.mail|WebCopier|WebCopy|WebMiner|WebReaper|WebSnake|WebStripper|webvac|webwalk|WebZIP|Wget|XGET) [NC,OR] # Miscellaneous (suspicious -- more information would be appreciated) RewriteCond %{HTTP_USER_AGENT} ^(ah-ha|aktuelles|amzn_assoc|ATHENS|attache|bew|disco|.*DTS.Agent|Favorites.Sweeper|FEZhead|Generic|GetRight|go-ahead-got-it|.*Harvest|IBM_Planetwide|leech|MCspider|NetResearchServer|nost\.info|OpaL|PackRat|RepoMonkey|.*Rover|Spegla|SqWorm|.*TrueRobot|UtilMind|vspider|.*WUMPUS) [NC,OR] # Blank or 10-letter user agent RewriteCond %{HTTP_USER_AGENT} ^(-?|[A-Z]{10})$ [OR] # A host which tries to hide itself in reverse DNS lookup RewriteCond %{REMOTE_HOST} ^private$ [NC,OR] # Web surveying sites (may require using ipchains) RewriteCond %{HTTP_REFERER} (traffixer|netfactual|netcraft)\.com [NC,OR] RewriteCond %{REMOTE_HOST} \.netcraft\.com$ [NC,OR] # A fake referrer that's often used -- use this unless your pages are related # in some way to atomic energy and could really be linked to from www.iaea.org RewriteCond %{HTTP_REFERER} ^[^?]*iaea\.org [NC,OR] # "addresses.com" is a referer used by an email address extractor RewriteCond %{HTTP_REFERER} ^[^?]*addresses\.com [NC,OR] # A fake referrer that's used in conjuncting with formmail exploits RewriteCond %{HTTP_REFERER} ^[^?]*\.ideography\.co\.uk [NC] # The rule which blocks out further access from the host RewriteRule .* /cgi-bin/bad.pl [L,T=application/x-httpd-cgi] RewriteCond %{HTTP_USER_AGENT} ^webcollage RewriteRule .* - [L,F] # Bad requests which look like attacks (these have all been seen in real attacks) RewriteRule ^[^?]*/(owssvr|strmver|orders|Auth_data|redirect\.adp|MSOffice|DCShop|msadc|winnt|system32|script|autoexec|formmail\.pl|_mem_bin|NULL\.) /cgi-bin/bad.pl [NC,L,T=application/x-httpd-cgi] # Filter out bad requests (may need to be adjusted to your needs) RewriteCond %{THE_REQUEST} "^((GET|POST|HEAD) [^/]|CONNECT)" [NC] RewriteRule .* /cgi-bin/bad.pl [L,T=application/x-httpd-cgi] RewriteEngine on RewriteCond %{HTTP_USER_AGENT} ^EmailSiphon [OR] RewriteCond %{HTTP_USER_AGENT} ^EmailWolf [OR] RewriteCond %{HTTP_USER_AGENT} ^ExtractorPro [OR] RewriteCond %{HTTP_USER_AGENT} ^Mozilla.*NEWT [OR] RewriteCond %{HTTP_USER_AGENT} ^Crescent [OR] RewriteCond %{HTTP_USER_AGENT} ^CherryPicker [OR] RewriteCond %{HTTP_USER_AGENT} ^[Ww]eb[Bb]andit [OR] RewriteCond %{HTTP_USER_AGENT} ^WebEMailExtrac.* [OR] RewriteCond %{HTTP_USER_AGENT} ^NICErsPRO [OR] RewriteCond %{HTTP_USER_AGENT} ^Telesoft [OR] RewriteCond %{HTTP_USER_AGENT} ^Zeus.*Webster [OR] RewriteCond %{HTTP_USER_AGENT} ^Microsoft.URL [OR] RewriteCond %{HTTP_USER_AGENT} ^ESurf15a 15 [OR] RewriteCond %{HTTP_USER_AGENT} ^HTTrack [OR] RewriteCond %{HTTP_USER_AGENT} ^Franklin Locator [OR] RewriteCond %{HTTP_USER_AGENT} ^FSurf15a 01 [OR] RewriteCond %{HTTP_USER_AGENT} ^8484 Boston Project v 1.0 [OR] RewriteCond %{HTTP_USER_AGENT} ^atSpider/1.0 [OR] RewriteCond %{HTTP_USER_AGENT} ^China Local Browse 2.6 [OR] RewriteCond %{HTTP_USER_AGENT} ^EmeraldShield.com WebBot [OR] RewriteCond %{HTTP_USER_AGENT} ^EmeraldShield spam and web filtration services [OR] RewriteCond %{HTTP_USER_AGENT} ^Franklin Locator 1.8 [OR] RewriteCond %{HTTP_USER_AGENT} ^Guestbook Auto Submitter [OR] RewriteCond %{HTTP_USER_AGENT} ^Hi! I'm CsCrawler my homepage: http://www.kde.cs.uni-kassel.de/lehre/ss2005/googlespam/crawler.html RPT-HTTPClient/0.3-3 [OR] RewriteCond %{HTTP_USER_AGENT} ^http://www.innovation.ch/java/HTTPClient/ [OR] RewriteCond %{HTTP_USER_AGENT} ^infoConveraCrawler/0.8 [OR] RewriteCond %{HTTP_USER_AGENT} ^INGRID/3.0 MT [OR] RewriteCond %{HTTP_USER_AGENT} ^ISC Systems iRc Search 2.1 [OR] RewriteCond %{HTTP_USER_AGENT} ^IUPUI Research Bot v 1.9a [OR] RewriteCond %{HTTP_USER_AGENT} ^LetsCrawl.com/1.0 +http://letscrawl.com/ [OR] RewriteCond %{HTTP_USER_AGENT} ^Lincoln State Web Browser [OR] RewriteCond %{HTTP_USER_AGENT} ^LWP::Simple/5.803 [OR] RewriteCond %{HTTP_USER_AGENT} ^Mac Finder 1.0.xx [OR] RewriteCond %{HTTP_USER_AGENT} ^Microsoft URL Control - 6.00.8xxx [OR] RewriteCond %{HTTP_USER_AGENT} ^Missauga Locate 1.0.0 [OR] RewriteCond %{HTTP_USER_AGENT} ^Missigua Locator 1.9 [OR] RewriteCond %{HTTP_USER_AGENT} ^Missouri College Browse [OR] RewriteCond %{HTTP_USER_AGENT} ^Mizzu Labs 2.2 [OR] RewriteCond %{HTTP_USER_AGENT} ^Mo College 1.9 [OR] RewriteCond %{HTTP_USER_AGENT} ^Mozilla/3.0 (INGRID/3.0 MT; webcrawler@NOSPAMexperimental.net; http://aanmelden.ilse.nl/?aanmeld_mode=webhints)