vendredi 14 septembre 2012

Protège ton site internet des bots !!!

Le but de ce post est de te donner une solution toute faite pour te protéger des bots qui potentiellement peuvent piller le contenu ou agresser ton site internet.

Dans un fichier .htaccess, à la racine de ton site, rajoute les lignes suivantes :

RewriteEngine On
RewriteCond %{HTTP_USER_AGENT} .*WebsecurifyScanner* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} .*Morfeus* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} .*updownerbot* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} .*python* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} .*HTTrack* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} .*Filangy* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} .*BackWeb* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} .*BackStreet* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} .*Bandit* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} .*BatchFTP* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} .*Bullseye* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} .*bumblebee* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} .*CherryPicker* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} .*CherryPickrElite* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} .*CherryPickerSE* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} .*ChinaClaw* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} .*clipping* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} .*collage* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} .*Collector* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} .*Crescent* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} .*eCatch* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} .*EirGrabber* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} .*EmeraldShield* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} .*FlashGet* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} .*FlickBot* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} .*FrontPage* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} .*GetRight* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} .*GetSmart* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} .*GetWeb* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} .*GetWebPage* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} .*gigabaz* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} .*GornKer* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} .*gotit* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} .*Grabber* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} .*GrabNet* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} .*kapere* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} .*larbin* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} .*Missigua* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} .*Vampire* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} .*PycURL* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} .*RealDownload* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} .*Reaper* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} .*Recorder* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} .*SearchExpress* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} .*SlySearch* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} .*SmartDownload* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} .*snagger* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} .*Snake* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} .*Stripper* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} .*Sucker* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} .*Telesoft* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} .*WebAuto* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} .*WebBandit* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} .*WebCapture* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} .*Webclipping* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} .*webcollage* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} .*WebCopier* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} .*WebEMailExtrac* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} .*WebFetch* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} .*WebLeacher* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} .*WebMiner* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} .*WebMirror* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} .*WebReaper* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} .*WebWhacker* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} .*Whacker* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} .*whizbang* [NC]
RewriteRule ^.* - [F,L]

La liste des bots n'est pas exhaustive, mais une petite mise a jour a temps régulier devrait permettre tout de même de mettre en place une protection de base relativement efficace.