|
本帖最后由 追影 于 2023-5-15 20:41 编辑
- #禁止Scrapy等工具的抓取
- if ($http_user_agent ~* (Scrapy|Curl|HttpClient)) {
- return 403;
- }
- #禁止指定UA及UA为空的访问
- if ($http_user_agent ~* "YandexBot|Bytespider|FeedDemon|Indy Library|Alexa Toolbar|AskTbFXTV|AhrefsBot|CrawlDaddy|CoolpadWebkit|Java|Feedly|UniversalFeedParser|ApacheBench|Microsoft URL Control|Swiftbot|ZmEu|oBot|jaunty|Python-urllib|lightDeckReports Bot|YYSpider|DigExt|HttpClient|MJ12bot|heritrix|Ezooms|^[ DISCUZ_CODE_0 ]quot; ) {
- return 403;
- }
- #禁止非GET|HEAD|POST方式的抓取
- if ($request_method !~ ^(GET|HEAD|POST)$) {
- return 403;
- }
复制代码 APACHE用rewrite实现:- RewriteEngine On
- <blockquote>RewriteCond %{HTTP_USER_AGENT} ^.*(spider|bot|slurp).*$ [NC]
复制代码
|
|