大米CMS官网论坛,大米站长联盟,大米站长之家,大米开发者社区
标题:
python scrapy多个items多管道的使用
[打印本页]
作者:
追影
时间:
2023-6-30 17:28
标题:
python scrapy多个items多管道的使用
多个items
这个比较简单,在items.py文件内创建相应的类,在spider中引入即可
items.py
import scrapy
class MymultispiderItem(scrapy.Item):
# define the fields for your item here like:
# name = scrapy.Field()
pass
class Myspd1spiderItem(scrapy.Item):
name = scrapy.Field()
class Myspd2spiderItem(scrapy.Item):
name = scrapy.Field()
class Myspd3spiderItem(scrapy.Item):
name = scrapy.Field()
复制代码
spider内使用对应的items
import scrapy
from mymultispider.items import Myspd1spiderItem
class Myspd1Spider(scrapy.Spider):
name = 'myspd1'
allowed_domains = ['sina.com.cn']
start_urls = ['http://sina.com.cn/']
def parse(self, response):
print('myspd1')
item = Myspd1spiderItem()
item['name'] = 'myspd1的pipelines'
yield item
复制代码
四,指定pipelines
1,这个也有两种方法,方法一,定义多个pipeline类:
pipelines.py文件内:
class Myspd1spiderPipeline:
def process_item(self,item,spider):
print(item['name'])
return item
class Myspd2spiderPipeline:
def process_item(self,item,spider):
print(item['name'])
return item
class Myspd3spiderPipeline:
def process_item(self,item,spider):
print(item['name'])
return item
复制代码
1.1settings.py文件开启管道
ITEM_PIPELINES = {
'mymultispider.pipelines.Myspd1spiderPipeline': 300,
'mymultispider.pipelines.Myspd2spiderPipeline': 300,
'mymultispider.pipelines.Myspd3spiderPipeline': 300,
}
复制代码
1.2spider中设置管道
mport scrapy
from mymultispider.items import Myspd1spiderItem
class Myspd1Spider(scrapy.Spider):
name = 'myspd1'
allowed_domains = ['sina.com.cn']
start_urls = ['http://sina.com.cn/']
custom_settings = {
'ITEM_PIPELINES': {'mymultispider.pipelines.Myspd1spiderPipeline': 300},
}
def parse(self, response):
print('myspd1')
item = Myspd1spiderItem()
item['name'] = 'myspd1的pipelines'
yield item
复制代码
指定管道的代码
custom_settings = {
'ITEM_PIPELINES': {'mymultispider.pipelines.Myspd1spiderPipeline': 300},
}
复制代码
欢迎光临 大米CMS官网论坛,大米站长联盟,大米站长之家,大米开发者社区 (https://www.damicms.com/bbs/)
Powered by Discuz! X3.1