爬取Amazon产品名称时爬取不到内容

阳关的香味 发布于 2015/04/29 15:17
阅读 187
收藏 0

@明天以后 你好,



from scrapy.contrib.spiders import CrawlSpider
from scrapy.selector import Selector
from scrapy.selector import HtmlXPathSelector
from tutorial.items import AmazonItem






class AmazonSpider(CrawlSpider):
    name = "dmoz"
    allowed_domains = ["amazon.com"]
    start_urls = ["http://www.amazon.com/dp/B00117STKE/"]
   
      
    def parse(self,response):
        sel = Selector(response)
        sites = sel.xpath('//div[starts-with(@id,"centerCol") and starts-with(@class ,"centerColAlign")]')
        items = []
        if sel.xpath('//div[starts-with(@id,"centerCol") and starts-with(@class ,"centerColAlign")]'):
            
            
            
            for site in sites: 
                item = AmazonItem()
                if site.xpath('//div[contains(@id,"title_feature_div")]//div[contains(@id,"titleSection")]/h1[contains(@id,"title")]/span/text()'):
                    item['slug'] = site.xpath('//div[contains(@id,"title_feature_div")]//div[contains(@id,"titleSection")]/h1[contains(@id,"title")]/span/text()').extract()[0]  
                elif response.xpath('//div[contains(@class,"buying")]//h1[contains(@class,"parseasinTitle")]/span/text()'):
                
                    item['slug'] = response.xpath('//div[contains(@class,"buying")]//h1[contains(@class,"parseasinTitle")]/span/text()').extract()
                else:
                    item['slus'] = 0
                items.append(item)
            return items
        else:
            return items


想跟你请教个问题:爬去http://www.amazon.com/dp/B00N9Z5Z80/这个网址可以  http://www.amazon.com/dp/B00117STKE/这个网址就不行了呢

加载中
0
明天以后
明天以后

晚上回去再看吧,上班呢。

或者你加下我Q:944898186 欢迎交流~

0
fromdtor
fromdtor

使用scrapy shell http://www.amazon.com/dp/B00117STKE/

然后print response._get_body()

有内容啊!看一下报错信息

fromdtor
fromdtor
@阳关的香味 把报错信息贴出来啊
阳关的香味
还是弄不出来 帮帮吧
返回顶部
顶部