使用Splash解决爬取页面时需要执行JS问题

写 Scrapy 爬虫时,遇到了 js 进行跳转的页面,大家有没有好的解决方法?

答案是:

splash

Splash is a javascript rendering service with an HTTP API. It’s a lightweight browser with an HTTP API, implemented in Python 3 using Twisted and QT5.

It’s fast, lightweight and state-less which makes it easy to distribute.

Documentation Documentation is available here: https://splash.readthedocs.io/

scrapy-splash

This library provides Scrapy and JavaScript integration using Splash. The license is BSD 3-clause.

参考:

发表评论