开发准备，安装scrapy框架

Scrapy是Python开发的一个快速、高层次的屏幕抓取和web抓取框架，用于抓取web站点并从页面中提取结构化的数据。Scrapy用途广泛，可以用于数据挖掘、监测和自动化测试。

** 1，基础环境搭建python开发安装最新版python （省略）**

详细安装方法

** 2，安装scrapy**

pip install scrapy
（或者）
pip3 install scrapy

安装界面


PS C:\WINDOWS\system32> pip install scrapy
Collecting scrapy
  Using cached https://files.pythonhosted.org/packages/29/4b/585e8e111ffb01466c59281f34febb13ad1a95d7fb3919fd57c33fc732a5/Scrapy-1.7.3-py2.py3-none-any.whl
Collecting lxml; python_version != "3.4" (from scrapy)
  Using cached https://files.pythonhosted.org/packages/bc/87/c3cecadcb5d7924cd71724b177343149cfc3609a89b197a991ac8593ed8c/lxml-4.4.1-cp37-cp37m-win_amd64.whl
Collecting w3lib>=1.17.0 (from scrapy)
  Using cached https://files.pythonhosted.org/packages/6a/45/1ba17c50a0bb16bd950c9c2b92ec60d40c8ebda9f3371ae4230c437120b6/w3lib-1.21.0-py2.py3-none-any.whl
Collecting queuelib (from scrapy)
  Using cached https://files.pythonhosted.org/packages/4c/85/ae64e9145f39dd6d14f8af3fa809a270ef3729f3b90b3c0cf5aa242ab0d4/queuelib-1.5.0-py2.py3-none-any.whl
Collecting service-identity (from scrapy)
  Using cached https://files.pythonhosted.org/packages/e9/7c/2195b890023e098f9618d43ebc337d83c8b38d414326685339eb024db2f6/service_identity-18.1.0-py2.py3-none-any.whl
Collecting Twisted>=13.1.0; python_version != "3.4" (from scrapy)
  Using cached https://files.pythonhosted.org/packages/ee/d9/5b79fef4a7d7dc4d526151904eae5dd207f80433ae646a258b32abbe77d4/Twisted-19.7.0-cp37-cp37m-win_amd64.whl
  ...
    Using cached https://files.pythonhosted.org/packages/ea/cd/35485615f45f30a510576f1a56d1e0a7ad7bd8ab5ed7cdc600ef7cd06222/asn1crypto-0.24.0-py2.py3-none-any.whl
Requirement already satisfied: setuptools in c:\program files (x86)\microsoft visual studio\shared\python37_64\lib\site-packages (from zope.interface>=4.4.2->Twisted>=13.1.0; python_version != "3.4"->scrapy) (40.8.0)
Collecting idna>=2.5 (from hyperlink>=17.1.1->Twisted>=13.1.0; python_version != "3.4"->scrapy)
  Using cached https://files.pythonhosted.org/packages/14/2c/cd551d81dbe15200be1cf41cd03869a46fe7226e7450af7a6545bfc474c9/idna-2.8-py2.py3-none-any.whl
Collecting pycparser (from cffi!=1.11.3,>=1.8->cryptography->service-identity->scrapy)
  Using cached https://files.pythonhosted.org/packages/68/9e/49196946aee219aead1290e00d1e7fdeab8567783e83e1b9ab5585e6206a/pycparser-2.19.tar.gz
Installing collected packages: lxml, six, w3lib, queuelib, attrs, pyasn1, pyasn1-modules, pycparser, cffi, asn1crypto, cryptography, service-identity, Automat, zope.interface, incremental, constantly, PyHamcrest, idna, hyperlink, Twisted, PyDispatcher, cssselect, parsel, pyOpenSSL, scrapy
  Running setup.py install for pycparser ... done
  Running setup.py install for PyDispatcher ... done
Successfully installed Automat-0.7.0 PyDispatcher-2.0.5 PyHamcrest-1.9.0 Twisted-19.7.0 asn1crypto-0.24.0 attrs-19.1.0 cffi-1.12.3 constantly-15.1.0 cryptography-2.7 cssselect-1.1.0 hyperlink-19.0.0 idna-2.8 incremental-17.5.0 lxml-4.4.1 parsel-1.5.2 pyOpenSSL-19.0.0 pyasn1-0.4.6 pyasn1-modules-0.2.6 pycparser-2.19 queuelib-1.5.0 scrapy-1.7.3 service-identity-18.1.0 six-1.12.0 w3lib-1.21.0 zope.interface-4.6.0

至此scrapy安装成功

首次安装可能报错，因为电脑上可以能没有某些依赖环境如果报错可根据报错信息安装下面安装包
pyOpenSSL：在官网下载wheel文件。
Twisted：在官网下载wheel文件。
PyWin32：在官网下载wheel文件。

扩展阅读
xpath语法
 pymysql操作数据库

######百度下 json csv文件的格式

转载：https://blog.csdn.net/cetd123/article/details/102485466

查看评论

小言_互联网的博客

小言_互联网的博客

个人资料

文章分类

文章存档

阅读排行

评论排行

推荐文章

scrapy框架使用教程1

开发准备，安装scrapy框架

Scrapy是Python开发的一个快速、高层次的屏幕抓取和web抓取框架，用于抓取web站点并从页面中提取结构化的数据。Scrapy用途广泛，可以用于数据挖掘、监测和自动化测试。

安装界面

至此scrapy安装成功

* 以上用户言论只代表其个人观点，不代表本网站的观点或立场