Home | 简体中文 | 繁体中文 | 杂文 | Github | 知乎专栏 | 51CTO学院 | CSDN程序员研修院 | OSChina 博客 | 腾讯云社区 | 阿里云栖社区 | Facebook | Linkedin | Youtube | 打赏(Donations) | About

11.2. scrapy 命令

neo@MacBook-Pro ~/Documents/crawler % scrapy     
Scrapy 1.4.0 - project: crawler

  scrapy <command> [options] [args]

Available commands:
  bench         Run quick benchmark test
  check         Check spider contracts
  crawl         Run a spider
  edit          Edit spider
  fetch         Fetch a URL using the Scrapy downloader
  genspider     Generate new spider using pre-defined templates
  list          List available spiders
  parse         Parse URL (using its spider) and print the results
  runspider     Run a self-contained spider (without creating a project)
  settings      Get settings values
  shell         Interactive scraping console
  startproject  Create new project
  version       Print Scrapy version
  view          Open URL in browser, as seen by Scrapy

Use "scrapy <command> -h" to see more info about a command


neo@MacBook-Pro ~/Documents % scrapy startproject crawler 
New Scrapy project 'crawler', using template directory '/usr/local/lib/python3.6/site-packages/scrapy/templates/project', created in:

You can start your first spider with:
    cd crawler
    scrapy genspider example example.com

11.2.2. 新建 spider

neo@MacBook-Pro ~/Documents/crawler % scrapy genspider netkiller netkiller.cn
Created spider 'netkiller' using template 'basic' in module:

11.2.3. 列出可用的 spiders

neo@MacBook-Pro ~/Documents/crawler % scrapy list

11.2.4. 运行 spider

neo@MacBook-Pro ~/Documents/crawler % scrapy crawl netkiller

运行结果输出到 json 文件中

neo@MacBook-Pro ~/Documents/crawler % scrapy crawl netkiller -o output.json