[Docker]自製docker image 執行scrapy ~ 度估記事本

2019年7月3日星期三

[Docker]自製docker image 執行scrapy

劉老克下午4:57 Docker, Linux, Python, Scrapy, Ubuntu No comments

本次實做的環境
ubuntu 18.04 的伺服器版本
docker 18.06
python 3.6
pip 19.11

以下繼續

前面需要安裝一些東西，快速帶過。
1.安裝完ubuntu後，設定時間
2.升級並安裝python3



sudo apt-get upgrade

sudo  apt-get update

sudo apt-get install python3-pip

pip3 install –upgrade pip

pip3 –version  #查看版本

python3 –version # 查看版本

3.安裝scrapy



sudo apt-get install builld-essential libssl-dev libffi-dev python3-dev   #先安裝依賴的軟體

sudo pip3 install scrapy

(出現錯誤 pip Import Error:cannot import name main，利用vim 修改usr/bin/pip3)
4.先建立scrapy的專案，看能不能正常執行。



scrapy startproject books

將程式丟到目錄(/home/user/books/books/spiders) 底下，執行

scrapy crawl books

5.再來開始準備製作image
製作image基本要有兩個檔案
Dockerfile
requirements.txt
首先Dockerfile



# As Scrapy runs on Python, I choose the official Python 3 Docker image.

FROM python:3

# maintainer 

MAINTAINER Daimom

# Set the working directory to /usr/src/app.

WORKDIR /home/user/books

# Copy the file from the local host to the filesystem of the container at the working directory.

COPY requirements.txt ./

  

# Install Scrapy specified in requirements.txt.

RUN pip3 install --no-cache-dir -r requirements.txt

  

# Copy the project source code from the local host to the filesystem of the container at the working directory.

COPY . .

  

# Run the crawler when the container launches.

#CMD scrapy crawl books

#CMD ["python3","./SchedulerSpider_2.py"]

CMD ["python3","./SchedulerSpider.py"]

簡單解釋
FROM 是決定要從哪個基底的image過來
WORKDIR 看你要在哪個目錄下執行程式
COPY 將你的檔案複製到docker的資料夾下
CMD 就是要執行的命令，有三種方式執行spider ，有空在講另外兩種方式
再來 requirements.txt 內



scrapy

influxdb

文件建立完成後，將檔案放到目錄底下，
所謂目錄底下，我是直接放在(/home/user/books)
這樣才會順便把檔案一併丟進去
6.建立image
(image名稱請小寫)



sudo docker build -t bookscrawl .

請記得後面的 . 一定要打，那是指你暫時存放的images位置

跑完後直接執行



sudo docker run bookscrawl

ref .
安裝Python 3.6 在Ubuntu 16.04 LTS 版本
Linux-Ubuntu16.04下Python3.5安裝pip3以及scrapy、numpy、itchat
Scrapy 對接 Docker
RUNNING A WEB CRAWLER IN A DOCKER CONTAINER
基於Docker的Scrapy+Scrapyd+Scrapydweb部署

度估記事本.

加菲貓

台中一景

Who am I

2019年7月3日星期三