`Although it is an article on Mac environment, the procedure is the same for Windows environment. Please read and try the environment-dependent part. ``
After reading this article to the end, you will be able to:
| No. | Overview | keyword |
|---|---|---|
| 1 | coding | Python |
| 2 | WEB scraping | Selenium,chromedriver,headless-chromium |
| 3 | Lambda settings | Lambda |
| environment | Ver. |
|---|---|
| macOS Catalina | 10.15.3 |
| Python | 3.7.3 |
| selenium | 3.141.0 |
I think that understanding will deepen if you read while actually following the implementation contents and source code. Please use it by all means.
This service is a pay-as-you-go system. Please note.
app/lambda_function.py
"""app/lambda_function.py
"""
import os
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.common.by import By
def lambda_handler(event, context):
"""lambda_handler
"""
print('event: {}'.format(event))
print('context: {}'.format(context))
headless_chromium = os.getenv('HEADLESS_CHROMIUM', '')
chromedriver = os.getenv('CHROMEDRIVER', '')
options = Options()
options.binary_location = headless_chromium
options.add_argument('--headless')
options.add_argument('--no-sandbox')
options.add_argument('--single-process')
options.add_argument('--disable-dev-shm-usage')
driver = webdriver.Chrome(executable_path=chromedriver, options=options)
driver.get('https://info.finance.yahoo.co.jp/fx/')
usd_jpy = driver.find_element(By.ID, 'USDJPY_top_bid').text
driver.close()
driver.quit()
return {
'status_code': 200,
'usd_jpy': usd_jpy
}
if __name__ == '__main__':
print(lambda_handler(event=None, context=None))
`If you want to work with Lambda, you need to set Options (). ``
--You need to change the version / path of chromedriver and headless-chromium to suit your environment.
--Operation confirmed as of May 2020.
make_upload.sh
rm upload.zip
rm -r upload/
rm -r download/
mkdir -p download/bin
curl -L https://chromedriver.storage.googleapis.com/2.41/chromedriver_linux64.zip -o download/chromedriver.zip
curl -L https://github.com/adieuadieu/serverless-chrome/releases/download/v1.0.0-55/stable-headless-chromium-amazonlinux-2017-03.zip -o download/headless-chromium.zip
unzip download/chromedriver.zip -d download/bin
unzip download/headless-chromium.zip -d download/bin
mkdir upload
cp -r download/bin upload/bin
cp app/lambda_function.py upload/
pip install -r app/requirements.txt -t upload/
cd upload/
zip -r ../upload.zip --exclude=__pycache__/* .
cd ../
rm -r upload/
rm -r download/
command_line.sh
sh make_upload.sh
--AWS account created --Lambda function role created
Function from the submenu and click Create Function.Create from scratch, enter the function name, runtime, and execute role, and click create function.`I uploaded it via S3 because it exceeded 10MB. ``
function code section of your Lambda function from Uploadand clickSave``function code section of your Lambda function and click Save``bin from ʻupload.zip` and create a zip with bin alonebin in LayersLayers to your Lambda functionUpload .zip file to upload ʻupload.zip`bin from ʻupload.zip` and create a zip with bin alonebin.sh
bin.zip
├── chromedriver
└── headless-chromium
bin in LayersLayer in the submenu and click Create LayernameUpload and select bin.zipCompatible Runtime-Option and clickCreateLayers to your Lambda functionLayers that appears in the center of the Designer section of your Lambda function.Add Layer of the layer displayed at the bottom.bin.zip name registered in Layers and clickAddUpload .zip file to upload ʻupload.zip`environment variable section of your Lambda function| Key | value |
|---|---|
| CHROMEDRIVER | /var/task/bin/chromedriver |
| HEADLESS_CHROMIUM | /var/task/bin/headless-chromium |
environment variable section of your Lambda function| Key | value |
|---|---|
| CHROMEDRIVER | /opt/bin/chromedriver |
| HEADLESS_CHROMIUM | /opt/bin/headless-chromium |
Designer section of your Lambda function and click Add TriggerCloudWatch Events / Event Bridgerule nameSchedule expressioncron (0 17? * MON-FRI *) and click Add| frequency | formula |
|---|---|
| 10 am every day:15 (UTC) | cron(15 10 * * ? *) |
| Every Monday to Friday 6 pm:00 | cron(0 18 ? * MON-FRI *) |
| 8 am on the first day of every month:00 | cron(0 8 1 * ? *) |
| Every 10 minutes on weekdays | cron(0/10 * ? * MON-FRI *) |
| 8 am from monday to friday:00 to 5 pm:Every 5 minutes up to 55 | cron(0/5 8-17 ? * MON-FRI *) |
| 9am on the first Monday of every month | cron(0 9 ? * 2#1 *) |
Recommended Posts