on
AWS ec2(Ubuntu)에 Airflow2.0 설치하기
AWS ec2(Ubuntu)에 Airflow2.0 설치하기
Ubuntu에서 airflow2.0 설치하는 방법
참고 https://github.com/keeyong/data-engineering-batch5/blob/main/docs/Airflow%202%20Installation.md
1. python 설치
sudo apt-get update sudo apt-get install -y python3-pip python3 --version Python 3.8.10
2. airlfow 및 기타모듈 설치
sudo apt-get install -y postgresql-server-dev-all
sudo apt-get install -y postgresql-common
sudo pip3 install apache-airflow
sudo pip3 install apache-airflow-providers-postgres[amazon]==2.0.0
sudo pip3 install cryptography psycopg2-binary boto3 botocore
sudo pip3 install SQLAlchemy==1.3.23
3.airflow 계정생성
ubunut의 root계정이 아닌 airlfow user를 생성해서 작업을 진행할 예정
sudo groupadd airflow
sudo useradd -s /bin/bash airflow -g airflow -d /var/lib/airflow -m
루트디렉토리 : /var/lib/airflow/
4. postgre 설치
sudo apt-get install -y postgresql postgresql-contrib
postgre user로 로그인해서 postgre의 USER와 DATABASE생성
#postgre user로그인
ubuntu@ip-172-31-50-243:~$ sudo su postgres
#user,database생성
postgres@ip-172-31-50-243:/home/ubuntu$ psql
psql (10.18 (Ubuntu 10.18-0ubuntu0.18.04.1))
Type "help" for help.
postgres=# CREATE USER airflow PASSWORD 'airflow';
CREATE ROLE
postgres=# CREATE DATABASE airflow;
CREATE DATABASE
postgres=# \q
postgres@ip-172-31-50-243:/home/ubuntu$ exit
exit
#postgresql 재실행
ubuntu@ip-172-31-50-243:~$ sudo service postgresql restart
5.airflow 초기화
# airflow user사용
ubuntu@ip-172-31-50-243:~$ sudo su airflow
airflow@ip-172-31-50-243:/home/ubuntu$ cd /var/lib/airflow
#dags 폴더생성
airflow@ip-172-31-50-243:~$ pwd
/var/lib/airflow
airflow@ip-172-31-50-243:~$ mkdir dags
airflow@ip-172-31-50-243:~$ ls
dags
#airflow 초기화
airflow@ip-172-31-50-243:~$ AIRFLOW_HOME=/var/lib/airflow airflow db init
airflow@ip-172-31-50-243:~$ ls
airflow.cfg airflow.db dags logs webserver_config.py
5.airflow config수정 (ariflow.cfg 파일)
# executor = LocalExecutor
# sql_alchemy_conn = postgresql+psycopg2://airflow:airflow@localhost:5432/airflow
ID와 PW와 데이터베이스 이름이 모두 airflow, 호스트이름 localhost
# "load_examples" 설정을 False로 바꾼다
# airflow 재설정
airflow@ip-172-31-50-243:~$ AIRFLOW_HOME=/var/lib/airflow airflow db init
6.airflow 웹서버 , 스케쥴러 서비스 등록
# ubuntu 계정으로 이동
airflow@ip-172-31-50-243:~$ exit
exit
ubuntu@ip-172-31-50-243:~$
#Airflow 웹서버를 백그라운드 서비스로 등록
ubuntu@ip-172-31-50-243:~$ sudo vi /etc/systemd/system/airflow-webserver.service
[Unit]
Description=Airflow webserver
After=network.target
[Service]
Environment=AIRFLOW_HOME=/var/lib/airflow
User=airflow
Group=airflow
Type=simple
ExecStart=/usr/local/bin/airflow webserver -p 8080
Restart=on-failure
RestartSec=10s
[Install]
WantedBy=multi-user.target
#Airflow 스케쥴러를 백그라운드 서비스로 등록
ubuntu@ip-172-31-50-243:~$ sudo vi /etc/systemd/system/airflow-scheduler.service
[Unit]
Description=Airflow scheduler
After=network.target
[Service]
Environment=AIRFLOW_HOME=/var/lib/airflow
User=airflow
Group=airflow
Type=simple
ExecStart=/usr/local/bin/airflow scheduler
Restart=on-failure
RestartSec=10s
[Install]
WantedBy=multi-user.target
#서비스 활성화
ubuntu@ip-172-31-50-243:~$ sudo systemctl daemon-reload
ubuntu@ip-172-31-50-243:~$ sudo systemctl enable airflow-webserver
Created symlink /etc/systemd/system/multi-user.target.wants/airflow-webserver.service → /etc/systemd/system/airflow-webserver.service.
ubuntu@ip-172-31-50-243:~$ sudo systemctl enable airflow-scheduler
Created symlink /etc/systemd/system/multi-user.target.wants/airflow-scheduler.service → /etc/systemd/system/airflow-scheduler.service.
#서비스 시작
ubuntu@ip-172-31-50-243:~$ sudo systemctl start airflow-webserver
ubuntu@ip-172-31-50-243:~$ sudo systemctl start airflow-scheduler
#서비스 상태확인
ubuntu@ip-172-31-50-243:~$ sudo systemctl status airflow-webserver
ubuntu@ip-172-31-50-243:~$ sudo systemctl status airflow-scheduler
7. Airflow webserver 로그인 어카운트 생성
ubuntu@ip-172-31-50-243:~$ AIRFLOW_HOME=/var/lib/airflow airflow users create --role Admin --username admin --email admin --firstname admin --lastname admin --password admin
[2021-09-03 13:36:03,043] {filesystemcache.py:224} ERROR - set key '\x1b[1m__wz_cache_count\x1b[22m' -> [Errno 1] Operation not permitted: '/tmp/tmpplecjlbc.__wz_cache' -> '/tmp/2029240f6d1128be89ddc32729463129'
[2021-09-03 13:36:03,079] {manager.py:788} WARNING - No user yet created, use flask fab command to do it.
8. Airflow접속
현재 ec2 ubuntu를 사용했으므로, [ec2의 hostname]:8080으로 접속해서 확인
반응형
from http://pearlluck.tistory.com/678 by ccl(A) rewrite - 2021-09-03 23:26:32