To download a file from Amazon S3 to your local system using Apache Airflow , the most flexible and widely used approach is utilizing the within a PythonOperator .
: For automated workflows, you can pair this with S3KeySensor . To download a file from Amazon S3 to
: The file is downloaded to the specific worker node. : Simplifies S3 interactions without managing raw boto3
: Simplifies S3 interactions without managing raw boto3 connections. Define the DAG with DAG( dag_id='s3_to_local_download'
import os from datetime import datetime from airflow import DAG from airflow.operators.python import PythonOperator from airflow.providers.amazon.aws.hooks.s3 import S3Hook # 1. Define the Python function for S3 download def download_from_s3(key: str, bucket_name: str, local_path: str): hook = S3Hook('aws_default') # Downloads file to specified local path return hook.download_file( key=key, bucket_name=bucket_name, local_path=local_path, preserve_file_name=True ) # 2. Define the DAG with DAG( dag_id='s3_to_local_download', start_date=datetime(2024, 1, 1), schedule_interval=None, catchup=False ) as dag: task_download = PythonOperator( task_id='download_file_task', python_callable=download_from_s3, op_kwargs={ 'key': 'data/input.csv', 'bucket_name': 'my-bucket', 'local_path': '/tmp/' # Downloads to worker tmp folder } ) Use code with caution. Key Components