When you need to pull data from a public or private URL, you have two primary options:
Apache Airflow for Data Science — How to Download Files from Amazon S3 | by Dario Radečić | TDS Archive | Medium apache airflow download file
For major cloud providers, using specific transfer operators is the most efficient and scalable method. When you need to pull data from a
: Use the S3Hook or a PythonOperator to trigger a download. The S3Hook.download_file() method requires the bucket_name , the key (file path in S3), and a local_path on the worker where the file will be saved. the key (file path in S3)
No account yet?
Create an Account