Copying Data From GCP Server (Linux) To External AWS S3 Bucket

Sonam Gupta
4y
13.1k
0
1
25
Blog

Introduction

In one of my projects, I had the requirement to transfer GBs of data from one cloud platform to another cloud platform (i.e. Google Cloud Platform to Amazon Web Service S3 bucket). I was working on a Linux virtual machine and data was on one of its directory and I had to transfer into external AWS S3 bucket.

Options to implement the above scenario

Gsutil

You will find several articles available on the internet to transfer data from gsutil, but the problem with this approach is that you will be required to create a GCP bucket also. Gsutil transfers data from GCP bucket to AWS bucket. I have found a very useful article to get an understanding of this topic. How to transfer files from Google Cloud Storage (GCS) into Amazon S3 bucket without downloading the files - Pro Analyst.

We don’t want to have the extra baggage of creating a GCP bucket so I decided not to work for this approach and I had to look for other options which are shown below.

AWS CLI

Now after evaluating my first approach which as per my requirement is not the correct way. I found AWS CLI will also be helpful for transferring data. Hence, I will not talk about Gsutil and our prime focus is on AWS CLI. I have read several articles that shows a very simple approach of using commands to transfer data using Access Key ID, Secret Access Key, and region of the AWS S3 bucket. Refer article: cp — AWS CLI 1.19.5 Command Reference (amazon.com)

Problem

In spite of working with commands of AWS CLI. I was not able to access external the AWS S3 bucket. There are several UI tools available for connection with cloud environments like CloudBerry explorer, Cyberduck, etc. I tried using them also but I could not see AWS S3 bucket with Cloudberry but through cyberduck I was able to access the AWS S3 bucket but I cannot use this tool to transfer data because my data is not at my local system.- Now after explaining all my hurdles, I will move to the solution to this problem.

Prerequisites

To implement the functionality below are the prerequisites:

Install AWS CLI in the machine from where you want to transfer data. Below is the link for installing AWS CLI on Windows and Linux machines.

For Windows - Installing, updating, and uninstalling the AWS CLI version 2 on Windows - AWS Command Line Interface (amazon.com)
Linux - Installing, updating, and uninstalling the AWS CLI version 2 on Linux - AWS Command Line Interface (amazon.com)
Access Key ID, Secret Access Key, and region name of the AWS S3 bucket.

Commands

I had a Linux machine so I ran the below commands with the help of Putty. To transfer data, three steps are required, as shown below:

Configure AWS

[root@gcp6 testfolder]# sudo /usr/local/bin/aws configure

AWS Access Key ID [****************PYGA]: [AccessKeyID]

AWS Secret Access Key [****************Adly]: [SecretAccessKey]

Default region name [regionname]: [Your Region Name]

Default output format [JSON]: JSON

Test AWS S3 bucket connection

[sgupt@gcp6 ~]$ sudo /usr/local/bin/aws s3 ls [yourawsbucketname] --region [awsregionname]

Output

DummyFolder/

TestFolder/

Upload using cp command for specific files

[root@gcp03 testfolder]# sudo /usr/local/bin/aws s3 cp /home/sourcedirectorypath/test.txt s3://[yourawsbucketname]/[targetdirectoryname]/

Output

upload: ./test.txt to s3://[yourawsbucketname]/[targetdirectoryname]//test.txt

Upload using sync command for a complete directory

[root@gcp03 testfolder]# sudo /usr/local/bin/aws s3 sync /home/sourcedirectorypath /testfolder s3:// ://[yourawsbucketname]/[targetdirectoryname]/

Output

upload: ./test.txt to s3:// ://[yourawsbucketname]/[targetdirectoryname]/test.txt