Performing Bulk Transfers from On-Prem or Other Clouds- completed Flashcards

1
Q

What is Storage Transfer Service (STS) in GCP?

A

Storage Transfer Service (STS) is a fully managed solution in GCP that enables efficient, secure movement of large datasets from on-premises systems, other cloud providers like AWS or Azure, or even other GCS buckets into Google Cloud Storage.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are the key features that STS provides for data migration?

A

STS offers high-performance bulk transfers, incremental syncs after initial migration, metadata preservation, encryption during transit, integrity validation, event-driven transfers, and built-in monitoring through Cloud Logging and Cloud Monitoring.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Explain how STS ensures data integrity and security during transfers.

A

STS encrypts all data in transit and runs automatic integrity checks using file checksums after transfer. It also ensures secured agent communication via HTTPS (port 443) for on-prem transfers.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is incremental transfer in STS? How does it help in large migrations?

A

Incremental transfer means after the first bulk migration, only newly added or modified files are moved in subsequent runs. It reduces bandwidth usage, speeds up syncs, and keeps source and destination almost real-time synchronized.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How would you set up an on-premises to GCS transfer using STS?

A

First, I would enable the Storage Transfer API. Then, I’d install Docker and configure STS agents on a Linux machine with access to the local file system. I’d make sure the agent can connect to GCP over HTTPS, configure the source and destination in the STS job settings, and finally create a transfer job via Console or gcloud.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Can you create a sample gcloud command to trigger a transfer job?

A

Sure.

```bash
gcloud transfer jobs create \
–source-agent-pool=MY_AGENT_POOL \
–source-root-directory=/local/folder/path \
–destination=gs://my-destination-bucket \
–project=my-project-id \
–display-name=”On-Prem-to-GCS-Transfer”
~~~

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What ports must be opened on the firewall for the STS agent to work?

A

Port 443 (HTTPS) must be open to allow the STS agent to securely communicate with Google Cloud.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

If a file gets updated after it has been transferred, how does STS handle it?

A

During scheduled incremental syncs, STS identifies updated files based on metadata and re-transfers only those modified files to GCS.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What happens if the transfer is interrupted mid-way? Does STS handle retries?

A

Yes, STS has built-in retry mechanisms. It resumes the transfer intelligently without re-uploading already completed files.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

You have frequent data updates on your on-premises system. How would you design the transfer to GCS for minimal lag?

A

I would configure a scheduled transfer job to run at frequent intervals, say every few minutes or hours, depending on the business need, leveraging STS’s incremental transfer capability.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

If you need to transfer from AWS S3 to GCS, what specific access setups are required?

A

I’d set up an access key and secret for the S3 bucket or configure identity federation between AWS and GCP, ensuring proper IAM roles for reading the S3 source.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

How would you optimize large-scale data transfers using STS?

A

I would deploy multiple STS agents for parallelism, use agent pools for balancing workloads, configure chunked uploads, and schedule transfers during off-peak network hours to maximize throughput.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What monitoring mechanisms are available for a Storage Transfer job?

A

We can monitor transfer jobs in the GCP Console under Storage Transfer, and also integrate with Cloud Logging and Cloud Monitoring for detailed logs, alerts, and dashboards.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

How can you schedule and automate recurring transfers using STS?

A

While creating the transfer job, we can define a schedule for daily, hourly, or even custom CRON-based triggers so that the transfer happens automatically at set intervals.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What IAM roles are required to perform a bulk transfer with STS?

A

At minimum, roles like Storage Admin or Storage Object Admin on the destination bucket, and permissions like storagetransfer.admin for managing transfer jobs, are needed.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

When would you choose STS over tools like gsutil or gcloud storage?

A

I would prefer STS when handling very large-scale migrations (hundreds of terabytes or petabytes), needing incremental syncs, or when a managed, scheduled, fault-tolerant solution is required.

17
Q

What are the differences between STS transfers from on-premises vs cloud-to-cloud?

A

For on-premises, STS requires setting up agents that scan and upload data securely to GCS. For cloud-to-cloud, STS connects directly using credentials, without needing on-prem agents, and pulls data over secure APIs.

18
Q

If downtime is critical, how does STS help minimize disruptions during migration?

A

By supporting incremental transfers and scheduling syncs, STS ensures that after the initial heavy migration, only delta changes are synced frequently, thus minimizing downtime and allowing a quick cutover.