Add -m
to multiprocess your transfer and observe greatly increased speed:
BUCKET_NAME=my-bucket-4312
gsutil -m cp -r gs://gcp-public-data-landsat/LC08/01/044/034/* gs://$BUCKET_NAME
Adding -m
to the gsutil causes most commands to run in parallel, using a combination of multithreading and multiprocessing. The number of threads and processes are set by parallel_thread_count
and parallel_process_count
. These can be set in your .boto configuration file or set on the command line with the -o
option flag.
In general, if you are moving a couple of files, you won’t need this flag. However, if you are doing batch uploads and are OK saturating your CPU and even your network link, you can consider setting it. Just note that this can starve other processes or devices on the network of resources.
If you are trying to do this in code, it is usually more performant to use the appropriate client library rather than call the gsutil command-line tool, which is in turn just calling a Python library.