s3cmdを使う
たまに使うのでs3cmdの備忘録です。
今回はAmazon Linux上で、s3cmd-1.5.0-beta1を使っています。
準備
まずはS3にアクセスできるユーザとS3のバケットを用意します。
S3にアクセスできる権限を持ったユーザを作っておくことにします。
s3cmdでバケットを作成することも可能ですが、今回は事前にバケットを作成しておきます。
ログもとるようにしておきました。
インストールと設定
インストールは簡単です。
# wget http://downloads.sourceforge.net/project/s3tools/s3cmd/1.5.0-beta1/s3cmd-1.5.0-beta1.tar.gz # tar -xvzf s3cmd-1.5.0-beta1.tar.gz # python setup.py install
次に設定ファイルを作成します。
$ s3cmd --configure Enter new values or accept defaults in brackets with Enter. Refer to user manual for detailed description of all options. Access key and Secret key are your identifiers for Amazon S3 Access Key: <Your Access Key> Secret Key: <Your Secret Key> Encryption password is used to protect your files from reading by unauthorized persons while in transfer to S3 Encryption password: <Your Encryption password> Path to GPG program [/usr/bin/gpg]: When using secure HTTPS protocol all communication with Amazon S3 servers is protected from 3rd party eavesdropping. This method is slower than plain HTTP and can't be used if you're behind a proxy Use HTTPS protocol [No]: On some networks all internet access must go through a HTTP proxy. Try setting it here if you can't conect to S3 directly HTTP Proxy server name: New settings: Access Key: <Your Access Key> Secret Key: <Your Secret Key> Encryption password: <Your Encrypption password> Path to GPG program: /usr/bin/gpg Use HTTPS protocol: False HTTP Proxy server name: HTTP Proxy server port: 0 Test access with supplied credentials? [Y/n] Y Please wait, attempting to list all buckets... Success. Your access key and secret key worked fine :-) Now verifying that encryption works... Success. Encryption and decryption worked fine :-) Save settings? [y/N] y Configuration saved to '/home/ec2-user/.s3cfg'
使ってみる(Linux版)
ファイルの操作は直感的にできます。
$ s3cmd ls s3://think-t/work/1M.img 2014-05-25 01:23 1024000 s3://think-t/work/1M.img $ s3cmd put 1M.img s3://think-t/work/ 1M.img -> s3://think-t/work/1M.img [1 of 1] 1024000 of 1024000 100% in 0s 5.01 MB/s done $ s3cmd get s3://think-t/work/1M.img 1M.img s3://think-t/work/1M.img -> 1M.img [1 of 1] 1024000 of 1024000 100% in 0s 6.05 MB/s done $ s3cmd del s3://think-t/work/1M.img File s3://think-t/work/1M.img deleted
rsyncのような使い方も。
$ s3cmd sync work s3://think-t/ work/tmp1/1M.img -> s3://think-t/work/tmp1/1M.img [1 of 9] 1024000 of 1024000 100% in 0s 6.13 MB/s done work/tmp1/2M.img -> s3://think-t/work/tmp1/2M.img [2 of 9] 2048000 of 2048000 100% in 0s 10.40 MB/s done work/tmp1/5M.img -> s3://think-t/work/tmp1/5M.img [3 of 9] 5120000 of 5120000 100% in 0s 13.51 MB/s done work/tmp2/10M.img -> s3://think-t/work/tmp2/10M.img [4 of 9] 10240000 of 10240000 100% in 0s 18.02 MB/s done work/tmp2/20M.img -> s3://think-t/work/tmp2/20M.img [part 1 of 2, 15MB] 15728640 of 15728640 100% in 0s 22.55 MB/s done work/tmp2/20M.img -> s3://think-t/work/tmp2/20M.img [part 2 of 2, 4MB] 4751360 of 4751360 100% in 0s 18.15 MB/s done work/tmp2/50M.img -> s3://think-t/work/tmp2/50M.img [part 1 of 4, 15MB] 15728640 of 15728640 100% in 0s 20.92 MB/s done (略) work/tmp2/50M.img -> s3://think-t/work/tmp2/50M.img [part 4 of 4, 3MB] 4014080 of 4014080 100% in 0s 18.28 MB/s done work/tmp3/100M.img -> s3://think-t/work/tmp3/100M.img [part 1 of 7, 15MB] 15728640 of 15728640 100% in 0s 23.07 MB/s done (略) work/tmp3/100M.img -> s3://think-t/work/tmp3/100M.img [part 7 of 7, 7MB] 8028160 of 8028160 100% in 0s 8.07 MB/s done work/tmp3/200M.img -> s3://think-t/work/tmp3/200M.img [part 1 of 14, 15MB] 15728640 of 15728640 100% in 1s 9.13 MB/s done (略) work/tmp3/200M.img -> s3://think-t/work/tmp3/200M.img [part 14 of 14, 320kB] 327680 of 327680 100% in 0s 4.67 MB/s done work/tmp3/500M.img -> s3://think-t/work/tmp3/500M.img [part 1 of 33, 15MB] 15728640 of 15728640 100% in 1s 8.82 MB/s done (略) work/tmp3/500M.img -> s3://think-t/work/tmp3/500M.img [part 33 of 33, 8MB] 8683520 of 8683520 100% in 1s 5.42 MB/s done Process files that was not remote copied Done. Uploaded 909312000 bytes in 95.0 seconds, 9.12 MB/s. Copied 0 files saving 0 bytes transfer.
デフォルトだと15MB以上のファイルは自動で分割されます。
(.s3cfgで「enable_multipart = True」「multipart_chunk_size_mb = 15」)
$ s3cmd put 6G.img s3://think-t/work/ 6G.img -> s3://think-t/work/6G.img [part 1 of 391, 15MB] 15728640 of 15728640 100% in 1s 12.16 MB/s done 6G.img -> s3://think-t/work/6G.img [part 2 of 391, 15MB] 15728640 of 15728640 100% in 1s 8.11 MB/s done 6G.img -> s3://think-t/work/6G.img [part 3 of 391, 15MB] (略) 6G.img -> s3://think-t/work/6G.img [part 391 of 391, 9MB] 9830400 of 9830400 100% in 1s 7.38 MB/s done
S3の場合はリクエストに課金が発生するので、大きいファイルを処理する場合は、
「--multipart-chunk-size-mb」オプションを付けるか、.s3cfgの設定を見直して、
分割サイズをコントロールするのが良さそうです。
$ s3cmd put --multipart-chunk-size-mb=5120 6G.img s3://think-t/work/ 6G.img -> s3://think-t/work/6G.img [part 1 of 2, 5GB] 5368709120 of 5368709120 100% in 624s 8.20 MB/s done 6G.img -> s3://think-t/work/6G.img [part 2 of 2, 739MB] 775290880 of 775290880 100% in 90s 8.14 MB/s done
infoオプションでファイルの情報をより詳細に取得できます。
$ s3cmd info s3://think-t/work/1M.img s3://think-t/work/1M.img (object): File size: 1024000 Last mod: Sun, 25 May 2014 01:29:15 GMT MIME type: application/octet-stream; charset=binary MD5 sum: 80ec129d645c70cf0de45b1a5a682235 SSE: NONE policy: none ACL: ----: FULL_CONTROL
ハッシュ値だけなら、「--list-md5」オプションを付ければOK
$ s3cmd ls --list-md5 s3://think-t/work/1M.img 2014-05-25 01:29 1024000 80ec129d645c70cf0de45b1a5a682235 s3://think-t/work/1M.img
全てのディレクトリを表示します。
$ s3cmd la --bucket-location=US s://think-t/work/ DIR s3://think-t/logs/ DIR s3://think-t/work/
バケットのサイズを表示します。
$ s3cmd du s3://think-t/ 1166628 s3://think-t/
やっぱり便利です。
今日はこんなところで。