The cback-amazons3-sync tool is used for synchronizing entire directories of files up to an Amazon S3 cloud storage bucket, outside of the normal Cedar Backup process.
This might be a good option for some types of data, as long as you understand the limitations around retrieving previous versions of objects that get modified or deleted as part of a sync. S3 does support versioning, but it won't be quite as easy to get at those previous versions as with an explicit incremental backup like cback provides. Cedar Backup does not provide any tooling that would help you retrieve previous versions.
The underlying functionality relies on the AWS CLI toolset. Before you use this extension, you need to set up your Amazon S3 account and configure AWS CLI as detailed in Amazons's setup guide. The aws command will be executed as the same user that is executing the cback-amazons3-sync command, so make sure you configure it as the proper user. (This is different than the amazons3 extension, which is designed to execute as root and switches over to the configured backup user to execute AWS CLI commands.)
You can use whichever Amazon-supported authentication mechanism you would like when setting up connectivity for the AWS CLI. It's best to set up a separate user in the IAM Console rather than using your main administrative user.
You probably want to lock down this user so that it can only take backup related actions in the AWS infrastructure. One option is to apply the AmazonS3FullAccess policy, which grants full access to the S3 infrastructure. If you would like to lock down the user even further, this appears to be the minimum set of permissions required for the aws s3 sync action, written as a JSON policy statement:
{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "s3:ListBucket", "s3:PutObject", "s3:PutObjectAcl", "s3:DeleteObject" ], "Resource": [ "arn:aws:s3:::your-bucket", "arn:aws:s3:::your-bucket/*" ] } ] }
In the Resource
section, be sure to list the name
of your S3 bucket instead of my-bucket
.
The cback-amazons3-sync command has the following syntax:
Usage: cback-amazons3-sync [switches] sourceDir s3bucketUrl Cedar Backup Amazon S3 sync tool. This Cedar Backup utility synchronizes a local directory to an Amazon S3 bucket. After the sync is complete, a validation step is taken. An error is reported if the contents of the bucket do not match the source directory, or if the indicated size for any file differs. This tool is a wrapper over the AWS CLI command-line tool. The following arguments are required: sourceDir The local source directory on disk (must exist) s3BucketUrl The URL to the target Amazon S3 bucket The following switches are accepted: -h, --help Display this usage/help listing -V, --version Display version information -b, --verbose Print verbose output as well as logging to disk -q, --quiet Run quietly (display no output to the screen) -l, --logfile Path to logfile (default: /var/log/cback.log) -o, --owner Logfile ownership, user:group (default: root:adm) -m, --mode Octal logfile permissions mode (default: 640) -O, --output Record some sub-command (i.e. aws) output to the log -d, --debug Write debugging information to the log (implies --output) -s, --stack Dump Python stack trace instead of swallowing exceptions -D, --diagnostics Print runtime diagnostics to the screen and exit -v, --verifyOnly Only verify the S3 bucket contents, do not make changes -w, --ignoreWarnings Ignore warnings about problematic filename encodings Typical usage would be something like: cback-amazons3-sync /home/myuser s3://example.com-backup/myuser This will sync the contents of /home/myuser into the indicated bucket.
-h
, --help
Display usage/help listing.
-V
, --version
Display version information.
-b
, --verbose
Print verbose output to the screen as well writing to the logfile. When this option is enabled, most information that would normally be written to the logfile will also be written to the screen.
-q
, --quiet
Run quietly (display no output to the screen).
-l
, --logfile
Specify the path to an alternate logfile. The default
logfile file is /var/log/cback.log
.
-o
, --owner
Specify the ownership of the logfile, in the form
user:group
. The default ownership is
root:adm
, to match the Debian standard
for most logfiles. This value will only be used when
creating a new logfile. If the logfile already exists when
the cback-amazons3-sync command is
executed, it will retain its existing ownership and mode.
Only user and group names may be used, not numeric uid and
gid values.
-m
, --mode
Specify the permissions for the logfile, using the
numeric mode as in chmod(1). The default mode is
0640
(-rw-r-----
).
This value will only be used when creating a new logfile.
If the logfile already exists when the
cback-amazons3-sync command is executed,
it will retain its existing ownership and mode.
-O
, --output
Record some sub-command output to the logfile. When this option is enabled, all output from system commands will be logged. This might be useful for debugging or just for reference.
-d
, --debug
Write debugging information to the logfile. This option
produces a high volume of output, and would generally only
be needed when debugging a problem. This option implies
the --output
option, as well.
-s
, --stack
Dump a Python stack trace instead of swallowing exceptions. This forces Cedar Backup to dump the entire Python stack trace associated with an error, rather than just propagating last message it received back up to the user interface. Under some circumstances, this is useful information to include along with a bug report.
-D
, --diagnostics
Display runtime diagnostic information and then exit. This diagnostic information is often useful when filing a bug report.
-v
, --verifyOnly
Only verify the S3 bucket contents against the directory on disk. Do not make any changes to the S3 bucket or transfer any files. This is intended as a quick check to see whether the sync is up-to-date.
Although no files are transferred, the tool will still
execute the source filename encoding check, discussed
below along with --ignoreWarnings
.
-w
, --ignoreWarnings
The AWS CLI S3 sync process is very picky about filename encoding. Files that the Linux filesystem handles with no problems can cause problems in S3 if the filename cannot be encoded properly in your configured locale. As of this writing, filenames like this will cause the sync process to abort without transferring all files as expected.
To avoid confusion, the cback-amazons3-sync
tries to guess which files in the source directory will
cause problems, and refuses to execute the AWS CLI S3 sync if
any problematic files exist. If you'd rather proceed
anyway, use --ignoreWarnings
.
If problematic files are found, then you have basically
two options: either correct your locale (i.e. if you have
set LANG=C
) or rename the file so it
can be encoded properly in your locale. The error messages
will tell you the expected encoding (from your locale) and
the actual detected encoding for the filename.