2019-05-03 06:17:27 +08:00
|
|
|
# frozen_string_literal: true
|
|
|
|
|
2018-10-15 09:43:31 +08:00
|
|
|
module BackupRestore
|
|
|
|
class S3BackupStore < BackupStore
|
FEATURE: Direct S3 multipart uploads for backups (#14736)
This PR introduces a new `enable_experimental_backup_uploads` site setting (default false and hidden), which when enabled alongside `enable_direct_s3_uploads` will allow for direct S3 multipart uploads of backup .tar.gz files.
To make multipart external uploads work with both the S3BackupStore and the S3Store, I've had to move several methods out of S3Store and into S3Helper, including:
* presigned_url
* create_multipart
* abort_multipart
* complete_multipart
* presign_multipart_part
* list_multipart_parts
Then, S3Store and S3BackupStore either delegate directly to S3Helper or have their own special methods to call S3Helper for these methods. FileStore.temporary_upload_path has also removed its dependence on upload_path, and can now be used interchangeably between the stores. A similar change was made in the frontend as well, moving the multipart related JS code out of ComposerUppyUpload and into a mixin of its own, so it can also be used by UppyUploadMixin.
Some changes to ExternalUploadManager had to be made here as well. The backup direct uploads do not need an Upload record made for them in the database, so they can be moved to their final S3 resting place when completing the multipart upload.
This changeset is not perfect; it introduces some special cases in UploadController to handle backups that was previously in BackupController, because UploadController is where the multipart routes are located. A subsequent pull request will pull these routes into a module or some other sharing pattern, along with hooks, so the backup controller and the upload controller (and any future controllers that may need them) can include these routes in a nicer way.
2021-11-11 06:25:31 +08:00
|
|
|
UPLOAD_URL_EXPIRES_AFTER_SECONDS ||= 6.hours.to_i
|
|
|
|
|
|
|
|
delegate :abort_multipart, :presign_multipart_part, :list_multipart_parts,
|
|
|
|
:complete_multipart, to: :s3_helper
|
2018-10-15 09:43:31 +08:00
|
|
|
|
|
|
|
def initialize(opts = {})
|
2020-05-29 02:58:23 +08:00
|
|
|
@s3_options = S3Helper.s3_options(SiteSetting)
|
|
|
|
@s3_options.merge!(opts[:s3_options]) if opts[:s3_options]
|
FEATURE: Direct S3 multipart uploads for backups (#14736)
This PR introduces a new `enable_experimental_backup_uploads` site setting (default false and hidden), which when enabled alongside `enable_direct_s3_uploads` will allow for direct S3 multipart uploads of backup .tar.gz files.
To make multipart external uploads work with both the S3BackupStore and the S3Store, I've had to move several methods out of S3Store and into S3Helper, including:
* presigned_url
* create_multipart
* abort_multipart
* complete_multipart
* presign_multipart_part
* list_multipart_parts
Then, S3Store and S3BackupStore either delegate directly to S3Helper or have their own special methods to call S3Helper for these methods. FileStore.temporary_upload_path has also removed its dependence on upload_path, and can now be used interchangeably between the stores. A similar change was made in the frontend as well, moving the multipart related JS code out of ComposerUppyUpload and into a mixin of its own, so it can also be used by UppyUploadMixin.
Some changes to ExternalUploadManager had to be made here as well. The backup direct uploads do not need an Upload record made for them in the database, so they can be moved to their final S3 resting place when completing the multipart upload.
This changeset is not perfect; it introduces some special cases in UploadController to handle backups that was previously in BackupController, because UploadController is where the multipart routes are located. A subsequent pull request will pull these routes into a module or some other sharing pattern, along with hooks, so the backup controller and the upload controller (and any future controllers that may need them) can include these routes in a nicer way.
2021-11-11 06:25:31 +08:00
|
|
|
end
|
|
|
|
|
|
|
|
def s3_helper
|
|
|
|
@s3_helper ||= S3Helper.new(s3_bucket_name_with_prefix, '', @s3_options.clone)
|
2018-10-15 09:43:31 +08:00
|
|
|
end
|
|
|
|
|
|
|
|
def remote?
|
|
|
|
true
|
|
|
|
end
|
|
|
|
|
|
|
|
def file(filename, include_download_source: false)
|
FEATURE: Direct S3 multipart uploads for backups (#14736)
This PR introduces a new `enable_experimental_backup_uploads` site setting (default false and hidden), which when enabled alongside `enable_direct_s3_uploads` will allow for direct S3 multipart uploads of backup .tar.gz files.
To make multipart external uploads work with both the S3BackupStore and the S3Store, I've had to move several methods out of S3Store and into S3Helper, including:
* presigned_url
* create_multipart
* abort_multipart
* complete_multipart
* presign_multipart_part
* list_multipart_parts
Then, S3Store and S3BackupStore either delegate directly to S3Helper or have their own special methods to call S3Helper for these methods. FileStore.temporary_upload_path has also removed its dependence on upload_path, and can now be used interchangeably between the stores. A similar change was made in the frontend as well, moving the multipart related JS code out of ComposerUppyUpload and into a mixin of its own, so it can also be used by UppyUploadMixin.
Some changes to ExternalUploadManager had to be made here as well. The backup direct uploads do not need an Upload record made for them in the database, so they can be moved to their final S3 resting place when completing the multipart upload.
This changeset is not perfect; it introduces some special cases in UploadController to handle backups that was previously in BackupController, because UploadController is where the multipart routes are located. A subsequent pull request will pull these routes into a module or some other sharing pattern, along with hooks, so the backup controller and the upload controller (and any future controllers that may need them) can include these routes in a nicer way.
2021-11-11 06:25:31 +08:00
|
|
|
obj = s3_helper.object(filename)
|
2018-10-15 09:43:31 +08:00
|
|
|
create_file_from_object(obj, include_download_source) if obj.exists?
|
|
|
|
end
|
|
|
|
|
|
|
|
def delete_file(filename)
|
FEATURE: Direct S3 multipart uploads for backups (#14736)
This PR introduces a new `enable_experimental_backup_uploads` site setting (default false and hidden), which when enabled alongside `enable_direct_s3_uploads` will allow for direct S3 multipart uploads of backup .tar.gz files.
To make multipart external uploads work with both the S3BackupStore and the S3Store, I've had to move several methods out of S3Store and into S3Helper, including:
* presigned_url
* create_multipart
* abort_multipart
* complete_multipart
* presign_multipart_part
* list_multipart_parts
Then, S3Store and S3BackupStore either delegate directly to S3Helper or have their own special methods to call S3Helper for these methods. FileStore.temporary_upload_path has also removed its dependence on upload_path, and can now be used interchangeably between the stores. A similar change was made in the frontend as well, moving the multipart related JS code out of ComposerUppyUpload and into a mixin of its own, so it can also be used by UppyUploadMixin.
Some changes to ExternalUploadManager had to be made here as well. The backup direct uploads do not need an Upload record made for them in the database, so they can be moved to their final S3 resting place when completing the multipart upload.
This changeset is not perfect; it introduces some special cases in UploadController to handle backups that was previously in BackupController, because UploadController is where the multipart routes are located. A subsequent pull request will pull these routes into a module or some other sharing pattern, along with hooks, so the backup controller and the upload controller (and any future controllers that may need them) can include these routes in a nicer way.
2021-11-11 06:25:31 +08:00
|
|
|
obj = s3_helper.object(filename)
|
2018-12-15 06:14:46 +08:00
|
|
|
|
|
|
|
if obj.exists?
|
|
|
|
obj.delete
|
|
|
|
reset_cache
|
|
|
|
end
|
2018-10-15 09:43:31 +08:00
|
|
|
end
|
|
|
|
|
|
|
|
def download_file(filename, destination_path, failure_message = nil)
|
FEATURE: Direct S3 multipart uploads for backups (#14736)
This PR introduces a new `enable_experimental_backup_uploads` site setting (default false and hidden), which when enabled alongside `enable_direct_s3_uploads` will allow for direct S3 multipart uploads of backup .tar.gz files.
To make multipart external uploads work with both the S3BackupStore and the S3Store, I've had to move several methods out of S3Store and into S3Helper, including:
* presigned_url
* create_multipart
* abort_multipart
* complete_multipart
* presign_multipart_part
* list_multipart_parts
Then, S3Store and S3BackupStore either delegate directly to S3Helper or have their own special methods to call S3Helper for these methods. FileStore.temporary_upload_path has also removed its dependence on upload_path, and can now be used interchangeably between the stores. A similar change was made in the frontend as well, moving the multipart related JS code out of ComposerUppyUpload and into a mixin of its own, so it can also be used by UppyUploadMixin.
Some changes to ExternalUploadManager had to be made here as well. The backup direct uploads do not need an Upload record made for them in the database, so they can be moved to their final S3 resting place when completing the multipart upload.
This changeset is not perfect; it introduces some special cases in UploadController to handle backups that was previously in BackupController, because UploadController is where the multipart routes are located. A subsequent pull request will pull these routes into a module or some other sharing pattern, along with hooks, so the backup controller and the upload controller (and any future controllers that may need them) can include these routes in a nicer way.
2021-11-11 06:25:31 +08:00
|
|
|
s3_helper.download_file(filename, destination_path, failure_message)
|
2018-10-15 09:43:31 +08:00
|
|
|
end
|
|
|
|
|
|
|
|
def upload_file(filename, source_path, content_type)
|
FEATURE: Direct S3 multipart uploads for backups (#14736)
This PR introduces a new `enable_experimental_backup_uploads` site setting (default false and hidden), which when enabled alongside `enable_direct_s3_uploads` will allow for direct S3 multipart uploads of backup .tar.gz files.
To make multipart external uploads work with both the S3BackupStore and the S3Store, I've had to move several methods out of S3Store and into S3Helper, including:
* presigned_url
* create_multipart
* abort_multipart
* complete_multipart
* presign_multipart_part
* list_multipart_parts
Then, S3Store and S3BackupStore either delegate directly to S3Helper or have their own special methods to call S3Helper for these methods. FileStore.temporary_upload_path has also removed its dependence on upload_path, and can now be used interchangeably between the stores. A similar change was made in the frontend as well, moving the multipart related JS code out of ComposerUppyUpload and into a mixin of its own, so it can also be used by UppyUploadMixin.
Some changes to ExternalUploadManager had to be made here as well. The backup direct uploads do not need an Upload record made for them in the database, so they can be moved to their final S3 resting place when completing the multipart upload.
This changeset is not perfect; it introduces some special cases in UploadController to handle backups that was previously in BackupController, because UploadController is where the multipart routes are located. A subsequent pull request will pull these routes into a module or some other sharing pattern, along with hooks, so the backup controller and the upload controller (and any future controllers that may need them) can include these routes in a nicer way.
2021-11-11 06:25:31 +08:00
|
|
|
obj = s3_helper.object(filename)
|
2018-10-15 09:43:31 +08:00
|
|
|
raise BackupFileExists.new if obj.exists?
|
|
|
|
|
|
|
|
obj.upload_file(source_path, content_type: content_type)
|
2018-12-15 06:14:46 +08:00
|
|
|
reset_cache
|
2018-10-15 09:43:31 +08:00
|
|
|
end
|
|
|
|
|
|
|
|
def generate_upload_url(filename)
|
FEATURE: Direct S3 multipart uploads for backups (#14736)
This PR introduces a new `enable_experimental_backup_uploads` site setting (default false and hidden), which when enabled alongside `enable_direct_s3_uploads` will allow for direct S3 multipart uploads of backup .tar.gz files.
To make multipart external uploads work with both the S3BackupStore and the S3Store, I've had to move several methods out of S3Store and into S3Helper, including:
* presigned_url
* create_multipart
* abort_multipart
* complete_multipart
* presign_multipart_part
* list_multipart_parts
Then, S3Store and S3BackupStore either delegate directly to S3Helper or have their own special methods to call S3Helper for these methods. FileStore.temporary_upload_path has also removed its dependence on upload_path, and can now be used interchangeably between the stores. A similar change was made in the frontend as well, moving the multipart related JS code out of ComposerUppyUpload and into a mixin of its own, so it can also be used by UppyUploadMixin.
Some changes to ExternalUploadManager had to be made here as well. The backup direct uploads do not need an Upload record made for them in the database, so they can be moved to their final S3 resting place when completing the multipart upload.
This changeset is not perfect; it introduces some special cases in UploadController to handle backups that was previously in BackupController, because UploadController is where the multipart routes are located. A subsequent pull request will pull these routes into a module or some other sharing pattern, along with hooks, so the backup controller and the upload controller (and any future controllers that may need them) can include these routes in a nicer way.
2021-11-11 06:25:31 +08:00
|
|
|
obj = s3_helper.object(filename)
|
2018-10-15 09:43:31 +08:00
|
|
|
raise BackupFileExists.new if obj.exists?
|
|
|
|
|
2021-11-08 07:16:38 +08:00
|
|
|
# TODO (martin) We can remove this at a later date when we move this
|
|
|
|
# ensure CORS for backups and direct uploads to a post-site-setting
|
|
|
|
# change event, so the rake task doesn't have to be run manually.
|
|
|
|
@s3_helper.ensure_cors!([S3CorsRulesets::BACKUP_DIRECT_UPLOAD])
|
|
|
|
|
2018-10-15 09:43:31 +08:00
|
|
|
presigned_url(obj, :put, UPLOAD_URL_EXPIRES_AFTER_SECONDS)
|
2019-02-20 22:15:38 +08:00
|
|
|
rescue Aws::Errors::ServiceError => e
|
|
|
|
Rails.logger.warn("Failed to generate upload URL for S3: #{e.message.presence || e.class.name}")
|
2021-03-02 22:29:37 +08:00
|
|
|
raise StorageError.new(e.message.presence || e.class.name)
|
2018-10-15 09:43:31 +08:00
|
|
|
end
|
|
|
|
|
FEATURE: Direct S3 multipart uploads for backups (#14736)
This PR introduces a new `enable_experimental_backup_uploads` site setting (default false and hidden), which when enabled alongside `enable_direct_s3_uploads` will allow for direct S3 multipart uploads of backup .tar.gz files.
To make multipart external uploads work with both the S3BackupStore and the S3Store, I've had to move several methods out of S3Store and into S3Helper, including:
* presigned_url
* create_multipart
* abort_multipart
* complete_multipart
* presign_multipart_part
* list_multipart_parts
Then, S3Store and S3BackupStore either delegate directly to S3Helper or have their own special methods to call S3Helper for these methods. FileStore.temporary_upload_path has also removed its dependence on upload_path, and can now be used interchangeably between the stores. A similar change was made in the frontend as well, moving the multipart related JS code out of ComposerUppyUpload and into a mixin of its own, so it can also be used by UppyUploadMixin.
Some changes to ExternalUploadManager had to be made here as well. The backup direct uploads do not need an Upload record made for them in the database, so they can be moved to their final S3 resting place when completing the multipart upload.
This changeset is not perfect; it introduces some special cases in UploadController to handle backups that was previously in BackupController, because UploadController is where the multipart routes are located. A subsequent pull request will pull these routes into a module or some other sharing pattern, along with hooks, so the backup controller and the upload controller (and any future controllers that may need them) can include these routes in a nicer way.
2021-11-11 06:25:31 +08:00
|
|
|
def signed_url_for_temporary_upload(file_name, expires_in: S3Helper::UPLOAD_URL_EXPIRES_AFTER_SECONDS, metadata: {})
|
|
|
|
obj = object_from_path(file_name)
|
|
|
|
raise BackupFileExists.new if obj.exists?
|
|
|
|
key = temporary_upload_path(file_name)
|
|
|
|
s3_helper.presigned_url(
|
|
|
|
key,
|
|
|
|
method: :put_object,
|
|
|
|
expires_in: expires_in,
|
|
|
|
opts: {
|
|
|
|
metadata: metadata,
|
|
|
|
acl: "private"
|
|
|
|
}
|
|
|
|
)
|
|
|
|
end
|
|
|
|
|
|
|
|
def temporary_upload_path(file_name)
|
|
|
|
FileStore::BaseStore.temporary_upload_path(file_name, folder_prefix: temporary_folder_prefix)
|
|
|
|
end
|
|
|
|
|
|
|
|
def temporary_folder_prefix
|
|
|
|
folder_prefix = s3_helper.s3_bucket_folder_path.nil? ? "" : s3_helper.s3_bucket_folder_path
|
|
|
|
|
|
|
|
if Rails.env.test?
|
|
|
|
folder_prefix = File.join(folder_prefix, "test_#{ENV['TEST_ENV_NUMBER'].presence || '0'}")
|
|
|
|
end
|
|
|
|
|
|
|
|
folder_prefix
|
|
|
|
end
|
|
|
|
|
|
|
|
def create_multipart(file_name, content_type, metadata: {})
|
|
|
|
obj = object_from_path(file_name)
|
|
|
|
raise BackupFileExists.new if obj.exists?
|
|
|
|
key = temporary_upload_path(file_name)
|
|
|
|
s3_helper.create_multipart(key, content_type, metadata: metadata)
|
|
|
|
end
|
|
|
|
|
|
|
|
def move_existing_stored_upload(
|
|
|
|
existing_external_upload_key:,
|
|
|
|
original_filename: nil,
|
|
|
|
content_type: nil
|
|
|
|
)
|
|
|
|
s3_helper.copy(
|
|
|
|
existing_external_upload_key,
|
|
|
|
File.join(s3_helper.s3_bucket_folder_path, original_filename),
|
|
|
|
options: { acl: "private", apply_metadata_to_destination: true }
|
|
|
|
)
|
|
|
|
s3_helper.delete_object(existing_external_upload_key)
|
|
|
|
end
|
|
|
|
|
|
|
|
def object_from_path(path)
|
|
|
|
s3_helper.object(path)
|
|
|
|
end
|
|
|
|
|
2018-10-15 09:43:31 +08:00
|
|
|
private
|
|
|
|
|
|
|
|
def unsorted_files
|
|
|
|
objects = []
|
|
|
|
|
FEATURE: Direct S3 multipart uploads for backups (#14736)
This PR introduces a new `enable_experimental_backup_uploads` site setting (default false and hidden), which when enabled alongside `enable_direct_s3_uploads` will allow for direct S3 multipart uploads of backup .tar.gz files.
To make multipart external uploads work with both the S3BackupStore and the S3Store, I've had to move several methods out of S3Store and into S3Helper, including:
* presigned_url
* create_multipart
* abort_multipart
* complete_multipart
* presign_multipart_part
* list_multipart_parts
Then, S3Store and S3BackupStore either delegate directly to S3Helper or have their own special methods to call S3Helper for these methods. FileStore.temporary_upload_path has also removed its dependence on upload_path, and can now be used interchangeably between the stores. A similar change was made in the frontend as well, moving the multipart related JS code out of ComposerUppyUpload and into a mixin of its own, so it can also be used by UppyUploadMixin.
Some changes to ExternalUploadManager had to be made here as well. The backup direct uploads do not need an Upload record made for them in the database, so they can be moved to their final S3 resting place when completing the multipart upload.
This changeset is not perfect; it introduces some special cases in UploadController to handle backups that was previously in BackupController, because UploadController is where the multipart routes are located. A subsequent pull request will pull these routes into a module or some other sharing pattern, along with hooks, so the backup controller and the upload controller (and any future controllers that may need them) can include these routes in a nicer way.
2021-11-11 06:25:31 +08:00
|
|
|
s3_helper.list.each do |obj|
|
2019-01-25 04:58:36 +08:00
|
|
|
if obj.key.match?(file_regex)
|
2018-10-15 09:43:31 +08:00
|
|
|
objects << create_file_from_object(obj)
|
|
|
|
end
|
|
|
|
end
|
|
|
|
|
|
|
|
objects
|
|
|
|
rescue Aws::Errors::ServiceError => e
|
|
|
|
Rails.logger.warn("Failed to list backups from S3: #{e.message.presence || e.class.name}")
|
2021-03-02 22:29:37 +08:00
|
|
|
raise StorageError.new(e.message.presence || e.class.name)
|
2018-10-15 09:43:31 +08:00
|
|
|
end
|
|
|
|
|
|
|
|
def create_file_from_object(obj, include_download_source = false)
|
2022-05-26 07:53:01 +08:00
|
|
|
expires = SiteSetting.s3_presigned_get_url_expires_after_seconds
|
2018-10-15 09:43:31 +08:00
|
|
|
BackupFile.new(
|
|
|
|
filename: File.basename(obj.key),
|
|
|
|
size: obj.size,
|
|
|
|
last_modified: obj.last_modified,
|
2019-06-06 11:27:24 +08:00
|
|
|
source: include_download_source ? presigned_url(obj, :get, expires) : nil
|
2018-10-15 09:43:31 +08:00
|
|
|
)
|
|
|
|
end
|
|
|
|
|
|
|
|
def presigned_url(obj, method, expires_in_seconds)
|
|
|
|
obj.presigned_url(method, expires_in: expires_in_seconds)
|
|
|
|
end
|
|
|
|
|
|
|
|
def cleanup_allowed?
|
|
|
|
!SiteSetting.s3_disable_cleanup
|
|
|
|
end
|
2018-12-05 10:10:39 +08:00
|
|
|
|
|
|
|
def s3_bucket_name_with_prefix
|
2020-05-29 02:58:23 +08:00
|
|
|
File.join(SiteSetting.s3_backup_bucket, RailsMultisite::ConnectionManagement.current_db)
|
|
|
|
end
|
|
|
|
|
2019-01-25 04:58:36 +08:00
|
|
|
def file_regex
|
|
|
|
@file_regex ||= begin
|
FEATURE: Direct S3 multipart uploads for backups (#14736)
This PR introduces a new `enable_experimental_backup_uploads` site setting (default false and hidden), which when enabled alongside `enable_direct_s3_uploads` will allow for direct S3 multipart uploads of backup .tar.gz files.
To make multipart external uploads work with both the S3BackupStore and the S3Store, I've had to move several methods out of S3Store and into S3Helper, including:
* presigned_url
* create_multipart
* abort_multipart
* complete_multipart
* presign_multipart_part
* list_multipart_parts
Then, S3Store and S3BackupStore either delegate directly to S3Helper or have their own special methods to call S3Helper for these methods. FileStore.temporary_upload_path has also removed its dependence on upload_path, and can now be used interchangeably between the stores. A similar change was made in the frontend as well, moving the multipart related JS code out of ComposerUppyUpload and into a mixin of its own, so it can also be used by UppyUploadMixin.
Some changes to ExternalUploadManager had to be made here as well. The backup direct uploads do not need an Upload record made for them in the database, so they can be moved to their final S3 resting place when completing the multipart upload.
This changeset is not perfect; it introduces some special cases in UploadController to handle backups that was previously in BackupController, because UploadController is where the multipart routes are located. A subsequent pull request will pull these routes into a module or some other sharing pattern, along with hooks, so the backup controller and the upload controller (and any future controllers that may need them) can include these routes in a nicer way.
2021-11-11 06:25:31 +08:00
|
|
|
path = s3_helper.s3_bucket_folder_path || ""
|
2019-01-25 04:58:36 +08:00
|
|
|
|
|
|
|
if path.present?
|
|
|
|
path = "#{path}/" unless path.end_with?("/")
|
|
|
|
path = Regexp.quote(path)
|
|
|
|
end
|
|
|
|
|
|
|
|
/^#{path}[^\/]*\.t?gz$/i
|
|
|
|
end
|
|
|
|
end
|
|
|
|
|
2018-12-15 06:14:46 +08:00
|
|
|
def free_bytes
|
|
|
|
nil
|
|
|
|
end
|
2018-10-15 09:43:31 +08:00
|
|
|
end
|
|
|
|
end
|