GCS gives NotImplemented errors for multi-part server side copies. The
threshold for these is currently set just below 5G so any files bigger
than 5G that rclone attempts to server side copy will fail.
This patch works around the problem by adding a quirk for GCS raising
--s3-copy-cutoff to the maximum. This means that rclone will never use
multi-part copies for files in GCS. This includes files bigger than
5GB which (according to AWS documentation) must be copied with
multi-part copy. However this seems to work with GCS.
See: https://forum.rclone.org/t/chunker-uploads-to-gcs-s3-fail-if-the-chunk-size-is-greater-than-the-max-part-size/44349/
See: https://issuetracker.google.com/issues/323465186
A seafile server can be configured to use a relative URL as
FILE_SERVER_ROOT in order to support more than one hostname/ip. (see
https://github.com/haiwen/seahub/issues/3398#issuecomment-506920360 )
The previous backend implementation always expected an absolute
download/upload URL, resulting in an "unsupported protocol scheme"
error.
With this commit it supports both absolute and relative.
It was reported that rclone copy occasionally uploaded corrupted data
to azure blob.
This turned out to be a race condition updating the block count which
caused blocks to be duplicated.
This bug was introduced in this commit in v1.64.0 and will be fixed in v1.65.2
0427177857 azureblob: implement OpenChunkWriter and multi-thread uploads #7056
This race only seems to happen if `--checksum` is used but can happen otherwise.
Unfortunately Azure blob does not check the MD5 that we send them so
despite sending incorrect data this corruption is not detected. The
corruption is detected when rclone tries to download the file, so
attempting to copy the files back to local disk will result in errors
such as:
ERROR : file.pokosuf5.partial: corrupted on transfer: md5 hash differ "XXX" vs "YYY"
This adds a check to test the blocklist we upload is as we expected
which would have caught the problem had it been in place earlier.
Similar to
acf1e2df84,
go1.21.4 appears to have broken sync.MoveDir on Windows because
filepath.VolumeName() returns `\\?` instead of `\\?\C:` in cleanRootPath. It
looks like the Go team is aware of the issue and planning a fix, so this may
only be needed temporarily.
When f.opt.MaxAge == 0, f.db is never set, however several methods later assume
it is set and attempt to access it, causing an invalid memory address error.
This change fixes the issue in a few spots (there may still be others I haven't
yet encountered.)
This fixes the Root() returned by the backend when it has returned
fs.ErrorIsFile.
Before this change it returned a root which included the file path.
Because Root() was wrong this caused the detection of the file being
moved over itself check to fail.
This adds an integration test to check it for all backends.
See: https://forum.rclone.org/t/rclone-move-chunker-dir-file-chunker-dir-deletes-all-file-chunks/43333/
Since `tokenRenewer` adds a Shutdown method, we should call it to
clean up resources.
changes backends:
onedrive,box,pcloud,amazonclouddrive,hidrive,jottacloud,sharefile
,premiumizeme
Signed-off-by: rkonfj <rkonfj@gmail.com>
This error was introduced in this commit when refactoring the list
routine.
b8591b230d onedrive: implement ListR method which gives --fast-list support
The error was caused by OneNote files not being skipped properly.
Before this change the IP address of the server was used in the SMB
connect request (see CloudSoda/go-smb2#18).
The updated library now can pass the hostname instead.
The update requires a small change in the dial method call.
Fixes rclone#6672
Before this change ListR was unconditionally enabled on onedrive.
This caused performance problems for some uses, so now the
--onedrive-delta flag has to be supplied.
Fixes#7362
Before this change PartialUploads was not set. This is clearly wrong
since incoming files are visible on the smb server.
Setting PartialUploads fixes the multithread upload modtime problem as
it uses the PartialUploads flag as an indication that it needs to set
the modtime explicitly.
This problem was detected by the new TestMultithreadCopy integration
tests
Fixes#7411
Before this change smb drives sometimes showed a fraction of the
correct size using `rclone about`.
This fixes the problem by switching the upstream library from
github.com/hirochachacha/go-smb2 to github.com/cloudsoda/go-smb2 which
has a fix for the problem.
The new library passes the integration tests.
Fixes#6733
Before this change, streaming files an exact multiple of the chunk
size would cause rclone to attempt to stream a 0 sized chunk which was
rejected by the b2 servers.
This bug was noticed by the new integration tests for chunked streaming.
Before this change the b2 servers would complain as this was only a
single part transfer.
This was noticed by the new integration tests for server side chunked copy.
This commit fixed the problem but made the integration tests fail.
33376bf399 dropbox: fix missing encoding for rclone purge
This fixes the problem properly by making sure we send the encoded or
non encoded root to the right places.
Before this change, the drive backend only used metadata if it was
created with Metadata enabled.
This patch changes it so the Metadata support is enabled dynamically
if it is set in the context.
This fixes the metadata tests in the integration tests which have been
changed to make sure Metadata is enabled.
Google drive doesn't allow the btime (created time) metadata to be
updated when updating an existing object.
This changes skips btime metadata if we are updating an existing
object but allows it otherwise.
- fetch metadata with listings and fetch permissions in parallel
- only write permissions out if they are not inherited.
- make setting labels, owner and permissions work controlled by flags
- `--drive-metadata-labels`, `--drive-metadata-owner`, `--drive-metadata-permissions`
- convert to directoryCache - makes backend much more efficient
- don't force --low-level-retries to 2
- don't wrap paced calls in pacer
- fix shouldRetry
- fix file list searching mechanism
- use rclone's http Transport
- fix handling of 0 length files
- combine into one file and remove uneeded abstraction
- make `chunk_size` and `upload_concurrency` settable
- make auth the same as azureblob
- set the Features correctly
- implement `--azurefiles-max-stream-size`
- remove arbitrary sleep on Mkdir
- implement `--header-upload`
- implement read and write MimeType for objects
- implement optional methods
- About
- Copy
- DirMove
- Move
- OpenWriterAt
- PutStream
- finish documentation
- disable build on plan9 and js
Fixes#365Fixes#7378
- Changes
- Rename `--s3-authkey` to `--auth-key` to get it out of the s3 backend namespace
- Enable `Content-MD5` integrity checks
- Remove locking after code audit
- Documentation
- Factor out documentation into seperate file
- Add Quickstart to docs
- Add Bugs section to docs
- Add experimental tag to docs
- Add rclone provider to s3 backend docs
- Fixes
- Correct quirks in s3 backend
- Change fmt.Printlns into fs.Logs
- Make metadata storage per backend not global
- Log on startup if anonymous access is enabled
- Coding style fixes
- rename fs to vfs to save confusion with the rest of rclone code
- rename db to b for *s3Backend
Fixes#7062
Users can now input a comma separated list of namenodes when writing
config for hdfs remotes.
This is required when you have multiple namenodes in your hdfs cluster
and cannot be certain which namenodes will be in 'standby' or 'active'
states.
This was available before but wasn't documented and didn't use the
correct rclone interfaces.
On a 404 error, b2 returns an empty body which, before this change,
caused the error handler to try to parse an empty string and give the
following DEBUG message:
Couldn't decode error response: EOF
This is confusing as it is expected in normal operations and isn't an
error.
This change reads the body of an error response first then tries to
decode it only if it isn't empty, which avoids the confusing DEBUG
message.
This also upgrades failure to read the body or failure to decode the
JSON to ERROR messages as now we are certain that we should have
something to read and decode.
For some files the Windows Volume Shadow Service (VSS) advertises the
file size as X in the directory listing but returns a different number
Y on stat-ing the file. If the file is opened and read there are Y
bytes available for reading.
Existing copy tools copy Y bytes rather than X so for consistency
rclone should do the same.
This fixes the problem by stat-ing the file immediately before opening
it. This will also reduce the unnecessary occurrence of "can't copy -
source file is being updated" errors; if the file has finished
changing by the time we come to copy it then we now can copy it
successfully.
See: https://forum.rclone.org/t/consistently-getting-corrupted-on-transfer-sizes-differ-syncing-to-an-smb-share/42218/
Streaming uploads are used by rclone rcat and rclone mount
--vfs-cache-mode off.
After the multipart chunker refactor the multipart chunked streaming
upload was accidentally mixing the first and the second parts up which
was causing corrupted uploads.
This was caused by a simple off by one error in the refactoring where
we went from 1 based part number counting to 0 based part number
counting.
Fixing this revealed that the metadata wasn't being re-read for the
copied object either.
This fixes both of those issues and adds an integration tests so it
won't happen again.
Fixes#7367
After the multipart chunker refactor the multipart chunked server side
copy was accidentally sending one part too many. The last part was 0
length which was rejected by b2.
This was caused by a simple off by one error in the refactoring where
we went from 1 based part number counting to 0 based part number
counting.
Fixing this revealed that the metadata wasn't being re-read for the
copied object either.
This fixes both of those issues and adds an integration tests so it
won't happen again.
See: https://forum.rclone.org/t/large-server-side-copy-in-b2-fails-due-to-bad-byte-range/42294
Before this change if you tried to create a bucket that already
existed, but someone else owned then rclone did not return an error.
This now will return an error on providers that return the
AlreadyOwnedByYou error code or no error on bucket creation of an
existing bucket owned by you.
This introduces a new provider quirk and this has been set or cleared
for as many providers as can be tested. This can be overridden by the
--s3-use-already-exists flag.
Fixes#7351
Before this change, attempting to server side copy a google form would
give this error
No export formats found for "application/vnd.google-apps.form"
Adding this flag allows the form to be server side copied but not
downloaded.
Fixes#6302
In v1.63 memory usage in the b2 backend was limited to `--transfers` *
`--b2-chunk-size`
However in v1.64 this was raised to `--transfers` * `--b2-chunk-size`
* `--b2-upload-concurrency`.
The default value for this was accidently set quite high at 16 which
means by default rclone could use up to 6.4GB of memory!
The new default sets a more reasonable (but still high) max memory of 1.6GB.
Before this change, the lock was held while the upload URL was being
fetched from the server.
This meant that any other threads were blocked from getting upload
URLs unecessarily.
It also increased the potential for deadlock.
Most useful is the addition of the file created timestamp, but also a timestamp for
when the file was uploaded.
Currently supporting a rather minimalistic set of metadata items, see PR #6359 for
some thoughts about possible extensions.
In this commit:
5f938fb9ed s3: fix "Entry doesn't belong in directory" errors when using directory markers
We checked that the remote has the prefix and then changed the remote
before removing the prefix. This sometimes causes:
panic: runtime error: slice bounds out of range [56:55]
The fix is to do the modification of the remote after removing the
prefix.
See: https://forum.rclone.org/t/cryptcheck-panic-runtime-error-slice-bounds-out-of-range/41977
Before this change the b2 backend listed all the buckets to turn a
single bucket name into an ID.
However in July 26, 2018 a parameter was added to the list buckets API
to make listing all the buckets unecessary.
This code sets the bucketName parameter so that only the results for
the desired bucket are returned.
The improved upload logic is active by default in uplink v1.12.0, so the
`testuplink.WithConcurrentSegmentUploadsDefaultConfig(ctx)` is not
required anymore.
See https://github.com/rclone/rclone/pull/7198
Before this change uploaded files could return the error "replication
in progress".
This error is harmless though and means the Close should be retried
which is what this patch does.
In this commit we discovered a problem with objects being uploaded to
the incorrect object name. It added an integration test for the
problem.
65b2e378e0 drive: fix incorrect remote after Update on object
This test was tripped by the hdfs backend and this patch fixes the
problem.
Sometimes opendrive reports "403 Folder is already deleted" on
directories which should exist.
This might be a bug in opendrive or in rclone however we work-around
here sufficient to get the tests passing.
ChangeNotify has been broken on the compress backend for a long time!
Before this change it was wrapping the file names received rather than
unwrapping them to discover the original names.
It is likely ChangeNotify was working adequately though for users as
the VFS just uses the directories rather than the file names.
Before this change the concurrency used for an upload was rather
inconsistent.
- if size below `--backend-upload-cutoff` (default 200M) do single part upload.
- if size below `--multi-thread-cutoff` (default 256M) or using streaming
uploads (eg `rclone rcat) do multipart upload using
`--backend-upload-concurrency` to set the concurrency used by the uploader.
- otherwise do multipart upload using `--multi-thread-streams` to set the
concurrency.
This change makes the default for the concurrency used be the
`--backend-upload-concurrency`. If `--multi-thread-streams` is set and larger
than the `--backend-upload-concurrency` then that will be used instead.
This means that if the user sets `--backend-upload-concurrency` then it will be
obeyed for all multipart/multi-thread transfers and the user can override them
all with `--multi-thread-streams`.
See: #7056
Before this change, b2 would return an error when opening a link
generated by `rclone link`. The following error occurs when the object
path contains an ampersand that is not percent encoded:
{
"code": "bad_request",
"message": "Bad character in percent-encoded string: 38 (0x26)",
"status": 400
}
Before this change the box backend could make errors like
Error "not_found" (404): On-Behalf-Of User not found ([123 34 105 110
118 97 108 105 100 95 117 115 101 114 95 105 100 34 58 123 34 105 100
34 58 34 48 48 48 48 48 48 48 48 48 48 48 34 125 125])
This fixes it to produce this instead
Error "not_found" (404): On-Behalf-Of User not found ({"invalid_user_id":{"id":"00000000000"}})
This implements the OpenChunkWriter interface for b2 which
enables multi-thread uploads.
This makes the memory controls of the s3 backend inoperative; they are
replaced with the global ones.
--b2-memory-pool-flush-time
--b2-memory-pool-use-mmap
By using the buffered reader this fixes excessive memory use when
uploading large files as it will share memory pages between all
readers.
This implements the OpenChunkWriter interface for azureblob which
enables multi-thread uploads.
This makes the memory controls of the s3 backend inoperative; they are
replaced with the global ones.
--azureblob-memory-pool-flush-time
--azureblob-memory-pool-use-mmap
By using the buffered reader this fixes excessive memory use when
uploading large files as it will share memory pages between all
readers.
This makes the memory controls of the s3 backend inoperative and
replaced with the global ones.
--s3-memory-pool-flush-time
--s3-memory-pool-use-mmap
By using the buffered reader this fixes excessive memory use when
uploading large files as it will share memory pages between all
readers.
Fixes#7141
As an extra security feature some FTP servers (eg FileZilla) require
that the data connection re-use the same TLS connection as the control
connection. This is a good thing for security.
The message "TLS session of data connection not resumed" means that it
was not done.
The problem turned out to be that rclone was re-using the TLS session
cache between concurrent connections so the resumed TLS data
connection could from any of the control connections.
This patch makes each TLS connection have its own session cache which
should fix the problem.
This also reverts the ftp library to the upstream version which now
contains all of our patches.
Fixes#7234