Before this change the webdav backend didn't truncate Range requests
to the size of the object. Most webdav providers are OK with this (it
is RFC compliant), but it causes 4shared to return 500 internal error.
Because Range requests are used in mounting, this meant that mounting
didn't work for 4shared.
This change truncates the Range request to the size of the object.
See: https://forum.rclone.org/t/cant-copy-use-files-on-webdav-mount-4shared-that-have-foreign-characters/21334/
Before this change, attempting to update an archive tier blob failed
with a 409 error message:
409 This operation is not permitted on an archived blob.
This change detects if we are overwriting a blob and either generates
the error (if `--azureblob-archive-tier-delete` is not set):
can't update archive tier blob without --azureblob-archive-tier-delete
Or deletes the blob first before uploading it again (if
`--azureblob-archive-tier-delete` is set).
Fixes#4819
Before this change if you attempted to list a remote set up with a SAS
URL outside its container then it would crash the Azure SDK.
A check is done to make sure the root is inside the container when
starting the backend which is usually enough, but when two SAS URL
based remotes are mounted in a union, the union backend attempts to
read paths outside the named container. This was causing a mysterious
crash in the Azure SDK.
This fixes the problem by checking to see if the container in the
listing is the one in the SAS URL before listing the directory and
returning directory not found if it isn't.
Before this change when NewObject was called the b2 backend would list
the directory that the object was in in order to find it.
Unfortunately list calls are Class C transactions and cost more.
This patch switches to using HEAD requests instead. These are Class B
transactions. It is then necessary to parse the headers from response
back into the data that we get from the listing. However B2 returns
exactly the same data, just in a different form.
Rclone will use the old directory listing method when looking for
files with versions as these can't be found via a HEAD request.
This change will particularly benefit --files-from, rclone serve
restic but most operations will see some benefit.
Starting September 30th, 2021, the Dropbox OAuth flow will no longer
return long-lived access tokens. It will instead return short-lived
access tokens, and optionally return refresh tokens.
This patch adds the token_access_type=offline parameter which causes
dropbox to return short lived tokens now.
Before this change rclone would upload the whole of multipart files
before receiving a message from dropbox that the path was too long.
This change hard codes the 255 rune limit and checks that before
uploading any files.
Fixes#4805
Before this change, rclone would retry files with filenames that were
too long again and again.
This changed recognises the malformed_path error that is returned and
marks it not to be retried which stops unnecessary retrying of the file.
See #4805
Before this change rclone was using the copy endpoint to copy large objects.
This can fail for large objects with this error:
Error 413: Copy spanning locations and/or storage classes could
not complete within 30 seconds. Please use the Rewrite method
This change makes Copy use the Rewrite method as suggested by the
error message which should be good for any size of copy.
Yandex appears to ignore mime types set as part of the PUT request or
as part of a PATCH request.
The docs make no mention of being able to set a mime type, so set
WriteMimeType=false indicating the backend can't set mime types on
uploaded files.
This is done by making fs.Config private and attaching it to the
context instead.
The Config should be obtained with fs.GetConfig and fs.AddConfig
should be used to get a new mutable config that can be changed.
Before this change, small objects uploaded with SSE-AWS/SSE-C would
not have MD5 sums.
This change adds metadata for these objects in the same way that the
metadata is stored for multipart uploaded objects.
See: #1824#2827
If rclone is configured for server side encryption - either aws:kms or
sse-c (but not sse-s3) then don't treat the ETags returned on objects
as MD5 hashes.
This fixes being able to upload small files.
Fixes#1824
This shouldn't be read as encouraging the use of math/rand instead of
crypto/rand in security sensitive contexts, rather as a safer default
if that does happen by accident.
For some reason the API started returning some integers as strings in
JSON. This is probably OK in Javascript but it upsets Go.
This is easily fixed with the `json:"name,size"` struct tag.
Before this change a circular symlink would cause rclone to error out from the listings.
After this change rclone will skip a circular symlink and carry on the listing,
producing an error at the end.
Fixes#4743
This adds a context.Context parameter to NewFs and related calls.
This is necessary as part of reading config from the context -
backends need to be able to read the global config.
It seems that when doing chunked uploads to onedrive, if the chunks
take more than 3 minutes or so to upload then they may timeout with
error 504 Gateway Timeout.
This change produces an error (just once) suggesting lowering
`--onedrive-chunk-size` or decreasing `--transfers`.
This is easy to replicate with:
rclone copy -Pvv --bwlimit 0.05M 20M onedrive:20M
See: https://forum.rclone.org/t/default-onedrive-chunk-size-does-not-work/20010/
Minor wording change to help for explicit and implicit FTPS flags. More consistent between flags. Add 's' to request because only one 'client' mentioned.
As reported in
https://github.com/rclone/rclone/issues/4660#issuecomment-705502792
After switching to a password callback function, if the ssh connection
aborts and needs to be reconnected then the user is-reprompted for their
password. Instead we now remember the password they entered and just give
that back. We do lose the ability for them to correct mistakes, but that's
the situation from before switching to callbacks. We keep the benefits
of not asking for passwords until the SSH connection succeeds (right
known_hosts entry, for example).
This required a small refactor of how `f := &Fs{}` was built, so we can
store the saved password in the Fs object
Before this change rclone returned the size from the Stat call of the
link. On Windows this reads as 0 always, however on unix it reads as
the length of the text in the link. This caused errors like this when
syncing:
Failed to copy: corrupted on transfer: sizes differ 0 vs 13
This change causes Windows platforms to read the link and use that as
the size of the link instead of 0 which fixes the problem.
Based on Issue 4087
https://github.com/rclone/rclone/issues/4087
Current behaviour is insecure. If the user specifies this value then we
switch to validating the server hostkey and so can detect server changes
or MITM-type attacks.
This allows files to be copied by ID from google drive. These can be
copied to any rclone remote and if the remote is a google drive then
server side copy will be attempted.
Fixes#3625
This type of error is unlikely to be an error that can be resolved by a retry,
and is triggered in #2296 by files with a timestamp before the unix epoch.
The maximum value for the --s3--copy-cutoff should be 5GiB as tested
with AWS S3.
However b2 have implemented this as 5GB rather than 5GiB so having the
default at 5 GiB makes the b2s3 server side copy of a large file by
default.
This patch sets the default to 4768 MiB which is slightly less than
5GB.
This should have very little effect on anything.
If in future rclone can lower this limit more if Copy can multithread.
See: https://forum.rclone.org/t/copying-files-within-a-b2-bucket/16680/76
Before this change the s3 multipart server side copy was not
preserving the metadata of the object. This was most noticeable
because the modtime was not preserved.
This change fetches the metadata from the object before starting the
copy and overwrites it if requires.
It will also mean any other metadata is preserved.
See: https://forum.rclone.org/t/copying-files-within-a-b2-bucket/16680/70
Before this change, when the above backends created a new backend they
didn't put it into the backend cache.
This meant that rc commands acting on those backends did not work.
This was fixed by making sure the backends use the backend cache.
See: https://forum.rclone.org/t/rclone-rc-backend-command-not-working-as-expected/18834
Before this change writing with the all policy deadlocked while
uploading.
This change fixes the problem by fixing the multi reader, closing the
pipes at the correct time with the correct error. This is factored
into a new function as it was used twice.
This patch also adds a new test which tests the all policies.
Before this fix we were reading the hash from the upload using the
string "ETag", however the go runtime normalises the tag into "Etag"
so we were in fact always reading an empty string.
This bug was introduced in
aeea4430d5 swift: efficiency: slim Object and reduce requests on upload
It was spotted by the integration tests.
The fix was just to use the canonical form "Etag" instead of "ETag".
In this commit
a2afa9aadd fs: Add directory to optional Purge interface
We failed to encrypt the directory name so the Purge failed.
This was spotted by the integration tests.
In this commit:
cbf3d43561 drive: fix missing items when listing using --fast-list / ListR
We introduced a bug where under specific circumstances it could cause
a "panic: send on closed channel".
This was caused by:
- rclone engaging the workaround from the commit above
- one of the listing routines returning an error
- this caused the `in` channel to be closed to stop the readers
- however the workaround was recycling stuff into the `in` channel at the time
- hence the panic on closed channel
This fix factors out the sending to the `in` channel into `sendJob`
and calls this both from the master go routine and the list
runners. `sendJob` detects the `in` channel being closed properly and
also deals correctly with contention on the `in` channel.
Fixes#4511
When using `rclone authorize` the hostname doesn't get set in the
config file.
This commit allows it to be set in the configurator and gives the user
a hint that it needs setting.
This reverts part of
151f03378f s3: fix upload of single files into buckets without create permission
This erroneously assumed that a HEAD request on a non existent object
would return "NotFound" if the bucket was found. In fact it returns
"NotFound" when the bucket isn't found also.
This will break the fix for #4297 - however that can be made to work
using the new --s3-assume-bucket-exists flag
Before this change, rclone was looking for the file without the
extension to see if it existed which meant that it never did.
This change checks the destination file exists firsts, before removing
the extension.
Google drive appears to no longer be copying the modification time of
google docs.
Setting the mod time immediately after the copy doesn't work either,
so this patch copies the object, waits for 1 second and then sets the
modtime.
Fixes#4517
This was only working for files in the root directory and wasn't
looking at the encoding.
This is fixed to use NewObject which takes both things into account
and it makes the share by ID instead of by path.
This problem was spotted by the integration tests.
Before this change we errored out if one upstream errored in Purge or
About.
This change checks for fs.ErrorDirNotFound and skips that backend in
this case.
1. adds SharedOptions data structure to oauthutil
2. adds config.ConfigToken option to oauthutil.SharedOptions
3. updates the backends that have oauth functionality
Fixes#2849
After uploading a multipart object, rclone deletes any unused parts.
Probably as part of the listing unification, the detection of the
parts beloning to the current upload was failing and calling Update
was deleting the parts for the current object.
This change fixes the detection and deletes all the old parts but none
of the new ones now.
Fixes#4075
Previous to this change rclone cached the looked up root_folder_id in
the root_folder_id config variable.
This has caused a lot of confusion and a few attempts at workarounds
and ultimately was a mistake.
This reverts rclone attempting to cache anything in root_folder_id and
returns that variable to be entirely user modified.
It gives a little hint in the debug that rclone could be sped up
slightly by setting it, but it is up to the user to think about
whether that would be OK or not.
Google drive root '': root_folder_id = "XXX" - save this in the config to speed up startup
It does not change root_folder_id itself, leaving this to the user.
See: https://forum.rclone.org/t/rclone-gdrive-no-longer-returning-anything/17215
- add a directory to the optional Purge interface
- fix up all the backends
- add an additional integration test to test for the feature
- use the new feature in operations.Purge
Many of the backends had been prepared in advance for this so the
change was trivial for them.
If this option is enabled, rclone will not set modtime of uploaded files and
the backend will return ModTimeNotSupported as its Precision.
Normally rclone updates modification time of files after they are done
uploading. This can cause permissions issues on Linux platforms when
rclone is copying to a CIFS mount where the user rclone is
running as does not own the file uploaded. If this option is enabled,
rclone will no longer update the modtime after copying a file.
See: https://forum.rclone.org/t/chtimes-error-on-local-mounted-copy/17784
This implements `rclone cleanup` to remove multipart uploads over 24
hours old. It also implements the backend command
`list-multipart-uploads` to see which ones are available and `cleanup`
to delete them with a configurable expiry interval.
See #4302
Before this fix, if an object had ID set and download_url was in use,
downloading the object would give this error:
failed to open for download: bucket example_bucket does not have file: /b2api/v1/b2_download_file_by_id (404 not_found)
After this fix we only download by ID if download_url is not set
See: https://forum.rclone.org/t/correct-format-for-rclone-b2-download-url-variable/15498
When we run MKCOL on 4shared on a directory that already exists, this
returns a 409/Conflict error. However this error code usually means
that the intermediate collections need creating.
The actual error code to return when trying to create a directory that
already exists isn't specified in the RFC, only that an error MUST be
returned and there are already 3 statuses checked in the code.
However using 409 makes rclone's usual strategy for making directories
fail and return the 409 error.
This patch tries the MKCOL and if it returns an unrecognised error
code, then calls PROPFIND on the directory to discover whether the
directory really exists or not.
This should also cover other WebDAV servers returning other error
messages we haven't accounted for in the code yet.
Before this change the cache backend contained its own routines for
mounting testing on that mount.
These tests are never run on the CI and cause a maintenance burden.
This commit removes the tests.
Previous to this fix if Region was not set and Endpoint was not set
then we set the endpoint to "https://s3.amazonaws.com/".
This is unecessary because if the Region alone isn't set then we set
it to "us-east-1" which has the same endpoint.
Having the endpoint set breaks the bucket region auto detection with
the error "Failed to update region for bucket: can't set region to
"xxx" as endpoint is set".
This fix removes that check.
At some point Purge stopped deleting directory markers. We don't have
an integration test for this so it went unnoticed.
This patch fixes the problem but doesn't introduce an integration test
as we don't have a framework for making directory markers yet.
Before this change, large objects which had had their contents deleted
would return "Object not found" and break the listing.
This change makes these objects appear as 0 sized entities so they can
be listed and deleted.
Pcloud appears to have opened up a new region and they are returning
the hostname in the oauth callback, thus
GET /?code=XXX&locationid=1&hostname=api.pcloud.com&state=XXX HTTP/1.1
GET /?code=XXX&locationid=2&hostname=eapi.pcloud.com&state=XXX HTTP/1.1
This isn't documented yet, however pCloud have confirmed that this is
the correct interpretation.
Rclone now reads the "hostname" parameter in the oauth callback and
stores it in the config file. It uses it for all subequent API calls.
Previous to this a dangling shortcut would error the directory
listing.
This patch makes dangling shortcuts appear as 0 sized objects in the
directory listing so they can be deleted. These objects can't be read
though.
For some objects the onedrive backend has been doing a server side
copy and a delete when a server side move would have worked OK.
This was caused by not detecting the home drive correctly (when it was
an empty string) and assuming that these transfers were cross drive.
This is fixed by comparing canonicalizing drive IDs before comparing them.
Currently credentials are required to download a public bucket file
which is not really necessary and makes automated usage more complex.
Add a new option "anonymous" which when enabled configures the gcs
backend to use an anonymous HTTP client. This of course only works
for read access and trying to write will lead to errors like that:
"googleapi: Error 401: Anonymous caller does not not have
storage.objects.create access to the Google Cloud Storage object.",
as expected. By default the anonymous access option is disabled so that
the GCS Application Default Credentials are still used by default as
before and an error is given if they can't be found.
Before this change rclone used the relative path from the current
working directory.
It appears that WS FTP doesn't like this and the openssh sftp tool
also uses absolute paths which is a good reason for switching to
absolute paths.
This change reads the current working directory at startup and bases
all file requests from there.
See: https://forum.rclone.org/t/sftp-ssh-fx-failure-directory-not-found/17436
Before this change the --local-no-updated flag would not error if the
files changed in size during the transfer. The file could still be
read beyond the size advertised though which caused problems with
certain backends.
After this change we attempt to provide a consistent view of the file
once it has been opened.
Once the file has had stat() called on it for the first time we
- Only transfer the size that stat gave
- Only checksum the size that stat gave
- Don't update the stat info for the file
This means that files that are extending can be transferred - rclone
will transfer the length it saw the first time it listed the file.
See: https://forum.rclone.org/t/transport-connection-broken/16494/21
In this commit
5c5ad6220 drive: fix --drive-impersonate with cached root_folder_id
We disabled the use of root_folder_id with --drive-impersonate to fix
a problem with a cached root_folder_id giving the wrong results.
This, alas, broke one users setup with a root_folder_id of
appDataFolder. Since this is identifiable and definitely couldn't have
been cached, we can safely skip this check in this case.
See: https://forum.rclone.org/t/rclone-gdrive-no-longer-returning-anything/17215/10
Before this change there was lots of duplicated code in all the
dircache using backends to support DirMove.
This change factors this code into the dircache library.
Dircache was changed to:
- Remove special cases for the root directory
- Remove Fatal errors
- Call FindRoot on behalf of the user wherever possible
- Bring up to modern Go standards
Backends were changed to:
- Remove calls to FindRoot
- Change calls to FindRootAndPath to FindPath
- Don't make special cases for the root
This fixes several corner cases, for example removing a non existent
directory if FindRoot hasn't been called.
Before this fix rclone would continually try to delete non empty
segment containers which made deleting lots of files very slow.
This fix makes rclone just try the delete once and then carry on which
was the original intent of the code before the retry logic got put in.
Before this change if the server sent us xml like this
```
<D:propstat>
<D:prop>
<g0:quota-available-bytes/>
<g0:quota-used-bytes/>
</D:prop>
<D:status>HTTP/1.1 404 Not Found</D:status>
</D:propstat>
```
Rclone would read the empty XML items as containing 0
After this fix we make sure that we have a value before using it.
Before this fix rclone v1.51 and 1.52 would incorrectly use the cached
root_folder_id when the --drive-impersonate flag was in use. This
meant that rclone could be looking up the wrong directory ID with
unpredictable results - usually all files apparently being missing.
This fix makes rclone look up the root_folder_id always when using
--drive-impersonate. It does this by clearing the root_folder_id and
making a NOTICE message that it is ignoring the cached value.
It also stops rclone caching the root_folder_id when using
--drive-impersonate.
See: https://forum.rclone.org/t/rclone-gdrive-no-longer-returning-anything/17215
Adding the expires parameter gives settings_error/not_authorized/.. errors.
The expires setting isn't in the documentation so this commit removes
it for now.
For SSH authentication, `key_pem` should both override `key_file`
and not require other SSH authentication methods to be set.
Prior to this fix, rclone would attempt to use an ssh-agent
when `key_pem` was the only SSH authentication method set.
Fixes#4240
Before this change we were setting the headers on the PUT
request for normal and multipart uploads. For normal uploads this caused the error
403 Forbidden: There were headers present in the request which were not signed
After this fix we set the headers in the object upload request itself
as the s3 SDK expects.
This means that we only support a limited range of headers
- Cache-Control
- Content-Disposition
- Content-Encoding
- Content-Language
- Content-Type
- X-Amz-Tagging
- X-Amz-Meta-
Note for the last of those are for setting custom metadata in the form
"X-Amz-Meta-Key: value".
This now works for multipart uploads and single part uploads
See also #59
This provides two things:
* It gives Storj insight into which uplink clients are using the
network.
* It facilitate rclone participating in the Tardigrade Open Source
Partner Program https://tardigrade.io/partner/
* s3: add `max_upload_parts` support
This allows to configure a maximum amount of chunks used to upload file:
- Support Scaleway which has a limit of 1k chunks currently
- Reduce a cost on S3 when each request costs some money at the expense of memory used
Co-authored-by: Nick Craig-Wood <nick@craig-wood.com>
This adds expire and unlink fields to the PublicLink interface.
This fixes up the affected backends and removes unlink parameters
where they are present.
This factors copy out of SetModTime and Copy so it can be called from
both places.
This also reworks all the multipart uploading to use sync.Errgroup and
memory pooling like the other backends. This makes it more memory
efficient and handle errors better.
See: https://forum.rclone.org/t/copying-files-within-a-b2-bucket/16680/10
Before this change, attempting to upload a single file into an s3
bucket which did not have create permission gave AccessDenied: Access
Denied error when it tried to create the bucket.
This was masked until e2bf91452a was
fixed.
This fix marks the bucket as OK if a fetch on an object indicates it
is OK. This stops rclone thinking it has to create the bucket in the
first place.
Fixes#4297
This is caused by a bug in Google drive where, in some circumstances
querying for "(A in parents) or (B in parents)" returns nothing
whereas querying for "A in parents" and "B in parents" separately
works fine.
This has been reported here:
https://issuetracker.google.com/issues/149522397
This workaround detects this condition by seeing if a listing for more
than one directory at once returns nothing.
If it does then it retries each one individually.
This can potentially have a false positive if the user has multiple
empty directories which are queried at once. The consequence of this
will be that ListR is disabled for a while until the directories are
found to be actually empty in which case it will be re-enabled.
Fixes#3114 and Fixes#4289
This reverts commit 9e4b68a364.
This does not work as intended - it only changes docs files and to
make it change drive files would take an extra roundtrip.
I think the sematics of server side copy are now correct - additional
features should be added with a new flag.
See #4230
When wrapping a backend that supports Server Side Copy (e.g. `b2`, `s3`)
and configuring the `tmp_upload_path` option, the `cache` backend would
erroneously report that Server Side Copy/Move was not supported, causing
operations such as file moves to fail. This change fixes this issue
under these circumstances such that Server Side Copy will now be used
when the wrapped backend supports it.
Fixes#3206
Before this change we early exited the SetModTime call which means we
skipped reading the info about the file.
This change reads info about the file in the SetModTime call even if
we are skipping setting the modtime.
See: https://forum.rclone.org/t/sftp-and-set-modtime-false-error/16362
This commit changes the output of the rclone backend encode crypt: and
decode commands to output a plain list of decoded or encoded file
names.
This makes the command much more useful for command line scripting.
Enable fast list functions for union backend when:
- at least one of the upstreams supports fast list
- upstreams only consist of backends that support fast list and local backend.
Fixes#3000
When server side copying Google docs files we attempt to preserve the
description.
This patch makes it so that we use the default description if the
original description was empty.
See: 6fdd7149c1 (commitcomment-38008638)
Before this change, for some operations, eg rcat or copyto (of a file)
rclone would attempt to create the container when using a SAS URL
limited to a container.
After this change we assume the container does not need creating when
using a container SAS URL.
See: https://forum.rclone.org/t/rclone-rcat-azure-blob-container-sas-token-403-error/16286
This also fixes typo in the name of the function, and allows making
shortcuts from the root directory which are useful in cross drive
shortcut creation.
This also adds a basic suite of tests for creating listing, removing
shortcuts.
This means that we can return ErrorNotAFile when there is an object
with the same name as a directory rather than potentially creating a
duplicate name.
Before this code we were settig the headers on the PUT request. However this isn't where GCS needs them.
After this fix we set the headers in the object upload request itself.
This means that we only support a limited range of headers
- Cache-Control
- Content-Disposition
- Content-Encoding
- Content-Language
- Content-Type
- X-Goog-Meta-
Note for the last of those are for setting custom metadata in the form
"X-Goog-Meta-Key: value".
Before this change the local backend was returning file not found
errors for post transfer hashes for files which were moved. This was
caused by the routine which checks for the object being changed.
After this change we ignore file not found errors while checking to
see if the object has changed. If the hash has to be computed then a
file not found error will be thrown when it is opened, otherwise the
cached hash will be returned.
Before this change rclone would skip all shortcuts with a message
Ignoring unknown document type "application/vnd.google-apps.shortcut"
After this message rclone resolves the shortcuts by default to the
actual files that they point to. See the docs for more info.
The --drive-skip-shortcuts flag can be used to skip shortcuts.
Before this change the newObject* functions could return object=nil
with err=nil. The result of these functions are passed outside of the
backend code (eg in Copy, Move) and returning a nil object with a nil
error leads to crashes elsewhere as it breaks expectations.
After this change we return (nil, fs.ErrorObjectNotFound) in these
cases. The one place this is actually needd internally (when turning
items into listings) we detect that error and use it to mean skip the
directory item.
This problem was noticed while testing the shortcuts code. It
shouldn't happen normally but it is conceivable it could.
Apparently some tools (eg duplicati) upload the SHA1 in uppercase to
b2 to be stored in the `large_file_sha1` metadata. This patch forces
it to lower case.
According to Microsoft support this error can be caused by
> A timing/concurrency issue where the PUT operations are happening
> about the same time for a single blob. The Put Block List operation
> writes a blob by specifying the list of block IDs that make up the
> blob. In order to be written as part of a blob, a block must have
> been successfully written to the server in a prior Put Block
> operation.
>
> Documentation reference:
>
> https://docs.microsoft.com/en-us/rest/api/storageservices/put-block
>
> This error can happen when doing concurrent upload commits after you
> have started the upload but before you commit. In that case, the
> upload fails. The application can retry this error or attempt some
> other recovery action based on the required scenario.
See: https://forum.rclone.org/t/error-while-syncing-with-azure-blob-storage-x-ms-error-code-invalidbloborblock/15561
For a certain class of broken or missing image Google Photos puts an
image in the error message.
Before this fix we blindly chucked it into the error message.
After this fix we replace it with some sensible text.
Before this change crypt would not calculate hashes for files it was
uploading. This is because, in the general case, they have to be
downloaded, encrypted and hashed which is too resource intensive.
However this causes backends which need the hash first before
uploading (eg s3/b2 when uploading chunked files) not to have a hash
of the file. This causes cryptcheck to complain about missing hashes
on large files uploaded via s3/b2.
This change calculates hashes for the upload if the upload is coming
from a local filesystem. It does this by encrypting and hashing the
local file re-using the code used by cryptcheck. For a local disk this
is not a lot more intensive than calculating the hash.
See: https://forum.rclone.org/t/strange-output-for-cryptcheck/15437Fixes: #2809
Previously we had a map of pools for different chunk sizes.
In practice the mapping is not very useful and requires a lock.
Pools of size other that ChunkSize can only happen when we have a huge file (over 10k * ChunkSize).
We need to have a bunch of identically sized huge files.
In such case most likely ChunkSize should be increased.
The mapping and its lock is replaced with a single initialised pool for ChunkSize, in other cases pool is allocated and freed on per file basis.
Rclone can't safely delete files with multiple parents without
PATCHing the parents list. This can be done, but since multiple
parents are going away to be replaced by drive shortcuts we return an
error for now.
See #4013
Before this change we queries /me/drives for a list of the users
drives and asked the user to choose. Sometimes this does not return
the users main drive for reasons unknown.
After this change we query /me/drives first then /me/drive and add
that to the list of drives if it wasn't already there.
In 5470d34740 "backend/s3: use low-level-retries as the number
of SDK retries" we switched over to using the AWS SDK low level
retries instead of rclone's low level retry logic.
This had the unfortunate attempt that retrying listings to correct XML
Syntax errors failed on non S3 backends such as CEPH. The AWS SDK was
also retrying the XML Syntax error request which doesn't make sense.
This change turns off the AWS SDK retries in favour of just using
rclone's retry logic.
If chunk size is more than 250M (262,144,000 bytes) then API throws the following error:
Microsoft.SharePoint.Client.InvalidClientQueryException: The request message is too big. The server does not allow messages larger than 262144000 bytes.
Before this change rclone didn't use sparse files on Windows. This
means that when you downloaded a file with multithread download it
wrote the entire file with zeros first on the first write not at the
start of the file.
This change makes the file be sparse on Windows. Linux/macOS files
were already sparse.
Before this change shared with me items with multiple parents (ie most
of them that aren't in the root) would appear twice in the directory
listings.
This fixes the problem by doing an early exit for shared with me
items.
This bug was introduced here by removing some necessary code detecting
shared with me items at the root with no parents.
4453fa4ba6 "drive: fix --fast-list when using appDataFolder"
This fix reverts that part of the patch.
Fixes#4018
This adds a bit of missed locking around the uploaded info to fix the
concurrent map write.
All the other accesses have locking - this one must have got missed.
pureftpd has a bug where it sends messages like this
```
150-Accepted data connection\r\n
Response code: File status okay; about to open data connection (150)
Response arg: Accepted data connection
150 32768.0 kbytes to download\r\n
150 0.014 seconds (measured here), 1665.27 Mbytes per second\r\n
```
The last `150` is treated as a new response - the previous `150` should have been `150-`.
This means that rclone sees the `150 0.014 seconds (measured here),
1665.27 Mbytes per second` as a reply to the next message and reports
it as an error.
This fix ignores that specific message when it is received in the
`Close` method. It dumps the FTP connection after as it is out of
sync.
See: #3984Fixes#3445
Before this change if rclone failed to close a file download for some
reason it would leak a concurrency token. When all the tokens were
leaked then rclone would lock up.
This fix returns the concurrency token regardless of the error status.
Before this change if rclone failed to upload a file for some reason
it would leak a concurrency token. When all the tokens were leaked
then rclone would lock up.
The fix returns the concurrency token regardless of the error state.
Before this change if rclone failed to make an FTP connection for some
reason it would leak a concurrency token. When all the tokens were
leaked then rclone would lock up.
The fix returns the concurrency token if creating the FTP connection
returns an error.
Amazon S3 is built to handle different kinds of workloads.
In rare cases where S3 is not able to scale for whatever reason users
will face status 500 errors.
Main mechanism for handling these errors are retries.
Amount of needed retries varies for each different use case.
This change is making retries for s3 backend configurable by using
--low-level-retries option.
Currently each multipart upload allocated his own buffers, which after
file upload was garbaged. Next files couldn't leverage already allocated
memory which resulted in inefficent memory management. This change
introduces backend memory pool keeping memory chunks which can be
used during object operations.
Fixes#3967
The error code 500 Internal Error indicates that Amazon S3 is unable to handle the request at that time. The error code 503 Slow Down typically indicates that the requests to the S3 bucket are very high, exceeding the request rates described in Request Rate and Performance Guidelines.
Because Amazon S3 is a distributed service, a very small percentage of 5xx errors are expected during normal use of the service. All requests that return 5xx errors from Amazon S3 can and should be retried, so we recommend that applications making requests to Amazon S3 have a fault-tolerance mechanism to recover from these errors.
https://aws.amazon.com/premiumsupport/knowledge-center/http-5xx-errors-s3/
This removes the unused functions run.writeRemoteRandomBytes() run.writeObjectRandomBytes() run.listPath() Directory.parentRemote() and Persistent.dumpRoot().
Before this change, when uploading multipart files, onedrive would
sometimes return an unexpected 416 error and rclone would abort the
transfer.
This is usually after a 500 error which caused rclone to do a retry.
This change checks the upload position on a 416 error and works how
much of the current chunk to skip, then retries (or skips) the current
chunk as appropriate.
If the position is before the current chunk or after the current chunk
then rclone will abort the transfer.
See: https://forum.rclone.org/t/fragment-overlap-error-with-encrypted-onedrive/14001Fixes#3131
This hides:
- "use_created_date"
- "use_shared_date"
- "size_as_quota"
from the configurator (rclone config) as they interfere with normal
operations and shouldn't be set in a backend config.
They can still be put in the config file by hand and will still work
as variables, etc.
This adds some more docs to "size_as_quota" also.
Fixes#3912
Before this change we used non multipart uploads for files of unknown
size (streaming and uploads in mount). This is slower and less
reliable and is not recommended by Google for files smaller than 5MB.
After this change we use multipart resumable uploads for all files of
unknown length. This will use an extra transaction so is less
efficient for files under the chunk size, however the natural
buffering in the operations.Rcat call specified by
`--streaming-upload-cutoff` will overcome this.
See: https://forum.rclone.org/t/upload-behaviour-and-speed-when-using-vfs-cache/9920/
This error started happening after updating golang/x/crypto which was
done as a side effect of:
3801b8109 vendor: update termbox-go to fix ncdu command on FreeBSD
This turned out to be a deliberate policy of making
ssh.ParsePrivateKeyWithPassphrase fail if the passphrase was empty.
See: https://go-review.googlesource.com/c/crypto/+/207599
This fix calls ssh.ParsePrivateKey if the passphrase is empty and
ssh.ParsePrivateKeyWithPassphrase otherwise which fixes the problem.
If the --drive-stop-on-upload-limit flag is in effect this checks the
error string from Google Drive to see if it is the error you get when
you've breached your 750GB a day limit.
If so then it turns this error into a Fatal error which should stop
the sync.
Fixes#3857
In listings if the ID `appDataFolder` is used to list a directory the
parents of the items returned have the actual ID instead the alias
`appDataFolder`. This confused the ListR routine into ignoring all
these items.
This change makes the listing routine accept all parent IDs returned
if there was only one ID in the query. This fixes the `appDataFolder`
problem. This means we are relying on Google Drive to only return the
items we asked for which is probably OK.
Fixes#3851
The S3 ListObject API returns paginated bucket listings, with
"MaxKeys" items for each GET call.
The default value is 1000 entries, but for buckets with millions of
objects it might make sense to request more elements per request, if
the backend supports it. This commit adds a "list_chunk" option for
the user to specify a lower or higher value.
This commit does not add safe guards around this value - if a user
decides to request a too large list, it might result in connection
timeouts (on the server or client).
In AWS S3, there is a fixed limit of 1000, some other services might
have one too. In Ceph, this can be configured in RadosGW.
Before this patch we were failing to URL decode the NextMarker when
url encoding was used for the listing.
The result of this was duplicated listings entries for directories
with >1000 entries where the NextMarker was a file containing a space.
Before this change we used the same (relatively low limits) for server
side copy as we did for multipart uploads. It doesn't make sense to
use the same limits since no data is being downloaded or uploaded for
a server side copy.
This change introduces a new parameter --s3-copy-cutoff to control
when the switch from single to multipart server size copy happens and
defaults it to the maximum 5GB.
This makes server side copies much more efficient.
It also fixes the erroneous error when trying to set the modification
time of a file bigger than 5GB.
See #3778
Before this change multipart copies were giving the error
Range specified is not valid for source object of size
This was due to an off by one error in the range source introduced in
7b1274e29a "s3: support for multipart copy"
Before this change rclone used "Authorization: BEARER token". However
according the the RFC this should be "Bearer"
https://tools.ietf.org/html/rfc6750#section-2.1
This changes it to "Authorization: Bearer token"
Fixes#3751 and interop with Salesforce Webdav server
When using nextcloud, before this change we only uploaded one of SHA1
or MD5 checksum in the OC-Checksum header with preference to SHA1 if
both were set.
This makes the MD5 checksums read as empty string which makes syncing
with checksums less useful than they should be as all the MD5
checksums are blank.
This change makes it so that we only upload the SHA1 to nextcloud.
The behaviour of owncloud is unchanged as owncloud uses the checksum
as an upload integrity check only and calculates its own checksums.
See: https://forum.rclone.org/t/how-to-specify-hash-method-to-checksum/13055
This also corrects the symlink detection logic to only check symlink
files. Previous to this it was checking all directories too which was
making it do more stat calls than was necessary.
Before this change we forgot to URL decode the X-Object-Manifest in a dynamic large object.
This problem was introduced by 2fe8285f89 "swift: reserve
segments of dynamic large object when delete objects in container what
was enabled versioning."
For few commands, RClone counts a error multiple times. This was fixed by
creating a new error type which keeps a flag to remember if the error has
already been counted or not. The CountError function now wraps the original
error eith the above new error type and returns it.
Before this change rclone used the team_drive ID as the root if set
even if the root_folder_id was set too.
This change uses the root_folder_id in preference over the team_drive
which restores the functionality.
This problem was introduced by ba7c2ac443Fixes#3742
We attempt to find the ID of the root folder by doing a GET on the
folder ID "root". With scope "drive.files" this fails with a 404
message.
After this change if we get the 404 message, we just carry on using
"root" as the root folder ID and we cache that for future lookups.
This means that changenotify messages will not work correctly in the
root folder but otherwise has minor consequences.
See: https://forum.rclone.org/t/fresh-raspberry-pi-build-google-drive-404-error-failed-to-ls-googleapi-error-404-file-not-found/12791
Before this change rclone would allow the user to stream (eg with
rclone mount, rclone rcat or uploading google photos or docs) 5TB
files. This meant that rclone allocated 4 * 525 MB buffers per
transfer which is way too much memory by default.
This change makes rclone use the configured chunk size for streamed
uploads. This is 5MB by default which means that rclone can stream
upload files up to 48GB by default staying below the 10,000 chunks
limit.
This can be increased with --s3-chunk-size if necessary.
If rclone detects that a file is being streamed to s3 it will make a
single NOTICE level log stating the limitation.
This fixes the enormous memory usage.
Fixes#3568
See: https://forum.rclone.org/t/how-much-memory-does-rclone-need/12743
Before this fix we neglected to add the shared drive ID to the request
when asking for an initial change notify token and this caused a lot
more results to be returned than was necessary.
When we changed recursive lists to use --fast-list by default this
broke listing with --drive-shared-with-me from the root.
This turned out to be an unwarranted assumption in the ListR code that
all items would have a parent folder that we had searched for - this
isn't true for shared with me items.
This was fixed when using --drive-shared-with-me to give items that
didn't have any parents a synthetic parent.
Fixes#3639
Before this change we used the id "root" as an alias for the root drive ID.
However this causes problems when we receive IDs back from drive which
are not in this format and have been expanded to their canonical ID.
This change looks up the ID "root" and stores it in the
"drive_folder_id" parameter in the config file.
This helps with
- Notifying changes at the root
- Files shared with me at the root
See #3639
Before this change when rclone was compiled with go1.13 it used HTTP/2
to contact drive by default.
This causes lockups and INTERNAL_ERRORs from the HTTP/2 code.
This is a workaround disabling the HTTP/2 code on an option.
It can be re-enabled with `--drive-disable-http2=false`
See #3631
Before this change we silently skipped uploads to dropbox of
disallowed file names. However this then caused "corrupted on
transfer" errors because the sizes were wrong.
After this change we return an no retry error which will mean that the
sync fails (as it should - not all files were uploaded) but no
unecessary retries happened.
This works around a bug in Ceph which doesn't encode CommonPrefixes
when using URL encoded directory listings.
See: https://tracker.ceph.com/issues/41870
changes:
- chunker: remove GetTier and SetTier
- remove wdmrcompat metaformat
- remove fastopen strategy
- make hash_type option non-advanced
- adverise hash support when possible
- add metadata field "ver", run strict checks
- describe internal behavior in comments
- improve documentation
note:
wdmrcompat used to write file name in the metadata, so maximum metadata
size was 1K; removing it allows to cap size by 200 bytes now.
Note: chunker implements many irrelevant methods (UserInfo, Disconnect etc),
but they are required by TestIntegration/FsCheckWrap and cannot be removed.
Dropped API methods: MergeDirs DirCacheFlush PublicLink UserInfo Disconnect OpenWriterAt
Meta formats:
- renamed old simplejson format to wdmrcompat.
- new simplejson format supports hash sums and verification of chunk size/count.
Change list:
- split-chunking overlay for mailru
- add to all
- fix linter errors
- fix integration tests
- support chunks without meta object
- fix package paths
- propagate context
- fix formatting
- implement new required wrapper interfaces
- also test large file uploads
- simplify options
- user friendly name pattern
- set default chunk size 2G
- fix building with golang 1.9
- fix ci/cd on a separate branch
- fix updated object name (SyncUTFNorm failed)
- fix panic in Box overlay
- workaround: Box rename failed if name taken
- enhance comments in unit test
- fix formatting
- embed wrapped remote rather than inherit
- require wrapped remote to support move (or copy)
- implement 3 (keep fstest)
- drop irrelevant file system interfaces
- factor out Object.mainChunk
- refactor TestLargeUpload as InternalTest
- add unit test for chunk name formats
- new improved simplejson meta format
- tricky case in test FsIsFile (fix+ignore)
- remove debugging print
- hide temporary objects from listings
- fix bugs in chunking reader:
- return EOF immediately when all data is sent
- handle case when wrapped remote puts by hash (bug detected by TestRcat)
- chunked file hashing (feature)
- server-side copy across configs (feature)
- robust cleanup of temporary chunks in Put
- linear download strategy (no read-ahead, feature)
- fix unexpected EOF in the box multipart uploader
- throw error if destination ignores data
When used with v2_auth = true, PresignRequest doesn't return
signed headers, so remote dest authentication would be fail.
This commit copying back HTTPRequest.Header to headers.
Tested with RiakCS v2.1.0.
Signed-off-by: Anthony Rusdi <33247310+antrusd@users.noreply.github.com>
- Read the storage class for each object
- Implement SetTier/GetTier
- Check the storage class on the **object** before using SetModTime
This updates the fix in 1a2fb52 so that SetModTime works when you are
using objects which have been migrated to GLACIER but you aren't using
GLACIER as a storage class.
Fixes#3522
Before this change we used PATCH on the object to update the metadata.
Apparently this requires the "full_control" scope which Google were
unhappy with in their oauth review.
This changes it to update the metadata by copying the object ontop of
itself (which is the way s3 works). This can be done with normal
permissions.
This fixes a crash on the google photos backend when an error is
returned from the rest.Call function.
This turned out to be a mis-understanding of the rest docs so
- improved rest.Call docs
- fixed mis-understanding in google photos backend
- fixed similar mis-understading in onedrive backend
- change the interface of listBuckets() removing dir parameter and adding context
- add makeBucket() and use in place of Mkdir("")
- this fixes some corner cases in Copy/Update
- mark all the listed buckets OK in ListR
Thanks to @yparitcher for the review.
Before this change, if the caller didn't provide a hint, we would
calculate all hashes for reads and writes.
The new whirlpool hash is particularly expensive and that has become noticeable.
Now we don't calculate any hashes on upload or download unless hints are provided.
This means that some operations may run slower and these will need to be discovered!
It does not affect anything calling operations.Copy which already puts
the corrects hints in.
When using the VFS with swift and --swift-no-chunk, PutStream was
returning objects with size -1 which was causing corrupted transfer
messages.
This was fixed by counting the bytes transferred in a streamed file
and updating the metadata with that.
This was factored from fstest as we were including the testing
enviroment into the main binary because of it.
This was causing opening the browser to fail because of 8243ff8bc8.
In 53a1a0e3ef we started returning non nil from NewObject when
an object isn't found. This breaks the integration tests and the API
expected of a backend.
This fixes it.
Introduce stats groups that will isolate accounting for logically
different transferring operations. That way multiple accounting
operations can be done in parallel without interfering with each other
stats.
Using groups is optional. There is dedicated global stats that will be
used by default if no group is specified. This is operating mode for CLI
usage which is just fire and forget operation.
For running rclone as rc http server each request will create it's own
group. Also there is an option to specify your own group.
Configuration time option to disable the above for if using Dropbox (does not
allow setting mtime on copy) or Amazon Drive (neither on upload nor on copy).
Before this change rclone was sending a MimeType in the requests for
server side Move and Copy.
The conjecture is that if you attempt to set the MimeType to something
different in a Copy then Google Drive has to do an actual copy of the
file data. This takes a very long time (since it is large) and fails
after a 90s timeout.
After the change we no longer set the MimeType in Move or Copy and the
copies happen instantly and correctly.
Many thanks to @darthShadow for discovering that this was causing the
problem.
Fixes#3070Fixes#3033Fixes#3300Fixes#3155
This was started by Fionera, finished off by Laura with fixes and more
docs from Nick.
Co-authored-by: Fionera <fionera@fionera.de>
Co-authored-by: Nick Craig-Wood <nick@craig-wood.com>
* azureblob - Add support for Azure Storage Emulator to test things locally.
Testing - Verified changes by testing manually.
* docs: update azureblob docs to reflect support of storage emulator
This bug was introduced as part of adding context to the backends and
slipped through the net because the About call did not have an
interface assertion in the sftp backend.
I checked there were no other missing interface assertions on all the
optional methods on all the backends.
- Change rclone/fs interfaces to accept context.Context
- Update interface implementations to use context.Context
- Change top level usage to propagate context to lover level functions
Context propagation is needed for stopping transfers and passing other
request-scoped values.
Before this change rclone attempted to set the "updated" field in
uploaded objects to the modification time.
However when this modification time was before 1970, google drive
would return the rather cryptic error:
googleapi: Error 400: Invalid value for UnsignedLong: -42000, invalid
However API docs: https://cloud.google.com/storage/docs/json_api/v1/objects#resource
state the "updated" field is read only and tests confirm that. Even
though the field is read only, it looks like Google parses it.
This change therefore removes the attempt to set the "updated" field
(which was doing nothing anyway) and fixes the problem uploading pre
1970 files.
See #3196 and https://forum.rclone.org/t/invalid-value-for-unsignedlong-file-missing-date-modified/3466
In #2728 and 55b9a4e we decided to allow server side operations
between google drives with different configurations.
This works in some cases (eg between teamdrives) but does not work in
the general case, and this caused breakage in quite a number of
people's workflows.
This change makes the feature conditional on the
--drive-server-side-across-configs flag which defaults to off.
See: https://forum.rclone.org/t/gdrive-to-gdrive-error-404-file-not-found/9621/10Fixes#3119
Under Linux, rclone attempts to preallocate files for efficiency.
Before this change, pre-allocation would fail on ZFS with the error
Failed to pre-allocate: operation not supported
After this change rclone tries a different flag combination for ZFS
then disables pre-allocate if that doesn't work.
Fixes#3066
Before this change rclone would fail with
Failed to set modification time: InvalidObjectState: Operation is not valid for the source object's storage class
when attempting to set the modification time of an object in GLACIER.
After this change rclone will re-upload the object as part of a sync if it needs to change the modification time.
See: https://forum.rclone.org/t/suspected-bug-in-s3-or-compatible-sync-logic-to-glacier/10187
Before this change, rclone would return an error from the listing if
there was an unreadable directory, or if there was a problem stat-ing
a directory entry. This was frustrating because the command
completely aborts at that point when there is work it could do.
After this change rclone lists the directories and reports ERRORs for
unreadable directories or problems stat-ing files, but does return an
error from the listing. It does set the error flag which means the
command will fail (and objects won't be deleted with `rclone sync`).
This brings rclone's behaviour exactly in to line with rsync's
behaviour. It does as much as possible, but doesn't let the errors
pass silently.
Fixes#3179
Before this change we calculated all possible hashes for the file when
the `Hashes` method was called.
After we only calculate the Hash requested.
Almost all uses of `Hash` just need one checksum. This will slow down
`rclone lsjson` with the `--hash` flag. Perhaps lsjson should have a
`--hash-type` flag.
However it will speed up sync/copy/move/check/md5sum/sha1sum etc.
Before it took 12.4 seconds to md5sum a 1GB file, after it takes 3.1
seconds which is the same time the md5sum utility takes.
This fixes rclone returning `listing failed: strconv.ParseInt` errors
when listing files which have a malformed `src_last_modified_millis`.
This is uploaded by the client so care is needed in interpreting it as
it can be malformed.
Fixes#3065
In as many methods as possible we attempt to obey the Retry-After
header where it is provided.
This means that when objects are being requested from OVH cold storage
rclone will sleep the correct amount of time before retrying.
If the sleeps are short it does them immediately, if long then it
returns an ErrorRetryAfter which will cause the outer retry to sleep
before retrying.
Fixes#3041
This implements the Expiry interface so token expiry works properly
This change makes sure that this change from the swift library works
correctly with rclone's custom authenticator.
> Renew the token 60s before the expiry time
>
> The v2 and v3 auth schemes both return the expiry time of the token,
> so instead of waiting for a 401 error, renew the token 60s before this
> time.
>
> This makes transfers more efficient and also works around a bug in
> CEPH which returns 403 instead of 401 when the token expires.
>
> http://tracker.ceph.com/issues/22223
Some WebDAV servers return an empty Available and Used which parses as 0.
This caused About to return the Total as 0 which can confused mounted
file systems.
After this change we ignore the result if Available and Used are both 0.
See: https://forum.rclone.org/t/windows-mounted-webdav-drive-has-no-free-space/8938
Before this change a race condition existed in mkdir
- the directory was attempted to be created
- the parent didn't exist so it failed
- the parent was created
- the directory was created again
The last step failed as the directory was created in a different thread.
This was fixed by checking the error messages of MKCOL for both
directory creations, rather than only the first.
Before this change a range request on a 0 length file would fail
$ rclone cat --head 128 drive:test/emptyfile
ERROR : open file failed: googleapi: Error 416: Request range not satisfiable, requestedRangeNotSatisfiable
To fix this we remove Range: headers on requests for zero length files.
This introduces a new config variable bucket_policy_only. If this is
set then rclone:
- ignores ACLs set on buckets
- ignores ACLs set on objects
- creates buckets with Bucket Policy Only set
Fall back to default application credentials when all other credentials sources fail
This change allows users with default application credentials
configured (notably when running on google compute instances) to
dispense with explicitly configuring google cloud storage credentials
in rclone's own configuration.
This enables MD5 checksum calculation and publication when uploading file above the "Cutoff" limit.
It was explictely ignored in case of multi-block (a.k.a. multipart) uploads to Azure Blob Storage.
Make the pacer package more flexible by extracting the pace calculation
functions into a separate interface. This also allows to move features
that require the fs package like logging and custom errors into the fs
package.
Also add a RetryAfterError sentinel error that can be used to signal a
desired retry time to the Calculator.
Bitrix Site Manager emits `<D:resourcetype><collection/></D:resourcetype>`
missing the namespace on the `collection` tag. This causes the item
to be identified as a file instead of a directory.
To work around this look at the Microsoft extension prop
`iscollection` which seems to be emitted as well.
Before this change any attempt to access a google doc in an rclone
mount would give the error "partial downloads are not supported while
exporting Google Documents" as the mount uses ranged requests to read
data.
This implements ranged requests for a limited number of scenarios,
just enough so that Google docs can be cat-ed from an rclone mount.
When they are cat-ed then they receive their correct size also.
Before this change the union remote was using whether the writable
union could poll for changes to decide whether the union mount could
poll for changes.
The fix causes the union backend to signal it can poll for changes if
**any** of the remotes can poll for changes.
Before this change it was setting the modification times of the things
that the symlinks pointed to.
Note that this is only implemented for unix style OSes. Other OSes
will not attempt to set the modification time of a symlink.
If the upload concurrency is set > 1 then the hash becomes corrupted.
The upload is fine, and can be downloaded fine, however the hash is no
longer the md5sum of the object. It is not known whether this is
rclone's fault or a bug at QingStor.
Before this change if ContentLength was set in the options but 0 then
we would upload using chunked encoding. Fix this to always upload
with a "Content-Length" header even if the size is 0.
Remove workarounds for this from b2 and onedrive backends.
This fixes the issue for the webdav backend described here:
https://forum.rclone.org/t/code-500-errors-with-webdav-nextcloud/8440/
Before this change azureblob would attempt to create already existing
containers. This causes problems with limited permissions keys.
This change checks the container exists before trying to create it in
the same way the s3 backend does. This uses no more requests in the
usual case of the container existing.
See: https://forum.rclone.org/t/copying-individual-files-to-azure-blob-storage/8397
Before this change buckets were created with the same ACL as objects.
After this change, the user can set just --s3-acl to set the ACL of
buckets and objects, or use --s3-bucket-acl as well to have a
different ACL used for bucket creation.
This also logs at INFO level the creation and deletion of buckets.
* drive: don't run teamdrive config if auto confirm set
* onedrive: don't run extra config if auto confirm set
* make Confirm results customisable by config
Fixes#1010
The existing s3 backend passed all integration tests with OSS provided
`force_path_style = false`.
This makes sure that is so and adds documentation and configuration
for OSS.
Thanks to @luolibin for their work on the OSS backend which we ended
up not needing.
Fixes#1641Fixes#1237
The time format provided by webdav servers seems to vary wildly from
that specified in the RFC - rclone already parses times in 5 different
formats!
If an unparseable time is found, then fail softly logging an ERROR
(just once) but returning the epoch.
This will mean that webdav servers with bad time formats will still be
usable by rclone.
Before this fix rclone would just use the authorised bucket regardless
of what bucket you put on the command line.
This uses the new `bucketName` response in the API and checks that the
user is using the correct bucket name to avoid accidents.
Fixes#2839
Before this fix the http backend was returning the wrong error code
when files were not found. This was causing --files-from to error on
missing files instead of skipping them like it should.
The `cleanup` command will delete unfinished large file uploads that
were started more than a day ago (to avoid deleting uploads that are
potentially still in progress).
Fixes#2617
Increasing the --s3-upload-concurrency to 4 (from 2) gives an
additional 45% throughput at the cost of 10MB extra memory per transfer.
After testing the upload perfoc
Before this change rclone would use multipart uploads for any size of
file. However multipart uploads are less efficient for smaller files
and don't have MD5 checksums so it is advantageous to use single part
uploads if possible.
This implements single part uploads for all files smaller than the
upload_cutoff size. Streamed files must be uploaded as multipart
files though.
Before this change we used Remove to remove directories. This works
fine on Unix based systems but not so well on Windows based ones.
Swap to using RemoveDirectory instead.
When a container is deleted, a container with the same name cannot be
created for at least 30 seconds; the container may not be available
for more than 30 seconds if the service is still processing the
request.
We sleep so that we wait at most 60 seconds. This is mostly useful in
the integration tests where containers get deleted and remade
immediately.
Get rid of the api client and use rest/pacer for all API calls
Add Copy, Move, DirMove, PublicLink, About optional interfaces
Improve general error handling
Remove ListR for now due to inconsitent behaviour
fixes#2586, progress on #2740 and #2178
Before this change backend integration tests depended on each other,
so tests could not be retried.
After this change we nest tests to ensure that tests are provided with
the starting state they expect.
Tell the integration test runner that it can retry backend tests also.
This also includes bin/test_independence.go which runs each test
individually for a backend to prove that they are independent.
Wasabi has two location, US East and US West, with different endpoint URLs.
When configuring S3 to use Wasabi, provide the endpoint information for both
locations.
Before this change Rmdir would check the root rather than the
directory specified for being empty and return "directory not empty"
when it shouldn't have done.
When the env_auth option is enabled, the AWS SDK's session constructor
now loads configuration from ~/.aws/config and environment variables,
and credentials per the selected (or default) AWS_PROFILE's settings.
This is accomplished by **NOT** including any Credential provider in the
aws.Config passed to the session constructor: If the Config.Credentials
is non-nil, that will always be used and the user's configuration re
role_arn, credential_source, source_profile, etc... from the shared
config will be completely ignored.
(The conditional creation and configuration of the stscreds Credential
provider is complicated enough that it is not worth re-creating that
logic.)
Before this change the ACL for objects which were server side copied
was left at the default "private" settings. S3 doesn't copy the ACL
from the source when you copy an object, you have to set it afresh
which is what this does.
Until https://github.com/Azure/azure-storage-blob-go/pull/75 is merged
the SDK can't upload a single blob of exactly the chunk size, so
upload files of this size with a multpart upload as a work around.
The previous fix for this 6a773289e7 turned out to cause problems
uploading files with maximum chunk size so needed to be redone.
Fixes#2653
Before this change the Features() method would return a different Fs
to that the Features() method was called on if the remote was
instantiated on a file.
The practical effect of this is that optional features, eg `rclone
about` wouldn't work properly when called on a file, and likely this
has been causing low level problems for users of these backends for
ages.
Ideally there would be a test for this, but it turns out that this is
really hard, so instead of that all the backends have been converted
to not copy the Fs and a big warning comment inserted for future
readers.
Fixes#2182
Use the same function to join the root paths for the wrapping remotes
alias, cache and crypt.
The new function fspath.JoinRootPath is equivalent to path.Join, but if
the first non empty element starts with "//", this is preserved to allow
Windows network path to be used in these remotes.
Implement optional interfaces
- Purge
- PutStream
- Copy
- Move
- DirMove
- DirCacheFlush
- ChangeNotify
- About
Make Hashes() return the intersection of all the hashes supported by the remotes
When moving a directory in drive, most of the time only a notification
for the directory itself is created, not the old or new parents.
This tires to find the old path in the dirCache and the new path with
the dirCache of the new parent, which can result in two notifications
for a moved directory.
Add a new flag to the drive backend to allow document conversions oni upload.
The existing --drive-formats flag has been renamed to --drive-export-formats.
The old flag is still working to be backward compatible.
Make use of the mime package to find matching extensions and mime types.
For simplicity, all extensions are now prefixed with "." to match the
mime package requirements.
Parsed extensions get converted if needed.
Before this change on Windows, files copied locally could become
heavily fragmented (300+ fragments for maybe 100 MB), no matter how
much contiguous free space there was (even if it's over 1TiB). This
can needlessly yet severely adversely affect performance on hard
disks.
This changes uses NtSetInformationFile to pre-allocate the space to
avoid this.
It does nothing on other OSes other than Windows.
Add --drive-v2-download-min-size flag to allow downloading files via the
drive v2 API. If files are greater than this flag, a download link is
generated when needed. The flag is disabled by default.
When combining the remote value and the root path, preserve the absence
or presence of the / at the beginning of the wrapped remote path.
e.g. a remote "cloud:" and root path "dir" becomes "cloud:dir" instead
of "cloud:/dir".
Fixes#2553
When combining the remote value and the root path, preserve the absence
or presence of the / at the beginning of the wrapped remote path.
e.g. a remote "cloud:" and root path "dir" becomes "cloud:dir" instead
of "cloud:/dir".
* Fix error handling in List and NewObject
* Fix Precision in case we have precision > time.Second
* Fix Features - all binary features are possible
* Fix integration tests using new test facilities
Uploads were broken because chunk size was set to zero. This was a
consequence of the backend config re-organization which meant that
chunk size had lost its default.
Sharing some backend config between swift and hubic fixes the problem
and means hubic gains its own --hubic-chunk-size flag.
The initial work on this was done by Oliver Heyme with updates from
Cnly.
Oliver Heyme:
* Changed to Microsoft graph
* Enable writing
* Added more options for adding a OneDrive Remote
* Better error handling
* Send modDate at create upload session and fix list children
Cnly:
* Simple upload API only supports max 4MB files
* Fix supported hash types for different drive types
* Fix unchecked err
Co-authored-by: Oliver Heyme <olihey@googlemail.com>
Co-authored-by: Cnly <minecnly@gmail.com>
Sometimes pcloud will leave a half uploaded file when the transfer
actually failed. This patch deletes the file if it exists.
This problem was spotted by the integration tests.
Before this change we were using the ChildCount in the Folder facet to
determine if a directory was empty or not. However this seems to be
unreliable, or updated asynchronously which meant that `rclone rmdir`
sometimes deleted directories that had files in.
This problem was spotted by the integration tests.
Listing the directory instead of relying on the ChildCount fixes the
problem and the integration tests, without changing the cost (one http
transaction).
This was causing errors which looked like this when copying a file to
the root of a drive:
mkdir \\?: The filename, directory name, or volume label syntax is incorrect.
This was caused by an incorrect path splitting routine which was
removing \ of the end of UNC paths when it shouldn't have been. Fixed
by using the standard library `filepath.Dir` instead.
Before this change if only one of storage_url or auth_token were
supplied then rclone would overwrite both of them when authenticating.
This effectively meant you could supply both of them or none of them
only.
Now rclone still does the authentication to read the missing
storage_url or auth_token then afterwards re-writes the auth_token or
storage_url back to what the user desired.
Fixes#2464
* Add docs for --jottacloud-md5-memory-limit
* Factor out readMD5 function and add tests
* Fix accounting
* Make sure temp file is deleted at the start (not Windows)
In e52ecba295 we forgot to unwrap and
re-wrap the accounting which mean the the accounting was no longer
first in the chain of readers. This lead to accounting inaccuracies
in remotes which wrap and unwrap the reader again.
Some webdav backends (eg rclone serve webdav) leave behind half
written files on error. This causes the integration tests to
fail. Here we remove the file if it exists.
Sometimes it takes many more commit retries than expected to commit a
multipart file, so split this number into its own config variable and
default it to 100 which should always be enough.
this change adds the depth parameter to listAll and readMetaDataForPath.
this allows recursive calls of these methods with a different depth
header.
Sharepoint won't list files if the depth header is != 0. If that is the
case, it will just return a error 404 although the file exists.
Since it is not possible to determine if a path should be a file or a
directory, rclone has to make a request with depth = 1 first. On success
we are sure that the path is a directory and the listing will work.
If this request returns error 404, the path either doesn't exist or it
is a file.
To be sure, we can try again with depth set to 0. If it still fails, the
path really doesn't exist, else we found our file.
Before this change the Part structure had an int for the Offset and
uploading large files would produce this error
json: cannot unmarshal number 2147483648 into Go struct field Part.offset of type int
Changing the field to an int64 fixes the problem.
This unifies the 3 methods of reading config
* command line
* environment variable
* config file
And allows them all to be configured in all places. This is done by
making the []fs.Option in the backend registration be the master
source of what the backend options are.
The backend changes are:
* Use the new configmap.Mapper parameter
* Use configstruct to parse it into an Options struct
* Add all config to []fs.Option including defaults and help
* Remove all uses of pflag
* Remove all uses of config.FileGet
This change includes removing older azureblob storage SDK, and getting
parity to existing code with latest blob storage SDK.
This change is also pre-req for addressing #2091
Go can't redirect PROPFIND requests properly, it changes the method to
GET, so we disable redirects when reading the metadata and assume the
object does not exist if we receive a redirect.
This is to work-around the qnap redirecting requests for directories
without /.
Previously this was reading a stale hash from the object leading to
broken integration tests.
This fixes these integration tests TestSyncDoesntUpdateModtime,
TestSyncAfterChangingFilesSizeOnly, TestSyncAfterChangingContentsOnly,
TestSyncWithUpdateOlder, TestSyncUTFNorm.
Now that https://issuetracker.google.com/issues/64468406 has been
fixed, we can remove part of the workaround which fixed#1675 -
019adc35609c2136
This will make queries marginally more efficient. We still need the
other part of the workaround since the `=` operator is case
insensitive.
Before this commit rclone's handling of symlinks and junction points
under Windows was broken. rclone treated them as files and attempted
to transfer them which gave the error "The handle is invalid".
Ultimately the cause of this was 3e43ff7414 which was a
workaround so files with reparse points (which are a kind of symlink)
would transfer correctly.
The solution implemented is to revert the above commit which will mean
that #614 will break again. However there is now a work-around (which
will be signaled by rclone) to use the -L flag which wasn't available
when the original commit was made.
Fixes#2336
In the drive v3 conversion we forgot the IncludeTeamDriveItems
parameter when calling the changes API. Adding it fixes the changes
polling with team drives.
This implementation hopefully can handle all error requests from the
onedrive for business authentication.
I have only tested it with the "domain in unmanaged state" error.
This was caused by using the sftp.File.Read method which resets the
streaming window after each call. Replacing it with sftp.File.WriteTo
and an io.Pipe fixes the problem bringing the speed to the same as the
sftp binary.
This checks the checksum of the streamed encrypted data against the
checksum of the encrypted object returned from the remote and returns
an error if it is different.
* Fix errcheck and golint warnings
* Remove unused constants and fix comments
* Parse error responses properly
* Fix Open with RangeOption
* Fix Move, Copy and DirMove
* Implement DirCacheFlush
* Check interfaces are correct
* Remove debugs and update overview
* Correct feature flags
* Pare replacement characters down to the minimum set
* Add to the integration tests
* Add Mkdir, Rmdir, Purge, Delete, SetModTime, Copy, Move, DirMove
* Update file size after upload
* Add Open seek
* Set private permission for new folder and uploaded file
* Add docs
* Update List function
* Fix UserSessionInfo struct
* Fix socket leaks
* Don’t close resp.Body in Open method
* Get hash when listing files
The official drive APIs seem to have trouble downloading large
documents sometimes.
This commit adds a --drive-alternate-export flag to use an different,
unofficial set of export URLS which seem to download large files OK.
Google cloud storage doesn't normally need retries, however certain
things (eg bucket creation and removal) are rate limited and do
generate 429 errors.
Before this change the integration tests would regularly blow up with
errors from GCS rate limiting bucket creation and removal.
After this change we low level retry all operations using the same
exponential backoff strategy as used in the google drive backend.
Before this fix team drives would return the drive quota which is
incorrect and mis-leading.
Team drives don't appear to have an API for reading the bytes used or
the quota so we now return that the quota and usage are unknown.
If md5sum/sha1sum fails we debug what it outputed on stderr and return
an empty hash indicating we didn't have a hash, rather than
hash.ErrUnsupported indicating that we don't support this hash type.
This fixes lots of ERROR messages for sftp and synology NAS which,
while it supports md5sum the SFTP paths and the SSH paths are
different so md5sum doesn't work.
We also stop disabling md5sum/sha1sum on errors since typically Hashes
is only checked at the start of a sync run and isn't expected to
change dynamically.
This enables the use of the SharePoint webdav endpoint provided by
OneDrive for Business or Office365 Education Accounts. It enables
unverified accounts to be accessed with rclone via webdav as it isn't
possible through the normal onedrive backend.
This integrates the https://github.com/hensur/onedrive-cookie-test
package to fetch the required cookies to authorize against the
SharePoint webdav endpoint.
Very large directories can have their sizes returned as floating point
numbers, eg `1.0034576985781e+14` from the box API.
Before this change this would fail to parse as an int64.
This change parses the size as a float64 instead which will be
perfectly accurate for sizes up to 2**56 which is about 9 PB.
It is unknown whether box themselves use a float64 as an intermediate
representation in the API or not - it seems likely.
Fixes#2261
* Implement about for:
* local, crypt, cache, drive, swift, hubic, onedrive, pcloud, dropbox
* Implement `--json` and `---full` flag for `rclone about`
* change About interface to return a Usage structure
* Remove operations.About as it is too thin an interface
* Implement Integration test
Relates to #1138 and #1564
This bug was introduced by the v3 API conversion in 07f20dd1fd.
The problem was that dircache.FindPath doesn't work for the root directory.
This adds an internal error for dircache.FindPath being called with
the root directory. This makes a failing test, which the fix to the
drive backend fixes.
This also improves the DirCache integration test.
These are AWS, Ceph, Dreamhost, IBM COS S3, Minio, Wasabi and Other.
This configures endpoints where known and makes sure config doesn't
appear where it isn't valid where possible.
This introduces a method of making provider specific configuration
within a remote. This is useful particularly in s3.
This commit does the basic configuration in S3 for IBM COS.
Before this change we lowercased the dropbox root directory. This was
likely a leftover from when we used to build a dictionary to translate
the cases of dropbox files. Now with the v2 API we can rely on
dropbox to do that for us, so we no longer need to lowercase the root.
This fixes issues using crypt with name obfuscation on dropbox.
Before this change asynchronous closes in cmount could cause sharing
violations under Windows on Remove which manifest themselves
frequently as test failures.
This change lets the Remove be retried on a sharing violation under
Windows.
Unfortunately multi part upload can't upload zero length files so
bring back the single part upload for zero length files only.
This was broken when we made all uploads multipart uploads.
The sftp library delivers the attributes of the symlink rather than
the object pointed to in directory listings, however when we use Stat
from the library it points to the objects.
Previous to this fix this caused items pointed to by symlinks to be
unusable.
After the fix both symlinked files and directories work as expected.
From testing it appears that CEPH no longer works properly with v2
auth and neither does Dreamhost, so update the docs anc configuration
to recommend v4 auth.
This is a problem when syncing a file which just needed its modtime
set with dropbox which can't set the mod time of a file without
re-uploading it.
Before this change we would delete the file, then the server side move
would fail moving the file to the backup-dir because it no longer
existed.
After this change the destination file is moved to the backup-dir
instead of being deleted and the new file is uploaded.
Fixes#2134
* All remotes now support RangeOption so remove SeekOption
* Correct off by one error as RangeOption arguments are inclusive.
* Use RangeSeek in preference to Seek if available
In a typical rclone copy to a bucket/container based remote, before
this change we were doing a list, followed by a HEAD of the bucket to
check it existed before doing the copy. The fact the list succeeded
means the bucket exists so mark it OK at that point.
Issue #1421
Before this change `rclone move localdir /mnt/different-fs` would
error. Now it falls back to moving individual files, which in turn
falls back to copying individual files across the filesystem boundary.
Because of a bug in the Onedrive API it will sometime report the wrong
size. If the size is wrong other remotes that depend on the size might
fail. To fix this we overwrite the objects size with the real size
from ContentLength header.
This was caused by inconsistent escaping of the URL in the prefix
check, so check the URL links back to the correct host and scheme
instead of the prefix check.
The decoded path check will catch any URLs which are outside of the
root.
This removes the old system of part accounting and replaces it with a
system of popping off the accounting reader and wrapping up new ones
as necessary.
This makes it much easier to carry the context down the chain of
wrapped readers and get the limiting as near as possible to the
output. This makes the accounting more accurate and the bandwidth
limiting smoother.
Fixes#2029 and Fixes#1443
This fixes uploads to existing files for Google Drive introduced by #2007.
Instead of updating the old file a new "Untitled" file would be created
in the root folder.
A Range request can never request 0 bytes however this change was made
to make a clearer signal that the limit means read to the end.
Add test and more documentation and fixup uses
It is unnecessary to notify the node.Parents, because a cahnge event is
generated for all involved files and folders in a move from d1/f1 to
d2/f1. There will be a event for d1, d2 and f1.
Additionally a duplicate notification is resolved when them empty string
is in pathsToClear.
Related to #2006
The purpose of this is to make it easier to maintain and eventually to
allow the rclone backends to be re-used in other projects without
having to use the rclone configuration system.
The new code layout is documented in CONTRIBUTING.