This is a very large change which turns the post Config function in
backends into a state based call and response system so that
alternative user interfaces can be added.
The existing config logic has been converted, but it is quite
complicated and folloup commits will likely be needed to fix it!
Follow up commits will add a command line and API based way of using
this configuration system.
It was discovered on some Android systems, the stat size of a symlink
is different to the size that readlink returns.
This was giving errors like this
transport connection broken: http: ContentLength=30 with Body length 28
There are enough exceptions to the size of readlink being different to
the size of stat that this patch now always does readlink to work out
the size of a symlink.
Since symlinks are relatively uncommon this shouldn't affect
performance too much and will mean that the size is always correct.
This deprecates the --local-zero-size-links flag which is now
effectively always enabled.
See: https://forum.rclone.org/t/problem-with-symlinks-and-links/23840/
Includes adding support for additional size input suffix Mi and MiB, treated equivalent to M.
Extends binary suffix output with letter i, e.g. Ki and Mi.
Centralizes creation of bit/byte unit strings.
v1.4.6 of uplink allows us to do a negative offset from the end of the
file. This removes a round trip when requesting the last N bytes of a
file.
Previous to v1.4.6 of uplink it wasn't possible to do a negative offset
on download. This meant that to fulfill the semantics of http range
headers it was necessary to first fetch the size of the object via a
stat call and compute absolute offset and length.
Restructuring of config code in v1.55 resulted in config
file being loaded early at process startup. If configuration
file is encrypted this means user will need to supply the password,
even when running commands that does not use config.
This also lead to an issue where mount with --deamon failed to
decrypt the config file when it had to prompt user for passord.
Fixes#5236Fixes#5228
Including the bucket name as part of the `fileNamePrefix` passed to
`b2_get_download_authorization` results in a link valid for objects that
have the bucket name as part of the object path; e.g.,
rclone link :b2:some-bucket/some-file
would result in a public link valid for the object
`some-bucket/some-file` in the `some-bucket` bucket (in rclone-remote
parlance, `:b2:some-bucket/some-bucket/some-file`). This will almost
certainly result in a broken link.
The B2 docs don't explicitly specify this behavior, but the example
given for `fileNamePrefix` provides some clarification.
See https://www.backblaze.com/b2/docs/b2_get_download_authorization.html.
This code removes the code added in
15d19131bd s3: use aws web identity role provider
This code no longer works because it doesn't initialise the
tokenFetcher - leading to a nil pointer crash.
The proper way to initialise this is with the
NewWebIdentityCredentials but it isn't clear where to get the other
parameters: roleARN, roleSessionName, path.
In the linked issue a user reports rclone working with EKS anyway, so
perhaps this code is no longer needed.
If it is needed, hopefully someone who knows AWS better will come
along and fix it!
See: https://forum.rclone.org/t/add-support-for-aws-sso/23569
Betweeen rclone v1.54 and v1.55 there was an approx 3x performance
regression when transferring to distant SFTP servers (in particular
rsync.net).
This turned out to be due to the library github.com/pkg/sftp rclone
uses. Concurrent writes used to be enabled in this library by default
(for v1.12.0 as used in rclone v1.54) but they are no longer enabled
(for v1.13.0 as used in rclone v1.55) for safety reasons and it is
necessary to enable them specifically.
The safety concerns are due to the uncertainty as to whether writes
come in order and whether a half completed file might have holes in
it. This isn't a problem for rclone since a) it doesn't restart
uploads and b) it has a post-transfer checksum test.
This change introduces a new flag `--sftp-disable-concurrent-writes`
to control the feature which defaults to false, meaning that
concurrent writes are enabled as in v1.54.
However this isn't quite enough to fix the problem as the sftp library
needs to be able to sniff the size of the stream from the reader
passed in, so this also adds a `Size` interface to the reader to
enable this. This involved a patch to the library.
The library was reverted to v1.12.0 for v1.55.1 - this patch installs
v1.13.0+master to fix the Size interface problem.
See: https://github.com/pkg/sftp/issues/426
Before this change, rclone checked to see if an object existed before
doing an upload by listing the destination directory. This was very
inefficient, especially with large directories.
After this change rclone uses the pre upload check API call which
checks to see if it is OK to upload an object, and also returns the ID
of an existing object which saves rclone having to do a directory
listing.
OneDrive randomly returns the error message: "InvalidAuthenticationToken: Unable to initialize RPS". These unexpected errors typically caused the entire rclone command to fail.
This work around recognizes these errors and marks them for a low level retry, that mostly succeeds. This will make rclone commands complete without being noticeable affected.
Fixes: #5270
With the file version format standardized in lib/version, `crypt` can
now treat the version strings separately from the encrypted/decrypted
file names. This allows --b2-versions to work with `crypt`.
Fixes#1627
Co-authored-by: Luc Ritchie <luc.ritchie@gmail.com>
Before this change rclone would auth over https even when the server
was configured with http.
Authing over http obviously isn't ideal, however this type of server
is on-premise and doesn't work over https.
PR #4266 modified ftpConnection to make ftp library into using
a custom dial function which is QoS aware and takes care of TLS.
However the ServerConn.Login function from the ftp library also needs
TLS config passed explicitly as a trigger for sending PSBZ and PROT
options to FTP server. This was not taken care of resulting in
failure to connect via FTP with implicit TLS.
This PR fixes that.
Fixes#5210
In
a3fcadddc8 sftp: close idle connections after --sftp-idle-timeout (1m by default)
Idle SFTP connections were closed after 1 minute. However due to the
way SSH multiplexes connections over a single SSH connection this
meant that if uploads or downloads went on for more than one minute
they failed with "EOF errors" as their underlying connection was
closed.
This fixes the problem by not clearing idle connections if there are
any transfers in progress.
Fixes#5197
This reverts the library update done in this commit.
713f8f357d sftp: fix "file not found" errors for read once servers
Reverting this commit triples the performance to a far away sftp server.
See: https://github.com/pkg/sftp/issues/426
Before this change when the context was cancelled (due to
--max-duration for example) this could deadlock when uploading
multipart uploads.
This change fixes the problem by introducing another go routine to
monitor the context and close the pipe with an error when the context
errors.
When reading files from B2 via cloudflare using --b2-download-url
cloudflare strips the Content-Length headers (presumably so it can
inject stuff into the body).
This caused rclone to think the file was corrupted as the length
didn't match.
The patch uses the old length read from the listing if there is no
Content-Length.
See: https://forum.rclone.org/t/b2-cloudflare-error-directory-not-found/23026
This commit broke the initialisation of the union backend
f17d7c0012 union: refactor to use fspath.SplitFs instead of fs.ParseRemote #4996
This patch fixes it.
Box recently changed their API, changing the case of returned API items
> On May 10th, 2021, as part of our continued infrastructure upgrade,
> Box's API response headers will standardize to return in a case
> insensitive manner, in line with industry best practices and our API
> documentation. Applications that are using these headers, such as
> "location" and "retry-after", will need to verify that their
> applications are checking for these headers in a case-insensitive
> fashion.
Rclone was reading the raw headers from the `http.Header` and not
using the `Get` accessor method which meant that it was sensitive to
case changes.
This fixes the problem by using the `Get` accessor method.
See: https://forum.rclone.org/t/box-backend-incompatible-with-box-api-changes-being-deployed/22972
If you exceed rate limits, dropbox tells you to wait for 300 seconds -
this is rather a long time for the user to be waiting for rclone to
finish, so emit a NOTICE level log instead of a DEBUG.
Some sftp servers don't allow the user to access the file after upload.
In this case the error message indicates that using
--sftp-set-modtime=false would fix the problem. However it doesn't
because SetModTime does a stat call which can't be disabled.
Update SetModTime failed: SetModTime stat failed: object not found
After upload this patch checks for an `object not found` error if
set_modtime == false and ignores it, returning the expected size of
the object instead.
It also makes SetModTime do nothing if set_modtime = false
https://forum.rclone.org/t/sftp-update-setmodtime-failed/22873
These were added by accident in
d9959b0271 drive: pass context on to drive SDK - this will help with cancellation
Which added lots of new Context() calls but duplicated some existing
ones.
Before this we just failed if the ftp connection or login failed.
This change adds a pacer just for the ftp connect and retries if the
connection failed to Dial or the login returns a 421 error.
This is implemented as a state machine parser so it can emit sensible
error messages.
It does not use the connection strings elsewhere in rclone yet - see
subsequent commits.
An optional fuzzer is implemented for the Parse function.
Before this change, when using an all create method with one of the
upstreams being read only, if there was an existing file on the read
only remote, it was impossible to update it.
This change detects that situation and creates the file on a
read/write upstream. This file will shadow the file on the read/only
upstream. If it is deleted the read only upstream file will be visible
again.
Fixes#4929
Before this fix using the epff policy could double close a channel.
The fix refactors the code to make that impossible and cancels any
running queries when the first query is found.
Before this change the config file needed to be explicitly reloaded.
This coupled the config file implementation with the backends
needlessly.
This change stats the config file to see if it needs to be reloaded on
every config file operation.
This allows us to remove calls to
- config.SaveConfig
- config.GetFresh
Which now makes the the only needed interface to the config file be
that provided by configmap.Map when rclone is not being configured.
This also adds tests for configfile
It introduces a new flag --sftp-disable-concurrent-reads to stop the
problematic behaviour in the SFTP library for read-once servers.
This upgrades the sftp library to v1.13.0 which has the fix.
This change checks the context whenever rclone might retry, and
doesn't retry if the current context has an error.
This fixes the pathological behaviour of `--max-duration` refusing to
exit because all the context deadline exceeded errors were being
retried.
This unfortunately meant changing the shouldRetry logic in every
backend and doing a lot of context propagation.
See: https://forum.rclone.org/t/add-flag-to-exit-immediately-when-max-duration-reached/22723
This change makes dedupe recursively count elements in same-named directories
and make the largest one primary. This allows to minimize the amount of data
moved (or at least the amount of API calls) when dedupe merges them.
It also adds a new fs.Object interface `ParentIDer` with function `ParentID` and
implements it for the drive and opendrive backends. This function returns
parent directory ID for objects on filesystems that allow same-named dirs.
We use it to correctly count sizes of same-named directories.
Fixes#2568
Co-authored-by: Ivan Andreev <ivandeex@gmail.com>
If you are using rclone a library you can decide to use the rclone
config file system or not by calling
configfile.LoadConfig(ctx)
If you don't you will need to set `config.Data` to an implementation
of `config.Storage`.
Other changes
- change interface of config.FileGet to remove unused default
- remove MustValue from config.Storage interface
- change GetValue to return string or bool like elsewhere in rclone
- implement a default config file system which panics with helpful error
- implement getWithDefault to replace the removed MustValue
- don't embed goconfig.ConfigFile so we can change the methods
This fixes the polling implementation for Dropbox, particularly
when using a scoped app. This also adds a lower end check for the
timeout, as I forgot to include that in the original implementation.
In this commit
fc5b14b620 s3: Added `--s3-disable-http2` to disable http/2
We created our own transport so we could disable http/2. However the
added function is called twice meaning that we create two HTTP
transports. This didn't happen with the original code because the
default transport is cached by fshttp.
Rclone normally does a PUT followed by a HEAD request to check an
upload has been successful.
With the two transports, the PUT and the HEAD were being done on
different HTTP transports. This means that it wasn't re-using the same
HTTP connection, so the HEAD request showed the previous object value.
This caused rclone to declare the upload was corrupted, delete the
object and try again.
This patch makes sure we only create one transport and use it for both
PUT and HEAD requests which fixes the problem with Wasabi.
See: https://forum.rclone.org/t/each-time-rclone-is-run-1-3-fails-2-3-succeeds/22545
Some storage providers e.g. S3 don't have an efficient rename operation.
Before this change, when chunker finished an upload, the server-side copy
and delete operations that renamed temporary chunks to their final names
could take a significant amount of time.
This PR records transaction identifier (versioning) in the metadata of
chunker composite objects striving to remove the need for rename
operations on such backends.
This approach will be triggered be the new "transactions" configuration
option, which can be "rename" (the default) or "norename".
We implement the new approach for uploads (Put operations).
The chunker Move operation still uses the rename operation of
underlying backend. Filling this gap is left for a later PR.
Co-authored-by: Ivan Andreev <ivandeex@gmail.com>
Before this change, if folder level access permissions policy was in
use, with trailing `/` marking the folders then rclone would HEAD the
path without a trailing `/` to work out if it was a file or a folder.
This returned a permission denied error, which rclone returned to the
user.
Failed to create file system for "s3:bucket/path/": Forbidden: Forbidden
status code: 403, request id: XXXX, host id:
Previous to this change
53aa03cc44 s3: complete sse-c implementation
rclone would assume any errors when HEAD-ing the object implied it
didn't exist and this test would not fail.
This change reverts the functionality of the test to work as it did
before, meaning any errors on HEAD will make rclone assume the object
does not exist and the path is referring to a directory.
Fixes#4990
This implements polling support for the Dropbox backend. The Dropbox SDK dependency had to be updated due to an auth issue, which was fixed on Jan 12 2021. A secondary internal Dropbox service was created to handle unauthorized SDK requests, as is necessary when using the ListFolderLongpoll function/endpoint. The config variable was renamed to cfg to avoid potential conflicts with the imported config package.
Sharepoint 2016 returns status 204 to the purge request
even if the directory to purge does not really exist.
This change adds an extra check to detect this condition
and returns a proper error code.
The go-ntlmssp NTLM negotiator has to try various authentication methods.
Intermediate responses from Sharepoint have status code 401, only the
final one is different. When rclone runs a large operation in parallel
goroutines according to --checkers or --transfers, one of threads can
receive intermediate 401 response targeted for another one and returns
the 401 authentication error to the user.
This patch fixes that.
On-premises Sharepoint returns HTTP errors 400 or 500 in
reply to attempts to use file names with special characters
like hash, percent, tilde, invalid UTF-7 and so on.
This patch activates transparent encoding of such characters.
As per Microsoft documentation, Windows authentication
(NTLM/Kerberos/Negotiate) is not supported with HTTP/2.
This patch disables transparent HTTP/2 support when the
vendor setting is "sharepoint-ntlm". Otherwise connections
to IIS/10.0 can fail with HTTP_1_1_REQUIRED.
Co-authored-by: Georg Neugschwandtner <georg.neugschwandtner@gmx.net>
The most popular keyword for the Sharepoint in-house or company
installations is "On-Premises".
"Microsoft OneDrive account" is in fact just a Microsoft account.
Co-authored-by: Georg Neugschwandtner <georg.neugschwandtner@gmx.net>
Add new option option "sharepoint-ntlm" for the vendor setting.
Use it when your hosted Sharepoint is not tied to the OneDrive
accounts and uses NTLM authentication.
Also add documentation and integration test.
Fixes: #2171
S3 backend shared_credentials_file option wasn't working neither from
config option nor from command line option. This was caused cause
shared_credentials_file_provider works as part of chain provider, but in
case user haven't specified access_token and access_key we had removed
(set nil) to credentials field, that may contain actual credentials got
from ChainProvider.
AWS_SHARED_CREDENTIALS_FILE env varible as far as i understood worked,
cause aws_sdk code handles it as one of default auth options, when
there's not configured credentials.
This change adds the scopes rclone wants during the oauth request.
Previously rclone left these blank to get a default set.
This allows rclone to add the "members.read" scope which is necessary
for "impersonate" to work, but only when it is in use as it require
authorisation from a Team Admin.
See: https://forum.rclone.org/t/dropbox-no-members-read/22223/3
Some virtual filesystems (such as Google Drive File Stream) may
incorrectly set the actual file size equal to the preallocated space,
causing checksum and file size checks to fail.
This flag can be used to disable preallocation for local backends of
this type.
Before this change, if an application key limited to a prefix was in
use, with trailing `/` marking the folders then rclone would HEAD the
path without a trailing `/` to work out if it was a file or a folder.
This returned a permission denied error, which rclone returned to the
user.
Failed to create file system for "b2:bucket/path/":
failed to HEAD for download: Unknown 401 (401 unknown)
This change assumes any errors on HEAD will make rclone assume the
object does not exist and the path is referring to a directory.
See: https://forum.rclone.org/t/b2-error-on-application-key-limited-to-a-prefix/22159/
Before this change, if --b2-chunk-size was raised above 200M then this
error would be produced:
b2: upload cutoff: 200M is less than chunk size 1G
This change automatically reaises --b2-upload-cutoff to be the value
of --b2-chunk-size if it is below it, which stops this error being
generated.
Fixes#4475
Before this change, running
rclone backend copyid drive: ID file.txt
Failed with the error
command "copyid" failed: failed copying "ID" "file.txt": can't use empty string as a path
This fixes the problem.
This makes sure that partially uploaded large files are removed
unless the `--swift-leave-parts-on-error` flag is supplied.
- refactor swift.go
- add unit test for swift with chunk
- add unit test for large object with fail case
- add "-" to white list char during encode.
Assume the Stat size of links is zero (and read them instead)
On some virtual filesystems (such ash LucidLink), reading a link size via a
Stat call always returns 0.
However, on unix it reads as the length of the text in the link. This may
cause errors like this when syncing:
Failed to copy: corrupted on transfer: sizes differ 0 vs 13
Setting this flag causes rclone to read the link and use that as the size of
the link instead of 0 which in most cases fixes the problem.
Fixes#4950
Signed-off-by: Riccardo Iaconelli <riccardo@kde.org>
This patch changes to using the default page limit for listing
unfinished multpart uploads rather than 1000. 1000 is the maximum
specified in the docs, but setting anything larger than 200 gives an
error.
- fix test case FsNewObjectCaseInsensitive (PR #4830)
- continue PR #4917, add comments in metadata detection code
- add warning about metadata detection in user documentation
- change metadata size limits, make room for future development
- hide critical chunker parameters from command line
Before this patch chunker required that there is at least one
data chunk to start checking for a composite object.
Now if chunker finds at least one potential temporary or control
chunk, it marks found files as a suspected composite object.
When later rclone tries a concrete operation on the object,
it performs postponed metadata read and decides: is this a native
composite object, incompatible metadata or just garbage.
This includes an HDFS docker image to use with the integration tests.
Co-authored-by: Ivan Andreev <ivandeex@gmail.com>
Co-authored-by: Nick Craig-Wood <nick@craig-wood.com>
The current authentication scheme works without creating
a public download endpoint for a private bucket as in the B2 official blog.
On the contrary, if the existing authorization header gets duplicated
in the Cloudflare Workers script, one might receive 401 Unauthorized errors.
Before this change the webdav backend didn't truncate Range requests
to the size of the object. Most webdav providers are OK with this (it
is RFC compliant), but it causes 4shared to return 500 internal error.
Because Range requests are used in mounting, this meant that mounting
didn't work for 4shared.
This change truncates the Range request to the size of the object.
See: https://forum.rclone.org/t/cant-copy-use-files-on-webdav-mount-4shared-that-have-foreign-characters/21334/
Before this change, attempting to update an archive tier blob failed
with a 409 error message:
409 This operation is not permitted on an archived blob.
This change detects if we are overwriting a blob and either generates
the error (if `--azureblob-archive-tier-delete` is not set):
can't update archive tier blob without --azureblob-archive-tier-delete
Or deletes the blob first before uploading it again (if
`--azureblob-archive-tier-delete` is set).
Fixes#4819
Before this change if you attempted to list a remote set up with a SAS
URL outside its container then it would crash the Azure SDK.
A check is done to make sure the root is inside the container when
starting the backend which is usually enough, but when two SAS URL
based remotes are mounted in a union, the union backend attempts to
read paths outside the named container. This was causing a mysterious
crash in the Azure SDK.
This fixes the problem by checking to see if the container in the
listing is the one in the SAS URL before listing the directory and
returning directory not found if it isn't.
Before this change when NewObject was called the b2 backend would list
the directory that the object was in in order to find it.
Unfortunately list calls are Class C transactions and cost more.
This patch switches to using HEAD requests instead. These are Class B
transactions. It is then necessary to parse the headers from response
back into the data that we get from the listing. However B2 returns
exactly the same data, just in a different form.
Rclone will use the old directory listing method when looking for
files with versions as these can't be found via a HEAD request.
This change will particularly benefit --files-from, rclone serve
restic but most operations will see some benefit.
Starting September 30th, 2021, the Dropbox OAuth flow will no longer
return long-lived access tokens. It will instead return short-lived
access tokens, and optionally return refresh tokens.
This patch adds the token_access_type=offline parameter which causes
dropbox to return short lived tokens now.
Before this change rclone would upload the whole of multipart files
before receiving a message from dropbox that the path was too long.
This change hard codes the 255 rune limit and checks that before
uploading any files.
Fixes#4805
Before this change, rclone would retry files with filenames that were
too long again and again.
This changed recognises the malformed_path error that is returned and
marks it not to be retried which stops unnecessary retrying of the file.
See #4805
Before this change rclone was using the copy endpoint to copy large objects.
This can fail for large objects with this error:
Error 413: Copy spanning locations and/or storage classes could
not complete within 30 seconds. Please use the Rewrite method
This change makes Copy use the Rewrite method as suggested by the
error message which should be good for any size of copy.
Yandex appears to ignore mime types set as part of the PUT request or
as part of a PATCH request.
The docs make no mention of being able to set a mime type, so set
WriteMimeType=false indicating the backend can't set mime types on
uploaded files.
This is done by making fs.Config private and attaching it to the
context instead.
The Config should be obtained with fs.GetConfig and fs.AddConfig
should be used to get a new mutable config that can be changed.
Before this change, small objects uploaded with SSE-AWS/SSE-C would
not have MD5 sums.
This change adds metadata for these objects in the same way that the
metadata is stored for multipart uploaded objects.
See: #1824#2827
If rclone is configured for server side encryption - either aws:kms or
sse-c (but not sse-s3) then don't treat the ETags returned on objects
as MD5 hashes.
This fixes being able to upload small files.
Fixes#1824
This shouldn't be read as encouraging the use of math/rand instead of
crypto/rand in security sensitive contexts, rather as a safer default
if that does happen by accident.
For some reason the API started returning some integers as strings in
JSON. This is probably OK in Javascript but it upsets Go.
This is easily fixed with the `json:"name,size"` struct tag.
Before this change a circular symlink would cause rclone to error out from the listings.
After this change rclone will skip a circular symlink and carry on the listing,
producing an error at the end.
Fixes#4743
This adds a context.Context parameter to NewFs and related calls.
This is necessary as part of reading config from the context -
backends need to be able to read the global config.