In as many methods as possible we attempt to obey the Retry-After
header where it is provided.
This means that when objects are being requested from OVH cold storage
rclone will sleep the correct amount of time before retrying.
If the sleeps are short it does them immediately, if long then it
returns an ErrorRetryAfter which will cause the outer retry to sleep
before retrying.
Fixes#3041
This implements the Expiry interface so token expiry works properly
This change makes sure that this change from the swift library works
correctly with rclone's custom authenticator.
> Renew the token 60s before the expiry time
>
> The v2 and v3 auth schemes both return the expiry time of the token,
> so instead of waiting for a 401 error, renew the token 60s before this
> time.
>
> This makes transfers more efficient and also works around a bug in
> CEPH which returns 403 instead of 401 when the token expires.
>
> http://tracker.ceph.com/issues/22223
Some WebDAV servers return an empty Available and Used which parses as 0.
This caused About to return the Total as 0 which can confused mounted
file systems.
After this change we ignore the result if Available and Used are both 0.
See: https://forum.rclone.org/t/windows-mounted-webdav-drive-has-no-free-space/8938
Before this change a race condition existed in mkdir
- the directory was attempted to be created
- the parent didn't exist so it failed
- the parent was created
- the directory was created again
The last step failed as the directory was created in a different thread.
This was fixed by checking the error messages of MKCOL for both
directory creations, rather than only the first.
Before this change a range request on a 0 length file would fail
$ rclone cat --head 128 drive:test/emptyfile
ERROR : open file failed: googleapi: Error 416: Request range not satisfiable, requestedRangeNotSatisfiable
To fix this we remove Range: headers on requests for zero length files.
This introduces a new config variable bucket_policy_only. If this is
set then rclone:
- ignores ACLs set on buckets
- ignores ACLs set on objects
- creates buckets with Bucket Policy Only set
Fall back to default application credentials when all other credentials sources fail
This change allows users with default application credentials
configured (notably when running on google compute instances) to
dispense with explicitly configuring google cloud storage credentials
in rclone's own configuration.
This enables MD5 checksum calculation and publication when uploading file above the "Cutoff" limit.
It was explictely ignored in case of multi-block (a.k.a. multipart) uploads to Azure Blob Storage.
Make the pacer package more flexible by extracting the pace calculation
functions into a separate interface. This also allows to move features
that require the fs package like logging and custom errors into the fs
package.
Also add a RetryAfterError sentinel error that can be used to signal a
desired retry time to the Calculator.
Bitrix Site Manager emits `<D:resourcetype><collection/></D:resourcetype>`
missing the namespace on the `collection` tag. This causes the item
to be identified as a file instead of a directory.
To work around this look at the Microsoft extension prop
`iscollection` which seems to be emitted as well.
Before this change any attempt to access a google doc in an rclone
mount would give the error "partial downloads are not supported while
exporting Google Documents" as the mount uses ranged requests to read
data.
This implements ranged requests for a limited number of scenarios,
just enough so that Google docs can be cat-ed from an rclone mount.
When they are cat-ed then they receive their correct size also.
Before this change the union remote was using whether the writable
union could poll for changes to decide whether the union mount could
poll for changes.
The fix causes the union backend to signal it can poll for changes if
**any** of the remotes can poll for changes.
Before this change it was setting the modification times of the things
that the symlinks pointed to.
Note that this is only implemented for unix style OSes. Other OSes
will not attempt to set the modification time of a symlink.
If the upload concurrency is set > 1 then the hash becomes corrupted.
The upload is fine, and can be downloaded fine, however the hash is no
longer the md5sum of the object. It is not known whether this is
rclone's fault or a bug at QingStor.
Before this change if ContentLength was set in the options but 0 then
we would upload using chunked encoding. Fix this to always upload
with a "Content-Length" header even if the size is 0.
Remove workarounds for this from b2 and onedrive backends.
This fixes the issue for the webdav backend described here:
https://forum.rclone.org/t/code-500-errors-with-webdav-nextcloud/8440/
Before this change azureblob would attempt to create already existing
containers. This causes problems with limited permissions keys.
This change checks the container exists before trying to create it in
the same way the s3 backend does. This uses no more requests in the
usual case of the container existing.
See: https://forum.rclone.org/t/copying-individual-files-to-azure-blob-storage/8397
Before this change buckets were created with the same ACL as objects.
After this change, the user can set just --s3-acl to set the ACL of
buckets and objects, or use --s3-bucket-acl as well to have a
different ACL used for bucket creation.
This also logs at INFO level the creation and deletion of buckets.
* drive: don't run teamdrive config if auto confirm set
* onedrive: don't run extra config if auto confirm set
* make Confirm results customisable by config
Fixes#1010
The existing s3 backend passed all integration tests with OSS provided
`force_path_style = false`.
This makes sure that is so and adds documentation and configuration
for OSS.
Thanks to @luolibin for their work on the OSS backend which we ended
up not needing.
Fixes#1641Fixes#1237
The time format provided by webdav servers seems to vary wildly from
that specified in the RFC - rclone already parses times in 5 different
formats!
If an unparseable time is found, then fail softly logging an ERROR
(just once) but returning the epoch.
This will mean that webdav servers with bad time formats will still be
usable by rclone.
Before this fix rclone would just use the authorised bucket regardless
of what bucket you put on the command line.
This uses the new `bucketName` response in the API and checks that the
user is using the correct bucket name to avoid accidents.
Fixes#2839
Before this fix the http backend was returning the wrong error code
when files were not found. This was causing --files-from to error on
missing files instead of skipping them like it should.
The `cleanup` command will delete unfinished large file uploads that
were started more than a day ago (to avoid deleting uploads that are
potentially still in progress).
Fixes#2617
Increasing the --s3-upload-concurrency to 4 (from 2) gives an
additional 45% throughput at the cost of 10MB extra memory per transfer.
After testing the upload perfoc
Before this change rclone would use multipart uploads for any size of
file. However multipart uploads are less efficient for smaller files
and don't have MD5 checksums so it is advantageous to use single part
uploads if possible.
This implements single part uploads for all files smaller than the
upload_cutoff size. Streamed files must be uploaded as multipart
files though.
Before this change we used Remove to remove directories. This works
fine on Unix based systems but not so well on Windows based ones.
Swap to using RemoveDirectory instead.
When a container is deleted, a container with the same name cannot be
created for at least 30 seconds; the container may not be available
for more than 30 seconds if the service is still processing the
request.
We sleep so that we wait at most 60 seconds. This is mostly useful in
the integration tests where containers get deleted and remade
immediately.
Get rid of the api client and use rest/pacer for all API calls
Add Copy, Move, DirMove, PublicLink, About optional interfaces
Improve general error handling
Remove ListR for now due to inconsitent behaviour
fixes#2586, progress on #2740 and #2178
Before this change backend integration tests depended on each other,
so tests could not be retried.
After this change we nest tests to ensure that tests are provided with
the starting state they expect.
Tell the integration test runner that it can retry backend tests also.
This also includes bin/test_independence.go which runs each test
individually for a backend to prove that they are independent.
Wasabi has two location, US East and US West, with different endpoint URLs.
When configuring S3 to use Wasabi, provide the endpoint information for both
locations.
Before this change Rmdir would check the root rather than the
directory specified for being empty and return "directory not empty"
when it shouldn't have done.
When the env_auth option is enabled, the AWS SDK's session constructor
now loads configuration from ~/.aws/config and environment variables,
and credentials per the selected (or default) AWS_PROFILE's settings.
This is accomplished by **NOT** including any Credential provider in the
aws.Config passed to the session constructor: If the Config.Credentials
is non-nil, that will always be used and the user's configuration re
role_arn, credential_source, source_profile, etc... from the shared
config will be completely ignored.
(The conditional creation and configuration of the stscreds Credential
provider is complicated enough that it is not worth re-creating that
logic.)
Before this change the ACL for objects which were server side copied
was left at the default "private" settings. S3 doesn't copy the ACL
from the source when you copy an object, you have to set it afresh
which is what this does.
Until https://github.com/Azure/azure-storage-blob-go/pull/75 is merged
the SDK can't upload a single blob of exactly the chunk size, so
upload files of this size with a multpart upload as a work around.
The previous fix for this 6a773289e7 turned out to cause problems
uploading files with maximum chunk size so needed to be redone.
Fixes#2653
Before this change the Features() method would return a different Fs
to that the Features() method was called on if the remote was
instantiated on a file.
The practical effect of this is that optional features, eg `rclone
about` wouldn't work properly when called on a file, and likely this
has been causing low level problems for users of these backends for
ages.
Ideally there would be a test for this, but it turns out that this is
really hard, so instead of that all the backends have been converted
to not copy the Fs and a big warning comment inserted for future
readers.
Fixes#2182
Use the same function to join the root paths for the wrapping remotes
alias, cache and crypt.
The new function fspath.JoinRootPath is equivalent to path.Join, but if
the first non empty element starts with "//", this is preserved to allow
Windows network path to be used in these remotes.
Implement optional interfaces
- Purge
- PutStream
- Copy
- Move
- DirMove
- DirCacheFlush
- ChangeNotify
- About
Make Hashes() return the intersection of all the hashes supported by the remotes
When moving a directory in drive, most of the time only a notification
for the directory itself is created, not the old or new parents.
This tires to find the old path in the dirCache and the new path with
the dirCache of the new parent, which can result in two notifications
for a moved directory.
Add a new flag to the drive backend to allow document conversions oni upload.
The existing --drive-formats flag has been renamed to --drive-export-formats.
The old flag is still working to be backward compatible.
Make use of the mime package to find matching extensions and mime types.
For simplicity, all extensions are now prefixed with "." to match the
mime package requirements.
Parsed extensions get converted if needed.
Before this change on Windows, files copied locally could become
heavily fragmented (300+ fragments for maybe 100 MB), no matter how
much contiguous free space there was (even if it's over 1TiB). This
can needlessly yet severely adversely affect performance on hard
disks.
This changes uses NtSetInformationFile to pre-allocate the space to
avoid this.
It does nothing on other OSes other than Windows.
Add --drive-v2-download-min-size flag to allow downloading files via the
drive v2 API. If files are greater than this flag, a download link is
generated when needed. The flag is disabled by default.
When combining the remote value and the root path, preserve the absence
or presence of the / at the beginning of the wrapped remote path.
e.g. a remote "cloud:" and root path "dir" becomes "cloud:dir" instead
of "cloud:/dir".
Fixes#2553
When combining the remote value and the root path, preserve the absence
or presence of the / at the beginning of the wrapped remote path.
e.g. a remote "cloud:" and root path "dir" becomes "cloud:dir" instead
of "cloud:/dir".
* Fix error handling in List and NewObject
* Fix Precision in case we have precision > time.Second
* Fix Features - all binary features are possible
* Fix integration tests using new test facilities
Uploads were broken because chunk size was set to zero. This was a
consequence of the backend config re-organization which meant that
chunk size had lost its default.
Sharing some backend config between swift and hubic fixes the problem
and means hubic gains its own --hubic-chunk-size flag.
The initial work on this was done by Oliver Heyme with updates from
Cnly.
Oliver Heyme:
* Changed to Microsoft graph
* Enable writing
* Added more options for adding a OneDrive Remote
* Better error handling
* Send modDate at create upload session and fix list children
Cnly:
* Simple upload API only supports max 4MB files
* Fix supported hash types for different drive types
* Fix unchecked err
Co-authored-by: Oliver Heyme <olihey@googlemail.com>
Co-authored-by: Cnly <minecnly@gmail.com>
Sometimes pcloud will leave a half uploaded file when the transfer
actually failed. This patch deletes the file if it exists.
This problem was spotted by the integration tests.
Before this change we were using the ChildCount in the Folder facet to
determine if a directory was empty or not. However this seems to be
unreliable, or updated asynchronously which meant that `rclone rmdir`
sometimes deleted directories that had files in.
This problem was spotted by the integration tests.
Listing the directory instead of relying on the ChildCount fixes the
problem and the integration tests, without changing the cost (one http
transaction).
This was causing errors which looked like this when copying a file to
the root of a drive:
mkdir \\?: The filename, directory name, or volume label syntax is incorrect.
This was caused by an incorrect path splitting routine which was
removing \ of the end of UNC paths when it shouldn't have been. Fixed
by using the standard library `filepath.Dir` instead.
Before this change if only one of storage_url or auth_token were
supplied then rclone would overwrite both of them when authenticating.
This effectively meant you could supply both of them or none of them
only.
Now rclone still does the authentication to read the missing
storage_url or auth_token then afterwards re-writes the auth_token or
storage_url back to what the user desired.
Fixes#2464
* Add docs for --jottacloud-md5-memory-limit
* Factor out readMD5 function and add tests
* Fix accounting
* Make sure temp file is deleted at the start (not Windows)
In e52ecba295 we forgot to unwrap and
re-wrap the accounting which mean the the accounting was no longer
first in the chain of readers. This lead to accounting inaccuracies
in remotes which wrap and unwrap the reader again.
Some webdav backends (eg rclone serve webdav) leave behind half
written files on error. This causes the integration tests to
fail. Here we remove the file if it exists.
Sometimes it takes many more commit retries than expected to commit a
multipart file, so split this number into its own config variable and
default it to 100 which should always be enough.
this change adds the depth parameter to listAll and readMetaDataForPath.
this allows recursive calls of these methods with a different depth
header.
Sharepoint won't list files if the depth header is != 0. If that is the
case, it will just return a error 404 although the file exists.
Since it is not possible to determine if a path should be a file or a
directory, rclone has to make a request with depth = 1 first. On success
we are sure that the path is a directory and the listing will work.
If this request returns error 404, the path either doesn't exist or it
is a file.
To be sure, we can try again with depth set to 0. If it still fails, the
path really doesn't exist, else we found our file.
Before this change the Part structure had an int for the Offset and
uploading large files would produce this error
json: cannot unmarshal number 2147483648 into Go struct field Part.offset of type int
Changing the field to an int64 fixes the problem.
This unifies the 3 methods of reading config
* command line
* environment variable
* config file
And allows them all to be configured in all places. This is done by
making the []fs.Option in the backend registration be the master
source of what the backend options are.
The backend changes are:
* Use the new configmap.Mapper parameter
* Use configstruct to parse it into an Options struct
* Add all config to []fs.Option including defaults and help
* Remove all uses of pflag
* Remove all uses of config.FileGet
This change includes removing older azureblob storage SDK, and getting
parity to existing code with latest blob storage SDK.
This change is also pre-req for addressing #2091
Go can't redirect PROPFIND requests properly, it changes the method to
GET, so we disable redirects when reading the metadata and assume the
object does not exist if we receive a redirect.
This is to work-around the qnap redirecting requests for directories
without /.
Previously this was reading a stale hash from the object leading to
broken integration tests.
This fixes these integration tests TestSyncDoesntUpdateModtime,
TestSyncAfterChangingFilesSizeOnly, TestSyncAfterChangingContentsOnly,
TestSyncWithUpdateOlder, TestSyncUTFNorm.
Now that https://issuetracker.google.com/issues/64468406 has been
fixed, we can remove part of the workaround which fixed#1675 -
019adc35609c2136
This will make queries marginally more efficient. We still need the
other part of the workaround since the `=` operator is case
insensitive.
Before this commit rclone's handling of symlinks and junction points
under Windows was broken. rclone treated them as files and attempted
to transfer them which gave the error "The handle is invalid".
Ultimately the cause of this was 3e43ff7414 which was a
workaround so files with reparse points (which are a kind of symlink)
would transfer correctly.
The solution implemented is to revert the above commit which will mean
that #614 will break again. However there is now a work-around (which
will be signaled by rclone) to use the -L flag which wasn't available
when the original commit was made.
Fixes#2336
In the drive v3 conversion we forgot the IncludeTeamDriveItems
parameter when calling the changes API. Adding it fixes the changes
polling with team drives.
This implementation hopefully can handle all error requests from the
onedrive for business authentication.
I have only tested it with the "domain in unmanaged state" error.
This was caused by using the sftp.File.Read method which resets the
streaming window after each call. Replacing it with sftp.File.WriteTo
and an io.Pipe fixes the problem bringing the speed to the same as the
sftp binary.
This checks the checksum of the streamed encrypted data against the
checksum of the encrypted object returned from the remote and returns
an error if it is different.
* Fix errcheck and golint warnings
* Remove unused constants and fix comments
* Parse error responses properly
* Fix Open with RangeOption
* Fix Move, Copy and DirMove
* Implement DirCacheFlush
* Check interfaces are correct
* Remove debugs and update overview
* Correct feature flags
* Pare replacement characters down to the minimum set
* Add to the integration tests
* Add Mkdir, Rmdir, Purge, Delete, SetModTime, Copy, Move, DirMove
* Update file size after upload
* Add Open seek
* Set private permission for new folder and uploaded file
* Add docs
* Update List function
* Fix UserSessionInfo struct
* Fix socket leaks
* Don’t close resp.Body in Open method
* Get hash when listing files
The official drive APIs seem to have trouble downloading large
documents sometimes.
This commit adds a --drive-alternate-export flag to use an different,
unofficial set of export URLS which seem to download large files OK.
Google cloud storage doesn't normally need retries, however certain
things (eg bucket creation and removal) are rate limited and do
generate 429 errors.
Before this change the integration tests would regularly blow up with
errors from GCS rate limiting bucket creation and removal.
After this change we low level retry all operations using the same
exponential backoff strategy as used in the google drive backend.
Before this fix team drives would return the drive quota which is
incorrect and mis-leading.
Team drives don't appear to have an API for reading the bytes used or
the quota so we now return that the quota and usage are unknown.
If md5sum/sha1sum fails we debug what it outputed on stderr and return
an empty hash indicating we didn't have a hash, rather than
hash.ErrUnsupported indicating that we don't support this hash type.
This fixes lots of ERROR messages for sftp and synology NAS which,
while it supports md5sum the SFTP paths and the SSH paths are
different so md5sum doesn't work.
We also stop disabling md5sum/sha1sum on errors since typically Hashes
is only checked at the start of a sync run and isn't expected to
change dynamically.
This enables the use of the SharePoint webdav endpoint provided by
OneDrive for Business or Office365 Education Accounts. It enables
unverified accounts to be accessed with rclone via webdav as it isn't
possible through the normal onedrive backend.
This integrates the https://github.com/hensur/onedrive-cookie-test
package to fetch the required cookies to authorize against the
SharePoint webdav endpoint.
Very large directories can have their sizes returned as floating point
numbers, eg `1.0034576985781e+14` from the box API.
Before this change this would fail to parse as an int64.
This change parses the size as a float64 instead which will be
perfectly accurate for sizes up to 2**56 which is about 9 PB.
It is unknown whether box themselves use a float64 as an intermediate
representation in the API or not - it seems likely.
Fixes#2261
* Implement about for:
* local, crypt, cache, drive, swift, hubic, onedrive, pcloud, dropbox
* Implement `--json` and `---full` flag for `rclone about`
* change About interface to return a Usage structure
* Remove operations.About as it is too thin an interface
* Implement Integration test
Relates to #1138 and #1564
This bug was introduced by the v3 API conversion in 07f20dd1fd.
The problem was that dircache.FindPath doesn't work for the root directory.
This adds an internal error for dircache.FindPath being called with
the root directory. This makes a failing test, which the fix to the
drive backend fixes.
This also improves the DirCache integration test.
These are AWS, Ceph, Dreamhost, IBM COS S3, Minio, Wasabi and Other.
This configures endpoints where known and makes sure config doesn't
appear where it isn't valid where possible.
This introduces a method of making provider specific configuration
within a remote. This is useful particularly in s3.
This commit does the basic configuration in S3 for IBM COS.
Before this change we lowercased the dropbox root directory. This was
likely a leftover from when we used to build a dictionary to translate
the cases of dropbox files. Now with the v2 API we can rely on
dropbox to do that for us, so we no longer need to lowercase the root.
This fixes issues using crypt with name obfuscation on dropbox.
Before this change asynchronous closes in cmount could cause sharing
violations under Windows on Remove which manifest themselves
frequently as test failures.
This change lets the Remove be retried on a sharing violation under
Windows.
Unfortunately multi part upload can't upload zero length files so
bring back the single part upload for zero length files only.
This was broken when we made all uploads multipart uploads.
The sftp library delivers the attributes of the symlink rather than
the object pointed to in directory listings, however when we use Stat
from the library it points to the objects.
Previous to this fix this caused items pointed to by symlinks to be
unusable.
After the fix both symlinked files and directories work as expected.
From testing it appears that CEPH no longer works properly with v2
auth and neither does Dreamhost, so update the docs anc configuration
to recommend v4 auth.
This is a problem when syncing a file which just needed its modtime
set with dropbox which can't set the mod time of a file without
re-uploading it.
Before this change we would delete the file, then the server side move
would fail moving the file to the backup-dir because it no longer
existed.
After this change the destination file is moved to the backup-dir
instead of being deleted and the new file is uploaded.
Fixes#2134
* All remotes now support RangeOption so remove SeekOption
* Correct off by one error as RangeOption arguments are inclusive.
* Use RangeSeek in preference to Seek if available
In a typical rclone copy to a bucket/container based remote, before
this change we were doing a list, followed by a HEAD of the bucket to
check it existed before doing the copy. The fact the list succeeded
means the bucket exists so mark it OK at that point.
Issue #1421
Before this change `rclone move localdir /mnt/different-fs` would
error. Now it falls back to moving individual files, which in turn
falls back to copying individual files across the filesystem boundary.
Because of a bug in the Onedrive API it will sometime report the wrong
size. If the size is wrong other remotes that depend on the size might
fail. To fix this we overwrite the objects size with the real size
from ContentLength header.
This was caused by inconsistent escaping of the URL in the prefix
check, so check the URL links back to the correct host and scheme
instead of the prefix check.
The decoded path check will catch any URLs which are outside of the
root.
This removes the old system of part accounting and replaces it with a
system of popping off the accounting reader and wrapping up new ones
as necessary.
This makes it much easier to carry the context down the chain of
wrapped readers and get the limiting as near as possible to the
output. This makes the accounting more accurate and the bandwidth
limiting smoother.
Fixes#2029 and Fixes#1443
This fixes uploads to existing files for Google Drive introduced by #2007.
Instead of updating the old file a new "Untitled" file would be created
in the root folder.
A Range request can never request 0 bytes however this change was made
to make a clearer signal that the limit means read to the end.
Add test and more documentation and fixup uses
It is unnecessary to notify the node.Parents, because a cahnge event is
generated for all involved files and folders in a move from d1/f1 to
d2/f1. There will be a event for d1, d2 and f1.
Additionally a duplicate notification is resolved when them empty string
is in pathsToClear.
Related to #2006
The purpose of this is to make it easier to maintain and eventually to
allow the rclone backends to be re-used in other projects without
having to use the rclone configuration system.
The new code layout is documented in CONTRIBUTING.