Backends for which additional config is detected (in the config string
or on the command line or as environment variables) will gain a suffix
`{XXXXX}` where `XXXX` is a base64 encoded md5hash of the config
string.
This fixes backend caching with config string remotes.
This much requested feature now works properly:
rclone copy -vv drive,shared_with_me:file.txt drive:
This is implemented as a state machine parser so it can emit sensible
error messages.
It does not use the connection strings elsewhere in rclone yet - see
subsequent commits.
An optional fuzzer is implemented for the Parse function.
If you are using rclone a library you can decide to use the rclone
config file system or not by calling
configfile.LoadConfig(ctx)
If you don't you will need to set `config.Data` to an implementation
of `config.Storage`.
Other changes
- change interface of config.FileGet to remove unused default
- remove MustValue from config.Storage interface
- change GetValue to return string or bool like elsewhere in rclone
- implement a default config file system which panics with helpful error
- implement getWithDefault to replace the removed MustValue
- don't embed goconfig.ConfigFile so we can change the methods
Some storage providers e.g. S3 don't have an efficient rename operation.
Before this change, when chunker finished an upload, the server-side copy
and delete operations that renamed temporary chunks to their final names
could take a significant amount of time.
This PR records transaction identifier (versioning) in the metadata of
chunker composite objects striving to remove the need for rename
operations on such backends.
This approach will be triggered be the new "transactions" configuration
option, which can be "rename" (the default) or "norename".
We implement the new approach for uploads (Put operations).
The chunker Move operation still uses the rename operation of
underlying backend. Filling this gap is left for a later PR.
Co-authored-by: Ivan Andreev <ivandeex@gmail.com>
Currently if container under test has multiple IP addresses,
the `docker_ip` function from `docker.sh` will return a gibberish.
This patch makes it return the first address found.
Additionally, I apply shellcheck on `docker.sh`.
This includes an HDFS docker image to use with the integration tests.
Co-authored-by: Ivan Andreev <ivandeex@gmail.com>
Co-authored-by: Nick Craig-Wood <nick@craig-wood.com>
This is done by making fs.Config private and attaching it to the
context instead.
The Config should be obtained with fs.GetConfig and fs.AddConfig
should be used to get a new mutable config that can be changed.
Before this change the tests were run against the previous stable
rclone/rclone docker image.
This unfortunately masked errors in the integration test server.
This change uses the currently installed rclone to run "rclone serve
ftp" etc. This is installed out of the current code by the integration
test server so will make a better test.
This adds a context.Context parameter to NewFs and related calls.
This is necessary as part of reading config from the context -
backends need to be able to read the global config.
Occasionally the b2 tests fail because the integration tests don't
retry hard enough with their new setting of -list-retries 3. Override
this setting to 5 for the b2 tests only.
Running all the tests for 1fichier takes too long due to the directory
reading rate limiter.
The backend tests do complete in a reasonable time (21 mins).
This is what I wrote to Digital Ocean support on July 10, 2020 - alas
it didn't result in the rate limits dropping, so reluctantly I'm going
to remove DO from the integration tests since they never pass and have
no hope of ever passing while this rate limit is in effect.
----
Somewhere towards the end of June 2020 or the start of July 2020 my
integration tests between rclone ( https://rclone.org ) and Digital
Ocean started failing.
I tried moving the tests to different regions (currently they are
using AMS1 because I'm in Europe) with no improvement.
Rclone seems to be hitting this rate limit as documented here:
https://www.digitalocean.com/docs/spaces/#limits
- 2 COPYs per 5 minutes on any individual object in a Space
Rclone creates small objects about 100 bytes in size and renames them
a few times - this involves using the COPY call as S3 does not have a
rename API. The tests do this more than twice per object so hit the 5
minute timeout I think. Rclone does exponential backoff and fails
after 10 retries not having reached 5 minutes delay after 10 retries.
Having a 5 minute lockout on an S3 compatible API is surprising!
Rclone integration tests with about 30 other providers, none of which
have a rate limit like this.
I understand the need for a COPY rate limit as server side copying
large files can be resource intensive. However a 5 minute lockout for
copying 100 byte files seems excessive!
Might I humbly suggest that you reduce or eliminate this rate limit
for small files?
----
This was the reply
Unfortunately it is not possible to raise this limit or remove it
currently on our platform. I do see how this would interfere with type
of applications that need to copy many small files and will be happy
to take the feedback to our engineering team to see how we can improve
the spaces system in the future
- add a directory to the optional Purge interface
- fix up all the backends
- add an additional integration test to test for the feature
- use the new feature in operations.Purge
Many of the backends had been prepared in advance for this so the
change was trivial for them.
This adds expire and unlink fields to the PublicLink interface.
This fixes up the affected backends and removes unlink parameters
where they are present.
These commands are for implementing backend specific
functionality. They have documentation which is placed automatically
into the backend doc.
There is a simple test for the feature in the backend tests.
This has been reported to Wasabi and they've confirmed as a known
issue that multipart uploads can't be 0 sized even though that is
incompatible with AWS S3.
changes:
- chunker: remove GetTier and SetTier
- remove wdmrcompat metaformat
- remove fastopen strategy
- make hash_type option non-advanced
- adverise hash support when possible
- add metadata field "ver", run strict checks
- describe internal behavior in comments
- improve documentation
note:
wdmrcompat used to write file name in the metadata, so maximum metadata
size was 1K; removing it allows to cap size by 200 bytes now.
Note: chunker implements many irrelevant methods (UserInfo, Disconnect etc),
but they are required by TestIntegration/FsCheckWrap and cannot be removed.
Dropped API methods: MergeDirs DirCacheFlush PublicLink UserInfo Disconnect OpenWriterAt
Meta formats:
- renamed old simplejson format to wdmrcompat.
- new simplejson format supports hash sums and verification of chunk size/count.
Change list:
- split-chunking overlay for mailru
- add to all
- fix linter errors
- fix integration tests
- support chunks without meta object
- fix package paths
- propagate context
- fix formatting
- implement new required wrapper interfaces
- also test large file uploads
- simplify options
- user friendly name pattern
- set default chunk size 2G
- fix building with golang 1.9
- fix ci/cd on a separate branch
- fix updated object name (SyncUTFNorm failed)
- fix panic in Box overlay
- workaround: Box rename failed if name taken
- enhance comments in unit test
- fix formatting
- embed wrapped remote rather than inherit
- require wrapped remote to support move (or copy)
- implement 3 (keep fstest)
- drop irrelevant file system interfaces
- factor out Object.mainChunk
- refactor TestLargeUpload as InternalTest
- add unit test for chunk name formats
- new improved simplejson meta format
- tricky case in test FsIsFile (fix+ignore)
- remove debugging print
- hide temporary objects from listings
- fix bugs in chunking reader:
- return EOF immediately when all data is sent
- handle case when wrapped remote puts by hash (bug detected by TestRcat)
- chunked file hashing (feature)
- server-side copy across configs (feature)
- robust cleanup of temporary chunks in Put
- linear download strategy (no read-ahead, feature)
- fix unexpected EOF in the box multipart uploader
- throw error if destination ignores data
Before this change we didn't calculate any hashes for test files
created in the Run framework.
This means that files were uploaded to S3 without a `Content-MD5`
header. This in turn caused minio to disengage `--compat` mode which
in turn caused the `TestSyncAfterChangingModtimeOnlyWithNoUpdateModTime`
test to fail in `fs/sync`.
After this change we supply all hashes supported by the destination Fs
on the upload object.
This means that the `Content-MD5` is set and minio engages `--compat`
mode to fix the problem. Using `--compat` on the command line also
fixes the problem.
This much better replicates how objects are actually uploaded with
operations.Copy so should improve the integration tests.
Before this change it was possible to make a remote with an invalid
name in the config file, either manually or with `rclone config
create` (but not with `rclone config`).
When this remote was used, because it was invalid, rclone would
presume this remote name was a local directory for a very suprising
user experience!
This change checks remote names more carefully and returns errors
- when the user tries to use an invalid remote name on the command line
- when an invalid remote name is used in `rclone config create/update/password`
- when the user tries to enter an invalid remote name in `rclone config`
This does not prevent the user entering a remote name with invalid
characters in the config manually, but such a remote will fail
immediately when it is used on the command line.
The fs cache makes test runs no longer independent and this can cause
a problem with some tests.
Clearing the fs cache between tests runs fixes the problem.
This was spotted by @cenkalti as part of merging #3469