Michał Matczuk f396550934 backend/local: Avoid polluting page cache when uploading local files to remote backends
This patch makes rclone keep linux page cache usage under control when
uploading local files to remote backends. When opening a file it issues
FADV_SEQUENTIAL to configure read ahead strategy. While reading
the file it issues FADV_DONTNEED every 128kB to free page cache from
already consumed pages.

```
fadvise64(5, 0, 0, POSIX_FADV_SEQUENTIAL) = 0
read(5, "\324\375\251\376\213\361\240\224>\t5E\301\331X\274^\203oA\353\303.2'\206z\177N\27fB"..., 32768) = 32768
read(5, "\361\311\vW!\354_\317hf\276t\307\30L\351\272T\342C\243\370\240\213\355\210\v\221\201\177[\333"..., 32768) = 32768
read(5, ":\371\337Gn\355C\322\334 \253f\373\277\301;\215\n\240\347\305\6N\257\313\4\365\276ANq!"..., 32768) = 32768
read(5, "\312\243\360P\263\242\267H\304\240Y\310\367sT\321\256\6[b\310\224\361\344$Ms\234\5\314\306i"..., 32768) = 32768
fadvise64(5, 0, 131072, POSIX_FADV_DONTNEED) = 0
read(5, "m\251\7a\306\226\366-\v~\"\216\353\342~0\fht\315DK0\236.\\\201!A#\177\320"..., 32768) = 32768
read(5, "\7\324\207,\205\360\376\307\276\254\250\232\21G\323n\255\354\234\257P\322y\3502\37\246\21\334^42"..., 32768) = 32768
read(5, "e{*\225\223R\320\212EG:^\302\377\242\337\10\222J\16A\305\0\353\354\326P\336\357A|-"..., 32768) = 32768
read(5, "n\23XA4*R\352\234\257\364\355Y\204t9T\363\33\357\333\3674\246\221T\360\226\326G\354\374"..., 32768) = 32768
fadvise64(5, 131072, 131072, POSIX_FADV_DONTNEED) = 0
read(5, "SX\331\251}\24\353\37\310#\307|h%\372\34\310\3070YX\250s\2269\242\236\371\302z\357_"..., 32768) = 32768
read(5, "\177\3500\236Y\245\376NIY\177\360p!\337L]\2726\206@\240\246pG\213\254N\274\226\303\357"..., 32768) = 32768
read(5, "\242$*\364\217U\264]\221Y\245\342r\t\253\25Hr\363\263\364\336\322\t\325\325\f\37z\324\201\351"..., 32768) = 32768
read(5, "\2305\242\366\370\203tM\226<\230\25\316(9\25x\2\376\212\346Q\223 \353\225\323\264jf|\216"..., 32768) = 32768
fadvise64(5, 262144, 131072, POSIX_FADV_DONTNEED) = 0
```

Page cache consumption per file can be checked with tools like [pcstat](https://github.com/tobert/pcstat).

This patch does not have a performance impact. Please find below results
of an experiment comparing local copy of 1GB file with and without this
patch.

With the patch:

```
(mmt/fadvise)$ pcstat 1GB.bin.1
+-----------+----------------+------------+-----------+---------+
| Name      | Size (bytes)   | Pages      | Cached    | Percent |
|-----------+----------------+------------+-----------+---------|
| 1GB.bin.1 | 1073741824     | 262144     | 0         | 000.000 |
+-----------+----------------+------------+-----------+---------+
(mmt/fadvise)$ taskset -c 0 /usr/bin/time -v ./rclone copy 1GB.bin.1 /var/empty/rclone
        Command being timed: "./rclone copy 1GB.bin.1 /var/empty/rclone"
        User time (seconds): 13.19
        System time (seconds): 1.12
        Percent of CPU this job got: 96%
        Elapsed (wall clock) time (h:mm:ss or m:ss): 0:14.81
        Average shared text size (kbytes): 0
        Average unshared data size (kbytes): 0
        Average stack size (kbytes): 0
        Average total size (kbytes): 0
        Maximum resident set size (kbytes): 27660
        Average resident set size (kbytes): 0
        Major (requiring I/O) page faults: 0
        Minor (reclaiming a frame) page faults: 2212
        Voluntary context switches: 5755
        Involuntary context switches: 9782
        Swaps: 0
        File system inputs: 4155264
        File system outputs: 2097152
        Socket messages sent: 0
        Socket messages received: 0
        Signals delivered: 0
        Page size (bytes): 4096
        Exit status: 0
(mmt/fadvise)$ pcstat 1GB.bin.1
+-----------+----------------+------------+-----------+---------+
| Name      | Size (bytes)   | Pages      | Cached    | Percent |
|-----------+----------------+------------+-----------+---------|
| 1GB.bin.1 | 1073741824     | 262144     | 0         | 000.000 |
+-----------+----------------+------------+-----------+---------+
```

Without the patch:

```
(master)$ taskset -c 0 /usr/bin/time -v ./rclone copy 1GB.bin.1 /var/empty/rclone
        Command being timed: "./rclone copy 1GB.bin.1 /var/empty/rclone"
        User time (seconds): 14.46
        System time (seconds): 0.81
        Percent of CPU this job got: 93%
        Elapsed (wall clock) time (h:mm:ss or m:ss): 0:16.41
        Average shared text size (kbytes): 0
        Average unshared data size (kbytes): 0
        Average stack size (kbytes): 0
        Average total size (kbytes): 0
        Maximum resident set size (kbytes): 27600
        Average resident set size (kbytes): 0
        Major (requiring I/O) page faults: 0
        Minor (reclaiming a frame) page faults: 2228
        Voluntary context switches: 7190
        Involuntary context switches: 1980
        Swaps: 0
        File system inputs: 2097152
        File system outputs: 2097152
        Socket messages sent: 0
        Socket messages received: 0
        Signals delivered: 0
        Page size (bytes): 4096
        Exit status: 0
(master)$ pcstat 1GB.bin.1
+-----------+----------------+------------+-----------+---------+
| Name      | Size (bytes)   | Pages      | Cached    | Percent |
|-----------+----------------+------------+-----------+---------|
| 1GB.bin.1 | 1073741824     | 262144     | 262144    | 100.000 |
+-----------+----------------+------------+-----------+---------+
```
2019-08-08 23:41:52 +01:00
2019-08-06 11:43:42 +01:00
2019-06-19 11:59:46 +01:00
2019-07-28 12:05:50 +01:00
2019-08-06 10:31:32 +01:00
2019-04-15 20:12:56 +01:00

Logo

Website | Documentation | Download | Contributing | Changelog | Installation | Forum |

Build Status Windows Build Status Build Status CircleCI Go Report Card GoDoc

Rclone

Rclone ("rsync for cloud storage") is a command line program to sync files and directories to and from different cloud storage providers.

Storage providers

Please see the full list of all storage providers and their features

Features

  • MD5/SHA-1 hashes checked at all times for file integrity
  • Timestamps preserved on files
  • Partial syncs supported on a whole file basis
  • Copy mode to just copy new/changed files
  • Sync (one way) mode to make a directory identical
  • Check mode to check for file hash equality
  • Can sync to and from network, e.g. two different cloud accounts
  • Optional encryption (Crypt)
  • Optional cache (Cache)
  • Optional FUSE mount (rclone mount)
  • Multi-threaded downloads to local disk
  • Can serve local or remote files over HTTP/WebDav/FTP/SFTP/dlna

Installation & documentation

Please see the rclone website for:

Downloads

License

This is free software under the terms of MIT the license (check the COPYING file included in this package).

Description
"rsync for cloud storage" - Google Drive, S3, Dropbox, Backblaze B2, One Drive, Swift, Hubic, Wasabi, Google Cloud Storage, Azure Blob, Azure Files, Yandex Files
Readme 350 MiB
Languages
Go 98.6%
Shell 0.4%
HTML 0.3%
Python 0.3%
JavaScript 0.2%
Other 0.1%