rclone/backend
Michał Matczuk f396550934 backend/local: Avoid polluting page cache when uploading local files to remote backends
This patch makes rclone keep linux page cache usage under control when
uploading local files to remote backends. When opening a file it issues
FADV_SEQUENTIAL to configure read ahead strategy. While reading
the file it issues FADV_DONTNEED every 128kB to free page cache from
already consumed pages.

```
fadvise64(5, 0, 0, POSIX_FADV_SEQUENTIAL) = 0
read(5, "\324\375\251\376\213\361\240\224>\t5E\301\331X\274^\203oA\353\303.2'\206z\177N\27fB"..., 32768) = 32768
read(5, "\361\311\vW!\354_\317hf\276t\307\30L\351\272T\342C\243\370\240\213\355\210\v\221\201\177[\333"..., 32768) = 32768
read(5, ":\371\337Gn\355C\322\334 \253f\373\277\301;\215\n\240\347\305\6N\257\313\4\365\276ANq!"..., 32768) = 32768
read(5, "\312\243\360P\263\242\267H\304\240Y\310\367sT\321\256\6[b\310\224\361\344$Ms\234\5\314\306i"..., 32768) = 32768
fadvise64(5, 0, 131072, POSIX_FADV_DONTNEED) = 0
read(5, "m\251\7a\306\226\366-\v~\"\216\353\342~0\fht\315DK0\236.\\\201!A#\177\320"..., 32768) = 32768
read(5, "\7\324\207,\205\360\376\307\276\254\250\232\21G\323n\255\354\234\257P\322y\3502\37\246\21\334^42"..., 32768) = 32768
read(5, "e{*\225\223R\320\212EG:^\302\377\242\337\10\222J\16A\305\0\353\354\326P\336\357A|-"..., 32768) = 32768
read(5, "n\23XA4*R\352\234\257\364\355Y\204t9T\363\33\357\333\3674\246\221T\360\226\326G\354\374"..., 32768) = 32768
fadvise64(5, 131072, 131072, POSIX_FADV_DONTNEED) = 0
read(5, "SX\331\251}\24\353\37\310#\307|h%\372\34\310\3070YX\250s\2269\242\236\371\302z\357_"..., 32768) = 32768
read(5, "\177\3500\236Y\245\376NIY\177\360p!\337L]\2726\206@\240\246pG\213\254N\274\226\303\357"..., 32768) = 32768
read(5, "\242$*\364\217U\264]\221Y\245\342r\t\253\25Hr\363\263\364\336\322\t\325\325\f\37z\324\201\351"..., 32768) = 32768
read(5, "\2305\242\366\370\203tM\226<\230\25\316(9\25x\2\376\212\346Q\223 \353\225\323\264jf|\216"..., 32768) = 32768
fadvise64(5, 262144, 131072, POSIX_FADV_DONTNEED) = 0
```

Page cache consumption per file can be checked with tools like [pcstat](https://github.com/tobert/pcstat).

This patch does not have a performance impact. Please find below results
of an experiment comparing local copy of 1GB file with and without this
patch.

With the patch:

```
(mmt/fadvise)$ pcstat 1GB.bin.1
+-----------+----------------+------------+-----------+---------+
| Name      | Size (bytes)   | Pages      | Cached    | Percent |
|-----------+----------------+------------+-----------+---------|
| 1GB.bin.1 | 1073741824     | 262144     | 0         | 000.000 |
+-----------+----------------+------------+-----------+---------+
(mmt/fadvise)$ taskset -c 0 /usr/bin/time -v ./rclone copy 1GB.bin.1 /var/empty/rclone
        Command being timed: "./rclone copy 1GB.bin.1 /var/empty/rclone"
        User time (seconds): 13.19
        System time (seconds): 1.12
        Percent of CPU this job got: 96%
        Elapsed (wall clock) time (h:mm:ss or m:ss): 0:14.81
        Average shared text size (kbytes): 0
        Average unshared data size (kbytes): 0
        Average stack size (kbytes): 0
        Average total size (kbytes): 0
        Maximum resident set size (kbytes): 27660
        Average resident set size (kbytes): 0
        Major (requiring I/O) page faults: 0
        Minor (reclaiming a frame) page faults: 2212
        Voluntary context switches: 5755
        Involuntary context switches: 9782
        Swaps: 0
        File system inputs: 4155264
        File system outputs: 2097152
        Socket messages sent: 0
        Socket messages received: 0
        Signals delivered: 0
        Page size (bytes): 4096
        Exit status: 0
(mmt/fadvise)$ pcstat 1GB.bin.1
+-----------+----------------+------------+-----------+---------+
| Name      | Size (bytes)   | Pages      | Cached    | Percent |
|-----------+----------------+------------+-----------+---------|
| 1GB.bin.1 | 1073741824     | 262144     | 0         | 000.000 |
+-----------+----------------+------------+-----------+---------+
```

Without the patch:

```
(master)$ taskset -c 0 /usr/bin/time -v ./rclone copy 1GB.bin.1 /var/empty/rclone
        Command being timed: "./rclone copy 1GB.bin.1 /var/empty/rclone"
        User time (seconds): 14.46
        System time (seconds): 0.81
        Percent of CPU this job got: 93%
        Elapsed (wall clock) time (h:mm:ss or m:ss): 0:16.41
        Average shared text size (kbytes): 0
        Average unshared data size (kbytes): 0
        Average stack size (kbytes): 0
        Average total size (kbytes): 0
        Maximum resident set size (kbytes): 27600
        Average resident set size (kbytes): 0
        Major (requiring I/O) page faults: 0
        Minor (reclaiming a frame) page faults: 2228
        Voluntary context switches: 7190
        Involuntary context switches: 1980
        Swaps: 0
        File system inputs: 2097152
        File system outputs: 2097152
        Socket messages sent: 0
        Socket messages received: 0
        Signals delivered: 0
        Page size (bytes): 4096
        Exit status: 0
(master)$ pcstat 1GB.bin.1
+-----------+----------------+------------+-----------+---------+
| Name      | Size (bytes)   | Pages      | Cached    | Percent |
|-----------+----------------+------------+-----------+---------|
| 1GB.bin.1 | 1073741824     | 262144     | 262144    | 100.000 |
+-----------+----------------+------------+-----------+---------+
```
2019-08-08 23:41:52 +01:00
..
alias build: fix up package paths after repo move 2019-07-28 18:47:38 +01:00
all build: fix up package paths after repo move 2019-07-28 18:47:38 +01:00
amazonclouddrive build: fix up package paths after repo move 2019-07-28 18:47:38 +01:00
azureblob azureblob/b2/dropbox/gcs/koofr/qingstor/s3: fix 0 length files 2019-08-06 15:18:08 +01:00
b2 azureblob/b2/dropbox/gcs/koofr/qingstor/s3: fix 0 length files 2019-08-06 15:18:08 +01:00
box build: fix up package paths after repo move 2019-07-28 18:47:38 +01:00
cache lib/random: unify random string generation into random.String 2019-08-06 12:44:08 +01:00
crypt build: fix up package paths after repo move 2019-07-28 18:47:38 +01:00
drive drive: update API for teamdrive use - fixes #3348 2019-08-02 16:06:23 +01:00
dropbox azureblob/b2/dropbox/gcs/koofr/qingstor/s3: fix 0 length files 2019-08-06 15:18:08 +01:00
fichier build: fix up package paths after repo move 2019-07-28 18:47:38 +01:00
ftp build: fix up package paths after repo move 2019-07-28 18:47:38 +01:00
googlecloudstorage azureblob/b2/dropbox/gcs/koofr/qingstor/s3: fix 0 length files 2019-08-06 15:18:08 +01:00
googlephotos lib/random: unify random string generation into random.String 2019-08-06 12:44:08 +01:00
http build: fix up package paths after repo move 2019-07-28 18:47:38 +01:00
hubic build: fix up package paths after repo move 2019-07-28 18:47:38 +01:00
jottacloud build: fix up package paths after repo move 2019-07-28 18:47:38 +01:00
koofr azureblob/b2/dropbox/gcs/koofr/qingstor/s3: fix 0 length files 2019-08-06 15:18:08 +01:00
local backend/local: Avoid polluting page cache when uploading local files to remote backends 2019-08-08 23:41:52 +01:00
mega build: fix up package paths after repo move 2019-07-28 18:47:38 +01:00
onedrive build: fix up package paths after repo move 2019-07-28 18:47:38 +01:00
opendrive build: fix up package paths after repo move 2019-07-28 18:47:38 +01:00
pcloud build: fix up package paths after repo move 2019-07-28 18:47:38 +01:00
qingstor azureblob/b2/dropbox/gcs/koofr/qingstor/s3: fix 0 length files 2019-08-06 15:18:08 +01:00
s3 azureblob/b2/dropbox/gcs/koofr/qingstor/s3: fix 0 length files 2019-08-06 15:18:08 +01:00
sftp build: fix up package paths after repo move 2019-07-28 18:47:38 +01:00
swift swift: fix upload when using no_chunk to return the correct size 2019-08-08 12:41:46 +01:00
union build: fix up package paths after repo move 2019-07-28 18:47:38 +01:00
webdav build: fix up package paths after repo move 2019-07-28 18:47:38 +01:00
yandex build: fix up package paths after repo move 2019-07-28 18:47:38 +01:00