mirror of
https://github.com/trapexit/mergerfs.git
synced 2024-11-25 17:57:41 +08:00
Misc README updates
This commit is contained in:
parent
bd02bfd54c
commit
5152c63480
273
README.md
273
README.md
|
@ -65,9 +65,10 @@ A + B = C
|
||||||
mergerfs does **not** support the copy-on-write (CoW) or whiteout
|
mergerfs does **not** support the copy-on-write (CoW) or whiteout
|
||||||
behaviors found in **aufs** and **overlayfs**. You can **not** mount a
|
behaviors found in **aufs** and **overlayfs**. You can **not** mount a
|
||||||
read-only filesystem and write to it. However, mergerfs will ignore
|
read-only filesystem and write to it. However, mergerfs will ignore
|
||||||
read-only drives when creating new files so you can mix read-write and
|
read-only filesystems when creating new files so you can mix
|
||||||
read-only drives. It also does **not** split data across drives. It is
|
read-write and read-only filesystems. It also does **not** split data
|
||||||
not RAID0 / striping. It is simply a union of other filesystems.
|
across filesystems. It is not RAID0 / striping. It is simply a union of
|
||||||
|
other filesystems.
|
||||||
|
|
||||||
|
|
||||||
# TERMINOLOGY
|
# TERMINOLOGY
|
||||||
|
@ -178,7 +179,7 @@ These options are the same regardless of whether you use them with the
|
||||||
policy of `create` (read below). Enabling this will cause rename and
|
policy of `create` (read below). Enabling this will cause rename and
|
||||||
link to always use the non-path preserving behavior. This means
|
link to always use the non-path preserving behavior. This means
|
||||||
files, when renamed or linked, will stay on the same
|
files, when renamed or linked, will stay on the same
|
||||||
drive. (default: false)
|
filesystem. (default: false)
|
||||||
* **security_capability=BOOL**: If false return ENOATTR when xattr
|
* **security_capability=BOOL**: If false return ENOATTR when xattr
|
||||||
security.capability is queried. (default: true)
|
security.capability is queried. (default: true)
|
||||||
* **xattr=passthrough|noattr|nosys**: Runtime control of
|
* **xattr=passthrough|noattr|nosys**: Runtime control of
|
||||||
|
@ -191,7 +192,7 @@ These options are the same regardless of whether you use them with the
|
||||||
copy-on-write function similar to cow-shell. (default: false)
|
copy-on-write function similar to cow-shell. (default: false)
|
||||||
* **statfs=base|full**: Controls how statfs works. 'base' means it
|
* **statfs=base|full**: Controls how statfs works. 'base' means it
|
||||||
will always use all branches in statfs calculations. 'full' is in
|
will always use all branches in statfs calculations. 'full' is in
|
||||||
effect path preserving and only includes drives where the path
|
effect path preserving and only includes branches where the path
|
||||||
exists. (default: base)
|
exists. (default: base)
|
||||||
* **statfs_ignore=none|ro|nc**: 'ro' will cause statfs calculations to
|
* **statfs_ignore=none|ro|nc**: 'ro' will cause statfs calculations to
|
||||||
ignore available space for branches mounted or tagged as 'read-only'
|
ignore available space for branches mounted or tagged as 'read-only'
|
||||||
|
@ -324,9 +325,9 @@ you're using. Not all features are available in older releases. Use
|
||||||
|
|
||||||
The 'branches' argument is a colon (':') delimited list of paths to be
|
The 'branches' argument is a colon (':') delimited list of paths to be
|
||||||
pooled together. It does not matter if the paths are on the same or
|
pooled together. It does not matter if the paths are on the same or
|
||||||
different drives nor does it matter the filesystem (within
|
different filesystems nor does it matter the filesystem type (within
|
||||||
reason). Used and available space will not be duplicated for paths on
|
reason). Used and available space will not be duplicated for paths on
|
||||||
the same device and any features which aren't supported by the
|
the same filesystem and any features which aren't supported by the
|
||||||
underlying filesystem (such as file attributes or extended attributes)
|
underlying filesystem (such as file attributes or extended attributes)
|
||||||
will return the appropriate errors.
|
will return the appropriate errors.
|
||||||
|
|
||||||
|
@ -334,7 +335,7 @@ Branches currently have two options which can be set. A type which
|
||||||
impacts whether or not the branch is included in a policy calculation
|
impacts whether or not the branch is included in a policy calculation
|
||||||
and a individual minfreespace value. The values are set by prepending
|
and a individual minfreespace value. The values are set by prepending
|
||||||
an `=` at the end of a branch designation and using commas as
|
an `=` at the end of a branch designation and using commas as
|
||||||
delimiters. Example: /mnt/drive=RW,1234
|
delimiters. Example: `/mnt/drive=RW,1234`
|
||||||
|
|
||||||
|
|
||||||
#### branch mode
|
#### branch mode
|
||||||
|
@ -590,10 +591,10 @@ something to keep in mind.
|
||||||
|
|
||||||
**WARNING:** Some backup solutions, such as CrashPlan, do not backup
|
**WARNING:** Some backup solutions, such as CrashPlan, do not backup
|
||||||
the target of a symlink. If using this feature it will be necessary to
|
the target of a symlink. If using this feature it will be necessary to
|
||||||
point any backup software to the original drives or configure the
|
point any backup software to the original filesystems or configure the
|
||||||
software to follow symlinks if such an option is
|
software to follow symlinks if such an option is available.
|
||||||
available. Alternatively create two mounts. One for backup and one for
|
Alternatively create two mounts. One for backup and one for general
|
||||||
general consumption.
|
consumption.
|
||||||
|
|
||||||
|
|
||||||
### nullrw
|
### nullrw
|
||||||
|
@ -750,11 +751,11 @@ All policies which start with `ep` (**epff**, **eplfs**, **eplus**,
|
||||||
**epmfs**, **eprand**) are `path preserving`. `ep` stands for
|
**epmfs**, **eprand**) are `path preserving`. `ep` stands for
|
||||||
`existing path`.
|
`existing path`.
|
||||||
|
|
||||||
A path preserving policy will only consider drives where the relative
|
A path preserving policy will only consider branches where the relative
|
||||||
path being accessed already exists.
|
path being accessed already exists.
|
||||||
|
|
||||||
When using non-path preserving policies paths will be cloned to target
|
When using non-path preserving policies paths will be cloned to target
|
||||||
drives as necessary.
|
branches as necessary.
|
||||||
|
|
||||||
With the `msp` or `most shared path` policies they are defined as
|
With the `msp` or `most shared path` policies they are defined as
|
||||||
`path preserving` for the purpose of controlling `link` and `rename`'s
|
`path preserving` for the purpose of controlling `link` and `rename`'s
|
||||||
|
@ -775,15 +776,15 @@ but it makes things a bit more uniform.
|
||||||
| all | Search: For **mkdir**, **mknod**, and **symlink** it will apply to all branches. **create** works like **ff**. |
|
| all | Search: For **mkdir**, **mknod**, and **symlink** it will apply to all branches. **create** works like **ff**. |
|
||||||
| epall (existing path, all) | For **mkdir**, **mknod**, and **symlink** it will apply to all found. **create** works like **epff** (but more expensive because it doesn't stop after finding a valid branch). |
|
| epall (existing path, all) | For **mkdir**, **mknod**, and **symlink** it will apply to all found. **create** works like **epff** (but more expensive because it doesn't stop after finding a valid branch). |
|
||||||
| epff (existing path, first found) | Given the order of the branches, as defined at mount time or configured at runtime, act on the first one found where the relative path exists. |
|
| epff (existing path, first found) | Given the order of the branches, as defined at mount time or configured at runtime, act on the first one found where the relative path exists. |
|
||||||
| eplfs (existing path, least free space) | Of all the branches on which the relative path exists choose the drive with the least free space. |
|
| eplfs (existing path, least free space) | Of all the branches on which the relative path exists choose the branch with the least free space. |
|
||||||
| eplus (existing path, least used space) | Of all the branches on which the relative path exists choose the drive with the least used space. |
|
| eplus (existing path, least used space) | Of all the branches on which the relative path exists choose the branch with the least used space. |
|
||||||
| epmfs (existing path, most free space) | Of all the branches on which the relative path exists choose the drive with the most free space. |
|
| epmfs (existing path, most free space) | Of all the branches on which the relative path exists choose the branch with the most free space. |
|
||||||
| eppfrd (existing path, percentage free random distribution) | Like **pfrd** but limited to existing paths. |
|
| eppfrd (existing path, percentage free random distribution) | Like **pfrd** but limited to existing paths. |
|
||||||
| eprand (existing path, random) | Calls **epall** and then randomizes. Returns 1. |
|
| eprand (existing path, random) | Calls **epall** and then randomizes. Returns 1. |
|
||||||
| ff (first found) | Given the order of the drives, as defined at mount time or configured at runtime, act on the first one found. |
|
| ff (first found) | Given the order of the branches, as defined at mount time or configured at runtime, act on the first one found. |
|
||||||
| lfs (least free space) | Pick the drive with the least available free space. |
|
| lfs (least free space) | Pick the branch with the least available free space. |
|
||||||
| lus (least used space) | Pick the drive with the least used space. |
|
| lus (least used space) | Pick the branch with the least used space. |
|
||||||
| mfs (most free space) | Pick the drive with the most available free space. |
|
| mfs (most free space) | Pick the branch with the most available free space. |
|
||||||
| msplfs (most shared path, least free space) | Like **eplfs** but if it fails to find a branch it will try again with the parent directory. Continues this pattern till finding one. |
|
| msplfs (most shared path, least free space) | Like **eplfs** but if it fails to find a branch it will try again with the parent directory. Continues this pattern till finding one. |
|
||||||
| msplus (most shared path, least used space) | Like **eplus** but if it fails to find a branch it will try again with the parent directory. Continues this pattern till finding one. |
|
| msplus (most shared path, least used space) | Like **eplus** but if it fails to find a branch it will try again with the parent directory. Continues this pattern till finding one. |
|
||||||
| mspmfs (most shared path, most free space) | Like **epmfs** but if it fails to find a branch it will try again with the parent directory. Continues this pattern till finding one. |
|
| mspmfs (most shared path, most free space) | Like **epmfs** but if it fails to find a branch it will try again with the parent directory. Continues this pattern till finding one. |
|
||||||
|
@ -832,7 +833,7 @@ filesystem. `rename` only works within a single filesystem or
|
||||||
device. If a rename can't be done atomically due to the source and
|
device. If a rename can't be done atomically due to the source and
|
||||||
destination paths existing on different mount points it will return
|
destination paths existing on different mount points it will return
|
||||||
**-1** with **errno = EXDEV** (cross device / improper link). So if a
|
**-1** with **errno = EXDEV** (cross device / improper link). So if a
|
||||||
`rename`'s source and target are on different drives within the pool
|
`rename`'s source and target are on different filesystems within the pool
|
||||||
it creates an issue.
|
it creates an issue.
|
||||||
|
|
||||||
Originally mergerfs would return EXDEV whenever a rename was requested
|
Originally mergerfs would return EXDEV whenever a rename was requested
|
||||||
|
@ -850,25 +851,25 @@ work while still obeying mergerfs' policies. Below is the basic logic.
|
||||||
* Using the **rename** policy get the list of files to rename
|
* Using the **rename** policy get the list of files to rename
|
||||||
* For each file attempt rename:
|
* For each file attempt rename:
|
||||||
* If failure with ENOENT (no such file or directory) run **create** policy
|
* If failure with ENOENT (no such file or directory) run **create** policy
|
||||||
* If create policy returns the same drive as currently evaluating then clone the path
|
* If create policy returns the same branch as currently evaluating then clone the path
|
||||||
* Re-attempt rename
|
* Re-attempt rename
|
||||||
* If **any** of the renames succeed the higher level rename is considered a success
|
* If **any** of the renames succeed the higher level rename is considered a success
|
||||||
* If **no** renames succeed the first error encountered will be returned
|
* If **no** renames succeed the first error encountered will be returned
|
||||||
* On success:
|
* On success:
|
||||||
* Remove the target from all drives with no source file
|
* Remove the target from all branches with no source file
|
||||||
* Remove the source from all drives which failed to rename
|
* Remove the source from all branches which failed to rename
|
||||||
* If using a **create** policy which does **not** try to preserve directory paths
|
* If using a **create** policy which does **not** try to preserve directory paths
|
||||||
* Using the **rename** policy get the list of files to rename
|
* Using the **rename** policy get the list of files to rename
|
||||||
* Using the **getattr** policy get the target path
|
* Using the **getattr** policy get the target path
|
||||||
* For each file attempt rename:
|
* For each file attempt rename:
|
||||||
* If the source drive != target drive:
|
* If the source branch != target branch:
|
||||||
* Clone target path from target drive to source drive
|
* Clone target path from target branch to source branch
|
||||||
* Rename
|
* Rename
|
||||||
* If **any** of the renames succeed the higher level rename is considered a success
|
* If **any** of the renames succeed the higher level rename is considered a success
|
||||||
* If **no** renames succeed the first error encountered will be returned
|
* If **no** renames succeed the first error encountered will be returned
|
||||||
* On success:
|
* On success:
|
||||||
* Remove the target from all drives with no source file
|
* Remove the target from all branches with no source file
|
||||||
* Remove the source from all drives which failed to rename
|
* Remove the source from all branches which failed to rename
|
||||||
|
|
||||||
The the removals are subject to normal entitlement checks.
|
The the removals are subject to normal entitlement checks.
|
||||||
|
|
||||||
|
@ -894,11 +895,11 @@ the source of the metadata you see in an **ls**.
|
||||||
#### statfs / statvfs ####
|
#### statfs / statvfs ####
|
||||||
|
|
||||||
[statvfs](http://linux.die.net/man/2/statvfs) normalizes the source
|
[statvfs](http://linux.die.net/man/2/statvfs) normalizes the source
|
||||||
drives based on the fragment size and sums the number of adjusted
|
filesystems based on the fragment size and sums the number of adjusted
|
||||||
blocks and inodes. This means you will see the combined space of all
|
blocks and inodes. This means you will see the combined space of all
|
||||||
sources. Total, used, and free. The sources however are dedupped based
|
sources. Total, used, and free. The sources however are dedupped based
|
||||||
on the drive so multiple sources on the same drive will not result in
|
on the filesystem so multiple sources on the same drive will not result in
|
||||||
double counting its space. Filesystems mounted further down the tree
|
double counting its space. Other filesystems mounted further down the tree
|
||||||
of the branch will not be included when checking the mount's stats.
|
of the branch will not be included when checking the mount's stats.
|
||||||
|
|
||||||
The options `statfs` and `statfs_ignore` can be used to modify
|
The options `statfs` and `statfs_ignore` can be used to modify
|
||||||
|
@ -1211,8 +1212,8 @@ following:
|
||||||
* mergerfs.fsck: Provides permissions and ownership auditing and the ability to fix them
|
* mergerfs.fsck: Provides permissions and ownership auditing and the ability to fix them
|
||||||
* mergerfs.dedup: Will help identify and optionally remove duplicate files
|
* mergerfs.dedup: Will help identify and optionally remove duplicate files
|
||||||
* mergerfs.dup: Ensure there are at least N copies of a file across the pool
|
* mergerfs.dup: Ensure there are at least N copies of a file across the pool
|
||||||
* mergerfs.balance: Rebalance files across drives by moving them from the most filled to the least filled
|
* mergerfs.balance: Rebalance files across filesystems by moving them from the most filled to the least filled
|
||||||
* mergerfs.consolidate: move files within a single mergerfs directory to the drive with most free space
|
* mergerfs.consolidate: move files within a single mergerfs directory to the filesystem with most free space
|
||||||
* https://github.com/trapexit/scorch
|
* https://github.com/trapexit/scorch
|
||||||
* scorch: A tool to help discover silent corruption of files and keep track of files
|
* scorch: A tool to help discover silent corruption of files and keep track of files
|
||||||
* https://github.com/trapexit/bbf
|
* https://github.com/trapexit/bbf
|
||||||
|
@ -1324,37 +1325,18 @@ of sizes below the FUSE message size (128K on older kernels, 1M on
|
||||||
newer).
|
newer).
|
||||||
|
|
||||||
|
|
||||||
#### policy caching
|
|
||||||
|
|
||||||
Policies are run every time a function (with a policy as mentioned
|
|
||||||
above) is called. These policies can be expensive depending on
|
|
||||||
mergerfs' setup and client usage patterns. Generally we wouldn't want
|
|
||||||
to cache policy results because it may result in stale responses if
|
|
||||||
the underlying drives are used directly.
|
|
||||||
|
|
||||||
The `open` policy cache will cache the result of an `open` policy for
|
|
||||||
a particular input for `cache.open` seconds or until the file is
|
|
||||||
unlinked. Each file close (release) will randomly chose to clean up
|
|
||||||
the cache of expired entries.
|
|
||||||
|
|
||||||
This cache is really only useful in cases where you have a large
|
|
||||||
number of branches and `open` is called on the same files repeatedly
|
|
||||||
(like **Transmission** which opens and closes a file on every
|
|
||||||
read/write presumably to keep file handle usage low).
|
|
||||||
|
|
||||||
|
|
||||||
#### statfs caching
|
#### statfs caching
|
||||||
|
|
||||||
Of the syscalls used by mergerfs in policies the `statfs` / `statvfs`
|
Of the syscalls used by mergerfs in policies the `statfs` / `statvfs`
|
||||||
call is perhaps the most expensive. It's used to find out the
|
call is perhaps the most expensive. It's used to find out the
|
||||||
available space of a drive and whether it is mounted
|
available space of a filesystem and whether it is mounted
|
||||||
read-only. Depending on the setup and usage pattern these queries can
|
read-only. Depending on the setup and usage pattern these queries can
|
||||||
be relatively costly. When `cache.statfs` is enabled all calls to
|
be relatively costly. When `cache.statfs` is enabled all calls to
|
||||||
`statfs` by a policy will be cached for the number of seconds its set
|
`statfs` by a policy will be cached for the number of seconds its set
|
||||||
to.
|
to.
|
||||||
|
|
||||||
Example: If the create policy is `mfs` and the timeout is 60 then for
|
Example: If the create policy is `mfs` and the timeout is 60 then for
|
||||||
that 60 seconds the same drive will be returned as the target for
|
that 60 seconds the same filesystem will be returned as the target for
|
||||||
creates because the available space won't be updated for that time.
|
creates because the available space won't be updated for that time.
|
||||||
|
|
||||||
|
|
||||||
|
@ -1392,42 +1374,42 @@ for instance.
|
||||||
MergerFS does not natively support any sort of tiered caching. Most
|
MergerFS does not natively support any sort of tiered caching. Most
|
||||||
users have no use for such a feature and its inclusion would
|
users have no use for such a feature and its inclusion would
|
||||||
complicate the code. However, there are a few situations where a cache
|
complicate the code. However, there are a few situations where a cache
|
||||||
drive could help with a typical mergerfs setup.
|
filesystem could help with a typical mergerfs setup.
|
||||||
|
|
||||||
1. Fast network, slow drives, many readers: You've a 10+Gbps network
|
1. Fast network, slow filesystems, many readers: You've a 10+Gbps network
|
||||||
with many readers and your regular drives can't keep up.
|
with many readers and your regular filesystems can't keep up.
|
||||||
2. Fast network, slow drives, small'ish bursty writes: You have a
|
2. Fast network, slow filesystems, small'ish bursty writes: You have a
|
||||||
10+Gbps network and wish to transfer amounts of data less than your
|
10+Gbps network and wish to transfer amounts of data less than your
|
||||||
cache drive but wish to do so quickly.
|
cache filesystem but wish to do so quickly.
|
||||||
|
|
||||||
With #1 it's arguable if you should be using mergerfs at all. RAID
|
With #1 it's arguable if you should be using mergerfs at all. RAID
|
||||||
would probably be the better solution. If you're going to use mergerfs
|
would probably be the better solution. If you're going to use mergerfs
|
||||||
there are other tactics that may help: spreading the data across
|
there are other tactics that may help: spreading the data across
|
||||||
drives (see the mergerfs.dup tool) and setting `func.open=rand`, using
|
filesystems (see the mergerfs.dup tool) and setting `func.open=rand`,
|
||||||
`symlinkify`, or using dm-cache or a similar technology to add tiered
|
using `symlinkify`, or using dm-cache or a similar technology to add
|
||||||
cache to the underlying device.
|
tiered cache to the underlying device.
|
||||||
|
|
||||||
With #2 one could use dm-cache as well but there is another solution
|
With #2 one could use dm-cache as well but there is another solution
|
||||||
which requires only mergerfs and a cronjob.
|
which requires only mergerfs and a cronjob.
|
||||||
|
|
||||||
1. Create 2 mergerfs pools. One which includes just the slow drives
|
1. Create 2 mergerfs pools. One which includes just the slow devices
|
||||||
and one which has both the fast drives (SSD,NVME,etc.) and slow
|
and one which has both the fast devices (SSD,NVME,etc.) and slow
|
||||||
drives.
|
devices.
|
||||||
2. The 'cache' pool should have the cache drives listed first.
|
2. The 'cache' pool should have the cache filesystems listed first.
|
||||||
3. The best `create` policies to use for the 'cache' pool would
|
3. The best `create` policies to use for the 'cache' pool would
|
||||||
probably be `ff`, `epff`, `lfs`, or `eplfs`. The latter two under
|
probably be `ff`, `epff`, `lfs`, or `eplfs`. The latter two under
|
||||||
the assumption that the cache drive(s) are far smaller than the
|
the assumption that the cache filesystem(s) are far smaller than the
|
||||||
backing drives. If using path preserving policies remember that
|
backing filesystems. If using path preserving policies remember that
|
||||||
you'll need to manually create the core directories of those paths
|
you'll need to manually create the core directories of those paths
|
||||||
you wish to be cached. Be sure the permissions are in sync. Use
|
you wish to be cached. Be sure the permissions are in sync. Use
|
||||||
`mergerfs.fsck` to check / correct them. You could also tag the
|
`mergerfs.fsck` to check / correct them. You could also set the
|
||||||
slow drives as `=NC` though that'd mean if the cache drives fill
|
slow filesystems mode to `NC` though that'd mean if the cache
|
||||||
you'd get "out of space" errors.
|
filesystems fill you'd get "out of space" errors.
|
||||||
4. Enable `moveonenospc` and set `minfreespace` appropriately. To make
|
4. Enable `moveonenospc` and set `minfreespace` appropriately. To make
|
||||||
sure there is enough room on the "slow" pool you might want to set
|
sure there is enough room on the "slow" pool you might want to set
|
||||||
`minfreespace` to at least as large as the size of the largest
|
`minfreespace` to at least as large as the size of the largest
|
||||||
cache drive if not larger. This way in the worst case the whole of
|
cache filesystem if not larger. This way in the worst case the
|
||||||
the cache drive(s) can be moved to the other drives.
|
whole of the cache filesystem(s) can be moved to the other drives.
|
||||||
5. Set your programs to use the cache pool.
|
5. Set your programs to use the cache pool.
|
||||||
6. Save one of the below scripts or create you're own.
|
6. Save one of the below scripts or create you're own.
|
||||||
7. Use `cron` (as root) to schedule the command at whatever frequency
|
7. Use `cron` (as root) to schedule the command at whatever frequency
|
||||||
|
@ -1442,15 +1424,15 @@ rather than days. May want to use the `fadvise` / `--drop-cache`
|
||||||
version of rsync or run rsync with the tool "nocache".
|
version of rsync or run rsync with the tool "nocache".
|
||||||
|
|
||||||
*NOTE:* The arguments to these scripts include the cache
|
*NOTE:* The arguments to these scripts include the cache
|
||||||
**drive**. Not the pool with the cache drive. You could have data loss
|
**filesystem** itself. Not the pool with the cache filesystem. You
|
||||||
if the source is the cache pool.
|
could have data loss if the source is the cache pool.
|
||||||
|
|
||||||
|
|
||||||
```
|
```
|
||||||
#!/bin/bash
|
#!/bin/bash
|
||||||
|
|
||||||
if [ $# != 3 ]; then
|
if [ $# != 3 ]; then
|
||||||
echo "usage: $0 <cache-drive> <backing-pool> <days-old>"
|
echo "usage: $0 <cache-fs> <backing-pool> <days-old>"
|
||||||
exit 1
|
exit 1
|
||||||
fi
|
fi
|
||||||
|
|
||||||
|
@ -1469,15 +1451,15 @@ Move the oldest file from the cache to the backing pool. Continue till
|
||||||
below percentage threshold.
|
below percentage threshold.
|
||||||
|
|
||||||
*NOTE:* The arguments to these scripts include the cache
|
*NOTE:* The arguments to these scripts include the cache
|
||||||
**drive**. Not the pool with the cache drive. You could have data loss
|
**filesystem** itself. Not the pool with the cache filesystem. You
|
||||||
if the source is the cache pool.
|
could have data loss if the source is the cache pool.
|
||||||
|
|
||||||
|
|
||||||
```
|
```
|
||||||
#!/bin/bash
|
#!/bin/bash
|
||||||
|
|
||||||
if [ $# != 3 ]; then
|
if [ $# != 3 ]; then
|
||||||
echo "usage: $0 <cache-drive> <backing-pool> <percentage>"
|
echo "usage: $0 <cache-fs> <backing-pool> <percentage>"
|
||||||
exit 1
|
exit 1
|
||||||
fi
|
fi
|
||||||
|
|
||||||
|
@ -1506,7 +1488,7 @@ FUSE filesystem working from userspace there is an increase in
|
||||||
overhead relative to kernel based solutions. That said the performance
|
overhead relative to kernel based solutions. That said the performance
|
||||||
can match the theoretical max but it depends greatly on the system's
|
can match the theoretical max but it depends greatly on the system's
|
||||||
configuration. Especially when adding network filesystems into the mix
|
configuration. Especially when adding network filesystems into the mix
|
||||||
there are many variables which can impact performance. Drive speeds
|
there are many variables which can impact performance. Device speeds
|
||||||
and latency, network speeds and latency, general concurrency,
|
and latency, network speeds and latency, general concurrency,
|
||||||
read/write sizes, etc. Unfortunately, given the number of variables it
|
read/write sizes, etc. Unfortunately, given the number of variables it
|
||||||
has been difficult to find a single set of settings which provide
|
has been difficult to find a single set of settings which provide
|
||||||
|
@ -1528,7 +1510,7 @@ understand what behaviors it may impact
|
||||||
* disable `async_read`
|
* disable `async_read`
|
||||||
* test theoretical performance using `nullrw` or mounting a ram disk
|
* test theoretical performance using `nullrw` or mounting a ram disk
|
||||||
* use `symlinkify` if your data is largely static and read-only
|
* use `symlinkify` if your data is largely static and read-only
|
||||||
* use tiered cache drives
|
* use tiered cache devices
|
||||||
* use LVM and LVM cache to place a SSD in front of your HDDs
|
* use LVM and LVM cache to place a SSD in front of your HDDs
|
||||||
* increase readahead: `readahead=1024`
|
* increase readahead: `readahead=1024`
|
||||||
|
|
||||||
|
@ -1567,9 +1549,9 @@ the order listed (but not combined).
|
||||||
2. Mount mergerfs over `tmpfs`. `tmpfs` is a RAM disk. Extremely high
|
2. Mount mergerfs over `tmpfs`. `tmpfs` is a RAM disk. Extremely high
|
||||||
speed and very low latency. This is a more realistic best case
|
speed and very low latency. This is a more realistic best case
|
||||||
scenario. Example: `mount -t tmpfs -o size=2G tmpfs /tmp/tmpfs`
|
scenario. Example: `mount -t tmpfs -o size=2G tmpfs /tmp/tmpfs`
|
||||||
3. Mount mergerfs over a local drive. NVMe, SSD, HDD, etc. If you have
|
3. Mount mergerfs over a local device. NVMe, SSD, HDD, etc. If you
|
||||||
more than one I'd suggest testing each of them as drives and/or
|
have more than one I'd suggest testing each of them as drives
|
||||||
controllers (their drivers) could impact performance.
|
and/or controllers (their drivers) could impact performance.
|
||||||
4. Finally, if you intend to use mergerfs with a network filesystem,
|
4. Finally, if you intend to use mergerfs with a network filesystem,
|
||||||
either as the source of data or to combine with another through
|
either as the source of data or to combine with another through
|
||||||
mergerfs, test each of those alone as above.
|
mergerfs, test each of those alone as above.
|
||||||
|
@ -1579,7 +1561,7 @@ further testing with different options to see if they impact
|
||||||
performance. For reads and writes the most relevant would be:
|
performance. For reads and writes the most relevant would be:
|
||||||
`cache.files`, `async_read`. Less likely but relevant when using NFS
|
`cache.files`, `async_read`. Less likely but relevant when using NFS
|
||||||
or with certain filesystems would be `security_capability`, `xattr`,
|
or with certain filesystems would be `security_capability`, `xattr`,
|
||||||
and `posix_acl`. If you find a specific system, drive, filesystem,
|
and `posix_acl`. If you find a specific system, device, filesystem,
|
||||||
controller, etc. that performs poorly contact trapexit so he may
|
controller, etc. that performs poorly contact trapexit so he may
|
||||||
investigate further.
|
investigate further.
|
||||||
|
|
||||||
|
@ -1632,7 +1614,7 @@ echo 3 | sudo tee /proc/sys/vm/drop_caches
|
||||||
* If you don't see some directories and files you expect, policies
|
* If you don't see some directories and files you expect, policies
|
||||||
seem to skip branches, you get strange permission errors, etc. be
|
seem to skip branches, you get strange permission errors, etc. be
|
||||||
sure the underlying filesystems' permissions are all the same. Use
|
sure the underlying filesystems' permissions are all the same. Use
|
||||||
`mergerfs.fsck` to audit the drive for out of sync permissions.
|
`mergerfs.fsck` to audit the filesystem for out of sync permissions.
|
||||||
* If you still have permission issues be sure you are using POSIX ACL
|
* If you still have permission issues be sure you are using POSIX ACL
|
||||||
compliant filesystems. mergerfs doesn't generally make exceptions
|
compliant filesystems. mergerfs doesn't generally make exceptions
|
||||||
for FAT, NTFS, or other non-POSIX filesystem.
|
for FAT, NTFS, or other non-POSIX filesystem.
|
||||||
|
@ -1684,7 +1666,7 @@ outdated.
|
||||||
The reason this is the default is because any other policy would be
|
The reason this is the default is because any other policy would be
|
||||||
more expensive and for many applications it is unnecessary. To always
|
more expensive and for many applications it is unnecessary. To always
|
||||||
return the directory with the most recent mtime or a faked value based
|
return the directory with the most recent mtime or a faked value based
|
||||||
on all found would require a scan of all drives.
|
on all found would require a scan of all filesystems.
|
||||||
|
|
||||||
If you always want the directory information from the one with the
|
If you always want the directory information from the one with the
|
||||||
most recent mtime then use the `newest` policy for `getattr`.
|
most recent mtime then use the `newest` policy for `getattr`.
|
||||||
|
@ -1709,9 +1691,9 @@ then removing the source. Since the source **is** the target in this
|
||||||
case, depending on the unlink policy, it will remove the just copied
|
case, depending on the unlink policy, it will remove the just copied
|
||||||
file and other files across the branches.
|
file and other files across the branches.
|
||||||
|
|
||||||
If you want to move files to one drive just copy them there and use
|
If you want to move files to one filesystem just copy them there and
|
||||||
mergerfs.dedup to clean up the old paths or manually remove them from
|
use mergerfs.dedup to clean up the old paths or manually remove them
|
||||||
the branches directly.
|
from the branches directly.
|
||||||
|
|
||||||
|
|
||||||
#### cached memory appears greater than it should be
|
#### cached memory appears greater than it should be
|
||||||
|
@ -1772,15 +1754,14 @@ Please read the section above regarding [rename & link](#rename--link).
|
||||||
|
|
||||||
The problem is that many applications do not properly handle `EXDEV`
|
The problem is that many applications do not properly handle `EXDEV`
|
||||||
errors which `rename` and `link` may return even though they are
|
errors which `rename` and `link` may return even though they are
|
||||||
perfectly valid situations which do not indicate actual drive or OS
|
perfectly valid situations which do not indicate actual device,
|
||||||
errors. The error will only be returned by mergerfs if using a path
|
filesystem, or OS errors. The error will only be returned by mergerfs
|
||||||
preserving policy as described in the policy section above. If you do
|
if using a path preserving policy as described in the policy section
|
||||||
not care about path preservation simply change the mergerfs policy to
|
above. If you do not care about path preservation simply change the
|
||||||
the non-path preserving version. For example: `-o category.create=mfs`
|
mergerfs policy to the non-path preserving version. For example: `-o
|
||||||
|
category.create=mfs` Ideally the offending software would be fixed and
|
||||||
Ideally the offending software would be fixed and it is recommended
|
it is recommended that if you run into this problem you contact the
|
||||||
that if you run into this problem you contact the software's author
|
software's author and request proper handling of `EXDEV` errors.
|
||||||
and request proper handling of `EXDEV` errors.
|
|
||||||
|
|
||||||
|
|
||||||
#### my 32bit software has problems
|
#### my 32bit software has problems
|
||||||
|
@ -1887,9 +1868,10 @@ Users have reported running mergerfs on everything from a Raspberry Pi
|
||||||
to dual socket Xeon systems with >20 cores. I'm aware of at least a
|
to dual socket Xeon systems with >20 cores. I'm aware of at least a
|
||||||
few companies which use mergerfs in production. [Open Media
|
few companies which use mergerfs in production. [Open Media
|
||||||
Vault](https://www.openmediavault.org) includes mergerfs as its sole
|
Vault](https://www.openmediavault.org) includes mergerfs as its sole
|
||||||
solution for pooling drives. The author of mergerfs had it running for
|
solution for pooling filesystems. The author of mergerfs had it
|
||||||
over 300 days managing 16+ drives with reasonably heavy 24/7 read and
|
running for over 300 days managing 16+ devices with reasonably heavy
|
||||||
write usage. Stopping only after the machine's power supply died.
|
24/7 read and write usage. Stopping only after the machine's power
|
||||||
|
supply died.
|
||||||
|
|
||||||
Most serious issues (crashes or data corruption) have been due to
|
Most serious issues (crashes or data corruption) have been due to
|
||||||
[kernel
|
[kernel
|
||||||
|
@ -1897,14 +1879,14 @@ bugs](https://github.com/trapexit/mergerfs/wiki/Kernel-Issues-&-Bugs). All
|
||||||
of which are fixed in stable releases.
|
of which are fixed in stable releases.
|
||||||
|
|
||||||
|
|
||||||
#### Can mergerfs be used with drives which already have data / are in use?
|
#### Can mergerfs be used with filesystems which already have data / are in use?
|
||||||
|
|
||||||
Yes. MergerFS is a proxy and does **NOT** interfere with the normal
|
Yes. MergerFS is a proxy and does **NOT** interfere with the normal
|
||||||
form or function of the drives / mounts / paths it manages.
|
form or function of the filesystems / mounts / paths it manages.
|
||||||
|
|
||||||
MergerFS is **not** a traditional filesystem. MergerFS is **not**
|
MergerFS is **not** a traditional filesystem. MergerFS is **not**
|
||||||
RAID. It does **not** manipulate the data that passes through it. It
|
RAID. It does **not** manipulate the data that passes through it. It
|
||||||
does **not** shard data across drives. It merely shards some
|
does **not** shard data across filesystems. It merely shards some
|
||||||
**behavior** and aggregates others.
|
**behavior** and aggregates others.
|
||||||
|
|
||||||
|
|
||||||
|
@ -1920,8 +1902,8 @@ best off using `mfs` for `category.create`. It will spread files out
|
||||||
across your branches based on available space. Use `mspmfs` if you
|
across your branches based on available space. Use `mspmfs` if you
|
||||||
want to try to colocate the data a bit more. You may want to use `lus`
|
want to try to colocate the data a bit more. You may want to use `lus`
|
||||||
if you prefer a slightly different distribution of data if you have a
|
if you prefer a slightly different distribution of data if you have a
|
||||||
mix of smaller and larger drives. Generally though `mfs`, `lus`, or
|
mix of smaller and larger filesystems. Generally though `mfs`, `lus`,
|
||||||
even `rand` are good for the general use case. If you are starting
|
or even `rand` are good for the general use case. If you are starting
|
||||||
with an imbalanced pool you can use the tool **mergerfs.balance** to
|
with an imbalanced pool you can use the tool **mergerfs.balance** to
|
||||||
redistribute files across the pool.
|
redistribute files across the pool.
|
||||||
|
|
||||||
|
@ -1929,8 +1911,8 @@ If you really wish to try to colocate files based on directory you can
|
||||||
set `func.create` to `epmfs` or similar and `func.mkdir` to `rand` or
|
set `func.create` to `epmfs` or similar and `func.mkdir` to `rand` or
|
||||||
`eprand` depending on if you just want to colocate generally or on
|
`eprand` depending on if you just want to colocate generally or on
|
||||||
specific branches. Either way the *need* to colocate is rare. For
|
specific branches. Either way the *need* to colocate is rare. For
|
||||||
instance: if you wish to remove the drive regularly and want the data
|
instance: if you wish to remove the device regularly and want the data
|
||||||
to predictably be on that drive or if you don't use backup at all and
|
to predictably be on that device or if you don't use backup at all and
|
||||||
don't wish to replace that data piecemeal. In which case using path
|
don't wish to replace that data piecemeal. In which case using path
|
||||||
preservation can help but will require some manual
|
preservation can help but will require some manual
|
||||||
attention. Colocating after the fact can be accomplished using the
|
attention. Colocating after the fact can be accomplished using the
|
||||||
|
@ -1965,29 +1947,29 @@ That said, for the average person, the following should be fine:
|
||||||
`cache.files=off,dropcacheonclose=true,category.create=mfs`
|
`cache.files=off,dropcacheonclose=true,category.create=mfs`
|
||||||
|
|
||||||
|
|
||||||
#### Why are all my files ending up on 1 drive?!
|
#### Why are all my files ending up on 1 filesystem?!
|
||||||
|
|
||||||
Did you start with empty drives? Did you explicitly configure a
|
Did you start with empty filesystems? Did you explicitly configure a
|
||||||
`category.create` policy? Are you using an `existing path` / `path
|
`category.create` policy? Are you using an `existing path` / `path
|
||||||
preserving` policy?
|
preserving` policy?
|
||||||
|
|
||||||
The default create policy is `epmfs`. That is a path preserving
|
The default create policy is `epmfs`. That is a path preserving
|
||||||
algorithm. With such a policy for `mkdir` and `create` with a set of
|
algorithm. With such a policy for `mkdir` and `create` with a set of
|
||||||
empty drives it will select only 1 drive when the first directory is
|
empty filesystems it will select only 1 filesystem when the first
|
||||||
created. Anything, files or directories, created in that first
|
directory is created. Anything, files or directories, created in that
|
||||||
directory will be placed on the same branch because it is preserving
|
first directory will be placed on the same branch because it is
|
||||||
paths.
|
preserving paths.
|
||||||
|
|
||||||
This catches a lot of new users off guard but changing the default
|
This catches a lot of new users off guard but changing the default
|
||||||
would break the setup for many existing users. If you do not care
|
would break the setup for many existing users. If you do not care
|
||||||
about path preservation and wish your files to be spread across all
|
about path preservation and wish your files to be spread across all
|
||||||
your drives change to `mfs` or similar policy as described above. If
|
your filesystems change to `mfs` or similar policy as described
|
||||||
you do want path preservation you'll need to perform the manual act of
|
above. If you do want path preservation you'll need to perform the
|
||||||
creating paths on the drives you want the data to land on before
|
manual act of creating paths on the filesystems you want the data to
|
||||||
transferring your data. Setting `func.mkdir=epall` can simplify
|
land on before transferring your data. Setting `func.mkdir=epall` can
|
||||||
managing path preservation for `create`. Or use `func.mkdir=rand` if
|
simplify managing path preservation for `create`. Or use
|
||||||
you're interested in just grouping together directory content by
|
`func.mkdir=rand` if you're interested in just grouping together
|
||||||
drive.
|
directory content by filesystem.
|
||||||
|
|
||||||
|
|
||||||
#### Do hardlinks work?
|
#### Do hardlinks work?
|
||||||
|
@ -2058,8 +2040,8 @@ such, mergerfs always changes its credentials to that of the
|
||||||
caller. This means that if the user does not have access to a file or
|
caller. This means that if the user does not have access to a file or
|
||||||
directory than neither will mergerfs. However, because mergerfs is
|
directory than neither will mergerfs. However, because mergerfs is
|
||||||
creating a union of paths it may be able to read some files and
|
creating a union of paths it may be able to read some files and
|
||||||
directories on one drive but not another resulting in an incomplete
|
directories on one filesystem but not another resulting in an
|
||||||
set.
|
incomplete set.
|
||||||
|
|
||||||
Whenever you run into a split permission issue (seeing some but not
|
Whenever you run into a split permission issue (seeing some but not
|
||||||
all files) try using
|
all files) try using
|
||||||
|
@ -2153,9 +2135,10 @@ overlayfs have.
|
||||||
#### Why use mergerfs over unionfs?
|
#### Why use mergerfs over unionfs?
|
||||||
|
|
||||||
UnionFS is more like aufs than mergerfs in that it offers overlay /
|
UnionFS is more like aufs than mergerfs in that it offers overlay /
|
||||||
CoW features. If you're just looking to create a union of drives and
|
CoW features. If you're just looking to create a union of filesystems
|
||||||
want flexibility in file/directory placement then mergerfs offers that
|
and want flexibility in file/directory placement then mergerfs offers
|
||||||
whereas unionfs is more for overlaying RW filesystems over RO ones.
|
that whereas unionfs is more for overlaying RW filesystems over RO
|
||||||
|
ones.
|
||||||
|
|
||||||
|
|
||||||
#### Why use mergerfs over overlayfs?
|
#### Why use mergerfs over overlayfs?
|
||||||
|
@ -2179,8 +2162,8 @@ without the single point of failure.
|
||||||
#### Why use mergerfs over ZFS?
|
#### Why use mergerfs over ZFS?
|
||||||
|
|
||||||
MergerFS is not intended to be a replacement for ZFS. MergerFS is
|
MergerFS is not intended to be a replacement for ZFS. MergerFS is
|
||||||
intended to provide flexible pooling of arbitrary drives (local or
|
intended to provide flexible pooling of arbitrary filesystems (local
|
||||||
remote), of arbitrary sizes, and arbitrary filesystems. For `write
|
or remote), of arbitrary sizes, and arbitrary filesystems. For `write
|
||||||
once, read many` usecases such as bulk media storage. Where data
|
once, read many` usecases such as bulk media storage. Where data
|
||||||
integrity and backup is managed in other ways. In that situation ZFS
|
integrity and backup is managed in other ways. In that situation ZFS
|
||||||
can introduce a number of costs and limitations as described
|
can introduce a number of costs and limitations as described
|
||||||
|
@ -2200,6 +2183,29 @@ There are a number of UnRAID users who use mergerfs as well though I'm
|
||||||
not entirely familiar with the use case.
|
not entirely familiar with the use case.
|
||||||
|
|
||||||
|
|
||||||
|
#### Why use mergerfs over StableBit's DrivePool?
|
||||||
|
|
||||||
|
DrivePool works only on Windows so not as common an alternative as
|
||||||
|
other Linux solutions. If you want to use Windows then DrivePool is a
|
||||||
|
good option. Functionally the two projects work a bit
|
||||||
|
differently. DrivePool always writes to the filesystem with the most
|
||||||
|
free space and later rebalances. mergerfs does not offer rebalance but
|
||||||
|
chooses a branch at file/directory create time. DrivePool's
|
||||||
|
rebalancing can be done differently in any directory and has file
|
||||||
|
pattern matching to further customize the behavior. mergerfs, not
|
||||||
|
having rebalancing does not have these features, but similar features
|
||||||
|
are planned for mergerfs v3. DrivePool has builtin file duplication
|
||||||
|
which mergerfs does not natively support (but can be done via an
|
||||||
|
external script.)
|
||||||
|
|
||||||
|
There are a lot of misc differences between the two projects but most
|
||||||
|
features in DrivePool can be replicated with external tools in
|
||||||
|
combination with mergerfs.
|
||||||
|
|
||||||
|
Additionally DrivePool is a closed source commercial product vs
|
||||||
|
mergerfs a ISC licensed OSS project.
|
||||||
|
|
||||||
|
|
||||||
#### What should mergerfs NOT be used for?
|
#### What should mergerfs NOT be used for?
|
||||||
|
|
||||||
* databases: Even if the database stored data in separate files
|
* databases: Even if the database stored data in separate files
|
||||||
|
@ -2214,7 +2220,7 @@ not entirely familiar with the use case.
|
||||||
availability you should stick with RAID.
|
availability you should stick with RAID.
|
||||||
|
|
||||||
|
|
||||||
#### Can drives be written to directly? Outside of mergerfs while pooled?
|
#### Can filesystems be written to directly? Outside of mergerfs while pooled?
|
||||||
|
|
||||||
Yes, however it's not recommended to use the same file from within the
|
Yes, however it's not recommended to use the same file from within the
|
||||||
pool and from without at the same time (particularly
|
pool and from without at the same time (particularly
|
||||||
|
@ -2244,7 +2250,7 @@ was asked of it: filtering possible branches due to those
|
||||||
settings. Only one error can be returned and if one of the reasons for
|
settings. Only one error can be returned and if one of the reasons for
|
||||||
filtering a branch was **minfreespace** then it will be returned as
|
filtering a branch was **minfreespace** then it will be returned as
|
||||||
such. **moveonenospc** is only relevant to writing a file which is too
|
such. **moveonenospc** is only relevant to writing a file which is too
|
||||||
large for the drive its currently on.
|
large for the filesystem it's currently on.
|
||||||
|
|
||||||
It is also possible that the filesystem selected has run out of
|
It is also possible that the filesystem selected has run out of
|
||||||
inodes. Use `df -i` to list the total and available inodes per
|
inodes. Use `df -i` to list the total and available inodes per
|
||||||
|
@ -2336,7 +2342,8 @@ away by using realtime signals to inform all threads to change
|
||||||
credentials. Taking after **Samba**, mergerfs uses
|
credentials. Taking after **Samba**, mergerfs uses
|
||||||
**syscall(SYS_setreuid,...)** to set the callers credentials for that
|
**syscall(SYS_setreuid,...)** to set the callers credentials for that
|
||||||
thread only. Jumping back to **root** as necessary should escalated
|
thread only. Jumping back to **root** as necessary should escalated
|
||||||
privileges be needed (for instance: to clone paths between drives).
|
privileges be needed (for instance: to clone paths between
|
||||||
|
filesystems).
|
||||||
|
|
||||||
For non-Linux systems mergerfs uses a read-write lock and changes
|
For non-Linux systems mergerfs uses a read-write lock and changes
|
||||||
credentials only when necessary. If multiple threads are to be user X
|
credentials only when necessary. If multiple threads are to be user X
|
||||||
|
|
263
man/mergerfs.1
263
man/mergerfs.1
|
@ -77,9 +77,9 @@ A + B = C
|
||||||
mergerfs does \f[B]not\f[R] support the copy-on-write (CoW) or whiteout
|
mergerfs does \f[B]not\f[R] support the copy-on-write (CoW) or whiteout
|
||||||
behaviors found in \f[B]aufs\f[R] and \f[B]overlayfs\f[R].
|
behaviors found in \f[B]aufs\f[R] and \f[B]overlayfs\f[R].
|
||||||
You can \f[B]not\f[R] mount a read-only filesystem and write to it.
|
You can \f[B]not\f[R] mount a read-only filesystem and write to it.
|
||||||
However, mergerfs will ignore read-only drives when creating new files
|
However, mergerfs will ignore read-only filesystems when creating new
|
||||||
so you can mix read-write and read-only drives.
|
files so you can mix read-write and read-only filesystems.
|
||||||
It also does \f[B]not\f[R] split data across drives.
|
It also does \f[B]not\f[R] split data across filesystems.
|
||||||
It is not RAID0 / striping.
|
It is not RAID0 / striping.
|
||||||
It is simply a union of other filesystems.
|
It is simply a union of other filesystems.
|
||||||
.SH TERMINOLOGY
|
.SH TERMINOLOGY
|
||||||
|
@ -210,7 +210,8 @@ Typically rename and link act differently depending on the policy of
|
||||||
\f[C]create\f[R] (read below).
|
\f[C]create\f[R] (read below).
|
||||||
Enabling this will cause rename and link to always use the non-path
|
Enabling this will cause rename and link to always use the non-path
|
||||||
preserving behavior.
|
preserving behavior.
|
||||||
This means files, when renamed or linked, will stay on the same drive.
|
This means files, when renamed or linked, will stay on the same
|
||||||
|
filesystem.
|
||||||
(default: false)
|
(default: false)
|
||||||
.IP \[bu] 2
|
.IP \[bu] 2
|
||||||
\f[B]security_capability=BOOL\f[R]: If false return ENOATTR when xattr
|
\f[B]security_capability=BOOL\f[R]: If false return ENOATTR when xattr
|
||||||
|
@ -233,7 +234,7 @@ to cow-shell.
|
||||||
.IP \[bu] 2
|
.IP \[bu] 2
|
||||||
\f[B]statfs=base|full\f[R]: Controls how statfs works.
|
\f[B]statfs=base|full\f[R]: Controls how statfs works.
|
||||||
`base' means it will always use all branches in statfs calculations.
|
`base' means it will always use all branches in statfs calculations.
|
||||||
`full' is in effect path preserving and only includes drives where the
|
`full' is in effect path preserving and only includes branches where the
|
||||||
path exists.
|
path exists.
|
||||||
(default: base)
|
(default: base)
|
||||||
.IP \[bu] 2
|
.IP \[bu] 2
|
||||||
|
@ -442,10 +443,10 @@ POLICY = mergerfs function policy
|
||||||
.PP
|
.PP
|
||||||
The `branches' argument is a colon (`:') delimited list of paths to be
|
The `branches' argument is a colon (`:') delimited list of paths to be
|
||||||
pooled together.
|
pooled together.
|
||||||
It does not matter if the paths are on the same or different drives nor
|
It does not matter if the paths are on the same or different filesystems
|
||||||
does it matter the filesystem (within reason).
|
nor does it matter the filesystem type (within reason).
|
||||||
Used and available space will not be duplicated for paths on the same
|
Used and available space will not be duplicated for paths on the same
|
||||||
device and any features which aren\[cq]t supported by the underlying
|
filesystem and any features which aren\[cq]t supported by the underlying
|
||||||
filesystem (such as file attributes or extended attributes) will return
|
filesystem (such as file attributes or extended attributes) will return
|
||||||
the appropriate errors.
|
the appropriate errors.
|
||||||
.PP
|
.PP
|
||||||
|
@ -454,7 +455,7 @@ A type which impacts whether or not the branch is included in a policy
|
||||||
calculation and a individual minfreespace value.
|
calculation and a individual minfreespace value.
|
||||||
The values are set by prepending an \f[C]=\f[R] at the end of a branch
|
The values are set by prepending an \f[C]=\f[R] at the end of a branch
|
||||||
designation and using commas as delimiters.
|
designation and using commas as delimiters.
|
||||||
Example: /mnt/drive=RW,1234
|
Example: \f[C]/mnt/drive=RW,1234\f[R]
|
||||||
.SS branch mode
|
.SS branch mode
|
||||||
.IP \[bu] 2
|
.IP \[bu] 2
|
||||||
RW: (read/write) - Default behavior.
|
RW: (read/write) - Default behavior.
|
||||||
|
@ -748,8 +749,8 @@ This is unlikely to occur in practice but is something to keep in mind.
|
||||||
\f[B]WARNING:\f[R] Some backup solutions, such as CrashPlan, do not
|
\f[B]WARNING:\f[R] Some backup solutions, such as CrashPlan, do not
|
||||||
backup the target of a symlink.
|
backup the target of a symlink.
|
||||||
If using this feature it will be necessary to point any backup software
|
If using this feature it will be necessary to point any backup software
|
||||||
to the original drives or configure the software to follow symlinks if
|
to the original filesystems or configure the software to follow symlinks
|
||||||
such an option is available.
|
if such an option is available.
|
||||||
Alternatively create two mounts.
|
Alternatively create two mounts.
|
||||||
One for backup and one for general consumption.
|
One for backup and one for general consumption.
|
||||||
.SS nullrw
|
.SS nullrw
|
||||||
|
@ -939,11 +940,11 @@ All policies which start with \f[C]ep\f[R] (\f[B]epff\f[R],
|
||||||
\f[C]path preserving\f[R].
|
\f[C]path preserving\f[R].
|
||||||
\f[C]ep\f[R] stands for \f[C]existing path\f[R].
|
\f[C]ep\f[R] stands for \f[C]existing path\f[R].
|
||||||
.PP
|
.PP
|
||||||
A path preserving policy will only consider drives where the relative
|
A path preserving policy will only consider branches where the relative
|
||||||
path being accessed already exists.
|
path being accessed already exists.
|
||||||
.PP
|
.PP
|
||||||
When using non-path preserving policies paths will be cloned to target
|
When using non-path preserving policies paths will be cloned to target
|
||||||
drives as necessary.
|
branches as necessary.
|
||||||
.PP
|
.PP
|
||||||
With the \f[C]msp\f[R] or \f[C]most shared path\f[R] policies they are
|
With the \f[C]msp\f[R] or \f[C]most shared path\f[R] policies they are
|
||||||
defined as \f[C]path preserving\f[R] for the purpose of controlling
|
defined as \f[C]path preserving\f[R] for the purpose of controlling
|
||||||
|
@ -990,19 +991,19 @@ T}
|
||||||
T{
|
T{
|
||||||
eplfs (existing path, least free space)
|
eplfs (existing path, least free space)
|
||||||
T}@T{
|
T}@T{
|
||||||
Of all the branches on which the relative path exists choose the drive
|
Of all the branches on which the relative path exists choose the branch
|
||||||
with the least free space.
|
with the least free space.
|
||||||
T}
|
T}
|
||||||
T{
|
T{
|
||||||
eplus (existing path, least used space)
|
eplus (existing path, least used space)
|
||||||
T}@T{
|
T}@T{
|
||||||
Of all the branches on which the relative path exists choose the drive
|
Of all the branches on which the relative path exists choose the branch
|
||||||
with the least used space.
|
with the least used space.
|
||||||
T}
|
T}
|
||||||
T{
|
T{
|
||||||
epmfs (existing path, most free space)
|
epmfs (existing path, most free space)
|
||||||
T}@T{
|
T}@T{
|
||||||
Of all the branches on which the relative path exists choose the drive
|
Of all the branches on which the relative path exists choose the branch
|
||||||
with the most free space.
|
with the most free space.
|
||||||
T}
|
T}
|
||||||
T{
|
T{
|
||||||
|
@ -1019,23 +1020,23 @@ T}
|
||||||
T{
|
T{
|
||||||
ff (first found)
|
ff (first found)
|
||||||
T}@T{
|
T}@T{
|
||||||
Given the order of the drives, as defined at mount time or configured at
|
Given the order of the branches, as defined at mount time or configured
|
||||||
runtime, act on the first one found.
|
at runtime, act on the first one found.
|
||||||
T}
|
T}
|
||||||
T{
|
T{
|
||||||
lfs (least free space)
|
lfs (least free space)
|
||||||
T}@T{
|
T}@T{
|
||||||
Pick the drive with the least available free space.
|
Pick the branch with the least available free space.
|
||||||
T}
|
T}
|
||||||
T{
|
T{
|
||||||
lus (least used space)
|
lus (least used space)
|
||||||
T}@T{
|
T}@T{
|
||||||
Pick the drive with the least used space.
|
Pick the branch with the least used space.
|
||||||
T}
|
T}
|
||||||
T{
|
T{
|
||||||
mfs (most free space)
|
mfs (most free space)
|
||||||
T}@T{
|
T}@T{
|
||||||
Pick the drive with the most available free space.
|
Pick the branch with the most available free space.
|
||||||
T}
|
T}
|
||||||
T{
|
T{
|
||||||
msplfs (most shared path, least free space)
|
msplfs (most shared path, least free space)
|
||||||
|
@ -1141,8 +1142,8 @@ If a rename can\[cq]t be done atomically due to the source and
|
||||||
destination paths existing on different mount points it will return
|
destination paths existing on different mount points it will return
|
||||||
\f[B]-1\f[R] with \f[B]errno = EXDEV\f[R] (cross device / improper
|
\f[B]-1\f[R] with \f[B]errno = EXDEV\f[R] (cross device / improper
|
||||||
link).
|
link).
|
||||||
So if a \f[C]rename\f[R]\[cq]s source and target are on different drives
|
So if a \f[C]rename\f[R]\[cq]s source and target are on different
|
||||||
within the pool it creates an issue.
|
filesystems within the pool it creates an issue.
|
||||||
.PP
|
.PP
|
||||||
Originally mergerfs would return EXDEV whenever a rename was requested
|
Originally mergerfs would return EXDEV whenever a rename was requested
|
||||||
which was cross directory in any way.
|
which was cross directory in any way.
|
||||||
|
@ -1169,7 +1170,7 @@ For each file attempt rename:
|
||||||
If failure with ENOENT (no such file or directory) run \f[B]create\f[R]
|
If failure with ENOENT (no such file or directory) run \f[B]create\f[R]
|
||||||
policy
|
policy
|
||||||
.IP \[bu] 2
|
.IP \[bu] 2
|
||||||
If create policy returns the same drive as currently evaluating then
|
If create policy returns the same branch as currently evaluating then
|
||||||
clone the path
|
clone the path
|
||||||
.IP \[bu] 2
|
.IP \[bu] 2
|
||||||
Re-attempt rename
|
Re-attempt rename
|
||||||
|
@ -1184,9 +1185,9 @@ returned
|
||||||
On success:
|
On success:
|
||||||
.RS 2
|
.RS 2
|
||||||
.IP \[bu] 2
|
.IP \[bu] 2
|
||||||
Remove the target from all drives with no source file
|
Remove the target from all branches with no source file
|
||||||
.IP \[bu] 2
|
.IP \[bu] 2
|
||||||
Remove the source from all drives which failed to rename
|
Remove the source from all branches which failed to rename
|
||||||
.RE
|
.RE
|
||||||
.RE
|
.RE
|
||||||
.IP \[bu] 2
|
.IP \[bu] 2
|
||||||
|
@ -1201,10 +1202,10 @@ Using the \f[B]getattr\f[R] policy get the target path
|
||||||
For each file attempt rename:
|
For each file attempt rename:
|
||||||
.RS 2
|
.RS 2
|
||||||
.IP \[bu] 2
|
.IP \[bu] 2
|
||||||
If the source drive != target drive:
|
If the source branch != target branch:
|
||||||
.RS 2
|
.RS 2
|
||||||
.IP \[bu] 2
|
.IP \[bu] 2
|
||||||
Clone target path from target drive to source drive
|
Clone target path from target branch to source branch
|
||||||
.RE
|
.RE
|
||||||
.IP \[bu] 2
|
.IP \[bu] 2
|
||||||
Rename
|
Rename
|
||||||
|
@ -1219,9 +1220,9 @@ returned
|
||||||
On success:
|
On success:
|
||||||
.RS 2
|
.RS 2
|
||||||
.IP \[bu] 2
|
.IP \[bu] 2
|
||||||
Remove the target from all drives with no source file
|
Remove the target from all branches with no source file
|
||||||
.IP \[bu] 2
|
.IP \[bu] 2
|
||||||
Remove the source from all drives which failed to rename
|
Remove the source from all branches which failed to rename
|
||||||
.RE
|
.RE
|
||||||
.RE
|
.RE
|
||||||
.PP
|
.PP
|
||||||
|
@ -1247,14 +1248,14 @@ file/directory which is the source of the metadata you see in an
|
||||||
.SS statfs / statvfs
|
.SS statfs / statvfs
|
||||||
.PP
|
.PP
|
||||||
statvfs (http://linux.die.net/man/2/statvfs) normalizes the source
|
statvfs (http://linux.die.net/man/2/statvfs) normalizes the source
|
||||||
drives based on the fragment size and sums the number of adjusted blocks
|
filesystems based on the fragment size and sums the number of adjusted
|
||||||
and inodes.
|
blocks and inodes.
|
||||||
This means you will see the combined space of all sources.
|
This means you will see the combined space of all sources.
|
||||||
Total, used, and free.
|
Total, used, and free.
|
||||||
The sources however are dedupped based on the drive so multiple sources
|
The sources however are dedupped based on the filesystem so multiple
|
||||||
on the same drive will not result in double counting its space.
|
sources on the same drive will not result in double counting its space.
|
||||||
Filesystems mounted further down the tree of the branch will not be
|
Other filesystems mounted further down the tree of the branch will not
|
||||||
included when checking the mount\[cq]s stats.
|
be included when checking the mount\[cq]s stats.
|
||||||
.PP
|
.PP
|
||||||
The options \f[C]statfs\f[R] and \f[C]statfs_ignore\f[R] can be used to
|
The options \f[C]statfs\f[R] and \f[C]statfs_ignore\f[R] can be used to
|
||||||
modify \f[C]statfs\f[R] behavior.
|
modify \f[C]statfs\f[R] behavior.
|
||||||
|
@ -1611,11 +1612,11 @@ mergerfs.dedup: Will help identify and optionally remove duplicate files
|
||||||
mergerfs.dup: Ensure there are at least N copies of a file across the
|
mergerfs.dup: Ensure there are at least N copies of a file across the
|
||||||
pool
|
pool
|
||||||
.IP \[bu] 2
|
.IP \[bu] 2
|
||||||
mergerfs.balance: Rebalance files across drives by moving them from the
|
mergerfs.balance: Rebalance files across filesystems by moving them from
|
||||||
most filled to the least filled
|
the most filled to the least filled
|
||||||
.IP \[bu] 2
|
.IP \[bu] 2
|
||||||
mergerfs.consolidate: move files within a single mergerfs directory to
|
mergerfs.consolidate: move files within a single mergerfs directory to
|
||||||
the drive with most free space
|
the filesystem with most free space
|
||||||
.RE
|
.RE
|
||||||
.IP \[bu] 2
|
.IP \[bu] 2
|
||||||
https://github.com/trapexit/scorch
|
https://github.com/trapexit/scorch
|
||||||
|
@ -1746,40 +1747,21 @@ Note that if an application is properly sizing writes then writeback
|
||||||
caching will have little or no effect.
|
caching will have little or no effect.
|
||||||
It will only help with writes of sizes below the FUSE message size (128K
|
It will only help with writes of sizes below the FUSE message size (128K
|
||||||
on older kernels, 1M on newer).
|
on older kernels, 1M on newer).
|
||||||
.SS policy caching
|
|
||||||
.PP
|
|
||||||
Policies are run every time a function (with a policy as mentioned
|
|
||||||
above) is called.
|
|
||||||
These policies can be expensive depending on mergerfs\[cq] setup and
|
|
||||||
client usage patterns.
|
|
||||||
Generally we wouldn\[cq]t want to cache policy results because it may
|
|
||||||
result in stale responses if the underlying drives are used directly.
|
|
||||||
.PP
|
|
||||||
The \f[C]open\f[R] policy cache will cache the result of an
|
|
||||||
\f[C]open\f[R] policy for a particular input for \f[C]cache.open\f[R]
|
|
||||||
seconds or until the file is unlinked.
|
|
||||||
Each file close (release) will randomly chose to clean up the cache of
|
|
||||||
expired entries.
|
|
||||||
.PP
|
|
||||||
This cache is really only useful in cases where you have a large number
|
|
||||||
of branches and \f[C]open\f[R] is called on the same files repeatedly
|
|
||||||
(like \f[B]Transmission\f[R] which opens and closes a file on every
|
|
||||||
read/write presumably to keep file handle usage low).
|
|
||||||
.SS statfs caching
|
.SS statfs caching
|
||||||
.PP
|
.PP
|
||||||
Of the syscalls used by mergerfs in policies the \f[C]statfs\f[R] /
|
Of the syscalls used by mergerfs in policies the \f[C]statfs\f[R] /
|
||||||
\f[C]statvfs\f[R] call is perhaps the most expensive.
|
\f[C]statvfs\f[R] call is perhaps the most expensive.
|
||||||
It\[cq]s used to find out the available space of a drive and whether it
|
It\[cq]s used to find out the available space of a filesystem and
|
||||||
is mounted read-only.
|
whether it is mounted read-only.
|
||||||
Depending on the setup and usage pattern these queries can be relatively
|
Depending on the setup and usage pattern these queries can be relatively
|
||||||
costly.
|
costly.
|
||||||
When \f[C]cache.statfs\f[R] is enabled all calls to \f[C]statfs\f[R] by
|
When \f[C]cache.statfs\f[R] is enabled all calls to \f[C]statfs\f[R] by
|
||||||
a policy will be cached for the number of seconds its set to.
|
a policy will be cached for the number of seconds its set to.
|
||||||
.PP
|
.PP
|
||||||
Example: If the create policy is \f[C]mfs\f[R] and the timeout is 60
|
Example: If the create policy is \f[C]mfs\f[R] and the timeout is 60
|
||||||
then for that 60 seconds the same drive will be returned as the target
|
then for that 60 seconds the same filesystem will be returned as the
|
||||||
for creates because the available space won\[cq]t be updated for that
|
target for creates because the available space won\[cq]t be updated for
|
||||||
time.
|
that time.
|
||||||
.SS symlink caching
|
.SS symlink caching
|
||||||
.PP
|
.PP
|
||||||
As of version 4.20 Linux supports symlink caching.
|
As of version 4.20 Linux supports symlink caching.
|
||||||
|
@ -1815,54 +1797,55 @@ NVMe, SSD, Optane in front of traditional HDDs for instance.
|
||||||
MergerFS does not natively support any sort of tiered caching.
|
MergerFS does not natively support any sort of tiered caching.
|
||||||
Most users have no use for such a feature and its inclusion would
|
Most users have no use for such a feature and its inclusion would
|
||||||
complicate the code.
|
complicate the code.
|
||||||
However, there are a few situations where a cache drive could help with
|
However, there are a few situations where a cache filesystem could help
|
||||||
a typical mergerfs setup.
|
with a typical mergerfs setup.
|
||||||
.IP "1." 3
|
.IP "1." 3
|
||||||
Fast network, slow drives, many readers: You\[cq]ve a 10+Gbps network
|
Fast network, slow filesystems, many readers: You\[cq]ve a 10+Gbps
|
||||||
with many readers and your regular drives can\[cq]t keep up.
|
network with many readers and your regular filesystems can\[cq]t keep
|
||||||
|
up.
|
||||||
.IP "2." 3
|
.IP "2." 3
|
||||||
Fast network, slow drives, small\[cq]ish bursty writes: You have a
|
Fast network, slow filesystems, small\[cq]ish bursty writes: You have a
|
||||||
10+Gbps network and wish to transfer amounts of data less than your
|
10+Gbps network and wish to transfer amounts of data less than your
|
||||||
cache drive but wish to do so quickly.
|
cache filesystem but wish to do so quickly.
|
||||||
.PP
|
.PP
|
||||||
With #1 it\[cq]s arguable if you should be using mergerfs at all.
|
With #1 it\[cq]s arguable if you should be using mergerfs at all.
|
||||||
RAID would probably be the better solution.
|
RAID would probably be the better solution.
|
||||||
If you\[cq]re going to use mergerfs there are other tactics that may
|
If you\[cq]re going to use mergerfs there are other tactics that may
|
||||||
help: spreading the data across drives (see the mergerfs.dup tool) and
|
help: spreading the data across filesystems (see the mergerfs.dup tool)
|
||||||
setting \f[C]func.open=rand\f[R], using \f[C]symlinkify\f[R], or using
|
and setting \f[C]func.open=rand\f[R], using \f[C]symlinkify\f[R], or
|
||||||
dm-cache or a similar technology to add tiered cache to the underlying
|
using dm-cache or a similar technology to add tiered cache to the
|
||||||
device.
|
underlying device.
|
||||||
.PP
|
.PP
|
||||||
With #2 one could use dm-cache as well but there is another solution
|
With #2 one could use dm-cache as well but there is another solution
|
||||||
which requires only mergerfs and a cronjob.
|
which requires only mergerfs and a cronjob.
|
||||||
.IP "1." 3
|
.IP "1." 3
|
||||||
Create 2 mergerfs pools.
|
Create 2 mergerfs pools.
|
||||||
One which includes just the slow drives and one which has both the fast
|
One which includes just the slow devices and one which has both the fast
|
||||||
drives (SSD,NVME,etc.) and slow drives.
|
devices (SSD,NVME,etc.) and slow devices.
|
||||||
.IP "2." 3
|
.IP "2." 3
|
||||||
The `cache' pool should have the cache drives listed first.
|
The `cache' pool should have the cache filesystems listed first.
|
||||||
.IP "3." 3
|
.IP "3." 3
|
||||||
The best \f[C]create\f[R] policies to use for the `cache' pool would
|
The best \f[C]create\f[R] policies to use for the `cache' pool would
|
||||||
probably be \f[C]ff\f[R], \f[C]epff\f[R], \f[C]lfs\f[R], or
|
probably be \f[C]ff\f[R], \f[C]epff\f[R], \f[C]lfs\f[R], or
|
||||||
\f[C]eplfs\f[R].
|
\f[C]eplfs\f[R].
|
||||||
The latter two under the assumption that the cache drive(s) are far
|
The latter two under the assumption that the cache filesystem(s) are far
|
||||||
smaller than the backing drives.
|
smaller than the backing filesystems.
|
||||||
If using path preserving policies remember that you\[cq]ll need to
|
If using path preserving policies remember that you\[cq]ll need to
|
||||||
manually create the core directories of those paths you wish to be
|
manually create the core directories of those paths you wish to be
|
||||||
cached.
|
cached.
|
||||||
Be sure the permissions are in sync.
|
Be sure the permissions are in sync.
|
||||||
Use \f[C]mergerfs.fsck\f[R] to check / correct them.
|
Use \f[C]mergerfs.fsck\f[R] to check / correct them.
|
||||||
You could also tag the slow drives as \f[C]=NC\f[R] though that\[cq]d
|
You could also set the slow filesystems mode to \f[C]NC\f[R] though
|
||||||
mean if the cache drives fill you\[cq]d get \[lq]out of space\[rq]
|
that\[cq]d mean if the cache filesystems fill you\[cq]d get \[lq]out of
|
||||||
errors.
|
space\[rq] errors.
|
||||||
.IP "4." 3
|
.IP "4." 3
|
||||||
Enable \f[C]moveonenospc\f[R] and set \f[C]minfreespace\f[R]
|
Enable \f[C]moveonenospc\f[R] and set \f[C]minfreespace\f[R]
|
||||||
appropriately.
|
appropriately.
|
||||||
To make sure there is enough room on the \[lq]slow\[rq] pool you might
|
To make sure there is enough room on the \[lq]slow\[rq] pool you might
|
||||||
want to set \f[C]minfreespace\f[R] to at least as large as the size of
|
want to set \f[C]minfreespace\f[R] to at least as large as the size of
|
||||||
the largest cache drive if not larger.
|
the largest cache filesystem if not larger.
|
||||||
This way in the worst case the whole of the cache drive(s) can be moved
|
This way in the worst case the whole of the cache filesystem(s) can be
|
||||||
to the other drives.
|
moved to the other drives.
|
||||||
.IP "5." 3
|
.IP "5." 3
|
||||||
Set your programs to use the cache pool.
|
Set your programs to use the cache pool.
|
||||||
.IP "6." 3
|
.IP "6." 3
|
||||||
|
@ -1880,8 +1863,8 @@ May want to use the \f[C]fadvise\f[R] / \f[C]--drop-cache\f[R] version
|
||||||
of rsync or run rsync with the tool \[lq]nocache\[rq].
|
of rsync or run rsync with the tool \[lq]nocache\[rq].
|
||||||
.PP
|
.PP
|
||||||
\f[I]NOTE:\f[R] The arguments to these scripts include the cache
|
\f[I]NOTE:\f[R] The arguments to these scripts include the cache
|
||||||
\f[B]drive\f[R].
|
\f[B]filesystem\f[R] itself.
|
||||||
Not the pool with the cache drive.
|
Not the pool with the cache filesystem.
|
||||||
You could have data loss if the source is the cache pool.
|
You could have data loss if the source is the cache pool.
|
||||||
.IP
|
.IP
|
||||||
.nf
|
.nf
|
||||||
|
@ -1889,7 +1872,7 @@ You could have data loss if the source is the cache pool.
|
||||||
#!/bin/bash
|
#!/bin/bash
|
||||||
|
|
||||||
if [ $# != 3 ]; then
|
if [ $# != 3 ]; then
|
||||||
echo \[dq]usage: $0 <cache-drive> <backing-pool> <days-old>\[dq]
|
echo \[dq]usage: $0 <cache-fs> <backing-pool> <days-old>\[dq]
|
||||||
exit 1
|
exit 1
|
||||||
fi
|
fi
|
||||||
|
|
||||||
|
@ -1907,8 +1890,8 @@ Move the oldest file from the cache to the backing pool.
|
||||||
Continue till below percentage threshold.
|
Continue till below percentage threshold.
|
||||||
.PP
|
.PP
|
||||||
\f[I]NOTE:\f[R] The arguments to these scripts include the cache
|
\f[I]NOTE:\f[R] The arguments to these scripts include the cache
|
||||||
\f[B]drive\f[R].
|
\f[B]filesystem\f[R] itself.
|
||||||
Not the pool with the cache drive.
|
Not the pool with the cache filesystem.
|
||||||
You could have data loss if the source is the cache pool.
|
You could have data loss if the source is the cache pool.
|
||||||
.IP
|
.IP
|
||||||
.nf
|
.nf
|
||||||
|
@ -1916,7 +1899,7 @@ You could have data loss if the source is the cache pool.
|
||||||
#!/bin/bash
|
#!/bin/bash
|
||||||
|
|
||||||
if [ $# != 3 ]; then
|
if [ $# != 3 ]; then
|
||||||
echo \[dq]usage: $0 <cache-drive> <backing-pool> <percentage>\[dq]
|
echo \[dq]usage: $0 <cache-fs> <backing-pool> <percentage>\[dq]
|
||||||
exit 1
|
exit 1
|
||||||
fi
|
fi
|
||||||
|
|
||||||
|
@ -1946,7 +1929,7 @@ That said the performance can match the theoretical max but it depends
|
||||||
greatly on the system\[cq]s configuration.
|
greatly on the system\[cq]s configuration.
|
||||||
Especially when adding network filesystems into the mix there are many
|
Especially when adding network filesystems into the mix there are many
|
||||||
variables which can impact performance.
|
variables which can impact performance.
|
||||||
Drive speeds and latency, network speeds and latency, general
|
Device speeds and latency, network speeds and latency, general
|
||||||
concurrency, read/write sizes, etc.
|
concurrency, read/write sizes, etc.
|
||||||
Unfortunately, given the number of variables it has been difficult to
|
Unfortunately, given the number of variables it has been difficult to
|
||||||
find a single set of settings which provide optimal performance.
|
find a single set of settings which provide optimal performance.
|
||||||
|
@ -1982,7 +1965,7 @@ disk
|
||||||
.IP \[bu] 2
|
.IP \[bu] 2
|
||||||
use \f[C]symlinkify\f[R] if your data is largely static and read-only
|
use \f[C]symlinkify\f[R] if your data is largely static and read-only
|
||||||
.IP \[bu] 2
|
.IP \[bu] 2
|
||||||
use tiered cache drives
|
use tiered cache devices
|
||||||
.IP \[bu] 2
|
.IP \[bu] 2
|
||||||
use LVM and LVM cache to place a SSD in front of your HDDs
|
use LVM and LVM cache to place a SSD in front of your HDDs
|
||||||
.IP \[bu] 2
|
.IP \[bu] 2
|
||||||
|
@ -2029,7 +2012,7 @@ Extremely high speed and very low latency.
|
||||||
This is a more realistic best case scenario.
|
This is a more realistic best case scenario.
|
||||||
Example: \f[C]mount -t tmpfs -o size=2G tmpfs /tmp/tmpfs\f[R]
|
Example: \f[C]mount -t tmpfs -o size=2G tmpfs /tmp/tmpfs\f[R]
|
||||||
.IP "3." 3
|
.IP "3." 3
|
||||||
Mount mergerfs over a local drive.
|
Mount mergerfs over a local device.
|
||||||
NVMe, SSD, HDD, etc.
|
NVMe, SSD, HDD, etc.
|
||||||
If you have more than one I\[cq]d suggest testing each of them as drives
|
If you have more than one I\[cq]d suggest testing each of them as drives
|
||||||
and/or controllers (their drivers) could impact performance.
|
and/or controllers (their drivers) could impact performance.
|
||||||
|
@ -2046,7 +2029,7 @@ For reads and writes the most relevant would be: \f[C]cache.files\f[R],
|
||||||
Less likely but relevant when using NFS or with certain filesystems
|
Less likely but relevant when using NFS or with certain filesystems
|
||||||
would be \f[C]security_capability\f[R], \f[C]xattr\f[R], and
|
would be \f[C]security_capability\f[R], \f[C]xattr\f[R], and
|
||||||
\f[C]posix_acl\f[R].
|
\f[C]posix_acl\f[R].
|
||||||
If you find a specific system, drive, filesystem, controller, etc.
|
If you find a specific system, device, filesystem, controller, etc.
|
||||||
that performs poorly contact trapexit so he may investigate further.
|
that performs poorly contact trapexit so he may investigate further.
|
||||||
.PP
|
.PP
|
||||||
Sometimes the problem is really the application accessing or writing
|
Sometimes the problem is really the application accessing or writing
|
||||||
|
@ -2109,7 +2092,7 @@ exibit incorrect behavior if run otherwise..
|
||||||
If you don\[cq]t see some directories and files you expect, policies
|
If you don\[cq]t see some directories and files you expect, policies
|
||||||
seem to skip branches, you get strange permission errors, etc.
|
seem to skip branches, you get strange permission errors, etc.
|
||||||
be sure the underlying filesystems\[cq] permissions are all the same.
|
be sure the underlying filesystems\[cq] permissions are all the same.
|
||||||
Use \f[C]mergerfs.fsck\f[R] to audit the drive for out of sync
|
Use \f[C]mergerfs.fsck\f[R] to audit the filesystem for out of sync
|
||||||
permissions.
|
permissions.
|
||||||
.IP \[bu] 2
|
.IP \[bu] 2
|
||||||
If you still have permission issues be sure you are using POSIX ACL
|
If you still have permission issues be sure you are using POSIX ACL
|
||||||
|
@ -2165,7 +2148,7 @@ appear outdated.
|
||||||
The reason this is the default is because any other policy would be more
|
The reason this is the default is because any other policy would be more
|
||||||
expensive and for many applications it is unnecessary.
|
expensive and for many applications it is unnecessary.
|
||||||
To always return the directory with the most recent mtime or a faked
|
To always return the directory with the most recent mtime or a faked
|
||||||
value based on all found would require a scan of all drives.
|
value based on all found would require a scan of all filesystems.
|
||||||
.PP
|
.PP
|
||||||
If you always want the directory information from the one with the most
|
If you always want the directory information from the one with the most
|
||||||
recent mtime then use the \f[C]newest\f[R] policy for \f[C]getattr\f[R].
|
recent mtime then use the \f[C]newest\f[R] policy for \f[C]getattr\f[R].
|
||||||
|
@ -2191,7 +2174,7 @@ Since the source \f[B]is\f[R] the target in this case, depending on the
|
||||||
unlink policy, it will remove the just copied file and other files
|
unlink policy, it will remove the just copied file and other files
|
||||||
across the branches.
|
across the branches.
|
||||||
.PP
|
.PP
|
||||||
If you want to move files to one drive just copy them there and use
|
If you want to move files to one filesystem just copy them there and use
|
||||||
mergerfs.dedup to clean up the old paths or manually remove them from
|
mergerfs.dedup to clean up the old paths or manually remove them from
|
||||||
the branches directly.
|
the branches directly.
|
||||||
.SS cached memory appears greater than it should be
|
.SS cached memory appears greater than it should be
|
||||||
|
@ -2253,16 +2236,15 @@ Please read the section above regarding rename & link.
|
||||||
The problem is that many applications do not properly handle
|
The problem is that many applications do not properly handle
|
||||||
\f[C]EXDEV\f[R] errors which \f[C]rename\f[R] and \f[C]link\f[R] may
|
\f[C]EXDEV\f[R] errors which \f[C]rename\f[R] and \f[C]link\f[R] may
|
||||||
return even though they are perfectly valid situations which do not
|
return even though they are perfectly valid situations which do not
|
||||||
indicate actual drive or OS errors.
|
indicate actual device, filesystem, or OS errors.
|
||||||
The error will only be returned by mergerfs if using a path preserving
|
The error will only be returned by mergerfs if using a path preserving
|
||||||
policy as described in the policy section above.
|
policy as described in the policy section above.
|
||||||
If you do not care about path preservation simply change the mergerfs
|
If you do not care about path preservation simply change the mergerfs
|
||||||
policy to the non-path preserving version.
|
policy to the non-path preserving version.
|
||||||
For example: \f[C]-o category.create=mfs\f[R]
|
For example: \f[C]-o category.create=mfs\f[R] Ideally the offending
|
||||||
.PP
|
software would be fixed and it is recommended that if you run into this
|
||||||
Ideally the offending software would be fixed and it is recommended that
|
problem you contact the software\[cq]s author and request proper
|
||||||
if you run into this problem you contact the software\[cq]s author and
|
handling of \f[C]EXDEV\f[R] errors.
|
||||||
request proper handling of \f[C]EXDEV\f[R] errors.
|
|
||||||
.SS my 32bit software has problems
|
.SS my 32bit software has problems
|
||||||
.PP
|
.PP
|
||||||
Some software have problems with 64bit inode values.
|
Some software have problems with 64bit inode values.
|
||||||
|
@ -2373,24 +2355,24 @@ to dual socket Xeon systems with >20 cores.
|
||||||
I\[cq]m aware of at least a few companies which use mergerfs in
|
I\[cq]m aware of at least a few companies which use mergerfs in
|
||||||
production.
|
production.
|
||||||
Open Media Vault (https://www.openmediavault.org) includes mergerfs as
|
Open Media Vault (https://www.openmediavault.org) includes mergerfs as
|
||||||
its sole solution for pooling drives.
|
its sole solution for pooling filesystems.
|
||||||
The author of mergerfs had it running for over 300 days managing 16+
|
The author of mergerfs had it running for over 300 days managing 16+
|
||||||
drives with reasonably heavy 24/7 read and write usage.
|
devices with reasonably heavy 24/7 read and write usage.
|
||||||
Stopping only after the machine\[cq]s power supply died.
|
Stopping only after the machine\[cq]s power supply died.
|
||||||
.PP
|
.PP
|
||||||
Most serious issues (crashes or data corruption) have been due to kernel
|
Most serious issues (crashes or data corruption) have been due to kernel
|
||||||
bugs (https://github.com/trapexit/mergerfs/wiki/Kernel-Issues-&-Bugs).
|
bugs (https://github.com/trapexit/mergerfs/wiki/Kernel-Issues-&-Bugs).
|
||||||
All of which are fixed in stable releases.
|
All of which are fixed in stable releases.
|
||||||
.SS Can mergerfs be used with drives which already have data / are in use?
|
.SS Can mergerfs be used with filesystems which already have data / are in use?
|
||||||
.PP
|
.PP
|
||||||
Yes.
|
Yes.
|
||||||
MergerFS is a proxy and does \f[B]NOT\f[R] interfere with the normal
|
MergerFS is a proxy and does \f[B]NOT\f[R] interfere with the normal
|
||||||
form or function of the drives / mounts / paths it manages.
|
form or function of the filesystems / mounts / paths it manages.
|
||||||
.PP
|
.PP
|
||||||
MergerFS is \f[B]not\f[R] a traditional filesystem.
|
MergerFS is \f[B]not\f[R] a traditional filesystem.
|
||||||
MergerFS is \f[B]not\f[R] RAID.
|
MergerFS is \f[B]not\f[R] RAID.
|
||||||
It does \f[B]not\f[R] manipulate the data that passes through it.
|
It does \f[B]not\f[R] manipulate the data that passes through it.
|
||||||
It does \f[B]not\f[R] shard data across drives.
|
It does \f[B]not\f[R] shard data across filesystems.
|
||||||
It merely shards some \f[B]behavior\f[R] and aggregates others.
|
It merely shards some \f[B]behavior\f[R] and aggregates others.
|
||||||
.SS Can mergerfs be removed without affecting the data?
|
.SS Can mergerfs be removed without affecting the data?
|
||||||
.PP
|
.PP
|
||||||
|
@ -2402,7 +2384,8 @@ probably best off using \f[C]mfs\f[R] for \f[C]category.create\f[R].
|
||||||
It will spread files out across your branches based on available space.
|
It will spread files out across your branches based on available space.
|
||||||
Use \f[C]mspmfs\f[R] if you want to try to colocate the data a bit more.
|
Use \f[C]mspmfs\f[R] if you want to try to colocate the data a bit more.
|
||||||
You may want to use \f[C]lus\f[R] if you prefer a slightly different
|
You may want to use \f[C]lus\f[R] if you prefer a slightly different
|
||||||
distribution of data if you have a mix of smaller and larger drives.
|
distribution of data if you have a mix of smaller and larger
|
||||||
|
filesystems.
|
||||||
Generally though \f[C]mfs\f[R], \f[C]lus\f[R], or even \f[C]rand\f[R]
|
Generally though \f[C]mfs\f[R], \f[C]lus\f[R], or even \f[C]rand\f[R]
|
||||||
are good for the general use case.
|
are good for the general use case.
|
||||||
If you are starting with an imbalanced pool you can use the tool
|
If you are starting with an imbalanced pool you can use the tool
|
||||||
|
@ -2413,8 +2396,8 @@ set \f[C]func.create\f[R] to \f[C]epmfs\f[R] or similar and
|
||||||
\f[C]func.mkdir\f[R] to \f[C]rand\f[R] or \f[C]eprand\f[R] depending on
|
\f[C]func.mkdir\f[R] to \f[C]rand\f[R] or \f[C]eprand\f[R] depending on
|
||||||
if you just want to colocate generally or on specific branches.
|
if you just want to colocate generally or on specific branches.
|
||||||
Either way the \f[I]need\f[R] to colocate is rare.
|
Either way the \f[I]need\f[R] to colocate is rare.
|
||||||
For instance: if you wish to remove the drive regularly and want the
|
For instance: if you wish to remove the device regularly and want the
|
||||||
data to predictably be on that drive or if you don\[cq]t use backup at
|
data to predictably be on that device or if you don\[cq]t use backup at
|
||||||
all and don\[cq]t wish to replace that data piecemeal.
|
all and don\[cq]t wish to replace that data piecemeal.
|
||||||
In which case using path preservation can help but will require some
|
In which case using path preservation can help but will require some
|
||||||
manual attention.
|
manual attention.
|
||||||
|
@ -2451,9 +2434,9 @@ the documentation will be improved.
|
||||||
That said, for the average person, the following should be fine:
|
That said, for the average person, the following should be fine:
|
||||||
.PP
|
.PP
|
||||||
\f[C]cache.files=off,dropcacheonclose=true,category.create=mfs\f[R]
|
\f[C]cache.files=off,dropcacheonclose=true,category.create=mfs\f[R]
|
||||||
.SS Why are all my files ending up on 1 drive?!
|
.SS Why are all my files ending up on 1 filesystem?!
|
||||||
.PP
|
.PP
|
||||||
Did you start with empty drives?
|
Did you start with empty filesystems?
|
||||||
Did you explicitly configure a \f[C]category.create\f[R] policy?
|
Did you explicitly configure a \f[C]category.create\f[R] policy?
|
||||||
Are you using an \f[C]existing path\f[R] / \f[C]path preserving\f[R]
|
Are you using an \f[C]existing path\f[R] / \f[C]path preserving\f[R]
|
||||||
policy?
|
policy?
|
||||||
|
@ -2461,23 +2444,23 @@ policy?
|
||||||
The default create policy is \f[C]epmfs\f[R].
|
The default create policy is \f[C]epmfs\f[R].
|
||||||
That is a path preserving algorithm.
|
That is a path preserving algorithm.
|
||||||
With such a policy for \f[C]mkdir\f[R] and \f[C]create\f[R] with a set
|
With such a policy for \f[C]mkdir\f[R] and \f[C]create\f[R] with a set
|
||||||
of empty drives it will select only 1 drive when the first directory is
|
of empty filesystems it will select only 1 filesystem when the first
|
||||||
created.
|
directory is created.
|
||||||
Anything, files or directories, created in that first directory will be
|
Anything, files or directories, created in that first directory will be
|
||||||
placed on the same branch because it is preserving paths.
|
placed on the same branch because it is preserving paths.
|
||||||
.PP
|
.PP
|
||||||
This catches a lot of new users off guard but changing the default would
|
This catches a lot of new users off guard but changing the default would
|
||||||
break the setup for many existing users.
|
break the setup for many existing users.
|
||||||
If you do not care about path preservation and wish your files to be
|
If you do not care about path preservation and wish your files to be
|
||||||
spread across all your drives change to \f[C]mfs\f[R] or similar policy
|
spread across all your filesystems change to \f[C]mfs\f[R] or similar
|
||||||
as described above.
|
policy as described above.
|
||||||
If you do want path preservation you\[cq]ll need to perform the manual
|
If you do want path preservation you\[cq]ll need to perform the manual
|
||||||
act of creating paths on the drives you want the data to land on before
|
act of creating paths on the filesystems you want the data to land on
|
||||||
transferring your data.
|
before transferring your data.
|
||||||
Setting \f[C]func.mkdir=epall\f[R] can simplify managing path
|
Setting \f[C]func.mkdir=epall\f[R] can simplify managing path
|
||||||
preservation for \f[C]create\f[R].
|
preservation for \f[C]create\f[R].
|
||||||
Or use \f[C]func.mkdir=rand\f[R] if you\[cq]re interested in just
|
Or use \f[C]func.mkdir=rand\f[R] if you\[cq]re interested in just
|
||||||
grouping together directory content by drive.
|
grouping together directory content by filesystem.
|
||||||
.SS Do hardlinks work?
|
.SS Do hardlinks work?
|
||||||
.PP
|
.PP
|
||||||
Yes.
|
Yes.
|
||||||
|
@ -2546,8 +2529,8 @@ of the caller.
|
||||||
This means that if the user does not have access to a file or directory
|
This means that if the user does not have access to a file or directory
|
||||||
than neither will mergerfs.
|
than neither will mergerfs.
|
||||||
However, because mergerfs is creating a union of paths it may be able to
|
However, because mergerfs is creating a union of paths it may be able to
|
||||||
read some files and directories on one drive but not another resulting
|
read some files and directories on one filesystem but not another
|
||||||
in an incomplete set.
|
resulting in an incomplete set.
|
||||||
.PP
|
.PP
|
||||||
Whenever you run into a split permission issue (seeing some but not all
|
Whenever you run into a split permission issue (seeing some but not all
|
||||||
files) try using
|
files) try using
|
||||||
|
@ -2644,7 +2627,7 @@ features which aufs and overlayfs have.
|
||||||
.PP
|
.PP
|
||||||
UnionFS is more like aufs than mergerfs in that it offers overlay / CoW
|
UnionFS is more like aufs than mergerfs in that it offers overlay / CoW
|
||||||
features.
|
features.
|
||||||
If you\[cq]re just looking to create a union of drives and want
|
If you\[cq]re just looking to create a union of filesystems and want
|
||||||
flexibility in file/directory placement then mergerfs offers that
|
flexibility in file/directory placement then mergerfs offers that
|
||||||
whereas unionfs is more for overlaying RW filesystems over RO ones.
|
whereas unionfs is more for overlaying RW filesystems over RO ones.
|
||||||
.SS Why use mergerfs over overlayfs?
|
.SS Why use mergerfs over overlayfs?
|
||||||
|
@ -2664,8 +2647,9 @@ without the single point of failure.
|
||||||
.SS Why use mergerfs over ZFS?
|
.SS Why use mergerfs over ZFS?
|
||||||
.PP
|
.PP
|
||||||
MergerFS is not intended to be a replacement for ZFS.
|
MergerFS is not intended to be a replacement for ZFS.
|
||||||
MergerFS is intended to provide flexible pooling of arbitrary drives
|
MergerFS is intended to provide flexible pooling of arbitrary
|
||||||
(local or remote), of arbitrary sizes, and arbitrary filesystems.
|
filesystems (local or remote), of arbitrary sizes, and arbitrary
|
||||||
|
filesystems.
|
||||||
For \f[C]write once, read many\f[R] usecases such as bulk media storage.
|
For \f[C]write once, read many\f[R] usecases such as bulk media storage.
|
||||||
Where data integrity and backup is managed in other ways.
|
Where data integrity and backup is managed in other ways.
|
||||||
In that situation ZFS can introduce a number of costs and limitations as
|
In that situation ZFS can introduce a number of costs and limitations as
|
||||||
|
@ -2683,6 +2667,29 @@ open source is important.
|
||||||
.PP
|
.PP
|
||||||
There are a number of UnRAID users who use mergerfs as well though
|
There are a number of UnRAID users who use mergerfs as well though
|
||||||
I\[cq]m not entirely familiar with the use case.
|
I\[cq]m not entirely familiar with the use case.
|
||||||
|
.SS Why use mergerfs over StableBit\[cq]s DrivePool?
|
||||||
|
.PP
|
||||||
|
DrivePool works only on Windows so not as common an alternative as other
|
||||||
|
Linux solutions.
|
||||||
|
If you want to use Windows then DrivePool is a good option.
|
||||||
|
Functionally the two projects work a bit differently.
|
||||||
|
DrivePool always writes to the filesystem with the most free space and
|
||||||
|
later rebalances.
|
||||||
|
mergerfs does not offer rebalance but chooses a branch at file/directory
|
||||||
|
create time.
|
||||||
|
DrivePool\[cq]s rebalancing can be done differently in any directory and
|
||||||
|
has file pattern matching to further customize the behavior.
|
||||||
|
mergerfs, not having rebalancing does not have these features, but
|
||||||
|
similar features are planned for mergerfs v3.
|
||||||
|
DrivePool has builtin file duplication which mergerfs does not natively
|
||||||
|
support (but can be done via an external script.)
|
||||||
|
.PP
|
||||||
|
There are a lot of misc differences between the two projects but most
|
||||||
|
features in DrivePool can be replicated with external tools in
|
||||||
|
combination with mergerfs.
|
||||||
|
.PP
|
||||||
|
Additionally DrivePool is a closed source commercial product vs mergerfs
|
||||||
|
a ISC licensed OSS project.
|
||||||
.SS What should mergerfs NOT be used for?
|
.SS What should mergerfs NOT be used for?
|
||||||
.IP \[bu] 2
|
.IP \[bu] 2
|
||||||
databases: Even if the database stored data in separate files (mergerfs
|
databases: Even if the database stored data in separate files (mergerfs
|
||||||
|
@ -2698,7 +2705,7 @@ much latency (if it works at all).
|
||||||
As replacement for RAID: mergerfs is just for pooling branches.
|
As replacement for RAID: mergerfs is just for pooling branches.
|
||||||
If you need that kind of device performance aggregation or high
|
If you need that kind of device performance aggregation or high
|
||||||
availability you should stick with RAID.
|
availability you should stick with RAID.
|
||||||
.SS Can drives be written to directly? Outside of mergerfs while pooled?
|
.SS Can filesystems be written to directly? Outside of mergerfs while pooled?
|
||||||
.PP
|
.PP
|
||||||
Yes, however it\[cq]s not recommended to use the same file from within
|
Yes, however it\[cq]s not recommended to use the same file from within
|
||||||
the pool and from without at the same time (particularly writing).
|
the pool and from without at the same time (particularly writing).
|
||||||
|
@ -2729,7 +2736,7 @@ those settings.
|
||||||
Only one error can be returned and if one of the reasons for filtering a
|
Only one error can be returned and if one of the reasons for filtering a
|
||||||
branch was \f[B]minfreespace\f[R] then it will be returned as such.
|
branch was \f[B]minfreespace\f[R] then it will be returned as such.
|
||||||
\f[B]moveonenospc\f[R] is only relevant to writing a file which is too
|
\f[B]moveonenospc\f[R] is only relevant to writing a file which is too
|
||||||
large for the drive its currently on.
|
large for the filesystem it\[cq]s currently on.
|
||||||
.PP
|
.PP
|
||||||
It is also possible that the filesystem selected has run out of inodes.
|
It is also possible that the filesystem selected has run out of inodes.
|
||||||
Use \f[C]df -i\f[R] to list the total and available inodes per
|
Use \f[C]df -i\f[R] to list the total and available inodes per
|
||||||
|
@ -2824,7 +2831,7 @@ Taking after \f[B]Samba\f[R], mergerfs uses
|
||||||
\f[B]syscall(SYS_setreuid,\&...)\f[R] to set the callers credentials for
|
\f[B]syscall(SYS_setreuid,\&...)\f[R] to set the callers credentials for
|
||||||
that thread only.
|
that thread only.
|
||||||
Jumping back to \f[B]root\f[R] as necessary should escalated privileges
|
Jumping back to \f[B]root\f[R] as necessary should escalated privileges
|
||||||
be needed (for instance: to clone paths between drives).
|
be needed (for instance: to clone paths between filesystems).
|
||||||
.PP
|
.PP
|
||||||
For non-Linux systems mergerfs uses a read-write lock and changes
|
For non-Linux systems mergerfs uses a read-write lock and changes
|
||||||
credentials only when necessary.
|
credentials only when necessary.
|
||||||
|
|
Loading…
Reference in New Issue
Block a user