10 KiB
date | title | slug | url |
---|---|---|---|
2017-01-02T15:29:14Z | rclone dedupe | rclone_dedupe | /commands/rclone_dedupe/ |
rclone dedupe
Interactively find duplicate files delete/rename them.
Synopsis
By default dedup
interactively finds duplicate files and offers to
delete all but one or rename them to be different. Only useful with
Google Drive which can have duplicate file names.
The dedupe
command will delete all but one of any identical (same
md5sum) files it finds without confirmation. This means that for most
duplicated files the dedupe
command will not be interactive. You
can use --dry-run
to see what would happen without doing anything.
Here is an example run.
Before - with duplicates
$ rclone lsl drive:dupes
6048320 2016-03-05 16:23:16.798000000 one.txt
6048320 2016-03-05 16:23:11.775000000 one.txt
564374 2016-03-05 16:23:06.731000000 one.txt
6048320 2016-03-05 16:18:26.092000000 one.txt
6048320 2016-03-05 16:22:46.185000000 two.txt
1744073 2016-03-05 16:22:38.104000000 two.txt
564374 2016-03-05 16:22:52.118000000 two.txt
Now the dedupe
session
$ rclone dedupe drive:dupes
2016/03/05 16:24:37 Google drive root 'dupes': Looking for duplicates using interactive mode.
one.txt: Found 4 duplicates - deleting identical copies
one.txt: Deleting 2/3 identical duplicates (md5sum "1eedaa9fe86fd4b8632e2ac549403b36")
one.txt: 2 duplicates remain
1: 6048320 bytes, 2016-03-05 16:23:16.798000000, md5sum 1eedaa9fe86fd4b8632e2ac549403b36
2: 564374 bytes, 2016-03-05 16:23:06.731000000, md5sum 7594e7dc9fc28f727c42ee3e0749de81
s) Skip and do nothing
k) Keep just one (choose which in next step)
r) Rename all to be different (by changing file.jpg to file-1.jpg)
s/k/r> k
Enter the number of the file to keep> 1
one.txt: Deleted 1 extra copies
two.txt: Found 3 duplicates - deleting identical copies
two.txt: 3 duplicates remain
1: 564374 bytes, 2016-03-05 16:22:52.118000000, md5sum 7594e7dc9fc28f727c42ee3e0749de81
2: 6048320 bytes, 2016-03-05 16:22:46.185000000, md5sum 1eedaa9fe86fd4b8632e2ac549403b36
3: 1744073 bytes, 2016-03-05 16:22:38.104000000, md5sum 851957f7fb6f0bc4ce76be966d336802
s) Skip and do nothing
k) Keep just one (choose which in next step)
r) Rename all to be different (by changing file.jpg to file-1.jpg)
s/k/r> r
two-1.txt: renamed from: two.txt
two-2.txt: renamed from: two.txt
two-3.txt: renamed from: two.txt
The result being
$ rclone lsl drive:dupes
6048320 2016-03-05 16:23:16.798000000 one.txt
564374 2016-03-05 16:22:52.118000000 two-1.txt
6048320 2016-03-05 16:22:46.185000000 two-2.txt
1744073 2016-03-05 16:22:38.104000000 two-3.txt
Dedupe can be run non interactively using the --dedupe-mode
flag or by using an extra parameter with the same value
--dedupe-mode interactive
- interactive as above.--dedupe-mode skip
- removes identical files then skips anything left.--dedupe-mode first
- removes identical files then keeps the first one.--dedupe-mode newest
- removes identical files then keeps the newest one.--dedupe-mode oldest
- removes identical files then keeps the oldest one.--dedupe-mode rename
- removes identical files then renames the rest to be different.
For example to rename all the identically named photos in your Google Photos directory, do
rclone dedupe --dedupe-mode rename "drive:Google Photos"
Or
rclone dedupe rename "drive:Google Photos"
rclone dedupe [mode] remote:path
Options
--dedupe-mode string Dedupe mode interactive|skip|first|newest|oldest|rename. (default "interactive")
Options inherited from parent commands
--acd-templink-threshold int Files >= this size will be downloaded via their tempLink. (default 9G)
--acd-upload-wait-per-gb duration Additional time per GB to wait after a failed complete upload to see if it appears. (default 3m0s)
--ask-password Allow prompt for password for encrypted configuration. (default true)
--b2-chunk-size int Upload chunk size. Must fit in memory. (default 96M)
--b2-test-mode string A flag string for X-Bz-Test-Mode header.
--b2-upload-cutoff int Cutoff for switching to chunked upload (default 190.735M)
--b2-versions Include old versions in directory listings.
--bwlimit int Bandwidth limit in kBytes/s, or use suffix b|k|M|G
--checkers int Number of checkers to run in parallel. (default 8)
-c, --checksum Skip based on checksum & size, not mod-time & size
--config string Config file. (default "/home/ncw/.rclone.conf")
--contimeout duration Connect timeout (default 1m0s)
--cpuprofile string Write cpu profile to file
--delete-after When synchronizing, delete files on destination after transfering
--delete-before When synchronizing, delete files on destination before transfering
--delete-during When synchronizing, delete files during transfer (default)
--delete-excluded Delete files on dest excluded from sync
--drive-auth-owner-only Only consider files owned by the authenticated user. Requires drive-full-list.
--drive-chunk-size int Upload chunk size. Must a power of 2 >= 256k. (default 8M)
--drive-formats string Comma separated list of preferred formats for downloading Google docs. (default "docx,xlsx,pptx,svg")
--drive-full-list Use a full listing for directory list. More data but usually quicker. (obsolete)
--drive-upload-cutoff int Cutoff for switching to chunked upload (default 8M)
--drive-use-trash Send files to the trash instead of deleting permanently.
--dropbox-chunk-size int Upload chunk size. Max 150M. (default 128M)
-n, --dry-run Do a trial run with no permanent changes
--dump-auth Dump HTTP headers with auth info
--dump-bodies Dump HTTP headers and bodies - may contain sensitive info
--dump-filters Dump the filters to the output
--dump-headers Dump HTTP headers - may contain sensitive info
--exclude stringArray Exclude files matching pattern
--exclude-from stringArray Read exclude patterns from file
--files-from stringArray Read list of source-file names from file
-f, --filter stringArray Add a file-filtering rule
--filter-from stringArray Read filtering patterns from a file
--ignore-existing Skip all files that exist on destination
--ignore-size Ignore size when skipping use mod-time or checksum.
-I, --ignore-times Don't skip files that match size and time - transfer all files
--include stringArray Include files matching pattern
--include-from stringArray Read include patterns from file
--log-file string Log everything to this file
--low-level-retries int Number of low level retries to do. (default 10)
--max-age string Don't transfer any file older than this in s or suffix ms|s|m|h|d|w|M|y
--max-depth int If set limits the recursion depth to this. (default -1)
--max-size int Don't transfer any file larger than this in k or suffix b|k|M|G (default off)
--memprofile string Write memory profile to file
--min-age string Don't transfer any file younger than this in s or suffix ms|s|m|h|d|w|M|y
--min-size int Don't transfer any file smaller than this in k or suffix b|k|M|G (default off)
--modify-window duration Max time diff to be considered the same (default 1ns)
--no-check-certificate Do not verify the server SSL certificate. Insecure.
--no-gzip-encoding Don't set Accept-Encoding: gzip.
--no-traverse Don't traverse destination file system on copy.
--no-update-modtime Don't update destination mod-time if files identical.
-x, --one-file-system Don't cross filesystem boundaries.
--onedrive-chunk-size int Above this size files will be chunked - must be multiple of 320k. (default 10M)
--onedrive-upload-cutoff int Cutoff for switching to chunked upload - must be <= 100MB (default 10M)
-q, --quiet Print as little stuff as possible
--retries int Retry operations this many times if they fail (default 3)
--s3-acl string Canned ACL used when creating buckets and/or storing objects in S3
--s3-storage-class string Storage class to use when uploading S3 objects (STANDARD|REDUCED_REDUNDANCY|STANDARD_IA)
--size-only Skip based on size only, not mod-time or checksum
--stats duration Interval between printing stats, e.g 500ms, 60s, 5m. (0 to disable) (default 1m0s)
--stats-unit string Show data rate in stats as either 'bits' or 'bytes'/s (default "bytes")
--swift-chunk-size int Above this size files will be chunked into a _segments container. (default 5G)
--timeout duration IO idle timeout (default 5m0s)
--transfers int Number of file transfers to run in parallel. (default 4)
-u, --update Skip files that are newer on the destination.
-v, --verbose Print lots more stuff
SEE ALSO
- rclone - Sync files and directories to and from local and remote object stores - v1.35-DEV