Commit Graph

114 Commits

Author SHA1 Message Date
Nick Craig-Wood 8f47b6746d b2: fix streaming chunked files an exact multiple of chunk size
Before this change, streaming files an exact multiple of the chunk
size would cause rclone to attempt to stream a 0 sized chunk which was
rejected by the b2 servers.

This bug was noticed by the new integration tests for chunked streaming.
2023-11-24 14:32:01 +00:00
Nick Craig-Wood cc2a4c2e20 fstest: factor chunked streaming tests from b2 and use in all backends 2023-11-24 12:58:40 +00:00
Nick Craig-Wood fabeb8e44e b2: fix server side chunked copy when file size was exactly --b2-copy-cutoff
Before this change the b2 servers would complain as this was only a
single part transfer.

This was noticed by the new integration tests for server side chunked copy.
2023-11-24 12:37:11 +00:00
Nick Craig-Wood c27977d4d5 fstest: factor chunked copy tests from b2 and use them in s3 and oos 2023-11-24 12:37:11 +00:00
Nick Craig-Wood e8fcde8de1 fs: add ChunkWriterDoesntSeek feature flag and set it for b2 2023-11-20 18:07:05 +00:00
Nick Craig-Wood bf21db0ac4 b2: fix multi-thread upload with copyto going to wrong name
See: https://forum.rclone.org/t/errors-and-failure-with-big-file-upload-to-b2/42522/
2023-10-28 15:18:00 +01:00
Nick Craig-Wood adfb1f7c7d b2: fix error handler to remove confusing DEBUG messages
On a 404 error, b2 returns an empty body which, before this change,
caused the error handler to try to parse an empty string and give the
following DEBUG message:

    Couldn't decode error response: EOF

This is confusing as it is expected in normal operations and isn't an
error.

This change reads the body of an error response first then tries to
decode it only if it isn't empty, which avoids the confusing DEBUG
message.

This also upgrades failure to read the body or failure to decode the
JSON to ERROR messages as now we are certain that we should have
something to read and decode.
2023-10-28 15:18:00 +01:00
Nick Craig-Wood 5fa68e9ca5 b2: fix chunked streaming uploads
Streaming uploads are used by rclone rcat and rclone mount
--vfs-cache-mode off.

After the multipart chunker refactor the multipart chunked streaming
upload was accidentally mixing the first and the second parts up which
was causing corrupted uploads.

This was caused by a simple off by one error in the refactoring where
we went from 1 based part number counting to 0 based part number
counting.

Fixing this revealed that the metadata wasn't being re-read for the
copied object either.

This fixes both of those issues and adds an integration tests so it
won't happen again.

Fixes #7367
2023-10-13 15:46:36 +01:00
Nick Craig-Wood d8d76ff647 b2: fix server side copies greater than 4GB
After the multipart chunker refactor the multipart chunked server side
copy was accidentally sending one part too many. The last part was 0
length which was rejected by b2.

This was caused by a simple off by one error in the refactoring where
we went from 1 based part number counting to 0 based part number
counting.

Fixing this revealed that the metadata wasn't being re-read for the
copied object either.

This fixes both of those issues and adds an integration tests so it
won't happen again.

See: https://forum.rclone.org/t/large-server-side-copy-in-b2-fails-due-to-bad-byte-range/42294
2023-10-12 11:19:56 +01:00
Nick Craig-Wood cb43e86d16 b2: reduce default --b2-upload-concurrency to 4 to reduce memory usage
In v1.63 memory usage in the b2 backend was limited to `--transfers` *
`--b2-chunk-size`

However in v1.64 this was raised to `--transfers` * `--b2-chunk-size`
* `--b2-upload-concurrency`.

The default value for this was accidently set quite high at 16 which
means by default rclone could use up to 6.4GB of memory!

The new default sets a more reasonable (but still high) max memory of 1.6GB.
2023-10-01 12:30:26 +01:00
Nick Craig-Wood 5c48102ede b2: fix locking window when getting mutipart upload URL
Before this change, the lock was held while the upload URL was being
fetched from the server.

This meant that any other threads were blocked from getting upload
URLs unecessarily.

It also increased the potential for deadlock.
2023-10-01 12:30:26 +01:00
Nick Craig-Wood 6072d314e1 b2: fix multipart upload: corrupted on transfer: sizes differ XXX vs 0
Before this change the b2 backend wasn't writing the metadata to the
object properly after a multipart upload.

The symptom of this was that sometimes it would give the error:

    corrupted on transfer: sizes differ XXX vs 0

This was fixed by returning the metadata in the chunk writer and setting it in Update.

See: https://forum.rclone.org/t/multipart-upload-to-b2-sometimes-failing-with-corrupted-on-transfer-sizes-differ/41829
2023-09-18 20:41:31 +01:00
Nick Craig-Wood 9277ca1e54 b2: implement --b2-lifecycle to control lifecycle when creating buckets 2023-09-16 17:01:43 +01:00
Nick Craig-Wood d6722607cb b2: implement "rclone backend lifecycle" to read and set bucket lifecycles 2023-09-16 16:44:28 +01:00
Nick Craig-Wood 4ef30db209 b2: fix listing all buckets when not needed
Before this change the b2 backend listed all the buckets to turn a
single bucket name into an ID.

However in July 26, 2018 a parameter was added to the list buckets API
to make listing all the buckets unecessary.

This code sets the bucketName parameter so that only the results for
the desired bucket are returned.
2023-09-16 16:04:50 +01:00
Nick Craig-Wood be17f1523a b2: fix ChunkWriter size return 2023-09-03 13:53:11 +01:00
Nick Craig-Wood 2db0e23584 backends: change OpenChunkWriter interface to allow backend concurrency override
Before this change the concurrency used for an upload was rather
inconsistent.

- if size below `--backend-upload-cutoff` (default 200M) do single part upload.

- if size below `--multi-thread-cutoff` (default 256M) or using streaming
  uploads (eg `rclone rcat) do multipart upload using
  `--backend-upload-concurrency` to set the concurrency used by the uploader.

- otherwise do multipart upload using `--multi-thread-streams` to set the
  concurrency.

This change makes the default for the concurrency used be the
`--backend-upload-concurrency`. If `--multi-thread-streams` is set and larger
than the `--backend-upload-concurrency` then that will be used instead.

This means that if the user sets `--backend-upload-concurrency` then it will be
obeyed for all multipart/multi-thread transfers and the user can override them
all with `--multi-thread-streams`.

See: #7056
2023-09-03 11:47:05 +01:00
Alishan Ladhani 7821cb884d
b2: fix rclone link when object path contains special characters
Before this change, b2 would return an error when opening a link
generated by `rclone link`. The following error occurs when the object
path contains an ampersand that is not percent encoded:

{
  "code": "bad_request",
  "message": "Bad character in percent-encoded string: 38 (0x26)",
  "status": 400
}
2023-09-02 18:31:14 +01:00
Nick Craig-Wood d69cdb79f7 b2: fix accounting for multpart uploads 2023-08-25 16:31:31 +01:00
Nick Craig-Wood ab803d1278 b2: implement OpenChunkWriter and multi-thread uploads #7056
This implements the OpenChunkWriter interface for b2 which
enables multi-thread uploads.

This makes the memory controls of the s3 backend inoperative; they are
replaced with the global ones.

    --b2-memory-pool-flush-time
    --b2-memory-pool-use-mmap

By using the buffered reader this fixes excessive memory use when
uploading large files as it will share memory pages between all
readers.
2023-08-24 12:39:27 +01:00
Nick Craig-Wood d0d41fe847 rclone config redacted: implement support mechanism for showing redacted config
This introduces a new fs.Option flag, Sensitive and uses this along
with IsPassword to redact the info in the config file for support
purposes.

It adds this flag into backends where appropriate. It was necessary to
add oauthutil.SharedOptions to some backends as they were missing
them.

Fixes #5209
2023-07-07 16:25:14 +01:00
Nick Craig-Wood 019a486d5b accounting: Make checkers show what they are doing
Before this change, all types of checkers showed "checking" after the
file name despite the fact that not all of them were checking.

After this change, they can show

- checking
- deleting
- hashing
- importing
- listing
- merging
- moving
- renaming

See: https://forum.rclone.org/t/what-is-rclone-checking-during-a-purge/35931/
2023-03-01 11:10:38 +00:00
Nick Craig-Wood 351fc609b1 b2: fix uploading files bigger than 1TiB
Before this change when uploading files bigger than 1TiB, the chunk
calculator would work out that the chunk size needed to be bigger than
the default 100 MiB to fit within the 10,000 parts limit.

However the uploader was still using the memory pool for the old chunk
size and this caused errors like

    panic: runtime error: slice bounds out of range [:122683392] with capacity 100663296

The fix for this is to make a temporary pool with the larger chunk
size and use it during the upload of the large file.

See: https://forum.rclone.org/t/rclone-cannot-complete-upload-to-b2-restarts-upload-frequently/35617/
2023-01-22 12:46:23 +00:00
Josh Soref ce3b65e6dc all: fix spelling across the project
* abcdefghijklmnopqrstuvwxyz
* accounting
* additional
* allowed
* almost
* already
* appropriately
* arise
* bandwidth
* behave
* bidirectional
* brackets
* cached
* characters
* cloud
* committing
* concatenating
* configured
* constructs
* current
* cutoff
* deferred
* different
* directory
* disposition
* dropbox
* either way
* error
* excess
* experiments
* explicitly
* externally
* files
* github
* gzipped
* hierarchies
* huffman
* hyphen
* implicitly
* independent
* insensitive
* integrity
* libraries
* literally
* metadata
* mimics
* missing
* modification
* multipart
* multiple
* nightmare
* nonexistent
* number
* obscure
* ourselves
* overridden
* potatoes
* preexisting
* priority
* received
* remote
* replacement
* represents
* reproducibility
* response
* satisfies
* sensitive
* separately
* separator
* specifying
* string
* successful
* synchronization
* syncing
* šenfeld
* take
* temporarily
* testcontents
* that
* the
* themselves
* throttling
* timeout
* transaction
* transferred
* unnecessary
* using
* webbrowser
* which
* with
* workspace

Signed-off-by: Josh Soref <2119212+jsoref@users.noreply.github.com>
2022-08-30 11:16:26 +02:00
albertony 555def2da7 build: add package comments to silence revive linter 2022-08-28 13:43:51 +02:00
Nick Craig-Wood 0501773db1 azureblob,b2,s3: fix chunksize calculations producing too many parts
Before this fix, the chunksize calculator was using the previous size
of the object, not the new size of the object to calculate the chunk
sizes.

This meant that uploading a replacement object which needed a new
chunk size would fail, using too many parts.

This fix fixes the calculator to take the size explicitly.
2022-08-09 12:57:38 +01:00
Nick Craig-Wood 6fd9e3d717 build: reformat comments to pass go1.19 vet
See: https://go.dev/doc/go1.19#go-doc
2022-08-05 16:35:41 +01:00
albertony ec117593f1 Fix lint issues reported by staticcheck
Used staticcheck 2022.1.2 (v0.3.2)

See: staticcheck.io
2022-06-13 21:13:50 +02:00
Derek Battams 8e5e230b81 b2: use chunksize lib to determine chunksize dynamically
Fixes #4643
2022-05-13 09:25:48 +01:00
SwazRGB 4cebade95d b2: Add b2-version-at flag to show file versions at time
Uses b2_list_file_versions to retrieve all file versions, and returns
the one that was active at the specified time

This is especially useful in combination with other backup tools, such
as restic, which may use rclone as a backend.
2022-04-28 16:29:13 +01:00
GGG KILLER cd4d8d55ec docs: add a note about the B2 download_url format
Currently the B2 docs don't specify which format the download_url
setting should have, and if you input it wrong, there is nothing
in the verbose logs or anywhere else that can let you know that.
2021-11-23 17:57:34 +00:00
Nick Craig-Wood e43b5ce5e5 Remove github.com/pkg/errors and replace with std library version
This is possible now that we no longer support go1.12 and brings
rclone into line with standard practices in the Go world.

This also removes errors.New and errors.Errorf from lib/errors and
prefers the stdlib errors package over lib/errors.
2021-11-07 11:53:30 +00:00
albertony e2f47ecdeb docs: punctuation cleanup
See #5538
2021-10-20 22:56:19 +02:00
albertony 2925e1384c Use binary prefixes for size and rate units
Includes adding support for additional size input suffix Mi and MiB, treated equivalent to M.
Extends binary suffix output with letter i, e.g. Ki and Mi.
Centralizes creation of bit/byte unit strings.
2021-04-27 02:25:52 +03:00
Jeffrey Tolar 7511b6f4f1 b2: don't include the bucket name in public link file prefixes
Including the bucket name as part of the `fileNamePrefix` passed to
`b2_get_download_authorization` results in a link valid for objects that
have the bucket name as part of the object path; e.g.,

    rclone link :b2:some-bucket/some-file

would result in a public link valid for the object
`some-bucket/some-file` in the `some-bucket` bucket (in rclone-remote
parlance, `:b2:some-bucket/some-bucket/some-file`). This will almost
certainly result in a broken link.

The B2 docs don't explicitly specify this behavior, but the example
given for `fileNamePrefix` provides some clarification.

See https://www.backblaze.com/b2/docs/b2_get_download_authorization.html.
2021-04-26 16:56:41 +01:00
Dominik Mydlil c163e6b250 b2: factor version handling into lib/version
Standardizes the filename version tagging so that it can be used by any
backend.
2021-04-12 15:59:18 +01:00
Nick Craig-Wood d042f3194f b2: fix html files downloaded via cloudflare
When reading files from B2 via cloudflare using --b2-download-url
cloudflare strips the Content-Length headers (presumably so it can
inject stuff into the body).

This caused rclone to think the file was corrupted as the length
didn't match.

The patch uses the old length read from the listing if there is no
Content-Length.

See: https://forum.rclone.org/t/b2-cloudflare-error-directory-not-found/23026
2021-03-24 17:06:59 +00:00
Nick Craig-Wood 4013bc4a4c Fix excessive retries missing --max-duration timeout - fixes #4504
This change checks the context whenever rclone might retry, and
doesn't retry if the current context has an error.

This fixes the pathological behaviour of `--max-duration` refusing to
exit because all the context deadline exceeded errors were being
retried.

This unfortunately meant changing the shouldRetry logic in every
backend and doing a lot of context propagation.

See: https://forum.rclone.org/t/add-flag-to-exit-immediately-when-max-duration-reached/22723
2021-03-13 09:25:44 +00:00
Nick Craig-Wood 53aa4b87fd b2: fix failed to create file system with application key limited to a prefix
Before this change, if an application key limited to a prefix was in
use, with trailing `/` marking the folders then rclone would HEAD the
path without a trailing `/` to work out if it was a file or a folder.
This returned a permission denied error, which rclone returned to the
user.

    Failed to create file system for "b2:bucket/path/":
        failed to HEAD for download: Unknown 401  (401 unknown)

This change assumes any errors on HEAD will make rclone assume the
object does not exist and the path is referring to a directory.

See: https://forum.rclone.org/t/b2-error-on-application-key-limited-to-a-prefix/22159/
2021-02-11 15:13:19 +00:00
Nick Craig-Wood 9710ded60f b2: automatically raise upload cutoff to avoid spurious error
Before this change, if --b2-chunk-size was raised above 200M then this
error would be produced:

    b2: upload cutoff: 200M is less than chunk size 1G

This change automatically reaises --b2-upload-cutoff to be the value
of --b2-chunk-size if it is below it, which stops this error being
generated.

Fixes #4475
2021-02-03 16:29:32 +00:00
lluuaapp 35b2ca642c b2: fixed possible crash when accessing Backblaze b2 remote 2021-01-25 17:48:40 +00:00
Kerry Su add7a35e55 b2: docs for download_url with private buckets
The current authentication scheme works without creating
a public download endpoint for a private bucket as in the B2 official blog.
On the contrary, if the existing authorization header gets duplicated
in the Cloudflare Workers script, one might receive 401 Unauthorized errors.
2021-01-02 11:33:48 +00:00
Nick Craig-Wood cb16f42075 b2: Make NewObject use less expensive API calls
Before this change when NewObject was called the b2 backend would list
the directory that the object was in in order to find it.

Unfortunately list calls are Class C transactions and cost more.

This patch switches to using HEAD requests instead. These are Class B
transactions. It is then necessary to parse the headers from response
back into the data that we get from the listing. However B2 returns
exactly the same data, just in a different form.

Rclone will use the old directory listing method when looking for
files with versions as these can't be found via a HEAD request.

This change will particularly benefit --files-from, rclone serve
restic but most operations will see some benefit.
2020-12-09 20:00:22 +00:00
Nick Craig-Wood 9d574c0d63 fshttp: read config from ctx not passed in ConfigInfo #4685 2020-11-26 16:40:12 +00:00
Nick Craig-Wood 2e21c58e6a fs: deglobalise the config #4685
This is done by making fs.Config private and attaching it to the
context instead.

The Config should be obtained with fs.GetConfig and fs.AddConfig
should be used to get a new mutable config that can be changed.
2020-11-26 16:40:12 +00:00
Nick Craig-Wood 1fb6ad700f accounting: add context.Context #3257 #4685 2020-11-09 18:05:54 +00:00
Nick Craig-Wood 8b96933e58 fs: Add context to fs.Features.Fill & fs.Features.Mask #3257 #4685 2020-11-09 18:05:54 +00:00
Nick Craig-Wood d846210978 fs: Add context to NewFs #3257 #4685
This adds a context.Context parameter to NewFs and related calls.

This is necessary as part of reading config from the context -
backends need to be able to read the global config.
2020-11-09 18:05:54 +00:00
Josh Soref e4a87f772f docs: spelling: e.g.
Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>
2020-10-28 18:16:23 +00:00
Josh Soref bbe7eb35f1 docs: spelling: server-side
Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>
2020-10-28 18:16:23 +00:00