Commit Graph

777 Commits

Author SHA1 Message Date
Milos Gajdos
7d74ee7186
Fix S3 driver loglevel param (#4617) 2025-08-11 14:27:41 +01:00
Wang Yan
c54bcb3770
s3-aws: fix build for 386 (#4642) 2025-08-11 14:14:53 +08:00
Raj Siva-Rajah
b559f27a08 Switch to UUIDv7
Signed-off-by: Raj Siva-Rajah <raj@zapzap.cloud>
2025-07-31 06:54:27 +00:00
Chen Qi
6970080b10 s3-aws: fix build for 386
When building for 386, we got the following build error:

  registry/storage/driver/s3-aws/s3.go:312:99: cannot use
  maxChunkSize (untyped int constant 5368709120) as int value
  in argument to getParameterAsInteger (overflows)

This is because the s3_64bit.go is used. Adjust the build tag matching
in s3_32bit.go and s3_64bit.go to fix this issue.

Signed-off-by: Chen Qi <Qi.Chen@windriver.com>
2025-05-29 11:35:59 +08:00
Oded Porat
dde1e49f23 Changes:
Append a UUID to ensure uniqueness
Join delete error

Signed-off-by: Oded Porat <onporat@gmail.com>
2025-05-04 10:43:19 +03:00
Oded Porat
a5a6f1ba3d To address the issue where empty files are created when the write process is interrupted, the solution involves writing to a temporary file first and then atomically renaming it to the target file. This ensures that the target file is only updated if the write completes successfully, preventing empty or partially written files.
**Explanation:**

1. **Temporary File Creation:** The content is first written to a temporary file (appending `.tmp` to the original path). This ensures that the original file remains intact until the write is complete.

2. **Write to Temporary File:** Using the existing `Writer` with truncation (`false`), the content is written to the temporary file. If the write fails, the temporary file is closed and deleted.

3. **Commit and Rename:** After successfully writing to the temporary file, it is committed. Then, the temporary file is atomically renamed to the target path using `Move`, which is handled by the filesystem's rename operation (atomic on most systems).

4. **Cleanup on Failure:** If any step fails, the temporary file is cleaned up to avoid leaving orphaned files.

Signed-off-by: Oded Porat <onporat@gmail.com>
2025-04-23 11:37:55 +03:00
Oded Porat
78456caf46 Fix: resolve issue #4478 by using a temporary file for non-append writes
To address the issue where a failed write operation results in an empty file, we can use a temporary file for non-append writes. This ensures that the original file is only replaced once the new content is fully written and committed.

**Key Changes:**

1. **Temporary File Handling:**
   - For non-append writes, a temporary file is created in the same directory as the target file.
   - All write operations are performed on the temporary file first.

2. **Atomic Commit:**
   - The temporary file is only renamed to the target path during `Commit()`, ensuring atomic replacement.
   - If `Commit()` fails, the temporary file is cleaned up.

3. **Error Handling:**
   - `Cancel()` properly removes temporary files if the operation is aborted.
   - `Close()` is made idempotent to handle multiple calls safely.

4. **Data Integrity:**
   - Directory sync after rename ensures metadata persistence.
   - Proper file flushing and syncing before rename operations.

Signed-off-by: Oded Porat <onporat@gmail.com>
2025-04-16 10:30:20 +03:00
closeobserve
a6ce1a7995 chore: make function comment match function name
Signed-off-by: closeobserve <pingcap@yahoo.com>
2025-04-13 17:40:27 +08:00
Lucas Melchior
ea6ab3652c fix newClient in azure storage provider
it can now return a client using default azure credentials
updated docs to include information on Azure Workload Identity

Signed-off-by: Lucas Melchior <lucasmelchior@flywheel.io>

fix anchor link in docs

Signed-off-by: Lucas Melchior <lucasmelchior@flywheel.io>
2025-04-08 10:22:34 -05:00
Milos Gajdos
369663e4be
Fix S3 driver loglevel param
Unfortunately YAML struck us hard in this one.
It interprets "off" as a truthy value so setting loglevel to off sets it
to false.

This commit makes sure we set the loglevel to off if the param is
marshalled into false and if it's not a string.

Signed-off-by: Milos Gajdos <milosthegajdos@gmail.com>
2025-04-01 08:15:26 -07:00
Milos Gajdos
ebd20d3be7
Azure driver retry fix (#4576) 2025-03-14 10:20:25 -07:00
Milos Gajdos
2ffa1171c2
Azure driver fix
* Make copy poll max retry, a global driver max retry
* Get support for etags in Azure
* Fix storage driver tests
* Fix auth mess and update docs
* Refactor Azure client and enable Azure storage tests

We use Azurite for integration testing which requires TLS,
so we had to figure out how to skip TLS verification when running tests locally:
this required updating testsuites Driver and constructor due to TestRedirectURL
sending GET and HEAD requests to remote storage which in this case is Azurite.

Signed-off-by: Milos Gajdos <milosthegajdos@gmail.com>
2025-03-14 10:03:09 -07:00
Oleg Gnusarev
b30274f26c use cached blob statter in ManifestService if available
Signed-off-by: Oleg Gnusarev <ognusarev@mts.ru>
2025-03-11 19:41:25 +03:00
Milos Gajdos
7884c71297
Add code comment
Adding a code comment that explains setting MD5 Sum field.

Signed-off-by: Milos Gajdos <milosthegajdos@gmail.com>
2025-03-01 07:35:41 -08:00
Milos Gajdos
e20645c050
Enable MD5 check on GCS driver
Apparently you can upload 0-size content wihtout GCS reportin any errors
back to you.

This is something a lot of our users experienced and reported. See here
for at least one example:
github.com/distribution/distribution/issues/3018

This sets tbe MD5 sum on the uploaded content which should rectify
things according to the docs:
https://pkg.go.dev/cloud.google.com/go/storage#ObjectAttrs

Signed-off-by: Milos Gajdos <milosthegajdos@gmail.com>
2025-02-28 07:20:48 -08:00
Rafael Fonseca
a032989bf9 registry/storage: add option to quiet GC output.
Consumers might not want GC output to be displayed (e.g, if you have
your own logging system).

Signed-off-by: Rafael Fonseca <r4f4rfs@gmail.com>
2025-02-02 10:18:45 +01:00
Thomas Way
5ee5aaa058
fix(registry/storage/driver/s3-aws): use a consistent multipart chunk size
Some S3 compatible object storage systems like R2 require that all
multipart chunks are the same size. This was mostly true before, except
the final chunk was larger than the requested chunk size which causes
uploads to fail.

In addition, the two byte slices have been replaced with a single
*bytes.Buffer and the surrounding code simplified significantly.

Fixes: #3873

Signed-off-by: Thomas Way <thomas@6f.io>
2024-10-30 21:46:36 +00:00
Milos Gajdos
bce9fcd135
avoid appending directory as file path in s3 driver Walk (#4485) 2024-10-16 21:14:56 +01:00
Flavian Missi
2e7482cb89 avoid appending directory as file path in s3 driver Walk
when a directory is empty, the s3 api lists it with a trailing slash.
this causes the path to be appended twice to the walkInfo slice, causing
purge uploads path transformations to panic when the `_uploads` is
emtpy.

this adds a check for file paths ending on slash, and do not append
those as regular files to the walkInfo slice.

fixes #4358

Signed-off-by: Flavian Missi <fmissi@redhat.com>
2024-10-14 14:53:31 +02:00
Flavian Missi
e44d9317d0 test s3 driver walk of empty dir
Signed-off-by: Flavian Missi <fmissi@redhat.com>
2024-10-14 14:53:26 +02:00
Sebastiaan van Stijn
0ab7f326e6
replace uses of Descriptor alias
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2024-10-07 13:07:47 +02:00
Milos Gajdos
a940e61623
Fix silly testing format mistakes
Signed-off-by: Milos Gajdos <milosthegajdos@gmail.com>
2024-08-30 11:18:18 +01:00
Milos Gajdos
170ac07a5e
chore: bump golangci-lint and fix govert issues
The latest golangci-lint spits out some govet issues.
This commit fixes them. We are also bumping the linter version.

Signed-off-by: Milos Gajdos <milosthegajdos@gmail.com>
2024-08-30 10:28:24 +01:00
Milos Gajdos
d8199f451b
chore: fix typo in rewrite storage middleware init
https://github.com/distribution/distribution/pull/4146 introduced a new
rewrite storage middleware but somehow missed to update the init logging
message. This commit fixes that.

Signed-off-by: Milos Gajdos <milosthegajdos@gmail.com>
2024-08-15 08:59:30 +01:00
Liang Zheng
db5c303e7e fix: skip removing layer's link file when '--dry-run' option spcified
Signed-off-by: Liang Zheng <zhengliang0901@gmail.com>
2024-07-31 23:21:45 +08:00
Milos Gajdos
91eda593ef
chore: fix typos returned in some errors
Signed-off-by: Milos Gajdos <milosthegajdos@gmail.com>
2024-07-21 10:12:15 +01:00
Jan-Otto Kröpke
8619a11f73
fix nil pointer in s3 list api
Signed-off-by: Jan-Otto Kröpke <github@jkroepke.de>
2024-07-19 15:12:54 +02:00
Milos Gajdos
252619876a
fix logic for handling regionEndpoint (#4341) 2024-07-18 22:56:58 +01:00
Sebastiaan van Stijn
1e89cf780c
deprecate Versioned in favor of oci.Versioned
Update the Manifest types to use the oci implementation of the Versioned
struct.

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2024-07-18 18:38:32 +02:00
Milos Gajdos
a18cc8a656
S3 driver: Attempt HeadObject on Stat first, fail over to List
Stat always calls ListObjects when stat-ing S3 key.
Unfortauntely ListObjects is not a free call - both in terms of egress
and actual AWS costs (likely because of the egress).

This changes the behaviour of Stat such that we always attempt the
HeadObject call first and only ever fall through to ListObjects if the
HeadObject returns an AWS API error.

Note, that the official docs mention that the only error returned by
HEAD is NoSuchKey; experiments show that this is demonstrably wrong and
the AWS docs are simply outdated at the time of this commit.

HeadObject actually returns the following errors:
* NotFound: if the queried key does not exist
* NotFound: if the queried key contains subkeys i.e. it's a prefix
* BucketRegionError: if the bucket does not exist
* Forbidden: if Head operation is not allows via IAM/ACLs

Co-authored-by: Cory Snider <corhere@gmail.com>
Co-authored-by: Sebastiaan van Stijn <github@gone.nl>
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
Signed-off-by: Milos Gajdos <milosthegajdos@gmail.com>
2024-07-17 10:16:54 +01:00
Cory Snider
671184e910
Remove ManifestBuilder interface
Defining an interface on the implementer side is generally not best
practice in Go code. There is no code in the distribution module which
consumes a ManifestBuilder value so there is no need to define the
interface in the distribution module. Export the concrete
ManifestBuilder types and modify the constructors to return concrete
values.

Co-authored-by: Sebastiaan van Stijn <github@gone.nl>
Signed-off-by: Cory Snider <csnider@mirantis.com>
2024-07-16 11:16:06 +02:00
Sebastiaan van Stijn
9ba7340601
vendor: github.com/opencontainers/image-spec v1.1.0
full diff: https://github.com/opencontainers/image-spec/compare/v1.0.2...v1.1.0

Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
2024-07-10 14:58:09 -05:00
Milos Gajdos
4dd0ac977e
feat: implement 'rewrite' storage middleware (#4146) 2024-07-04 16:16:29 +01:00
Milos Gajdos
306f4ff71e
Replace custom Redis config struct with go-redis UniversalOptions (adds sentinel & cluster support) (#4306) 2024-07-04 16:00:37 +01:00
Andrey Smirnov
558ace1391
feat: implement 'rewrite' storage middleware
This allows to rewrite 'URLFor' of the storage driver to use a specific
host/trim the base path.

It is different from the 'redirect' middleware, as it still calls the
storage driver URLFor.

For example, with Azure storage provider, this allows to transform the
SAS Azure Blob Storage URL into the URL compatible with Azure Front
Door.

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
2024-07-04 18:49:25 +04:00
Liang Zheng
d9050bb917 remove layer's link file by gc
The garbage-collect should remove unsed layer link file

P.S. This was originally contributed by @m-masataka, now I would like to take over it.
Thanks @m-masataka efforts with PR https://github.com/distribution/distribution/pull/2288

Signed-off-by: Liang Zheng <zhengliang0901@gmail.com>
2024-07-03 00:16:11 +08:00
Anders Ingemann
b63cbb3318
Replace custom Redis config struct with go-redis UniversalOptions
Huge help from @milosgajdos who figured out how to do the entire
marshalling/unmarshalling for the configs

Signed-off-by: Anders Ingemann <aim@orbit.online>
2024-06-14 10:31:09 +02:00
James Hewitt
c40c4b289a
Enable configuration of index dependency validation
Enable configuration options that can selectively disable validation
that dependencies exist within the registry before the image index
is uploaded.

This enables sparse indexes, where a registry holds a manifest index that
could be signed (so the digest must not change) but does not hold every
referenced image in the index. The use case for this is when a registry
mirror does not need to mirror all platforms, but does need to maintain
the digests of all manifests either because they are signed or because
they are pulled by digest.

The registry administrator can also select specific image architectures
that must exist in the registry, enabling a registry operator to select
only the platforms they care about and ensure all image indexes uploaded
to the registry are valid for those platforms.

Signed-off-by: James Hewitt <james.hewitt@uk.ibm.com>
2024-05-28 09:56:14 +01:00
Ankur Kothiwal
eb6123f5ed fix logic for handling regionEndpoint
With the current logic we only verifies the region and return if it's
empty; we were not validating the regionEndpoint parameter.

Signed-off-by: Ankur Kothiwal <ankur.kothiwal@cern.com>
2024-05-07 17:03:12 +02:00
Sylvain DESGRAIS
f1875862cf Set readStartAtFile context aware for purge uploads
Signed-off-by: Sylvain DESGRAIS <sylvain.desgrais@gmail.com>
2024-05-02 11:06:39 +02:00
Liang Zheng
a2afe23f38 add concurrency limits for tag lookup and untag
Harbor is using the distribution for it's (harbor-registry) registry component.
The harbor GC will call into the registry to delete the manifest, which in turn
then does a lookup for all tags that reference the deleted manifest.
To find the tag references, the registry will iterate every tag in the repository
and read it's link file to check if it matches the deleted manifest (i.e. to see
if uses the same sha256 digest). So, the more tags in repository, the worse the
performance will be (as there will be more s3 API calls occurring for the tag
directory lookups and tag file reads).

Therefore, we can use concurrent lookup and untag to optimize performance as described in https://github.com/goharbor/harbor/issues/12948.

P.S. This optimization was originally contributed by @Antiarchitect, now I would like to take it over.
Thanks @Antiarchitect's efforts with PR https://github.com/distribution/distribution/pull/3890.

Signed-off-by: Liang Zheng <zhengliang0901@gmail.com>
2024-04-26 22:32:21 +08:00
Liang Zheng
112156321f fix: ignore error of manifest tag path not found in gc
it is reasonable to ignore the error that the manifest tag path does not exist when querying
all tags of the specified repository when executing gc.

Signed-off-by: Liang Zheng <zhengliang0901@gmail.com>
2024-04-25 17:13:06 +08:00
Milos Gajdos
e6d1d182bf
Allow setting s3 forcepathstyle without regionendpoint (#4291) 2024-04-24 08:34:01 +01:00
Milos Gajdos
e8ea4e5951
chore: fix some typos in comments (#4332) 2024-04-23 09:03:51 +01:00
goodactive
e0a1ce14a8 chore: fix some typos in comments
Signed-off-by: goodactive <goodactive@qq.com>
2024-04-23 12:04:03 +08:00
Anthony Ramahay
601b37d98b Handle OCI image index and V2 manifest list during garbage collection
Signed-off-by: Anthony Ramahay <thewolt@gmail.com>
2024-04-20 16:41:50 +02:00
Benjamin Schanzel
8654a0ee45
Allow setting s3 forcepathstyle without regionendpoint
Currently, the `forcepathstyle` parameter for the s3 storage driver is
considered only if the `regionendpoint` parameter is set. Since setting
a region endpoint explicitly is discouraged with AWS s3, it is not clear
how to enforce path style URLs with AWS s3.
This also means, that the default value (true) only applies if a region
endpoint is configured.

This change makes sure we always forward the `forcepathstyle` parameter
to the aws-sdk if present in the config. This is a breaking change where
a `regionendpoint` is configured but no explicit `forcepathstyle` value
is set.

Signed-off-by: Benjamin Schanzel <benjamin.schanzel@bmw.de>
2024-04-08 12:45:26 +02:00
xiaoxiangxianzi
2446e1102d chore: remove repetitive words in comments
Signed-off-by: xiaoxiangxianzi <zhaoyizheng@outlook.com>
2024-03-27 17:34:22 +08:00
Tadeusz Dudkiewicz
de450c903a update: support redirects in gcs storage with default credentials
Signed-off-by: Tadeusz Dudkiewicz <tadeusz.dudkiewicz@rtbhouse.com>
2024-03-11 21:05:03 +01:00
gotgelf
f690b3ebe2 Added Open Telemetry Tracing to Filesystem package
Signed-off-by: gotgelf <gotgelf@gmail.com>
2024-03-04 13:31:22 +01:00