This reverts commit 5a00a041f1.
Implementation is currently not correct. Multiple uploads can share the
same etag but have different paths in the S3 bucket.
A "bad upload" in this context is a upload with a mismatched URL. This can happen when changing the S3 bucket used for uploads and the upload records in the database have not been remapped correctly.
Badges can have their associated image uploads deleted. When this happens, any user who has that badge will have their profile page error out.
After this fix, when deleting an upload that's associated with a badge, we nullify the foreign key ID on the badge. This makes the existing safeguard work correctly.
This commit introduces the following changes which allows a site
administrator to mark `Upload` records with the `s3_file_missing`
verification status which will result in the `Upload` record being ignored when
`Discourse.store.list_missing_uploads` is ran on a site where S3 uploads
are enabled and `SiteSetting.enable_s3_inventory` is set to `true`.
1. Introduce `s3_file_missing` to `Upload.verification_statuses`
2. Introduce `Upload.mark_invalid_s3_uploads_as_missing` which updates
`Upload#verification_status` of all `Upload` records from `invalid_etag` to `s3_file_missing`.
3. Introduce `rake uploads:mark_invalid_s3_uploads_as_missing` Rake task
which allows a site administrator to change `Upload` records with
`invalid_etag` verification status to the `s3_file_missing`
verificaton_status.
4. Update `S3Inventory` to ignore `Upload` records with the
`s3_file_missing` verification status.
The `-ping` option significantly speeds up the ImageMagick `identify` command per our testing and the [documentation](https://imagemagick.org/script/command-line-options.php#ping):
> -ping
Efficiently determine these image characteristics: image number, the file name, the width and height of the image, whether the image is colormapped or not, the number of colors in the image, the number of bytes in the image, the format of the image (JPEG, PNM, etc.). Use +ping to ensure accurate image properties.
We already pass the `-ping` option in other places where the `identify` command is used, so it makes sense to use the option everywhere.
Internal topic: t/121431.
When rebaking and in various other places for posts, we
run through the uploads and call `update_secure_status` on
each of them.
However, if the secure status didn't change, we were still
calling S3 to change the ACL, which would have been a noop
in many cases and takes ~1 second per call, slowing things
down a lot.
Also, we didn't account for the s3_acls_enabled site setting
being false here, and in the specs doing an assertion
that `Discourse.store.update_ACL` is not called doesn't
work; `Discourse.store` isn't a singleton, it re-initializes
`FileStore::S3Store.new` every single time.
There are cases where a user can copy image markdown from a public
post (such as via the discourse-templates plugin) into a PM which
is then sent via an email. Since a PM is a secure context (via the
.with_secure_uploads? check on Post), the image will get a secure
URL in the PM post even though the backing upload is not secure.
This fixes the bug in that case where the image would be stripped
from the email (since it had a /secure-uploads/ URL) but not re-attached
further down the line using the secure_uploads_allow_embed_images_in_emails
setting because the upload itself was not secure.
The flow in Email::Sender for doing this is still not ideal, but
there are chicken and egg problems around when to strip the images,
how to fit in with other attachments and email size limits, and
when to apply the images inline via Email::Styles. It's convoluted,
but at least this fixes the Template use case for now.
In #21498, we split `BaseStore#download` into a "safe" version which returns nil on errors, and an "unsafe" version which raises an exception, which was the old behaviour of `#download`.
This change updates call sites that used the old `#download`, which raised exceptions, to use the new `#download!` to preserve behaviour (and silence deprecation warnings.)
It also silences the deprecation warning in tests.
16 bit images were not returning the correct dominant color due truncation
The routine expected an 8bit color eg: #FFAA00, but ended up getting a 16bit one eg: #FFFAAA000. This caused a truncation, which leads to wildly off colors.
We've had the UploadReference table for some time now in core,
but it was added after ChatUpload was and chat was just never
moved over to this new system.
This commit changes all chat code dealing with uploads to create/
update/delete/query UploadReference records instead of ChatUpload
records for consistency. At a later date we will drop the ChatUpload
table, but for now keeping it for data backup.
The migration + post migration are the same, we need both in case
any chat uploads are added/removed during deploy.
The previous fix in e83d35d6 was incorrect, and the stub in the test was never actually hit. This commit moves the error handling to the right place and updates the specs to ensure the stub is always used.
These errors tend to indicate that the upload is missing on the remote store. This is bad, but we don't want it to block the dominant-color calculation process. This commit catches errors when there is an HTTP error, and fixes the `base_store.rb` implementation when `FileHelper.download` returns nil.
This commit renames all secure_media related settings to secure_uploads_* along with the associated functionality.
This is being done because "media" does not really cover it, we aren't just doing this for images and videos etc. but for all uploads in the site.
Additionally, in future we want to secure more types of uploads, and enable a kind of "mixed mode" where some uploads are secure and some are not, so keeping media in the name is just confusing.
This also keeps compatibility with the `secure-media-uploads` path, and changes new
secure URLs to be `secure-uploads`.
Deprecated settings:
* secure_media -> secure_uploads
* secure_media_allow_embed_images_in_emails -> secure_uploads_allow_embed_images_in_emails
* secure_media_max_email_embed_image_size_kb -> secure_uploads_max_email_embed_image_size_kb
The `add_column` `limit` parameter has no effect on a postgres `text` column. Instead we can perform the check in ActiveRecord.
We never expect this condition to be hit - users cannot control this value. It's just a safety net.
We previously had a system which would generate a 10x10px preview of images and add their URLs in a data-small-upload attribute. The client would then use that as the background-image of the `<img>` element. This works reasonably well on fast connections, but on slower connections it can take a few seconds for the placeholders to appear. The act of loading the placeholders can also break or delay the loading of the 'real' images.
This commit replaces the placeholder logic with a new approach. Instead of a 10x10px preview, we use imagemagick to calculate the average color of an image and store it in the database. The hex color value then added as a `data-dominant-color` attribute on the `<img>` element, and the client can use this as a `background-color` on the element while the real image is loading. That means no extra HTTP request is required, and so the placeholder color can appear instantly.
Dominant color will be calculated:
1. When a new upload is created
2. During a post rebake, if the dominant color is missing from an upload, it will be calculated and stored
3. Every 15 minutes, 25 old upload records are fetched and their dominant color calculated and stored. (part of the existing PeriodicalUpdates job)
Existing posts will continue to use the old 10x10px placeholder system until they are next rebaked
We do not zero-pad our base62 short URLs, so there is no guarantee that the length is 27. Instead, let's greedily match all consecutive base62 characters and look for a matching upload.
This reverts bd32656157 and 36f5d5eada.
This table holds associations between uploads and other models. This can be used to prevent removing uploads that are still in use.
* DEV: Create upload_references
* DEV: Use UploadReference instead of PostUpload
* DEV: Use UploadReference for SiteSetting
* DEV: Use UploadReference for Badge
* DEV: Use UploadReference for Category
* DEV: Use UploadReference for CustomEmoji
* DEV: Use UploadReference for Group
* DEV: Use UploadReference for ThemeField
* DEV: Use UploadReference for ThemeSetting
* DEV: Use UploadReference for User
* DEV: Use UploadReference for UserAvatar
* DEV: Use UploadReference for UserExport
* DEV: Use UploadReference for UserProfile
* DEV: Add method to extract uploads from raw text
* DEV: Use UploadReference for Draft
* DEV: Use UploadReference for ReviewableQueuedPost
* DEV: Use UploadReference for UserProfile's bio_raw
* DEV: Do not copy user uploads to upload references
* DEV: Copy post uploads again after deploy
* DEV: Use created_at and updated_at from uploads table
* FIX: Check if upload site setting is empty
* DEV: Copy user uploads to upload references
* DEV: Make upload extraction less strict
This will make future changes to the 'pull hotlinked images' system easier. This commit should not introduce any functional change.
For now, the old post_custom_field data is kept in the database. This will be dropped in a future commit.
Due to default CSP web workers instantiated from CDN based assets are still
treated as "same-origin" meaning that we had no way of safely instansiating
a web worker from a theme.
This limits the theme system and adds the arbitrary restriction that WASM
based components can not be safely used.
To resolve this limitation all js assets in about.json are also cached on
local domain.
{
"name": "Header Icons",
"assets" : {
"worker" : "assets/worker.js"
}
}
This can then be referenced in JS via:
settings.theme_uploads_local.worker
local_js_assets are unconditionally served from the site directly and
bypass the entire CDN, using the pre-existing JavascriptCache
Previous to this change this code was completely dormant on sites which
used s3 based uploads, this reuses the very well tested and cached asset
system on s3 based sites.
Note, when creating local_js_assets it is highly recommended to keep the
assets lean and keep all the heavy working in CDN based assets. For example
wasm files can still live on the CDN but the lean worker that loads it can
live on local.
This change unlocks wasm in theme components, so wasm is now also allowed
in `theme_authorized_extensions`
* more usages of upload.content
* add a specific test for upload.content
* Adjust logic to ensure that after upgrades we still get a cached local js
on save
This commit introduces two new APIs for handling unused uploads, one
can be used to exclude uploads in bulk when the data model allow and
the other one excludes uploads one by one.
This is necessary to allow for large file uploads via
the direct S3 upload mechanism, as we convert the external
file to an Upload record via ExternalUploadManager once
it is complete.
This will allow for files larger than 2,147,483,647 bytes (2.14GB)
to be referenced in the uploads table.
This is a table locking migration, but since it is not as highly
trafficked as posts, topics, or users, the disruption should be minimal.
When secure media is enabled or when upload secure status
is updated, we also try and update the upload ACL. However
if the object storage provider does not implement this we
get an Aws::S3::Errors::NotImplemented error. This PR handles
this error so the update_secure_status method does not error
out and still returns whether the secure status changed.
Previously certain images may lead to convert / identify to run for unreasonable
amounts of time
This adds a maximum amount of time these commands can run prior to forcing
them to stop
SVG files can have dimensions expressed in inches, centimeters, etc., which may lead to the dimensions being misinterpreted (e.g. “8in” ends up as 8 pixels).
If the file type is `svg`, ask ImageMagick to work out what size the SVG file should be rendered on screen.
NOTE: The `pencil.svg` file was obtained from https://freesvg.org/1534028868, which has placed the file in to the public domain.
This PR adds security_last_changed_at and security_last_changed_reason to uploads. This has been done to make it easier to track down why an upload's secure column has changed and when. This necessitated a refactor of the UploadSecurity class to provide reasons why the upload security would have changed.
As well as this, a source is now provided from the location which called for the upload's security status to be updated as they are several (e.g. post creator, topic security updater, rake tasks, manual change).
* FEATURE - Add SiteSettings to control JPEG image quality
`recompress_original_jpg_quality` - the maximum quality of a newly
uploaded file.
`image_preview_jpg_quality` - the maximum quality of OptimizedImages
Dependency on gifsicle, allow_animated_avatars and allow_animated_thumbnails
site settings were all removed. Animated GIF images are still allowed, but
the generated optimized images are no longer animated for those (which were
used for avatars and thumbnails).
The added 'animated' is populated by extracting information using FastImage.
This field was used to selectively reoptimize old animations. This process
happens in the background.
Upload.secure_media_url? raised an exceptions when the URL was invalid,
which was a issue in some situations where secure media URLs must be
removed.
For example, sending digests used PrettyText.strip_secure_media,
which used Upload.secure_media_url? to replace secure media with
placeholders. If the URL was invalid, then an exception would be raised
and left unhandled.
Now instead in UrlHelper.rails_route_from_url we return nil if there is something wrong with the URL.
Co-authored-by: Bianca Nenciu <nenciu.bianca@gmail.com>
There is an fk to user_profile that can make destroying uploads fail
if they happen to be set as user profile.
This ensures we clear this information when destroying uploads.
There are more relationships, but this makes some more progress.
The previous fix (f43c0a5d85) wasn't working for images that were already uploaded.
The "metadata" (eg. 'for_*' and 'secure' attributes) were not added to existing uploads.
Also used 'Upload.get_from_url' is the admin/site_setting controller to properly retrieve
an upload from its URL.
Fixed the Upload::URL_REGEX to use the \h (hexadecimal) for the SHA
Follow-up-to: f43c0a5d85
This reverts commit 20780a1eee.
* SECURITY: re-adds accidentally reverted commit:
03d26cd6: ensure embed_url contains valid http(s) uri
* when the merge commit e62a85cf was reverted, git chose the 2660c2e2 parent to land on
instead of the 03d26cd6 parent (which contains security fixes)