Upload.secure_media_url? raised an exceptions when the URL was invalid,
which was a issue in some situations where secure media URLs must be
removed.
For example, sending digests used PrettyText.strip_secure_media,
which used Upload.secure_media_url? to replace secure media with
placeholders. If the URL was invalid, then an exception would be raised
and left unhandled.
Now instead in UrlHelper.rails_route_from_url we return nil if there is something wrong with the URL.
Co-authored-by: Bianca Nenciu <nenciu.bianca@gmail.com>
Extracted commonly used spec helpers into spec/support/uploads_helpers.rb, removed unused stubs and let definitions. Makes it easier to write new S3-related specs without copy and pasting setup steps from other specs.
There is an fk to user_profile that can make destroying uploads fail
if they happen to be set as user profile.
This ensures we clear this information when destroying uploads.
There are more relationships, but this makes some more progress.
* When copying the markdown for an image between posts, we were not adding the srcset and data-small-image attributes which are done by calling optimize_image! in cooked post processor
* Refactored the code which was confusing in its current state (the consider_for_reuse method was super confusing) and fixed the issue
If the “secure media” site setting is enabled then ALL files uploaded to Discourse (images, video, audio, pdf, txt, zip etc. etc.) will follow the secure media rules. The “prevent anons from downloading files” setting will no longer have any bearing on upload security. Basically, the feature will more appropriately be called “secure uploads” instead of “secure media”.
This is being done because there are communities out there that would like all attachments and media to be secure based on category rules but still allow anonymous users to download attachments in public places, which is not possible in the current arrangement.
* Attachments (non media files) were being marked as secure if just
SiteSetting.prevent_anons_from_downloading_files was enabled. this
was not correct as nothing should be marked as actually "secure" in
the DB without that site setting enabled
* Also add a proper standalone spec file for the upload security class
Further on from my earlier PR #8973 also reject upload as secure if its origin URL contains images/emoji. We still check Emoji.all first to try and be canonical.
This may be a little heavy handed (e.g. if an external URL followed this same path it would be a false positive), but there are a lot of emoji aliases where the actual Emoji url is something, but you can have another image that should not be secure that that thing is an alias for. For example slight_smile.png does not show up in Emoji.all BUT slightly_smiling_face does, and it aliases slight_smile e.g. /images/emoji/twitter/slight_smile.png?v=9 and /images/emoji/twitter/slightly_smiling_face.png?v=9 are equivalent.
Sometimes PullHotlinkedImages pulls down a site emoji and creates a new upload record for it. In the cases where these happen the upload is not created via the normal path that custom emoji follows, so we need to check in UploadSecurity whether the origin of the upload is based on a regular site emoji. If it is we never want to mark it as secure (we don't want emoji not accessible from other posts because of secure media).
This only became apparent because the uploads:ensure_correct_acl rake task uses UploadSecurity to check whether an upload should be secure, which would have marked a whole bunch of regular-old-emojis as secure.
* Because custom emoji count as post "uploads" we were
marking them as secure when updating the secure status for post uploads.
* We were also giving them an access control post id, which meant
broken image previews from 403 errors in the admin custom emoji list.
* We now check if an upload is used as a custom emoji and do not
assign the access control post + never mark as secure.
Basically, say you had already downloaded a certain image from a certain URL
using pull_hotlinked_images and the onebox. The upload would be stored
by its sha as an upload record. Whenever you linked to the same URL again
in a post (e.g. in our case an og:image on review.discourse) we would
would reuse the original upload record because of the sha1.
However when you turned on secure media this could cause problems as
the first post that uses that upload after secure media is enabled
will set the access control post for the upload to the new post.
Then if the post is deleted every single onebox/link to that same image
URL will fail forever with 403 as the secure-media-uploads URL fails
if the access control post has been deleted.
To fix this when cooking posts and pulling hotlinked images, we only
allow using an original upload by URL if its access control post
matches the current post, and if the original_sha1 is filled in,
meaning it was uploaded AFTER secure media was enabled. otherwise
we just redownload the media again to be safe, as the URL will always
be new then.
### General Changes and Duplication
* We now consider a post `with_secure_media?` if it is in a read-restricted category.
* When uploading we now set an upload's secure status straight away.
* When uploading if `SiteSetting.secure_media` is enabled, we do not check to see if the upload already exists using the `sha1` digest of the upload. The `sha1` column of the upload is filled with a `SecureRandom.hex(20)` value which is the same length as `Upload::SHA1_LENGTH`. The `original_sha1` column is filled with the _real_ sha1 digest of the file.
* Whether an upload `should_be_secure?` is now determined by whether the `access_control_post` is `with_secure_media?` (if there is no access control post then we leave the secure status as is).
* When serializing the upload, we now cook the URL if the upload is secure. This is so it shows up correctly in the composer preview, because we set secure status on upload.
### Viewing Secure Media
* The secure-media-upload URL will take the post that the upload is attached to into account via `Guardian.can_see?` for access permissions
* If there is no `access_control_post` then we just deliver the media. This should be a rare occurrance and shouldn't cause issues as the `access_control_post` is set when `link_post_uploads` is called via `CookedPostProcessor`
### Removed
We no longer do any of these because we do not reuse uploads by sha1 if secure media is enabled.
* We no longer have a way to prevent cross-posting of a secure upload from a private context to a public context.
* We no longer have to set `secure: false` for uploads when uploading for a theme component.
When uploading a file to a theme component, and that file is existing and has already been marked as secure, we now automatically mark the file as secure: false, change the ACL, and log the action as the user (also rebake the posts for the upload)
This PR introduces a new secure media setting. When enabled, it prevent unathorized access to media uploads (files of type image, video and audio). When the `login_required` setting is enabled, then all media uploads will be protected from unauthorized (anonymous) access. When `login_required`is disabled, only media in private messages will be protected from unauthorized access.
A few notes:
- the `prevent_anons_from_downloading_files` setting no longer applies to audio and video uploads
- the `secure_media` setting can only be enabled if S3 uploads are already enabled and configured
- upload records have a new column, `secure`, which is a boolean `true/false` of the upload's secure status
- when creating a public post with an upload that has already been uploaded and is marked as secure, the post creator will raise an error
- when enabling or disabling the setting on a site with existing uploads, the rake task `uploads:ensure_correct_acl` should be used to update all uploads' secure status and their ACL on S3
This change both speeds up specs (less strings to allocate) and helps catch
cases where methods in Discourse are mutating inputs.
Overall we will be migrating everything to use #frozen_string_literal: true
it will take a while, but this is the first and safest move in this direction
This is for backwards compatibility purposes. Even if `Upload#url` has a
format that we don't recognize, we should still return the upload object
as long as the upload record is present.
Previously if upload had missing width and height we would calculate
on first use BUT we (me) forgot to save this to the database
This was particularly bad on home page cause category images (when old)
miss dimensions.
For a normal upload url
Before
```
Warming up --------------------------------------
264.000 i/100ms
Calculating -------------------------------------
2.754k (± 8.4%) i/s - 13.728k in 5.022066s
```
After
```
Warming up --------------------------------------
341.000 i/100ms
Calculating -------------------------------------
3.435k (±11.6%) i/s - 17.050k in 5.045676s
```
Previously we used width and height for thumbnails, new code ensures
1. We auto correct width and height
2. We added extra columns for thumbnail_width and height, this is determined
by actual upload and no longer passed in as a side effect
3. Optimized Image now stores filesize which can be used for analysis, decisions
Also
- fixes Android image manifest as a side effect
- fixes issue where a thumbnail generated that is smaller than the upload is no longer used