Commit Graph

44 Commits

Author SHA1 Message Date
Jarek Radosz
3d55f2e3b7
FIX: Improvements and fixes to the image downsizing script (#9950)
Fixed bugs, added specs, extracted the upload downsizing code to a class, added support for non-S3 setups, changed it so that images aren't downloaded twice.

This code has been tested on production and successfully resized ~180k uploads.

Includes:

* DEV: Extract upload downsizing logic
* DEV: Add support for non-S3 uploads
* DEV: Process only images uploaded by users
* FIX: Incorrect usage of `count` and `exist?` typo
* DEV: Spec S3 image downsizing
* DEV: Avoid downloading images twice
* DEV: Update filesizes earlier in the process
* DEV: Return false on invalid upload
* FIX: Download images that currently above the limit (If the image size limit is decreased, then there was no way to resize those images that now fall outside the allowed size range)
* Update script/downsize_uploads.rb (Co-authored-by: Régis Hanol <regis@hanol.fr>)
2020-06-11 14:47:59 +02:00
Jarek Radosz
27ad562ff5 DEV: Rubocop fix 2020-06-01 06:07:07 +02:00
Jarek Radosz
7df688d108 FIX: Handle files removed between glob and mtime 2020-06-01 05:50:50 +02:00
Sam Saffron
0cbaa8d813
FEATURE: extend duration allowed for download
Previously we would raise a warning in the logs if downloading
a file (from s3) takes longer than 60 seconds.

At scale this happens reasonably frequently.

1. Raised the duration to 3 minutes

2. Pulled the resizing mutex out of the downloading mutex
so we have less and clearer error logs
2020-05-15 12:45:47 +10:00
Sam Saffron
d0d5a138c3
DEV: stop freezing frozen strings
We have the `# frozen_string_literal: true` comment on all our
files. This means all string literals are frozen. There is no need
to call #freeze on any literals.

For files with `# frozen_string_literal: true`

```
puts %w{a b}[0].frozen?
=> true

puts "hi".frozen?
=> true

puts "a #{1} b".frozen?
=> true

puts ("a " + "b").frozen?
=> false

puts (-("a " + "b")).frozen?
=> true
```

For more details see: https://samsaffron.com/archive/2018/02/16/reducing-string-duplication-in-ruby
2020-04-30 16:48:53 +10:00
Jarek Radosz
c1c211365a
FIX: Improve clearing store cache (#9568)
1. Shorter
2. Simpler
3. Doesn't depend on external binaries
4. Doesn't fail on large amounts of files
5. Hopefully eliminates flaky spec errors
2020-04-28 17:24:04 +02:00
David Taylor
ba616ffb50
DEV: Use a tmp directory for storing uploads in tests (#9554)
This avoids development-mode upload files from polluting the test environment
2020-04-28 14:03:04 +01:00
Jarek Radosz
63a4aa65ff
DEV: Ignore ls errors when clearing FileStore cache (#8780)
A race condition issue is possible when multiple thread/processes are calling this method.
`ls` prints out to stderr "cannot access '...': No such file or directory" if any of the files it's currently trying to list are being removed by the `xargs rm -rf` in an another process. That doesn't affect the result, but it did raise an error before this change.

Tested on a production instance where the original issue was observed.

Co-Authored-By: Régis Hanol <regis@hanol.fr>
2020-01-27 02:59:54 +01:00
Martin Brennan
7c32411881
FEATURE: Secure media allowing duplicated uploads with category-level privacy and post-based access rules (#8664)
### General Changes and Duplication

* We now consider a post `with_secure_media?` if it is in a read-restricted category.
* When uploading we now set an upload's secure status straight away.
* When uploading if `SiteSetting.secure_media` is enabled, we do not check to see if the upload already exists using the `sha1` digest of the upload. The `sha1` column of the upload is filled with a `SecureRandom.hex(20)` value which is the same length as `Upload::SHA1_LENGTH`. The `original_sha1` column is filled with the _real_ sha1 digest of the file. 
* Whether an upload `should_be_secure?` is now determined by whether the `access_control_post` is `with_secure_media?` (if there is no access control post then we leave the secure status as is).
* When serializing the upload, we now cook the URL if the upload is secure. This is so it shows up correctly in the composer preview, because we set secure status on upload.

### Viewing Secure Media

* The secure-media-upload URL will take the post that the upload is attached to into account via `Guardian.can_see?` for access permissions
* If there is no `access_control_post` then we just deliver the media. This should be a rare occurrance and shouldn't cause issues as the `access_control_post` is set when `link_post_uploads` is called via `CookedPostProcessor`

### Removed

We no longer do any of these because we do not reuse uploads by sha1 if secure media is enabled.

* We no longer have a way to prevent cross-posting of a secure upload from a private context to a public context.
* We no longer have to set `secure: false` for uploads when uploading for a theme component.
2020-01-16 13:50:27 +10:00
Vinoth Kannan
3b7f5db5ba
FIX: parallel spec system needs a dedicated upload folder for each worker. (#8547) 2019-12-18 11:21:57 +05:30
Vinoth Kannan
d3e7768ea8 Revert "FIX: parallel spec system needs needs a dedicated upload folder for each worker. (#8372)"
This reverts commit 42e5176bc3.
2019-11-19 15:02:18 +05:30
Vinoth Kannan
42e5176bc3
FIX: parallel spec system needs needs a dedicated upload folder for each worker. (#8372) 2019-11-19 13:16:20 +05:30
Penar Musaraj
102909edb3 FEATURE: Add support for secure media (#7888)
This PR introduces a new secure media setting. When enabled, it prevent unathorized access to media uploads (files of type image, video and audio). When the `login_required` setting is enabled, then all media uploads will be protected from unauthorized (anonymous) access. When `login_required`is disabled, only media in private messages will be protected from unauthorized access. 

A few notes: 

- the `prevent_anons_from_downloading_files` setting no longer applies to audio and video uploads
- the `secure_media` setting can only be enabled if S3 uploads are already enabled and configured
- upload records have a new column, `secure`, which is a boolean `true/false` of the upload's secure status
- when creating a public post with an upload that has already been uploaded and is marked as secure, the post creator will raise an error
- when enabling or disabling the setting on a site with existing uploads, the rake task `uploads:ensure_correct_acl` should be used to update all uploads' secure status and their ACL on S3
2019-11-18 11:25:42 +10:00
David Taylor
1998be3b27
DEV: Raise errors when cleaning the download cache, and fix for macOS (#8319)
POSIX's `head` specification states: "The application shall ensure that the number option-argument is a positive decimal integer"

Negative values are supported on GNU `head`, so this works in the discourse docker image. However, in some environments (e.g. macOS), the system `head` version fails with a negative `n` parameter.

This commit does two things:

Checks the status at each stage of the pipe, so it cannot fail silently
Flip the `ls` command to list in descending time order, and use `tail -n +501` instead of `head -n -500`.

The visible result is that macOS users no longer see head: illegal line count -- -500 printed throughout the test suite.
2019-11-08 15:34:03 +00:00
Vinoth Kannan
b7830680b6 DEV: use cdn url to download the external uploads to local. 2019-06-06 19:17:19 +05:30
David Taylor
ef660d5a3e FIX: Return consistent character encodings when downloading S3 uploads
Net::HTTP always returns ASCII-8BIT encoding. File.read auto-detects the encoding. This leads to an encoding inconsistency between a fresh download, and a cached download. This commit ensures all downloaded files are treated equally, by always returning the cached version from the filesystem, even during initial download.

One symptom of this problem is during theme exports: https://meta.discourse.org/t/116907

Related ruby ticket: https://bugs.ruby-lang.org/issues/2567
2019-05-17 11:27:00 +01:00
Sam Saffron
30990006a9 DEV: enable frozen string literal on all files
This reduces chances of errors where consumers of strings mutate inputs
and reduces memory usage of the app.

Test suite passes now, but there may be some stuff left, so we will run
a few sites on a branch prior to merging
2019-05-13 09:31:32 +08:00
Guo Xiang Tan
b0c8fdd7da FIX: Properly support defaults for upload site settings. 2019-03-13 16:36:57 +08:00
Vinoth Kannan
f94c0283b2
FIX: Use correct version when generating file path for optimized image (#6871) 2019-01-11 18:35:38 +05:30
Régis Hanol
5381096bfd PERF: new 'migrate_to_s3' rake task 2018-12-26 17:34:49 +01:00
Rishabh
05a4f3fb51 FEATURE: Multisite support for S3 image stores (#6689)
* FEATURE: Multisite support for S3 image stores

* Use File.join to concatenate all paths & fix linting on multisite/s3_store_spec.rb
2018-11-29 12:11:48 +08:00
Vinoth Kannan
fd272eee44 FEATURE: Make uploads:missing task compatible with s3 uploads 2018-11-27 00:54:51 +05:30
Régis Hanol
448e2fe1a2 FIX: properly delete files in the download cache 2018-07-04 18:18:39 +02:00
Régis Hanol
5f4f617689 FIX: cache_file storage cleanup logic was wrong
https://meta.discourse.org/t/68296
2018-01-18 17:00:04 +01:00
Guo Xiang Tan
8cc8010564 Maintain backwards compatibility before Jobs::MigrateUploadExtensions runs. 2017-08-03 11:56:55 +09:00
Neil Lalonde
83011045c8 fix rubocop offenses 2017-07-31 11:59:16 -04:00
Jakub Macina
677267ae78 Add onceoff job for uploads migration of column extension. Simplify filetype search and related rspec tests. 2017-07-12 17:19:27 +02:00
Jakub Macina
8c445e9f17 Fix backend code for searching by a filetype as a combination of uploads and topic links. Add rspec test for extracting file extension in upload. 2017-07-06 19:19:31 +02:00
Robin Ward
cdbe027c1c Refactor FileHelper to use keyword arguments. 2017-05-24 13:54:26 -04:00
Guo Xiang Tan
3378ee223f FIX: Incorrect path being passed to S3Store#remove_file. 2016-08-15 11:35:30 +08:00
Guo Xiang Tan
1779a9634a FIX: Make sure we raise an error when method is not implemented. 2016-08-12 11:43:57 +08:00
Régis Hanol
5169bcdb6e FIX: httpshttps ultra secure URLs 2016-06-30 16:55:01 +02:00
Robin Ward
f155ff8270 FIX: Tests would fail if your test db's optimized image ids were high 2015-10-16 17:08:41 -04:00
Régis Hanol
81a699e2b0 better support for mixed content 2015-06-01 17:49:58 +02:00
Régis Hanol
56f077db69 FIX: optimized images fail if source is remote and S3 is disabled 2015-06-01 11:13:56 +02:00
Régis Hanol
5a143c0c6e storage engines refactor 2015-05-29 18:39:47 +02:00
Régis Hanol
8e7bfd0f29 FIX: automatically growing uploads tree 2015-05-28 01:03:24 +02:00
Sam
a988cd5abe FIX: redirect to CDN avatar for s3 avatars 2015-05-27 12:02:57 +10:00
Régis Hanol
31e9cafe0e FEATURE: use original filename when clicking the download link in the lightbox 2014-10-15 19:20:04 +02:00
Régis Hanol
652cc3efba FEATURE: new rake task to clean up uploads & thumbnails 2014-09-29 18:31:53 +02:00
Régis Hanol
542d54e6bf BUGFIX: uploads to S3 2014-04-15 13:04:14 +02:00
Régis Hanol
e732aa8a86 BUGFIX: we should not store absolute urls for locally uploaded avatar templates
Highly recommended to run: `RAILS_ENV=production bundle exec rake avatars:regenerate` to fix the avatar templates stored in the database.
2014-01-07 17:45:06 +01:00
Régis Hanol
52160179f8 add a tombstone for extra safety 2013-11-27 22:05:11 +01:00
Régis Hanol
37fd7ab574 pull hotlinked images 2013-11-05 19:07:29 +01:00