To add an extra layer of security, we sanitize settings before shipping them to the client. We don't sanitize those that have the "html" type.
The CookedPostProcessor already uses Loofah for sanitization, so I chose to also use it for this. I added it to our gemfile since we installed it as a transitive dependency.
Version 2.8 brings some changes to how address fields are handled and
this commits updates that and should also include a fix which handles
encoded attachment filenames.
The fork contains a bugfix to correctly decode mail attachments.
* DEV: Add schema checking to api doc testing
This commit improves upon rswag which lacks schema checking. rswag
really only checks that the https status matches, but this change adds
in the json-schema_builder gem which also has schema validation.
Now we can define schemas for each of our requests/responses in the
`spec/requests/api/schemas` directory which will make our documentation
specs a lot cleaner.
If we update a serializer by either adding or removing an attribute the
tests will now fail (this is a good thing!). Also if you change the type
of an attribute say from an array to a string the tests will now fail.
This will help significantly with keeping the docs in sync with actual
code changes! Now if you change how an endpoint will respond you will
have to update the docs too in order for the tests to pass. :D
This PR is inspired by:
https://www.tealhq.com/post/how-teal-keeps-their-api-tests-and-documentation-in-sync
* Swap out json schema validator gem
Swapped out the outdated json-schema_builder gem with the json_schemer
gem.
* Add validation fields to schema
In order to have "strict" validation we need to add
`additionalProperties: false` to the schema, and we need to specify
which attributes are required.
Updated the debugging test output to print out the error details if
there are any.
We are switching over to a fork because we are currently on a pinned
version of ember-rails 0.18.5 which is pretty old. Upgrading to the
latest version causes many things to break which isn't really worth the
time to debug while we plan to completely switch over to ember-cli
somewhat soonish. Our fork contains a single cherry-pick commit
https://github.com/emberjs/ember-rails/pull/534
which will fix an issue when running the `rails g migration` command and
it spits out a bunch of deprecation warnings.
We are no longer directly referencing the rb-inotify gem directly in
code. This was just a spec level dependency anyways.
Using `git log -S "Inotify"` resulted in these two commits as usages of
`Inotify`:
- b56b11d96a
- 9cf03b352c
both from 2013, but we no longer are using inotify in
https://github.com/discourse/discourse/blob/master/lib/tasks/autospec.rake
which appears to be the only file that was using it.
Based on this info we can safely remove rb-inotify from the Gemfile.
Just as a side note we still do have a couple of gems that do have
rb-inotify as a dependency: listen, and lru_redux.
* DEV: Switch our fast_xor gem for xorcist
We use the `xor` function as part of password hashing and we want to use
a faster version than the native ruby xor'ing feature so we use a gem
for this.
fast_xor has been abandoned, and xorcist fixed our initial holdup for
switching in https://github.com/fny/xorcist/issues/4
xorcist also has jruby support so we can remove our jruby fallback
logic.
* Move using statement inside of class
The rotp gem is currently pinned to version 5.1.0 and this will bump it
up to version 6.0.1.
Follow up to: 85d4370f79
because this issue we were waiting on is now closed:
https://github.com/mdp/rotp/issues/98
Because version 6 is now encoding the params I needed to update the
tests as well.
Currently we have pinned highline to version 1.7.0. This is the gem that
we use to have an interactive command line for tasks like `rake
admin:create`.
Upgrading to the latest version 2.0.3 will remove ruby 2.7 deprecation
warnings.
I'm not sure why *this* gem was pinned. I manually executed a couple of
our rake tasks that use this and everything seems fine.
Not ready for an upgrade due to: https://github.com/mdp/rotp/issues/98
The policy here is that for cases like this we pin the version and add
a comment explaining why it is pinned.
We can revisit in a few months depending on upstream.
This is very minor, see: https://github.com/advisories/GHSA-j6w9-fv6q-3q52
An attacker can elevate own cookie usage to bypass server cookie restrictions
Technically this is a security commit, but the surface area is extremely
low, we do not expect any real world impact.
This includes a fix for CVE-2020-8185 we are not vulnerable as we do not use
the impacted middleware. However it still makes sense to stay upgraded, other
small fixes exist in this release.
It adds a somewhat unnecessary middleware before `ActionDispatch::DebugExceptions` and totally bypasses it. Apps that register exception interceptors with `ActionDispatch::DebugExceptions` would therefore stop working if better_errors is used.
We removed pry-nav a while back because it is not up to date with pry but it is super useful. Luckily pry-byebug is here to save us all from Satan's power.
To get this to work you need to add the following to your $HOME/.pryrc file.
```
if defined?(PryByebug)
Pry.commands.alias_command 'c', 'continue'
Pry.commands.alias_command 's', 'step'
Pry.commands.alias_command 'n', 'next'
Pry.commands.alias_command 'f', 'finish'
end
Pry::Commands.command /^$/, "repeat last command" do
pry_instance.run_command Pry.history.to_a.last
end
```
The require-ing of pry, pry-rails, and pry-byebug in specs is controlled by the IMPROVED_SPEC_DEBUGGING flag (disabled by default).
This reverts commit 20780a1eee.
* SECURITY: re-adds accidentally reverted commit:
03d26cd6: ensure embed_url contains valid http(s) uri
* when the merge commit e62a85cf was reverted, git chose the 2660c2e2 parent to land on
instead of the 03d26cd6 parent (which contains security fixes)
Upgrades Rails to latest, this version has better compatibility
with Ruby 2.7
During the upgrade we needed a new cleaner mechanism for configuring
message bus.
All tests are green.
If anything weird pops up please revert.
This reverts commit e23f1a9071.
Reverting as this currently breaks our plugin linting job in GithHub Action and Jenkins. Will re-revert after all the plugins get the latest rubocop config and/or a (potential) rubocop issue is fixed.
rspec-rails 4.0 was released so we no longer need to depend on a
beta version. Also updates minor on a bunch of rspec gems.
Thanks to @ryanwi for raising this.
TLDR; this commit vastly improves how whitespaces are handled when converting from HTML to Markdown.
It also adds support for converting HTML <tables> to markdown tables.
The previous 'remove_whitespaces!' method was traversing the whole HTML tree and used a heuristic to remove
leading and trailing whitespaces whenever it was appropriate (ie. mostly before and after HTML block elements)
It was a good idea, but it was very limited and leaded to bad conversion when the html had leading whitespaces on several lines for example.
One such example can be found [here](https://meta.discourse.org/t/86782).
For various reasons, most of the whitespaces in a HTML file is ignored when the page is being displayed in a browser.
The rules that the browsers follow are the [CSS' White Space Processing Rules](https://www.w3.org/TR/css-text-3/#white-space-rules).
They can be quite complicated when you take into account RTL languages and other various tidbits but they boils down to the following:
- Collapse whitespaces down to one space (0x20) inside an inline context (ie. nodes/tags that are being displaying on the same line)
- Remove any leading/trailing whitespaces inside an inline context
One quick & dirty way of getting this 90% solved would be to do 'HTML.gsub!(/[[:space:]]+/, " ")'.
We would also need to hoist <pre> elements in order to not mess with their whitespaces.
Unfortunately, this solution let some whitespaces creep around HTML tags which leads to more '.strip!' calls than I can bear.
I decided to "emulate" the browser's handling of whitespaces and came up with a solution in 4 parts
1. remove_not_allowed!
The HtmlToMarkdown library is recursively "visiting" all the nodes in the HTML in order to convert them to Markdown.
All the nodes that aren't handled by the library (eg. <script>, <style> or any non-textual HTML tags) are "swallowed".
In order to reduce the number of nodes visited, the method 'remove_not_allowed!' will automatically delete all the nodes
that have no "visitor" (eg. a 'visit_<tag>' method) defined.
2. remove_hidden!
Similar purpose as the previous method (eg. reducing number of nodes visited), there's no point trying to convert something that is hidden.
The 'remove_hidden!' method removes any nodes that was hidden using the "hidden" HTML attribute, some CSS or with a width or height equal to 0.
3. hoist_line_breaks!
The 'hoist_line_breaks!' method is there to handle <br> tags. I know those tiny <br> don't do much but they can be quite annoying.
The <br> tags are inline elements but they visually work like a block element (ie. they create a new line).
If you have the following HTML "<i>Foo<br>Bar</i>", it ends up visually similar to "<i>Foo</i><br><i>Bar</i>".
The latter being much more easy to process than the former, so that's what this method is doing.
The "hoist_line_breaks" will hoist <br> tags out of inline tags until their parent is a block element.
4. remove_whitespaces!
The "remove_whitespaces!" is where all the whitespace removal is happening. It's broken down into 4 methods as well
- remove_whitespaces!
- is_inline?
- collapse_spaces!
- remove_trailing_space!
The 'remove_whitespace!' method is recursively walking the HTML tree (skipping <pre> tags).
If a node has any children, they will be chunked into groups of inline elements vs block elements.
For each chunks of inline elements, it will call the "collapse_space!" and "remove_trailing_space!" methods.
For each chunks of block elements, it will call "remote_whitespace!" to keep walking the HTML tree recursively.
The "is_inline?" method determines whether a node is part of a inline context.
A node is inline iif it's a text node or it's an inline tag, but not <br>, and all its children are also inline.
The "collapse_spaces!" method will collapse any kind of (white) space into a single space (" ") character, even accros tags.
For example, if we have " Foo \n<i> Bar </i>\t42", it will return "Foo <i>Bar </i>42".
Finally, the "remove_trailing_space!" method is there to remove any trailing space that might creep in at the end of the inline chunk.
This solution is not 100% bullet-proof.
It does not support RTL languages at all and has some caveats that I felt were not worth the work to get properly fixed.
FIX: better detection of hidden elements when converting HTML to Markdown
FIX: take into account the 'allowed_href_schemes' site setting when converting HTML <a> to Markdown
FIX: added support for 'mailto:' scheme when converting <a> from HTML to Markdown
FIX: added support for <img> dimensions when converting from HTML to Markdown
FIX: added support for <dl>, <dd> and <dt> when converting from HTML to Markdown
FIX: added support for multilines emphases, strongs and strikes when converting from HTML to Markdown
FIX: added support for <acronym> when converting from HTML to Markdown
DEV: remove unused 'sanitize' gem
Wow, did you just read all that?! Congratz, here's a cookie: 🍪.
pry-nav is not yet supported on latest pry, this holds off on
upgrading pry, which in turn holds off on upgrading deps
Stripping pry-nav for now till it works with latest pry
Latest version of Rails contains compatibility fixes for Ruby 2.7 and some
minor security fixes we would like to have
It also broke some of the multisite tests.
Rails tries to use the same connection for reading from a replica as writing
to the leader during tests, because, with everything happening in a
transaction, changes to the DB wouldn't otherwise be reflected in the
replica connection.
The difference now is that Rails tries to do this for connections opened
after the test has started which affected rails multisite connections.
The upshot of this is that, as things stand, you are likely to
experience problems if you try to connect to a different multisite DB in
a test when the `current_db` is not 'default'.
This adds rubocop-rspec, and enables some cops that were either already passing or are passing now, after fixing them in this commit.
Some new cops are disabled for now, with annotation: "TODO" or "To be decided". Those either need to be discussed first, or require manual changes, or the number of found and fixed offenses is too large to bundle them up in a single PR.
Includes:
* DEV: Update rubocop's `TargetRubyVersion` to 2.6
* DEV: Enable RSpec/VoidExpect
* DEV: Enable RSpec/SharedContext
* DEV: Enable RSpec/EmptyExampleGroup (Removed an obsolete empty spec file)
* DEV: Enable RSpec/ItBehavesLike
* DEV: Remove RSpec/ScatteredLet (It's too strict, as it doesn't recognize fab! as a let-like)
* DEV: Remove RSpec/MultipleExpectations
json is shipped out of sync with Ruby. Even though we use OJ for many things
we still use the json gem sometimes, this ensures we use the latest
b8b29e79ad/config/initializers/100-oj.rb (L9-L9)
* Remove some `.es6` from comments where it does not matter
* Use a post processor for transpilation
This will allow us to eventually use the directory structure to
transpile rather than the extension.
* FIX: Some errors and clean up in confirm-new-email
It would throw an error if the webauthn element wasn't present.
Also I changed things so that no-module is not explicitly
referenced.
* Remove `no-module`
Instead we allow a magic comment: `// discourse-skip-module` to prevent
the asset pipeline from creating a module.
* DEV: Enable babel transpilation based on directory
If it's in `app/assets/javascripts/dicourse` it will be transpiled
even without the `.es6` extension.
* REFACTOR: Remove Tilt/ES6ModuleTranspiler
byebug, ruby-prof, better_errors and rbtrace are very MRI specific, flag
them as such
This helps move forward on potential jruby and truffleruby experiments
This is not used in core or official plugins, and has been printing a deprecation notice since v2.3.0beta4. All OpenID 2.0 code and dependencies have been dropped. The user_open_ids table remains for now, in case anyone has missed the deprecation notice, and needs to migrate their data.
Context at https://meta.discourse.org/t/-/113249
* DEV: Use Ember 3.12.2
* Add Ember version to ThemeField's DEPENDENT_CONSTANTS
* DEV: Use `id` instead of `elementId` (See: https://github.com/emberjs/ember.js/issues/18147)
* FIX: Don't leak event listeners (bug introduced in 999e2ff)
We can not upgrade rack cause it breaks Sidekiq web.
I can not find a trivial fix short of disabling sessions in Sidekiq which
is a security concern.
We need to figure out how to reuse sessions with our Rails application in
Sidekiq.
This gets extra complex cause we use a special cookie store for sessions.
9e399b42b9/lib/discourse_cookie_store.rb (L3-L21)
The ROTP gem is only used in a very small amount of places in the app, we don't need to globally require it.
Also set the Addressable gem to not have a specific version range, as it has not been a problem yet.
Some slight refactoring of UserSecondFactor here too to use SecondFactorManager to avoid code repetition
The following methods have long been deprecated in ruby due to flaws in their implementation per http://blade.nagaokaut.ac.jp/cgi-bin/vframe.rb/ruby/ruby-core/29293?29179-31097:
URI.escape
URI.unescape
URI.encode
URI.unencode
escape/encode are just aliases for one another. This PR uses the Addressable gem to replace these methods with its own encode, unencode, and encode_component methods where appropriate.
I have put all references to Addressable::URI here into the UrlHelper to keep them corralled in one place to make changes to this implementation easier.
Addressable is now also an explicit gem dependency.
Previously it was unclear why certain gems are being held back cause Gemfile
had no comment explaining it.
I tried to add some explanation from memory and remove some exceptions that
seemed to be superfluous.
This upgrades shoulda to latest, it appears to work once a couple of assertions
are removed
Also update http accept language used to auto detect language from http header
this is tested
Zeitwerk small update seems fine
This reverts commit ab74a50d85.
We really want to upgrade redis, but discovered some edge cases
around failover we need to test.
Holding off on the upgrade till a bit more testing happens
Previously some local micro-benchmarks revealed it was not giving any perf
benefits.
Now that we upgraded to 2.6.5 we are seeing some segfaults.
No need to carry this dependency around anymore.
We can re-evaluate in future if it improves perf and fix the segfaults.
Bump onebox version, and add new styling
Commit, PR and Issue oneboxes are updated with a new design. Timestamps are now localized using local-dates (if installed).
Bump onebox version to include new github rendering, and add relevant CSS
Avatars are reduced in size significantly, and icons are added to easily differentiate PRs and commits. The 'Issue:' prefix is removed from issue oneboxes, to make them consistent with commits and PRs.
We preload to ensure as much memory as possible is reused from unicorn master
to various workers using copy-on-write (sidekiq, unicorn)
This migrates the preloading code into the Discourse module for easier
reuse and adds 3 notable preloading changes
1. We attempt to localize a string on each site, ensuring we warmup
the i18n
2. We preload all our templates (compiling .erb to class)
3. We warm-up our search tokenizer which uses cppjieba which is a large
memory consumer, this will only cause a warmup on CJK sites or sites with
the special site setting enabled.
Adds 2 factor authentication method via second factor security keys over [web authn](https://developer.mozilla.org/en-US/docs/Web/API/Web_Authentication_API).
Allows a user to authenticate a second factor on login, login-via-email, admin-login, and change password routes. Adds registration area within existing user second factor preferences to register multiple security keys. Supports both external (yubikey) and built-in (macOS/android fingerprint readers).
* Adjustments to pass specs on Rails 6.0.0
* Use classic autoloader instead of Zeitwerk
* Update Rails 6.0.0 deprecated methods
* Rails 6.0.0 not allowing column with integer name
* Drop freedom_patches/rails6.rb
* Default value for trigger_transactional_callbacks? is true
* Bump rspec-rails version to 4.0.0.beta2
This commit introduces 2 features:
1. DISCOURSE_COMPRESS_ANON_CACHE (true|false, default false): this allows
you to optionally compress the anon cache body entries in Redis, can be
useful for high load sites with Redis that lives on a separate server to
to webs
2. DISCOURSE_ANON_CACHE_STORE_THRESHOLD (default 2), only pop entries into
redis if we observe them more than N times. This avoids situations where
a crawler can walk a big pile of topics and store them all in Redis never
to be used. Our default anon cache time for topics is only 60 seconds. Anon
cache is in place to avoid the "slashdot" effect where a single topic is
hit by 100s of people in one minute.
This feature adds the ability to customize the HTML part of all emails using a custom HTML template and optionally some CSS to style it. The CSS will be parsed and converted into inline styles because CSS is poorly supported by email clients. When writing the custom HTML and CSS, be aware of what email clients support. Keep customizations very simple.
Customizations can be added and edited in Admin > Customize > Email Style.
Since the summary email is already heavily styled, there is a setting to disable custom styles for summary emails called "apply custom styles to digest" found in Admin > Settings > Email.
As part of this work, RTL locales are now rendered correctly for all emails.
* Revert "Revert "FEATURE: admin/user exports are compressed using the zip format (#7784)""
This reverts commit f89bd55576.
* Replace .tar.zip with .zip
* FEATURE: admin/user exports are compressed using the zip format
* Update translations. Theme exporter now exports .zip file. Theme importer supports .zip and .gz files
* Fix controller test, updated locale and skip saving the csv export to disk