For Topic Embeds, we would prefer <article> to be the main article in a topic, rather than a table cell <td> with potentially a lot of data. However, in an example URL like here, the table cell (the very large code snippet) is seen as the Topic Embed's article due to the determined content weight by the Readability library we use.
In the newly released 0.7.1 cantino/ruby-readability#94, the library has a new option to exclude the library's default <td> element into content weighting. This is more in line with the original library where they only weighted <p>. So this PR excludes the td, as seen in the tests, to allow the actual article to be seen as the article. This PR also adds the details tag into the allow-list.
* DEV: Add `topic_embed_import_create_args` plugin modifier
This modifier allows a plugin to change the arguments used when creating
a new topic for an imported article.
For example: let's say you want to prepend "Imported: " to the title of
every imported topic. You could use this modifier like so:
```ruby
# In your plugin's code
plugin.register_modifier(:topic_embed_import_create_args) do |args|
args[:title] = "Imported: #{args[:title]}"
args
end
```
In this example, the modifier is prepending "Imported: " to the `title` in the `create_args` hash. This modified title would then be used when the new topic is created.
We're changing the implementation of trust levels to use groups. Part of this is to have site settings that reference trust levels use groups instead. It converts the min_trust_level_to_tag_topics site setting to tag_topic_allowed_groups.
We're changing the implementation of trust levels to use groups. Part of this is to have site settings that reference trust levels use groups instead. It converts the min_trust_level_to_tag_topics site setting to tag_topic_allowed_groups.
* FEATURE: Cache embed contents in the database
This will be useful for features that rely on the semantic content of topics, like the many AI features
Co-authored-by: Roman Rizzi <rizziromanalejandro@gmail.com>
The most common thing that we do with fab! is:
fab!(:thing) { Fabricate(:thing) }
This commit adds a shorthand for this which is just simply:
fab!(:thing)
i.e. If you omit the block, then, by default, you'll get a `Fabricate`d object using the fabricator of the same name.
Many blog posts use these to illustrate and images were previously omitted
Additionally strip superfluous HTML and BODY tags from embed HTML.
This was incorrectly returned from server.
The content from the remote site and the footer get cached for 10 minutes, so Discourse should use the default locale instead of the user locale for the footer. Otherwise Discourse might cache the message in a different language.
It's very easy to forget to add `require 'rails_helper'` at the top of every core/plugin spec file, and omissions can cause some very confusing/sporadic errors.
By setting this flag in `.rspec`, we can remove the need for `require 'rails_helper'` entirely.
This allows text editors to use correct syntax coloring for the heredoc sections.
Heredoc tag names we use:
languages: SQL, JS, RUBY, LUA, HTML, CSS, SCSS, SH, HBS, XML, YAML/YML, MF, ICS
other: MD, TEXT/TXT, RAW, EMAIL
FinalDestination now supports the `follow_canonical` option, which will perform an initial GET request, parse the canonical link if present, and perform a HEAD request to it.
We use this mode during embeds to avoid treating URLs with different query parameters as different topics.
DEV: Allow passing cook_method to TopicEmbed.import to override default
This will be used in the rss-polling plugin when we want to have
oneboxes on feed content, like youtube for example.
If the associated page of a remote url passed to `TopicEmber.new(remote_url)` contained a malformed link like: `<a href="(http://foo.bar)">Baz</a>` it would raise an uncaught exception:
```
Job exception: Invalid scheme format: (http
```
* DEV: More robust processing of URLs
The previous `UrlHelper.encode_component(CGI.unescapeHTML(UrlHelper.unencode(uri))` method would naively process URLs, which could result in a badly formed response.
`Addressable::URI.normalized_encode(uri)` appears to deal with these edge-cases in a more robust way.
* DEV: onebox should use UrlHelper
* DEV: fix spec
* DEV: Escape output when rendering local links
Previously this site setting `embed unlisted` defaulted to false and
empty topics would be generated for embed, but those topics tend to take
up a lot of room on the topic lists.
This new default creates invisible topics by default until they receive
their first reply.
This reverts commit 20780a1eee.
* SECURITY: re-adds accidentally reverted commit:
03d26cd6: ensure embed_url contains valid http(s) uri
* when the merge commit e62a85cf was reverted, git chose the 2660c2e2 parent to land on
instead of the 03d26cd6 parent (which contains security fixes)
When `SiteSetting.embed_truncate` is enabled (by default), the truncated
string is mutatable and does not raise an error.
However, when the setting is disabled, the `contents` string is frozen
and immutable, and will raise a `FrozenError`.
* Introduced fab!, a helper that creates database state for a group
It's almost identical to let_it_be, except:
1. It creates a new object for each test by default,
2. You can disable it using PREFABRICATION=0
This change both speeds up specs (less strings to allocate) and helps catch
cases where methods in Discourse are mutating inputs.
Overall we will be migrating everything to use #frozen_string_literal: true
it will take a while, but this is the first and safest move in this direction