This commit fixes the follow quality issue with `PostSearchData#raw_data`:
1. URLs are being tokenized and links with similar href and characters
are being duplicated in the raw data.
`Post#cooked`:
```
<p><a href=\"https://meta.discourse.org/some.png\" class=\"onebox\" target=\"_blank\" rel=\"nofollow noopener\">https://meta.discourse.org/some.png</a></p>
```
`PostSearchData#raw_data` Before:
```
This is a test topic 0 Uncategorized https://meta.discourse.org/some.png discourse org/some png https://meta.discourse.org/some.png discourse org/some png
```
`PostSearchData#raw_data` After:
```
This is a test topic 0 Uncategorized https://meta.discourse.org/some.png meta discourse org
```
2. Ligthbox being included in search pollutes the
`PostSearchData#raw_data` unncessarily.
From 28 March 2018 to 28 March 2019, searches for the term `image` on
`meta.discourse.org` had a click through rate of 2.1%. Non-lightboxed images are not included in indexing for search yet we were indexing content within a lightbox. Also, search for terms like `image` was affected we were using `Pasted image` as the filename for
uploads that were pasted.
`Post#cooked`
```
<p>Let me see how I can fix this image<br>\n<div class=\"lightbox-wrapper\"><a class=\"lightbox\" href=\"https://meta.discourse.org/some.png\" title=\"some.png\" rel=\"nofollow noopener\"><img src=\"https://meta.discourse.org/some.png\" width=\"275\" height=\"299\"><div class=\"meta\">\n<svg class=\"fa d-icon d-icon-far-image svg-icon\" aria-hidden=\"true\"><use xlink:href=\"#far-image\"></use></svg><span class=\"filename\">some.png</span><span class=\"informations\">1750×2000</span><svg class=\"fa d-icon d-icon-discourse-expand svg-icon\" aria-hidden=\"true\"><use xlink:href=\"#discourse-expand\"></use></svg>\n</div></a></div></p>
```
`PostSearchData#raw_data` Before:
```
This is a test topic 0 Uncategorized Let me see how I can fix this image some.png png https://meta.discourse.org/some.png discourse org/some png some.png png 1750×2000
```
`PostSearchData#raw_data` After:
```
This is a test topic 0 Uncategorized Let me see how I can fix this image
```
In terms of indexing performance, we now have to parse the given HTML
through nokogiri twice. However performance is not a huge worry here since a string length of 194170 takes only 30ms
to scrub plus the indexing takes place in a background job.
When the S3 store was enabled, we were only applying the S3 CDN.
So all images stored locally, like the emojis, were never put on the local CDN.
Fixed a bunch of CookedPostProcessor test by adding a call to 'optimize_urls'
in order to get final URLs.
I also removed the unnecessary PrettyText.add_s3_cdn method since this is already
handled in the CookedPostProcessor.
This generates a 10x10 PNG thumbnail for each lightboxed image.
If Image Lazy Loading is enabled (IntersectionObserver API) then
we'll load the low res version when offscreen. As the image scrolls
in we'll swap it for the high res version.
We use a WeakMap to track the old image attributes. It's much less
memory than storing them as `data-*` attributes and swapping them
back and forth all the time.
Previously, the site setting was only effective on the client side of
things. Once the site setting was been reached, all oneboxes are not
rendered. This commit changes it such that the site setting is respected
both on the client and server side. The first N oneboxes are rendered and
once the limit has been reached, subsequent oneboxes will not be
rendered.
* Add missing icons to set
* Revert FA5 revert
This reverts commit 42572ff
* use new SVG syntax in locales
* Noscript page changes (remove login button, center "powered by" footer text)
* Cast wider net for SVG icons in settings
- include any _icon setting for SVG registry (offers better support for plugin settings)
- let themes store multiple pipe-delimited icons in a setting
- also replaces broken onebox image icon with SVG reference in cooked post processor
* interpolate icons in locales
* Fix composer whisper icon alignment
* Add support for stacked icons
* SECURITY: enforce hostname to match discourse hostname
This ensures that the hostname rails uses for various helpers always matches
the Discourse hostname
* load SVG sprite with pre-initializers
* FIX: enable caching on SVG sprites
* PERF: use JSONP for SVG sprites so they are served from CDN
This avoids needing to deal with CORS for loading of the SVG
Note, added the svg- prefix to the filename so we can quickly tell in
dev tools what the file is
* Add missing SVG sprite JSONP script to CSP
* Upgrade to FA 5.5.0
* Add support for all FA4.7 icons
- adds complete frontend and backend for renamed FA4.7 icons
- improves performance of SvgSprite.bundle and SvgSprite.all_icons
* Fix group avatar flair preview
- adds an endpoint at /svg-sprites/search/:keyword
- adds frontend ajax call that pulls icon in avatar flair preview even when it is not in subset
* Remove FA 4.7 font files
When creating lightboxes we will attempt to create 1.5x and 2x thumbnails
for retina screens, this can be controlled with a new hidden site setting
called responsice_post_image_sizes, if you wish to create 3x images run
SiteSetting.responsive_post_image_sizes = "1|1.5|2|3"
The default should be good for most of the setups as it balances filesize
with quality. 3x thumbs can get big.
Previously we used width and height for thumbnails, new code ensures
1. We auto correct width and height
2. We added extra columns for thumbnail_width and height, this is determined
by actual upload and no longer passed in as a side effect
3. Optimized Image now stores filesize which can be used for analysis, decisions
Also
- fixes Android image manifest as a side effect
- fixes issue where a thumbnail generated that is smaller than the upload is no longer used
Introduce new patterns for direct sql that are safe and fast.
MiniSql is not prone to memory bloat that can happen with direct PG usage.
It also has an extremely fast materializer and very a convenient API
- DB.exec(sql, *params) => runs sql returns row count
- DB.query(sql, *params) => runs sql returns usable objects (not a hash)
- DB.query_hash(sql, *params) => runs sql returns an array of hashes
- DB.query_single(sql, *params) => runs sql and returns a flat one dimensional array
- DB.build(sql) => returns a sql builder
See more at: https://github.com/discourse/mini_sql
This refactors handling of s3 so it can be specified via GlobalSetting
This means that in a multisite environment you can configure s3 uploads
without actual sites knowing credentials in s3
It is a critical setting for situations where assets are mirrored to s3.