With this change the script:
* Actually removes original large-sized images
* Doesn't save processed files if their size has increased
* Prevents inconsistent state
We pick the first topic with 30 responses as our bench topic.
Previously we simply picked the last topic, but hand no guarantee on ordering.
This also attempts to correct previous runs of the bench.
The following methods have long been deprecated in ruby due to flaws in their implementation per http://blade.nagaokaut.ac.jp/cgi-bin/vframe.rb/ruby/ruby-core/29293?29179-31097:
URI.escape
URI.unescape
URI.encode
URI.unencode
escape/encode are just aliases for one another. This PR uses the Addressable gem to replace these methods with its own encode, unencode, and encode_component methods where appropriate.
I have put all references to Addressable::URI here into the UrlHelper to keep them corralled in one place to make changes to this implementation easier.
Addressable is now also an explicit gem dependency.
We like to stay as close as possible to latest with rubocop cause the cops
get better.
This update required some code changes, specifically the default is to avoid
explicit returns where implicit is done
Also this renames a few rules
This is a bottom up rewrite of Discourse cache to support faster performance
and a limited surface area.
ActiveSupport::Cache::Store accepts many options we do not use, this partial
implementation only picks the bits out that we do use and want to support.
Additionally params are named which avoids typos such as "expires_at" vs "expires_in"
This also moves a few spots in Discourse to use Discourse.cache over setex
Performance of setex and Discourse.cache.write is similar.
Discourse.cache is a more consistent method to use and offers clean fallback
if you are skipping redis
This is part of a larger change that both optimizes Discoruse.cache and omits
use of setex on $redis in favor of consistently using discourse cache
Bench does reveal that use of Rails.cache and Discourse.cache is 1.25x slower
than redis.setex / get so a re-implementation will follow prior to porting
`FileUtils.cd` and `Dir.chdir` cause the working directory to change for the entire process. We run sidekiq jobs, hijacked requests and deferred jobs in threads, which can make working directory changes have unintended side-effects.
- Add a rubocop rule to warn about usage of Dir.chdir and FileUtils.cd
- Added rubocop:disable for scripts used outside the app
- Refactored code using cd to use alternative methods
- Temporarily skipped the rubocop check for lib/backup_restore. This will require more complex refactoring, so I will create a separate PR for review
Doing .pluck(:column).first is a very common pattern in Discourse and in
most cases, a limit cause isn't being added. Instead of adding a limit
clause to all these callsites, this commit adds two new methods to
ActiveRecord::Relation:
pluck_first, equivalent to limit(1).pluck(*columns).first
and pluck_first! which, like other finder methods, raises an exception
when no record is found
The script will now correct all width/height and thumbnail_width/thumbnail_height properties of all the uploaded images.
The script now uses width * height to filter out all unaffected images.
Also handled the case where a downsized image was already an uploaded record.
These scripts are somewhat rough but I needed them to help debug a memory
leak we have noticed in rails 6.
The biggest object script finds all the biggest objects we have in memory
after boot.
The test memory leak runs a very simple iteration through all multisites
and observed memory.
I introduced DemonBase because I had got some conflict between `demon/base.rb` and `jobs/base.rb`, however, to not rename base class, it is possible to use regex on absolute path in Zeitwerk custom inflector.
Zeitwerk simplifies working with dependencies in dev and makes it easier reloading class chains.
We no longer need to use Rails "require_dependency" anywhere and instead can just use standard
Ruby patterns to require files.
This is a far reaching change and we expect some followups here.
Trying to automate the login into a Google account is quite hard. This makes the crawler use the content of a cookies.txt file instead. It also removes a couple of deprecation warnings and adds some color to the output.
This is a temporary workaround for the issue in https://github.com/rails/rails/pull/36949
Discussing a proper fix in Rails with the Rails team.
Prior to this fix we were spinning up a thread every time we closed a connection
to the db.
* REFACTOR: Rename SiteSetting.disable_edit_notifications to disable_system_edit_notifications
- The older name could cause some confusion because the setting does not disable all edit notifications, only system ones.
* FIX: Add frozen_string_literal: true in the migration
* DEV: Deprecate 'disable_edit_notifications'
This also corrects FileHelper.download so it supports "follow_redirect"
correctly (it used to always follow 1 redirect) and adds a `validate_url`
param that will bypass all uri validation if set to false (default is true)
* FEATURE: Add attachment support to XenForo importer
If `ATTACHMENT_DIR` is provided, importer will scan each imported post
for `[GALLERY]` and `[ATTACH]` tags, attempt to import the referenced files
as Discourse uploads and replace the tags with Discourse markup.
References to files which cannot be imported are stripped.
NOTE: This only imports attachments which are referenced in imported
posts. Any XenForo media or files which are not referenced in any post
using `[ATTACH]` or `[GALLERY]` tags will not be imported. The goal is to
ensure that we don't have posts with missing images and unsightly
markup, NOT to ensure that all attachments are migrated.
* FEATURE: Add attachment support to XenForo importer
If `ATTACHMENT_DIR` is provided, importer will scan each imported post
for `[GALLERY]` and `[ATTACH]` tags, attempt to import the referenced files
as Discourse uploads and replace the tags with Discourse markup.
References to files which cannot be imported are stripped.
NOTE: This only imports attachments which are referenced in imported
posts. Any XenForo media or files which are not referenced in any post
using `[ATTACH]` or `[GALLERY]` tags will not be imported. The goal is to
ensure that we don't have posts with missing images and unsightly
markup, NOT to ensure that all attachments are migrated.
* FEATURE: Add attachment support to XenForo importer
If `ATTACHMENT_DIR` is provided, importer will scan each imported post
for `[GALLERY]` and `[ATTACH]` tags, attempt to import the referenced files
as Discourse uploads and replace the tags with Discourse markup.
References to files which cannot be imported are stripped.
NOTE: This only imports attachments which are referenced in imported
posts. Any XenForo media or files which are not referenced in any post
using `[ATTACH]` or `[GALLERY]` tags will not be imported. The goal is to
ensure that we don't have posts with missing images and unsightly
markup, NOT to ensure that all attachments are migrated.
This reduces chances of errors where consumers of strings mutate inputs
and reduces memory usage of the app.
Test suite passes now, but there may be some stuff left, so we will run
a few sites on a branch prior to merging
This removes all uses of both `send` and `public_send` from consumers of
SiteSetting and instead introduces a `get` helper for dynamic lookup
This leads to much cleaner and safer code long term as we are always explicit
to test that a site setting is really there before sending an arbitrary
string to the class
It also removes a couple of risky stubs from the auth provider test
`Upload#url` is more likely and can change from time to time. When it
does changes, we don't want to have to look through multiple tables to
ensure that the URLs are all up to date. Instead, we simply associate
uploads properly to `UserProfile` so that it does not have to replicate
the URLs in the table.
Includes support for flags, reviewable users and queued posts, with REST API
backwards compatibility.
Co-Authored-By: romanrizzi <romanalejandro@gmail.com>
Co-Authored-By: jjaffeux <j.jaffeux@gmail.com>
This script can be used to flip Ruby to a patched Ruby version
or a different major version from inside the container
It is used to test and compare different Ruby versions
Migrates email user options to a new data structure, where `email_always`, `email_direct` and `email_private_messages` are replace by
* `email_messages_level`, with options: `always`, `only_when_away` and `never` (defaults to `always`)
* `email_level`, with options: `always`, `only_when_away` and `never` (defaults to `only_when_away`)
The library used to generate random text changed, this caused the title
of the topic used for testing to change, which meant the slug changed, so
a hit to the topic was a redirect
This fix gives the topic used for performance testing a static name to avoid
this issue in future
It is not a setting, and only relevant in specs. The new API is:
```
Jobs.run_later! # jobs will be thrown on the queue
Jobs.run_immediately! # jobs will run right away, avoid the queue
```
Improves the generic database used by some import scripts:
* Adds additional columns for users
* Adds support for attachments
* Allows setting the data type for keys (numeric or string) to ensure correct sorting
* Log errors when mapping of posts, messages, etc. fails
* Allow permalink normalizations for old subfolder installation
* Disable importing of polls for now. It's broken.
Treating TIFF and BMP as images cause us to add them to IMG tags, this is very inconsistent across browsers.
You can still upload these files they will simply not be displayed in IMG tags.
Changes to functionality
- Removed syncing of user metadata including gender, location etc.
These are no longer available to standard Facebook applications.
- Removed the remote 'revoke' functionality. No other providers have
it, and it does not appear to be standard practice in other apps.
- The 'facebook_no_email' event is no longer logged. The system can
cope fine with a missing email address.
Data is migrated to the new user_associated_accounts table.
facebook_user_infos can be dropped once we are confident the data has
been migrated successfully.
This is useful for cases where you want to add resiliency to DNS lookups
for redis and postgres, so they will continue to work even if there is
a DNS outage
This restores the previous functionality. The script now allows the following options:
* `discourse backup` (uses the system generated filename)
* `discourse backup <some_filename>` (uses the provided filename)
* `discourse backup </some/path/to/filename>` (moves the backup to the provided path with the given filename)
Remote backup stores do not support the last option.
Some file extensions (like `.tar.gz`) are automatically removed from the provided filename.
- By default, behaviour is not changed: tags are made lowercase upon creation and edit.
- If force_lowercase_tags is disabled, then mixed case tags are allowed.
- Tags must remain case-insensitively unique. This is enforced by ActiveRecord and Postgres.
- A migration is added to provide a `UNIQUE` index on `lower(name)`. Migration includes a safety to correct any current tags that do not meet the criteria.
- A `where_name` scope is added to `models/tag.rb`, to allow easy case-insensitive lookups. This is used instead of `Tag.where(name: "blah")`.
- URLs remain lowercase. Mixed case URLs are functional, but have the lowercase equivalent as the canonical.
* IPB import script replace PHP code tags with proper markdown
remove excess newlines in code blocks
decode HTML entities in code blocks
add replacement for list items
proper handling of attachments that are not images
fix typo
improved quote handling
fix code style complaint from travis-ci build
* emails weren't sorted in correct order
* better default regex for splitting mbox files
* output Message-ID if email is skipped because it doesn't have a Date
* Try multiple filenames and do lots of guessing when searching for attachments
* Unescape HTML in filenames and replace invalid characters in filenames
* Existing permalinks prevented resuming of import
* Prevent duplicate attachments in same post
Introduce new patterns for direct sql that are safe and fast.
MiniSql is not prone to memory bloat that can happen with direct PG usage.
It also has an extremely fast materializer and very a convenient API
- DB.exec(sql, *params) => runs sql returns row count
- DB.query(sql, *params) => runs sql returns usable objects (not a hash)
- DB.query_hash(sql, *params) => runs sql returns an array of hashes
- DB.query_single(sql, *params) => runs sql and returns a flat one dimensional array
- DB.build(sql) => returns a sql builder
See more at: https://github.com/discourse/mini_sql
* supports resources created with Transifex's YML handler version 3
* uses translations-manager gem
* makes sure that the locales supported by translations-manager are not out of sync
* update the lang_map in tx client config before pulling translations
Ignores the site setting "find_related_post_with_key" and always tries to honor the `In-Reply-To` and `References` header for emails sent to a group.
The senders email address must be included in the `To` or `CC` header of a previous email sent to the group and the `Message-ID` of that email must be included in the current email's `In-Reply-To` or `References` header.
* Works with the current column layout exported as Excel file
* Tries to fix invalid CSV when it wasn't exported from Excel
* Imports categories
* Imports topics into the correct category
* Allows skipping archived topics
* Allows skipping private topics
* Makes use of the latest features from the base importer
* Some minor fixes and documentation updates
* Make sure the category description is imported correctly
(the about topic usually had the wrong excerpt).
* Allow import scripts to mark topics as closed or archived.
* Allow import scripts to store the topic's original id.
It will be stored in topic_custom_fields as import_topic_id.
In the past we used suppress_from_homepage, it had mixed semantics
it would remove from category list if category list was on home and
unconditionally remove from latest.
New setting explicitly only removes from latest list but leaves the
category list alond
- Remove thin which is no longer supported
- Bypass admin api rate limiting in profile environment
- Admin password was too short
- Run by default in concurrency 1 mode
- A skip bundle assets flag to speed up local testing
* store time it took to index message in DB (to find performance issues)
* ignore listserv specific files
* better examples for split_regex
* first email in mbox shouldn't contain the split string
* always lock the DB in exclusive mode
* save email within transaction
* messages can be grouped by subject and use original order (for Listserv)
* adds option to index emails without running the import
* ignores dot-files and empty emails
* new setting to prefer HTML over plaintext emails during import
* restore original site settings at the end of import
* elided content of HTML mails was not put inside details block