Commit Graph

1146 Commits

Author SHA1 Message Date
David Taylor
2f40d9b07b
DEV: Correct ember-5 lockfile generation (#24983)
The regen_ember_5_lockfile script was actually just duplicating the ember3 lockfile without changes 🤦‍♂️. This commit fixes that, and updates the ember-version-enforcement workflow to detect lockfile issues in future.
2023-12-20 11:45:01 +00:00
Jarek Radosz
6d7dd658a4
DEV: Update rubocop-discourse to 3.6.0 (#24945) 2023-12-18 13:44:36 +01:00
Alan Guo Xiang Tan
fc8075c169
DEV: Fix flaky tests report artifacts not using the right job_id (#24939)
Why this change?

`github.job` returns the `job_id` per the docs but it doesn't actually
return the id of the job but instead returns the job's name strangely.

Per https://github.com/orgs/community/discussions/8945, there is no way
to get the `job_id` from the existing contexts in the actions run.
Therefore, we have to hit Github's API to fetch it. Not ideal but no
way around this.
2023-12-18 15:59:41 +08:00
Kelv
2477bcc32e
DEV: lint against Layout/EmptyLineBetweenDefs (#24914) 2023-12-15 23:46:04 +08:00
Leonardo Mosquera
5417c4fac0
FIX: discourse remap: fix output to avoid UX issue (#24905)
Before this commit, this output is possible:

```
Rewriting all occurrences of STRING1 to STRING2

THIS TASK WILL REWRITE DATA, ARE YOU SURE (type YES)
WILL RUN ON ALL 1 DBS
```

Which, when run from a script, might lead one to believe that YES was
automatically inserted into STDIN and the script is continuing.

Turns out this isn't the case so the obvious expectation is broken.

This commit swaps the order of those last lines to make it clear that
the script is blocked on input.
2023-12-14 16:30:14 -03:00
Jarek Radosz
607c530252
DEV: Remove ruby 1.9/2.0 benchmark (#24862) 2023-12-13 07:22:15 +08:00
Gerhard Schlager
dc8c6b8958 DEV: Lots of improvements to the generic_bulk import script
Notable changes:
* Imports a lot more tables from core and plugins
  * site settings
  * uploads with necessary upload references
  * groups and group members
  * user profiles
  * user options
  * user fields & values
  * muted users
  * user notes (plugin)
  * user followers (plugin)
  * user avatars
  * tag groups and tags
  * tag users (notification settings for tags / user)
  * category permissions
  * polls with options and votes
  * post votes (plugin)
  * solutions (plugin)
  * gamification scores (plugin)
  * events (plugin)
  * badges and badge groupings
  * user badges
  * optimized images
  * topic users (notification settings for topics)
  * post custom fields
  * permalinks and permalink normalizations

* It creates the `migration_mappings` table which is used to store the mapping for a handful of imported tables

* Detects duplicate group names and renames them

* Pre-cooking for attachments, images and mentions

* Outputs instructions when gems are missing

* Supports importing uploads from a DB generated by `uploads_importer.rb`

* Checks that all required plugins exists and enables them if needed

* A couple of optimizations and additions in `import.rake`
2023-12-11 16:23:07 +01:00
Gerhard Schlager
d725b3ca9e DEV: Add script for preprocessing uploads as part of a migration
This script preprocesses all uploads within a intermediate DB (output of converters) and uploads those files to S3. It does the same for optimized images. This speeds up migrations when you have to run them multiple times, because you only have to preprocess and upload the files once.

This script is very hacky and mostly undocumented for now. That will change in the future.
2023-12-11 16:23:07 +01:00
Jarek Radosz
694b5f108b
DEV: Fix various rubocop lints (#24749)
These (21 + 3 from previous PRs) are soon to be enabled in rubocop-discourse:

Capybara/VisibilityMatcher
Lint/DeprecatedOpenSSLConstant
Lint/DisjunctiveAssignmentInConstructor
Lint/EmptyConditionalBody
Lint/EmptyEnsure
Lint/LiteralInInterpolation
Lint/NonLocalExitFromIterator
Lint/ParenthesesAsGroupedExpression
Lint/RedundantCopDisableDirective
Lint/RedundantRequireStatement
Lint/RedundantSafeNavigation
Lint/RedundantStringCoercion
Lint/RedundantWithIndex
Lint/RedundantWithObject
Lint/SafeNavigationChain
Lint/SafeNavigationConsistency
Lint/SelfAssignment
Lint/UnreachableCode
Lint/UselessMethodDefinition
Lint/Void

Previous PRs:
Lint/ShadowedArgument
Lint/DuplicateMethods
Lint/BooleanSymbol
RSpec/SpecFilePathSuffix
2023-12-06 23:25:00 +01:00
Keegan George
d2b53ccac2
DEV: Port discourse-table-builder theme component to core (#24441) 2023-11-30 10:54:29 -08:00
David Taylor
16b6e86932 DEV: Introduce feature-flag for Ember 5 upgrade
This commit introduces the scaffolding for us to easily switch between Ember 3.28 and Ember 5 on the `main` branch of Discourse. Unfortunately, there is no built-in system to apply this kind of flagging within yarn / ember-cli. There are projects like `ember-try` which are designed for running against multiple version of a dependency, but they do not allow us to 'lock' dependency/sub-dependency versions, and are therefore unsuitable for our use in production.

Instead, we will be maintaining two root `package.json` files, and two `yarn.lock` files. For ember-3, they remain as-is. For ember5, we use a yarn 'resolution' to override the version for ember-source across the entire yarn workspace.

To allow for easy switching with minimal diff against the repository, `package.json` and `yarn.lock` are symlinks which point to `package-ember3.json` and `yarn-ember3.lock` by default. To switch to Ember 5, we can run `script/switch ember version 5` to update the symlinks to point to `package-ember5.json` and `package-ember3.json` respectively. In production, and when using `bin/ember-cli` for development, the ember version can also be upgraded using the `EMBER_VERSION=5` environment variable.

When making changes to dependencies, these should be made against the default `ember3` versions, and then `script/regen_ember_5_lockfile` should be used to regenerate `yarn-ember5.lock` accordingly. A new 'Ember Version Lockfiles' GitHub workflow will automate this process on Dependabot PRs.

When running a local environment against Ember 5, the two symlink changes will show up as git diffs. To avoid us accidentally committing/pushing that change, another GitHub workflow is introduced which checks the default Ember version and raises an error if it is greater than v3.

Supporting two ember versions simultaneously obviously carries significant overhead, so our aim will be to get themes/plugins updated as quickly as possible, and then drop this flag.
2023-11-27 16:40:22 +00:00
Jarek Radosz
24532653e6
FIX: A typo bug in an import script (#24553) 2023-11-25 18:10:42 +01:00
Michael Fitz-Payne
2389186155 DEV(cache_critical_dns): sort resolved SRV targets by priority
The priority field in an SRV RR indicates a preferential order at which
the underlying targets should be utilised. We need to prefer healthy
services in order of priority, where 0 is highest.

Prior to this commit, we relied on whatever order the
dnsclient.getresources method returned. As it turns out, this assumption
is incorrect. The order returned is likely whatever order the system
resolver received DNS responses in, which may not be ordered according
to the spec.

This introduces a ResolvedAddress type which holds the priority value
for SRV targets, or a stand-in priority of zero for A/AAAA RRs. This
type is used as a return value from the underlying name resolution
routines in Name and SRVName.

In this manner, all ordering by priority and resolved time can be
performed directly within the ResolverCache class and calling code can
continue to be none-the-wiser.

Before sorting, we still ensure that we only consider targets with a
priority within the given threshold as previously implemented.

See t/115911.
2023-11-22 08:26:00 +10:00
David Taylor
9449a0e0ed
DEV: Silence successful database migration output in github actions (#24416)
The output of db:migrate for a new database is 20k+ lines. We only need the output when an error occurs.
2023-11-16 15:55:41 +00:00
Constanza
28f27b2490
DEV: Adding polls, solutions, upload references and other improvements to the Discourse merger script (#23689) 2023-11-16 14:32:53 +01:00
David Taylor
93c67eeb4f
DEV: Consolidate and update jsconfig, and add types packages (#23824)
These updates significantly improve IDE tooling for imports across the Discourse core codebase, and also for framework packages. The `@types/ember-*` packages are a temporary solution until we get onto Ember 5, which ships its types in the main package.

The previous approach of having jsconfig files in each package directory did work, but once you start adding all the possible interlinks between them, we hit the file count limit of VSCode's tooling (because it counts every file for every jsconfig its referenced in). Having one file at the root means that a single file can apply to all core packages and plugins.

Long-term, to get the same functionality for all themes/plugins, we may need to look at building/publishing a Discourse types package which can be added to theme/plugin package.json files for development purposes.
2023-10-18 12:13:20 +01:00
David Taylor
8a5d97ef3f
DEV: Update importers from PostUpload to UploadReference (#23681)
Discourse stopped using PostUpload in 9db8f00b3d. Since then, these importers have been writing to the table, but any data was totally unused. This commit updates the easy cases to use UploadReference, and adds an error to the discourse_merger import script, which needs more significant work.
2023-09-27 15:01:04 +01:00
Godfrey Chan
e1373c3e84
DEV: introduce Embroider behind a flag, and start testing in CI (#23005)
Discourse core now builds and runs with Embroider! This commit adds
the Embroider-based build pipeline (`USE_EMBROIDER=1`) and start
testing it on CI.

The new pipeline uses Embroider's compat mode + webpack bundler to
build discourse code, and leave everything else (admin, wizard,
markdown-it, plugins, etc) exactly the same using the existing
Broccoli-based build as external bundles (<script> tags), passed
to the build as `extraPublicTress` (which just means they get
placed in the `/public` folder).

At runtime, these "external" bundles are glued back together with
`loader.js`. Specifically, the external bundles are compiled as
AMD modules (just as they were before) and registered with the
global `loader.js` instance. They expect their `import`s (outside
of whatever is included in the bundle) to be already available in
the `loader.js` runtime registry.

In the classic build, _every_ module gets compiled into AMD and
gets added to the `loader.js` runtime registry. In Embroider,
the goal is to do this as little as possible, to give the bundler
more flexibility to optimize modules, or omit them entirely if it
is confident that the module is unused (i.e. tree-shaking).

Even in the most compatible mode, there are cases where Embroider
is confident enough to omit modules in the runtime `loader.js`
registry (notably, "auto-imported" non-addon NPM packages). So we
have to be mindful of that an manage those dependencies ourselves,
as seen in #22703.

In the longer term, we will look into using modern features (such
as `import()`) to express these inter-dependencies.

This will only be behind a flag for a short period of time while we
perform some final testing. Within the next few weeks, we intend
to enable by default and remove the flag.

---------

Co-authored-by: David Taylor <david@taylorhq.com>
2023-09-07 13:15:43 +01:00
Alan Guo Xiang Tan
f165c99d77
DEV: Fix typo in docker_test.rb script (#23456)
Follow up to 9caba30d5c
2023-09-07 15:36:27 +08:00
Alan Guo Xiang Tan
9caba30d5c
DEV: Add docker:test:setup Rake task (#23430)
## What is the context here?

The `docker.rake` Rakefile contains Rake tasks that are meant to be run
in the `discourse/discourse_test:release` Docker image. For example, we
have the `docker:test` Rake task that makes it easier to run the test
suite for a particular Discourse commit.

Why are we introducing a `docker:test:setup` Rake task?

While we have the `docker:test` Rake task, it is very limited in the
test commands that can be executed. It is very useful for automated
testing but not very useful for running tests in the development
environment. Therefore, we are introducing a `docker:test:setup` rake
task that can be used to set up the test environment for running tests.

The envisioned example usage is something like this:

```
docker run -d --name=discourse_test --entrypoint=/sbin/boot discourse/discourse_test:release
docker exec -u discourse:discourse discourse_test ruby script/docker_test.rb --no-tests
docker exec -u discourse:discourse discourse_test bundle exec rake docker:test:setup
docker exec -u discourse:discourse discourse_test bundle exec rspec <path to file>
```
2023-09-07 13:46:23 +08:00
Martin Brennan
cf42466dea
DEV: Add S3 upload system specs using minio (#22975)
This commit adds some system specs to test uploads with
direct to S3 single and multipart uploads via uppy. This
is done with minio as a local S3 replacement. We are doing
this to catch regressions when uppy dependencies need to
be upgraded or we change uppy upload code, since before
this there was no way to know outside manual testing whether
these changes would cause regressions.

Minio's server lifecycle and the installed binaries are managed
by the https://github.com/discourse/minio_runner gem, though the
binaries are already installed on the discourse_test image we run
GitHub CI from.

These tests will only run in CI unless you specifically use the
CI=1 or RUN_S3_SYSTEM_SPECS=1 env vars.

For a history of experimentation here see https://github.com/discourse/discourse/pull/22381

Related PRs:

* https://github.com/discourse/minio_runner/pull/1
* https://github.com/discourse/minio_runner/pull/2
* https://github.com/discourse/minio_runner/pull/3
2023-08-23 11:18:33 +10:00
Jarek Radosz
94649565ce
DEV: Correct Style/RedundantReturn rubocop issues (#23052) 2023-08-10 02:03:38 +02:00
Gerhard Schlager
0b29dc5d38 DEV: Add experimental generic bulk import script 2023-08-09 20:56:14 +02:00
Penar Musaraj
26a8e7da28
DEV: Remove redundant case in import script (#22882) 2023-07-31 14:52:06 -04:00
Selase Krakani
c90e399003
FIX: user_id arg override in Slack import (#22713)
Fixes `user_id` arg override during mention parsing.
2023-07-20 11:22:43 +00:00
Gerhard Schlager
e09ce99884
DEV: Slack import script (#22386)
It's very simple import script and currently imports only the following content:
* Users
* Messages as Discourse topics/posts
* Attachments

Each channel can be mapped to a category and tags. It uses regular expressions to convert formatted messages ("rich text") into Markdown used by Discourse. In the future we could convert the `blocks` attribute from each message into Markdown instead of applying regular expressions on the `text` attribute.
2023-07-04 21:37:45 +02:00
Leonardo Mosquera
c83914e2e5
FIX: fix normalize_raw method for nil inputs in migration scripts (#22304)
Various migration scripts define a normalize_raw method to do custom processing of post contents before storing it in the Post.raw and other fields.

They normally do not handle nil inputs, but it's a relatively common occurrence in data dumps.

Since this method is used from various points in the migration script, as it stands, the experience of using a migration script is that it will fail multiple times at different points, forcing you to fix the data or apply logic hacks every time then restarting.

This PR generalizes handling of nil input by returning a <missing> string.

Pros:

    no more messy repeated crashes + restarts
    consistency

Cons:

    it might hide data issues
        OTOH we can't print a warning on that method because it will flood the console since it's called from inside loops.

* FIX: zendesk import script: support nil inputs in normalize_raw
* FIX: return '<missing>' instead of empty string; do it for all methods
2023-06-29 13:22:47 -03:00
Constanza
911539c1b2
FIX: Removing arbitrary limit in a Discuz importer script query (#21686) 2023-05-23 17:07:09 -04:00
Jarek Radosz
00630e4c74
DEV: Remove RUBY_GLOBAL_METHOD_CACHE_SIZE (#21249)
It doesn't do anything since ruby 3.0.0.preview1. It was removed in https://github.com/ruby/ruby/pull/2888
2023-04-26 10:39:39 +02:00
Krzysztof Kotlarek
1e08afa8d4
FIX: require date db_timestamps_mover script (#21248)
Before `pg` gem version 1.4.6 was loading `date` as dependency.

Looks like version 1.5.1 is not doing that anymore. Update was done here: d32709a74f

Therefore, we have to load `date` explicitly.
2023-04-26 17:59:20 +10:00
Gerhard Schlager
b24c35d887
DEV: phpBB3 importer should get quoted username from actual post (#20979)
The actual code in the import script didn't match the expected behavior as defined by the specs. This fixes it and is a follow-up to ad32fa56
2023-04-05 15:43:20 +02:00
ftc2
ad32fa5690
FIX: phpBB3 importer created invalid quote for posts (#20646)
for proper quotes (those including a valid reference to a source post), the
importer failed to yield a username, and the imported quote was broken.
2023-04-05 13:50:01 +02:00
Jarek Radosz
29e2e3ff3b
DEV: Fix random typos (#20937) 2023-04-03 19:27:32 +02:00
David Taylor
7288bc277b
DEV: Update docker_test to checkout specific branch by default (#20684)
Previously, FETCH_HEAD would always point to tests-passed because our base docker image was configured to only fetch the tests-passed branch. Since https://github.com/discourse/discourse_docker/commit/53bbacc882, we switched to a partial clone which means that `git fetch; git checkout FETCH_HEAD` will checkout whichever remote branch is the first alphabetically. This commit makes the checkout more specific to avoid this issue.
2023-03-15 10:54:47 +00:00
Michael Fitz-Payne
5624dbaa9b FIX(cache_critical_dns): use DB port sourced from environment
Fixes an assumption that the PostgreSQL port will always be reachable at
the discovered host on the default port when performing the healthcheck.

Instead we should be sourcing this from the same environment variable
that the application will be using to connect to.

Defaults to the standard PG port, 5432.
2023-03-10 10:09:07 +10:00
Michael Fitz-Payne
f38779adf4 DEV(cache_critical_dns): improve error reporting for failures
There are two failure modes that can be expected - no target SRV DNS RRs
found or no healthy service available at target addresses. Prior to this
patch, there was no way to differentiate from log messages between the
two cases.

Introduce an EmptyCache exception which may be raised by either the
ResolverCache or HealthyCache. The exception message contains enough
information about where the exception occurred to troubleshoot issues.

An existing bug was fixed in this commit. Previously if a target address
changed during runtime, an old cached (healthy) address would be
returned.. The behaviour has been corrected to return the newly cached
address.
2023-03-09 14:30:44 +10:00
Bianca Nenciu
ccb345bd88
FEATURE: Update topic/comment embedding parameters (#20181)
This commit implements many changes to topic and comments embedding. It
deprecates the class_name field from EmbeddableHost and suggests using
the className parameter. discourse_username parameter has been
deprecated and it will fetch it from embedded site from the author or
discourse-username meta.

See the updated code sample from Admin > Customize > Embedding page.

* FEATURE: Add className parameter for Discourse embed

* DEV: Hide class_name from EmbeddableHost

* DEV: Deprecate class_name field of EmbeddableHost

* FEATURE: Use either author or discourse-username meta tag

* DEV: Deprecate discourse_username parameter

* DEV: Improve embed code sample
2023-02-28 14:31:59 +02:00
Jarek Radosz
f63d18d6ca
DEV: Update syntax_tree to 6.0.1 (#20466) 2023-02-27 17:20:00 +01:00
Loïc Guitaut
f7c57fbc19 DEV: Enable unless cops
We discussed the use of `unless` internally and decided to enforce
available rules from rubocop to restrict its most problematic uses.
2023-02-21 10:30:48 +01:00
Gerhard Schlager
7ef482a292
REFACTOR: Fix pluralized strings in chat plugin (#20357)
* FIX: Use pluralized string

* REFACTOR: Fix misuse of pluralized string

* REFACTOR: Fix misuse of pluralized string

* DEV: Remove linting of `one` key in MessageFormat string, it doesn't work

* REFACTOR: Fix misuse of pluralized string

This also ensures that the URL works on subfolder and shows the site setting link only for admins instead of staff. The string is quite complicated, so the best option was to switch to MessageFormat.

* REFACTOR: Fix misuse of pluralized string

* FIX: Use pluralized string

This also ensures that the URL works on subfolder and shows the site setting link only for admins instead of staff.

* REFACTOR: Correctly pluralize reaction tooltips in chat

This also ensures that maximum 5 usernames are shown and fixes the number of "others" which was off by 1 if the current user reacted on a message.

* REFACTOR: Use translatable string as comma separator

* DEV: Add comment to translation to clarify the meaning of `%{identifier}`

* REFACTOR: Use translatable comma separator and use explicit interpolation keys

* REFACTOR: Don't interpolate lowercase channel status

* REFACTOR: Fix misuse of pluralized string

* REFACTOR: Don't interpolate channel status

* REFACTOR: Use %{count} interpolation key

* REFACTOR: Fix misuse of pluralized string

* REFACTOR: Correctly pluralize DM chat channel titles
2023-02-20 10:31:02 +01:00
Ted Johansson
25a226279a
DEV: Replace #pluck_first freedom patch with AR #pick in core (#19893)
The #pluck_first freedom patch, first introduced by @danielwaterworth has served us well, and is used widely throughout both core and plugins. It seems to have been a common enough use case that Rails 6 introduced it's own method #pick with the exact same implementation. This allows us to retire the freedom patch and switch over to the built-in ActiveRecord method.

There is no replacement for #pluck_first!, but a quick search shows we are using this in a very limited capacity, and in some cases incorrectly (by assuming a nil return rather than an exception), which can quite easily be replaced with #pick plus some extra handling.
2023-02-13 12:39:45 +08:00
GeckoLinux
d1e844841d
Fix occasional bug in order of imported comments (#20204)
This bug is actually a Drupal issue where some edited posts have their `created` and `changed` timestamps set to the same value. But even when that happens in Drupal it still maintains the correct post order in an affected thread. This PR makes the Discourse importer also maintain the original Drupal comment order by sorting comments in the source DB by their `cid`, which is sequential and never changes. More details from this post onward:
https://meta.discourse.org/t/large-drupal-forum-migration-importer-errors-and-limitations/246939/24?u=rahim123
2023-02-08 22:20:46 -05:00
Gerhard Schlager
c3e978ada9
DEV: Add import script for Yammer (#20074)
Co-authored-by: Jay Pfaffman <jay@literatecomputing.com>
2023-01-31 10:12:01 +01:00
Michael Fitz-Payne
df4a9f96ae DEV(cache_critical_dns): add additional service runtime variable
We'd like to lean on the DNS caching service for more than the standard
DB and Redis hosts, but without having to add additional code each time.
Define a new environment variable
DISCOURSE_DNS_CACHE_ADDITIONAL_SERVICE_NAMES (admittedly a mouthful)
which is a list of service names to be added to the static list at
process execution time.

For example, plugin foo may reference two services that you want to
cache the address of. By specifying the following two variables in the
process environment, cache_critical_dns will perform the lookup
alongside the DB and Redis host variables.

```
DISCOURSE_DNS_CACHE_ADDITIONAL_SERVICE_NAMES='FOO_SERVICE1,FOO_SERVICE2'
FOO_SERVICE1='foo.service1.example.com'
FOO_SERVICE1_SRV='foo._tcp.example.com'
FOO_SERVICE2='foo.service2.example.com'
```

The behaviour when it comes to SRV record lookup is the same as
previously implemented for the `DISCOURSE_DB_..` and
`DISCOURSE_REDIS_..` variables.

For the purposes of the health checks, services defined in the list _are
always considered healthy_. This is a compromise for conveniences sake.
Defining a dynamic method for health checks at runtime is not practical.

See t/88457/32.
2023-01-20 10:03:08 +10:00
David Taylor
436b3b392b
DEV: Apply syntax_tree formatting to script/* 2023-01-09 11:13:22 +00:00
David Taylor
d5491b13f5
DEV: Fix syntax/formatting in xenforo import script (#19761)
Followup to 7dfe85fc
2023-01-05 12:47:05 +00:00
Alan Guo Xiang Tan
0da79561c3
DEV: Improve/Fix script/bench.rb (#19646)
1. Fix bug where we were not waiting for all unicorn workers to start up
before running benchmarks.

2. Fix a bug where headers were not used when benchmarking. Admin
benchmarks were basically running as anon user.

3. Disable rate limits when in profile env. We're pretty much going to
hit the rate limit every time as a normal user.

4. Benchmark against topic with a fixed posts count of 100. Previously profiling script was just randomly creating posts
and we would benchmark against a topic with a fixed posts count of 30.
Sometimes, the script fails because no topics with a posts count of 30
exists.

5. Benchmarks are not run against a normal user on top of anon and
admin.

6. Add script option to select tests that should be run.
2022-12-30 07:25:11 +08:00
Bianca Nenciu
c358151a6c
DEV: Promote historic post_deploy migrations (#19492)
This commit promotes all post_deploy migrations which existed in
Discourse v2.8.0 (timestamp <= 20220107014925).

This commit includes a fix to the promote_migrations script to promote
all migrations of the first version of the previous stable version. For
example, if the current stable version is v2.8.13, the version used as
a cutoff for promoting migrations is v2.8.0.
2022-12-16 13:36:30 +02:00
GeckoLinux
cc5b4cd49a
FIX: change drupal permalink creation to use /node/
Drupal URL scheme for nodes begins with `/node/` , not `/topic/` .
2022-12-02 16:03:00 +11:00
Alan Guo Xiang Tan
7c321d3aad
PERF: Update Group#user_count counter cache outside DB transaction (#19256)
While load testing our user creation code path in production, we
identified that executing the DB statement to update the `Group#user_count` column within a
transaction is creating a bottleneck for us. This is because the
creation of a user and addition of the user to the relevant groups are
done in a transaction. When we execute the DB statement to update
`Group#user_count` for the relevant group, a row level lock is held
until the transaction completes. This row level lock acts like a global
lock when the server is creating users that will be added to the same
group in quick succession.

Instead of updating the counter cache within a transaction which the
default ActiveRecord `counter_cache` option does, we simply update the
counter cache outside of the committing transaction.

Co-authored-by: Rafael dos Santos Silva <xfalcox@gmail.com>

Co-authored-by: Rafael dos Santos Silva <xfalcox@gmail.com>
2022-11-30 11:52:08 -03:00
Leonardo Mosquera
bfecbde837
Fixes for vBulletin bulk importer (#17618)
* Allow taking table prefix from env var

* FIX: remove unused column references

The columns `filedata` and `extension` are not present in a v4.2.4
database, and they aren't used in the method anyways.

* FIX: report progress for tables without imported_id

* FIX: effectively check for AR validation errors

NOTE: other migration scripts also have this problem; see /t/58202

* FIX: properly count Posts when importing attachments

* FIX: improve logging

* Remove leftover comment

* FIX: show progress when exporting Permalink file

* PERF: stream Permalink file

The current way results in tons of memory usage; write once per line instead

* Document fixes needed

* WIP - deduplicate category names

* Ignore non alphanumeric chars for grouping

* FIX: properly deduplicate user emails by merging accounts

* FIX: don't merge empty UserEmails

* Improve logging

* Merge users AFTER fixing primary key sequences

* Parallelize user merging

* Save duplicated users structure for debugging purposes

* Add progress logging for the (multiple hour) user merging step
2022-11-28 16:30:19 -03:00
communiteq
7dfe85fcc7
DEV: Xenforo importer improvements (#18457)
* Fix: make expressions non-greedy
* Feature: import Xenforo avatars
* Feature: import Xenforo likes
* Feature: import Xenforo private messages
* Feature: Xenforo create permalinks
* Feature: Xenforo migrate view counts
* Fix: Xenforo list regexes
* Fix: Xenforo import all attachments
2022-11-28 16:42:39 +01:00
Pierre Ozoux
9e9235ca62
FEATURE: Add import script for Elgg (#19140) 2022-11-28 16:28:08 +01:00
David Taylor
84bec1cbae
DEV: Cleanup legacy asset compilation gems and code (#19177)
We now use Ember CLI (core/plugins) and DiscourseJSProcessor (themes) for all Ember and template compilation. This commit removes the remnants of the legacy Sprockets-based Ember compilation system.

Sprockets, and its DiscourseJSProcess-based Babel transformations, is still in use for a few assets. Ideally that will be removed/replaced in the near future.
2022-11-24 12:13:59 +00:00
Jarek Radosz
bc22fe4fdf
DEV: Convert the downsizing script to a rake task (#18976)
…to make it testable!
2022-11-11 13:00:44 +01:00
Michael Fitz-Payne
5fdbbe3045 DEV(cache_critical_dns): add caching for MessageBus Redis hostname
We are already caching any DB_HOST and REDIS_HOST (and their
accompanying replicas), we should also cache the resolved addresses for
the MessageBus specific Redis. This is a noop if no MB redis is defined
in config. A side effect is that the MB will also support SRV lookup and
priorities, following the same convention as the other cached services.

The port argument was added to redis_healthcheck so that the script
supports a setup where Redis is running on a non-default port.

Did some minor refactoring to improve readability when filtering out the
CRITICAL_HOST_ENV_VARS. The `select` block was a bit confusing, so the
sequence was made easier to follow.

We were coercing an environment variable to an int in a few places, so
the `env_as_int` method was introduced to do that coercion in one place and
for convenience purposes default to a value if provided.

See /t/68301/30.
2022-10-12 10:11:22 +10:00
Constanza
067c4deb4c
Fix comment to include phpbb 3.3, which is now supported (#18006) 2022-08-19 16:42:32 -04:00
Constanza
ef842a4b29
FEATURE: Adding a simple CSV importer (#17993) 2022-08-19 13:09:30 -04:00
Constanza
8836c8bcdf
FIX: the phpbbb import script was not parsing youtube tags (#17787) 2022-08-05 15:20:32 -04:00
communiteq
603f36ca4a
DEV: Support phpBB 3.3 imports (#17641)
* handle polls with duplicate items
* handle polls with incorrect poll_option_total values
* handle group IDs in personal messages
* support for version 3.3
2022-07-25 22:07:03 +02:00
Jay Pfaffman
7ab5dcf82f
FEATURE: my_bb import supports avatars (#17617) 2022-07-25 15:22:25 +02:00
Constanza
b9ac8e5748
Adding 3.2 to the versions of phpbb supported by the migration script (#17483) 2022-07-14 18:06:47 +05:30
Michael Fitz-Payne
1867202a4d DEV(cache_critical_dns): add option to run once and exit
There are situations where a container running Discourse may want to
cache the critical DNS services without running the cache_critical_dns
service, for example running migrations prior to running a full bore
application container.

Add a `--once` argument for the cache_critical_dns script that will
only execute the main loop once, and return the status code for the
script to use when exiting. 0 indicates no errors occured during SRV
resolution, and 1 indicates a failure during the SRV lookup.

Nothing is reported to prometheus in run_once mode. Generally this
mode of operation would be a part of a unix pipeline, in which the exit
status is a more meaningful and immediate signal than a prometheus metric.

The reporting has been moved into it's own method that can be called
only when the script is running as a service.

See /t/69597.
2022-07-06 14:53:02 +10:00
Michael Fitz-Payne
aabbc9e63e DOC(cache_critical_dns): add program description
Describes the behaviour and configuration of the cache_critical_dns
script, mainly cribbed from commit messages. Tries to make this program
a bit less of an enigma.
2022-05-26 14:26:57 +10:00
Michael Fitz-Payne
0553788d3b DEV(cache_critical_dns): improve postgres_healthcheck
The `PG::Connection#ping` method is only reliable for checking if the
given host is accepting connections, and not if the authentication
details are valid.

This extends the healthcheck to confirm that the auth details are
able to both create a connection and execute queries against the
database.

We expect the empty query to return an empty result set, so we can
assert on that. If a failure occurs for any reason, the healthcheck will
return false.
2022-05-24 08:20:10 +10:00
Martin Brennan
fcc2e7ebbf
FEATURE: Promote polymorphic bookmarks to default and migrate (#16729)
This commit migrates all bookmarks to be polymorphic (using the
bookmarkable_id and bookmarkable_type) columns. It also deletes
all the old code guarded behind the use_polymorphic_bookmarks setting
and changes that setting to true for all sites and by default for
the sake of plugins.

No data is deleted in the migrations, the old post_id and for_topic
columns for bookmarks will be dropped later on.
2022-05-23 10:07:15 +10:00
Gabe Pacuilla
4284ba9c27
FIX(cache_critical_dns): use correct DISCOURSE_DB_USERNAME envvar (#16862) 2022-05-18 13:01:18 -04:00
Gabe Pacuilla
9f246e6969
FIX(cache_critical_dns): use discourse database name and user by default (#16856) 2022-05-17 16:09:32 -04:00
Michael Fitz-Payne
35d5c29e10 DEV(cache_critical_dns): add SRV priority tunables
An SRV RR contains a priority value for each of the SRV targets that
are present, ranging from 0 - 65535. When caching SRV records we may want to
filter out any targets above or below a particular threshold.

This change adds support for specifying a lower and/or upper bound on
target priorities for any SRV RRs. Any targets returned when resolving
the SRV RR whose priority does not fall between the lower and upper
thresholds are ignored.

For example: Let's say we are running two Redis servers, a primary and
cold server as a backup (but not a replica). Both servers would pass health
checks, but clearly the primary should be preferred over the backup
server. In this case, we could configure our SRV RR with the primary
target as priority 1 and backup target as priority 10. The
`DISCOURSE_REDIS_HOST_SRV_LE` could then be set to 1 and the target with
priority 10 would be ignored.

See /t/66045.
2022-05-12 08:08:56 +10:00
Loïc Guitaut
ab6ca78486 FIX: Use proper ActiveRecord method in import scripts
`ActiveRecord::Base.connection_config` has been deprecated since Rails
6.1 and was completely removed from Rails 7.
Instead we need to use
`ActiveRecord::Base.connection_db_config.configuration_hash`.

Import scripts were forgotten when we did the Rails 7 upgrade, this
patch fixes them.
2022-05-09 11:09:27 +02:00
Martin Brennan
222c8d9b6a
FEATURE: Polymorphic bookmarks pt. 3 (reminders, imports, exports, refactors) (#16591)
A bit of a mixed bag, this addresses several edge areas of bookmarks and makes them compatible with polymorphic bookmarks (hidden behind the `use_polymorphic_bookmarks` site setting). The main ones are:

* ExportUserArchive compatibility
* SyncTopicUserBookmarked job compatibility
* Sending different notifications for the bookmark reminders based on the bookmarkable type
* Import scripts compatibility
* BookmarkReminderNotificationHandler compatibility

This PR also refactors the `register_bookmarkable` API so it accepts a class descended from a `BaseBookmarkable` class instead. This was done because we kept having to add more and more lambdas/properties inline and it was very messy, so a factory pattern is cleaner. The classes can be tested independently as well.

Some later PRs will address some other areas like the discourse narrative bot, advanced search, reports, and the .ics endpoint for bookmarks.
2022-05-09 09:37:23 +10:00
Leonardo Mosquera
3e5faffb0d
DEV: mbox importer improvements (#16557)
* FIX: support specifying parent_category_id in mbox import metadata
* FIX: elide tabs from topic titles
* FIX: optionally fix Mailman from: addresses
* DEV: optionally elide anything up to the last = in email addresses
* Fix Mailmain broken from: detection
2022-04-29 13:24:29 -03:00
Michael Fitz-Payne
1acc4751ff
FIX: remove refresh seconds override on cache_critical_dns (#16572)
This removes the option to override the sleep time between caching of
DNS records. The override was invalid because `''.to_i` is 0 in Ruby,
causing a tight loop calling the `run` method.
2022-04-27 12:42:35 +08:00
Michael Fitz-Payne
0784c28702 FIX: cache_critical_dns - add TLS support for Redis healthcheck
For Redis connections that operate over TLS, we need to ensure that we
are setting the correct arguments for the Redis client. We can utilise
the existing environment variable `DISCOURSE_REDIS_USE_SSL` to toggle
this behaviour.

No SSL verification is performed for two reasons:
- the Discourse application will perform a verification against any FQDN
  as specified for the Redis host
- the healthcheck is run against the _resolved_ IP address for the Redis
  hostname, and any SSL verification will always fail against a direct
  IP address

If no SSL arguments are provided, the IP address is never cached against
the hostname as no healthy address is ever found in the HealthyCache.
2022-04-27 12:27:58 +10:00
Michael Fitz-Payne
c4ea439cc3 DEV: refactor cache_critical_dns for SRV RR awareness
Modify the cache_critical_dns script for SRV RR awareness. The new
behaviour is only enabled when one or more of the following environment
variables are present (and only for a host where the `DISCOURSE_*_HOST_SRV`
variable is present):
- `DISCOURSE_DB_HOST_SRV`
- `DISCOURSE_DB_REPLICA_HOST_SRV`
- `DISCOURSE_REDIS_HOST_SRV`
- `DISCOURSE_REDIS_REPLICA_HOST_SRV`

Some minor changes in refactor to original script behaviour:
- add Name and SRVName classes for storing resolved addresses for a hostname
- pass DNS client into main run loop instead of creating inside the loop
- ensure all times are UTC
- add environment override for system hosts file path and time between DNS
  checks mainly for testing purposes

The environment variable for `BUNDLE_GEMFILE` is set to enables Ruby to
load gems that are installed and vendored via the project's Gemfile.
This script is usually not run from the project directory as it is
configured as a system service (see
71ba9fb7b5/templates/cache-dns.template.yml (L19))
and therefore cannot load gems like `pg` or `redis` from the default
load paths. Setting this environment variable configures bundler to look
in the correct project directory during it's setup phase.

When a `DISCOURSE_*_HOST_SRV` environment variable is present, the
decision for which target to cache is as follows:
- resolve the SRV targets for the provided hostname
- lookup the addresses for all of the resolved SRV targets via the
  A and AAAA RRs for the target's hostname
- perform a protocol-aware healthcheck (PostgreSQL or Redis pings)
- pick the newest target that passes the healthcheck

From there, the resolved address for the SRV target is cached against
the hostname as specified by the original form of the environment
variable.

For example: The hostname specified by the `DISCOURSE_DB_HOST` record
is `database.example.com`, and the `DISCOURSE_DB_HOST_SRV` record is
`database._postgresql._tcp.sd.example.com`. An SRV RR lookup will return
zero or more targets. Each of the targets will be queried for A and AAAA
RRs. For each of the addresses returned, the newest address that passes
a protocol-aware healthcheck will be cached. This address is cached so
that if any newer address for the SRV target appears we can perform a
health check and prefer the newer address if the check passes.

All resolved SRV targets are cached for a minimum of 30 minutes in memory
so that we can prefer newer hosts over older hosts when more than one target
is returned. Any host in the cache that hasn't been seen for more than 30
minutes is purged.

See /t/61485.
2022-04-27 10:14:33 +10:00
David Taylor
d81359246a
DEV: Be more lenient in CLI confirmation (#16290)
If someone types `yes` rather than `YES`, continue anyway.

The chance of typing `yes`, when you actually want to stop, is non-existent. The chance of typing `yes` when you meant `YES` is  high, and it's very frustrating when the script quite because you got the case wrong!
2022-03-25 20:14:41 +00:00
David Taylor
f3aab19829
DEV: Promote historic post_deploy migrations (#16288)
This commit promotes all post_deploy migrations which existed in Discourse v2.7.13 (timestamp <= 20210328233843)

This reduces the likelihood of issues relating to migration run order

Also fixes a couple of typos in `script/promote_migrations`
2022-03-25 15:48:20 +00:00
Jarek Radosz
2fc70c5572
DEV: Correctly tag heredocs (#16061)
This allows text editors to use correct syntax coloring for the heredoc sections.

Heredoc tag names we use:

languages: SQL, JS, RUBY, LUA, HTML, CSS, SCSS, SH, HBS, XML, YAML/YML, MF, ICS
other: MD, TEXT/TXT, RAW, EMAIL
2022-02-28 20:50:55 +01:00
Jarek Radosz
6f6406ea03
DEV: Fix random typos (#16066) 2022-02-28 10:20:58 +08:00
David Taylor
5374e587a3
DEV: Add message-bus analysis script (#15979)
This will count how many messages are published per-channel and produce a table of channels ordered by 'most messages'
2022-02-18 20:21:17 +00:00
Michael Brown
3bf3b9a4a5 DEV: pull email address validation out to a new EmailAddressValidator
We validate the *format* of email addresses in many places with a match against
a regex, often with very slightly different syntax.

Adding a separate EmailAddressValidator simplifies the code in a few spots and
feels cleaner.

Deprecated the old location in case someone is using it in a plugin.

No functionality change is in this commit.

Note: the regex used at the moment does not support using address literals, e.g.:
* localpart@[192.168.0.1]
* localpart@[2001:db8::1]
2022-02-17 21:49:22 -05:00
Gerhard Schlager
6394d7cddf
DEV: Improve phpBB3 import script (#15956)
* Optional import of custom user fields from phpBB 3.1+
* Optional import of likes from phpBB3
  Requires the phpBB "Thanks for posts" extension
* Fix import of bookmarks from phpBB3
* Update `created_at` of existing user
* Support mapping of phpBB forums to existing Discourse categories
  This is in addition to the ability of merging phpBB forums and importing into newly created Discourse categories.
2022-02-16 13:04:31 +01:00
Gerhard Schlager
33d6ed60a4
DEV: Don't import year of birth (#15937)
The cakeday plugin doesn't use the year.
2022-02-14 18:10:35 +01:00
Gerhard Schlager
6a41ec179c
FIX: Default settings for phpBB3 import were broken (#15913) 2022-02-11 18:18:54 +01:00
David Taylor
9e43f0303d
DEV: Include DISCOURSE_REDIS_REPLICA_HOST in cache_critical_dns (#15877)
This is the replacement for DISCOURSE_REDIS_SLAVE_HOST
2022-02-09 14:41:26 +00:00
Canapin
ea2fd75d10
DEV: Fix some regexes in phpBB3 import script (#15829)
1. bbcode hashes don't always have exactly 8 characters.

2. colors aren't always hex values, it can be a color string ("red", "blue", etc).

3. The closing tag of smileys doesn't always include a `:` character (the start of the regex was already right for this particular issue)
2022-02-07 16:16:46 +01:00
David Taylor
ed2f700440
DEV: Wait for initdb to complete in docker.rake (#15614)
On slower hardware it can take a while to init the database. If we don't wait, the `rake db:create` step will fail.
2022-01-17 17:45:39 +00:00
Peter Zhu
c5fd8c42db
DEV: Fix methods removed in Ruby 3.2 (#15459)
* File.exists? is deprecated and removed in Ruby 3.2 in favor of
File.exist?
* Dir.exists? is deprecated and removed in Ruby 3.2 in favor of
Dir.exist?
2022-01-05 18:45:08 +01:00
David Taylor
0e87f882a7
DEV: Use discourse image for postgres in GitHub Actions (#15291)
The discourse base image already contains a postgres installation, so pulling a separate postgres image is a little wasteful. Using the copy of Postgres in the discourse image saves about 20 seconds on every GitHub actions run.

This commit sets up Postgres with a few performance-improving flags, which we were already using for the `rake docker:test` task (used on our internal CI system).
2021-12-14 17:20:06 +00:00
Jarek Radosz
cfabdb72bc
FIX: Ambiguous column in downsize_uploads (#14972) 2021-11-16 16:23:32 +01:00
Leonardo Mosquera
48a08cc397
FIX: Vanilla importer fixes (#14699)
Import script was out of date
2021-10-27 14:22:37 +02:00
Dan Ungureanu
69f0f48dc0
DEV: Fix rubocop issues (#14715) 2021-10-27 11:39:28 +03:00
David Taylor
46d96c9feb
DEV: Apply rubocop to script/import_scripts/phorum.rb (#14727)
Followup to b24002018a
2021-10-26 19:16:52 +01:00
Jeremy Waters
b24002018a Update phorum.rb
Add attachment/file/upload handling to bring them in from phorum to discourse
2021-10-26 12:41:50 -04:00
Jarek Radosz
451cd4ec3f
DEV: Fix thor deprecation warning (#14680)
```
Deprecation warning: Thor exit with status 0 on errors. To keep this behavior, you must define `exit_on_failure?` in `DiscourseCLI`
```
2021-10-21 21:01:05 +02:00
Theodore Diamantidis
97178cd777
FIX: phpbb import - attachments not embedded in posts (#14570) 2021-10-11 14:27:54 +02:00
Constanza
a413a1e015
DEV: process image uploads in the Zendesk API import script (#14524) 2021-10-06 12:24:12 -04:00
Gerhard Schlager
a4d0d866aa
DEV: Bulk imports should find existing users by email (#14468)
Without this change, bulk imports unconditionally create new user records even when a user with the same email address exists.
2021-09-29 00:20:06 +02:00
Andrei Prigorshnev
1656b7ed01
DEV: Make db_timestamp_mover work with tables with unique constraints (#14027)
Some tables in the database have constraints on columns with dates. Because of them, the script for moving timestamps can fail from time to time. This PR makes the script work with such tables.

In general, in PostgreSQL it is not always possible to defer constraint checks to the transaction commit (Primary Keys and Unique Constraints can be deferred, but them should be declared as DEFERRABLE to make it possible. Indices created with CREATE UNIQUE INDEX can't be deferred at all).

Since we can't defer constraint checks, I've made it work using a little hack. For example, if we need to move all timestamps by one day, the script will move timestamps by 1000 years and one day, and then return timestamps back by 1000 years. The script use this hack only for columns that have unique constraints.
2021-08-12 19:24:21 +04:00
Vinoth Kannan
cd9262b7d3
DEV: minor improvements in the vanilla import script. (#14026)
We're parsing the post raw based on the record format now.
2021-08-12 15:07:44 +05:30