Commit Graph

1021 Commits

Author SHA1 Message Date
Michael Fitz-Payne
29f0bc4ea2 DEV(cache_critical_dns): add caching for MessageBus Redis hostname
We are already caching any DB_HOST and REDIS_HOST (and their
accompanying replicas), we should also cache the resolved addresses for
the MessageBus specific Redis. This is a noop if no MB redis is defined
in config. A side effect is that the MB will also support SRV lookup and
priorities, following the same convention as the other cached services.

The port argument was added to redis_healthcheck so that the script
supports a setup where Redis is running on a non-default port.

Did some minor refactoring to improve readability when filtering out the
CRITICAL_HOST_ENV_VARS. The `select` block was a bit confusing, so the
sequence was made easier to follow.

We were coercing an environment variable to an int in a few places, so
the `env_as_int` method was introduced to do that coercion in one place and
for convenience purposes default to a value if provided.

See /t/68301/30.
2022-10-26 09:22:20 +10:00
Michael Fitz-Payne
bc8ad2a5d2 DEV(cache_critical_dns): add option to run once and exit
There are situations where a container running Discourse may want to
cache the critical DNS services without running the cache_critical_dns
service, for example running migrations prior to running a full bore
application container.

Add a `--once` argument for the cache_critical_dns script that will
only execute the main loop once, and return the status code for the
script to use when exiting. 0 indicates no errors occured during SRV
resolution, and 1 indicates a failure during the SRV lookup.

Nothing is reported to prometheus in run_once mode. Generally this
mode of operation would be a part of a unix pipeline, in which the exit
status is a more meaningful and immediate signal than a prometheus metric.

The reporting has been moved into it's own method that can be called
only when the script is running as a service.

See /t/69597.
2022-10-26 09:22:20 +10:00
Michael Fitz-Payne
7804eb44e6 DOC(cache_critical_dns): add program description
Describes the behaviour and configuration of the cache_critical_dns
script, mainly cribbed from commit messages. Tries to make this program
a bit less of an enigma.
2022-10-26 09:22:20 +10:00
Michael Fitz-Payne
cf86886448 DEV(cache_critical_dns): improve postgres_healthcheck
The `PG::Connection#ping` method is only reliable for checking if the
given host is accepting connections, and not if the authentication
details are valid.

This extends the healthcheck to confirm that the auth details are
able to both create a connection and execute queries against the
database.

We expect the empty query to return an empty result set, so we can
assert on that. If a failure occurs for any reason, the healthcheck will
return false.
2022-10-26 09:22:20 +10:00
Gabe Pacuilla
5340b09ad1 FIX(cache_critical_dns): use correct DISCOURSE_DB_USERNAME envvar (#16862) 2022-10-26 09:22:20 +10:00
Gabe Pacuilla
ab88591536 FIX(cache_critical_dns): use discourse database name and user by default (#16856) 2022-10-26 09:22:20 +10:00
Michael Fitz-Payne
924ef90eed DEV(cache_critical_dns): add SRV priority tunables
An SRV RR contains a priority value for each of the SRV targets that
are present, ranging from 0 - 65535. When caching SRV records we may want to
filter out any targets above or below a particular threshold.

This change adds support for specifying a lower and/or upper bound on
target priorities for any SRV RRs. Any targets returned when resolving
the SRV RR whose priority does not fall between the lower and upper
thresholds are ignored.

For example: Let's say we are running two Redis servers, a primary and
cold server as a backup (but not a replica). Both servers would pass health
checks, but clearly the primary should be preferred over the backup
server. In this case, we could configure our SRV RR with the primary
target as priority 1 and backup target as priority 10. The
`DISCOURSE_REDIS_HOST_SRV_LE` could then be set to 1 and the target with
priority 10 would be ignored.

See /t/66045.
2022-10-26 09:22:20 +10:00
Michael Fitz-Payne
cbf7d16d7c FIX: remove refresh seconds override on cache_critical_dns (#16572)
This removes the option to override the sleep time between caching of
DNS records. The override was invalid because `''.to_i` is 0 in Ruby,
causing a tight loop calling the `run` method.
2022-10-26 09:22:20 +10:00
Michael Fitz-Payne
8293f11f53 FIX: cache_critical_dns - add TLS support for Redis healthcheck
For Redis connections that operate over TLS, we need to ensure that we
are setting the correct arguments for the Redis client. We can utilise
the existing environment variable `DISCOURSE_REDIS_USE_SSL` to toggle
this behaviour.

No SSL verification is performed for two reasons:
- the Discourse application will perform a verification against any FQDN
  as specified for the Redis host
- the healthcheck is run against the _resolved_ IP address for the Redis
  hostname, and any SSL verification will always fail against a direct
  IP address

If no SSL arguments are provided, the IP address is never cached against
the hostname as no healthy address is ever found in the HealthyCache.
2022-10-26 09:22:20 +10:00
Michael Fitz-Payne
971409741f DEV: refactor cache_critical_dns for SRV RR awareness
Modify the cache_critical_dns script for SRV RR awareness. The new
behaviour is only enabled when one or more of the following environment
variables are present (and only for a host where the `DISCOURSE_*_HOST_SRV`
variable is present):
- `DISCOURSE_DB_HOST_SRV`
- `DISCOURSE_DB_REPLICA_HOST_SRV`
- `DISCOURSE_REDIS_HOST_SRV`
- `DISCOURSE_REDIS_REPLICA_HOST_SRV`

Some minor changes in refactor to original script behaviour:
- add Name and SRVName classes for storing resolved addresses for a hostname
- pass DNS client into main run loop instead of creating inside the loop
- ensure all times are UTC
- add environment override for system hosts file path and time between DNS
  checks mainly for testing purposes

The environment variable for `BUNDLE_GEMFILE` is set to enables Ruby to
load gems that are installed and vendored via the project's Gemfile.
This script is usually not run from the project directory as it is
configured as a system service (see
71ba9fb7b5/templates/cache-dns.template.yml (L19))
and therefore cannot load gems like `pg` or `redis` from the default
load paths. Setting this environment variable configures bundler to look
in the correct project directory during it's setup phase.

When a `DISCOURSE_*_HOST_SRV` environment variable is present, the
decision for which target to cache is as follows:
- resolve the SRV targets for the provided hostname
- lookup the addresses for all of the resolved SRV targets via the
  A and AAAA RRs for the target's hostname
- perform a protocol-aware healthcheck (PostgreSQL or Redis pings)
- pick the newest target that passes the healthcheck

From there, the resolved address for the SRV target is cached against
the hostname as specified by the original form of the environment
variable.

For example: The hostname specified by the `DISCOURSE_DB_HOST` record
is `database.example.com`, and the `DISCOURSE_DB_HOST_SRV` record is
`database._postgresql._tcp.sd.example.com`. An SRV RR lookup will return
zero or more targets. Each of the targets will be queried for A and AAAA
RRs. For each of the addresses returned, the newest address that passes
a protocol-aware healthcheck will be cached. This address is cached so
that if any newer address for the SRV target appears we can perform a
health check and prefer the newer address if the check passes.

All resolved SRV targets are cached for a minimum of 30 minutes in memory
so that we can prefer newer hosts over older hosts when more than one target
is returned. Any host in the cache that hasn't been seen for more than 30
minutes is purged.

See /t/61485.
2022-10-26 09:22:20 +10:00
David Taylor
9524c1320a DEV: Include DISCOURSE_REDIS_REPLICA_HOST in cache_critical_dns (#15877)
This is the replacement for DISCOURSE_REDIS_SLAVE_HOST
2022-10-26 09:22:20 +10:00
David Taylor
ed2f700440
DEV: Wait for initdb to complete in docker.rake (#15614)
On slower hardware it can take a while to init the database. If we don't wait, the `rake db:create` step will fail.
2022-01-17 17:45:39 +00:00
Peter Zhu
c5fd8c42db
DEV: Fix methods removed in Ruby 3.2 (#15459)
* File.exists? is deprecated and removed in Ruby 3.2 in favor of
File.exist?
* Dir.exists? is deprecated and removed in Ruby 3.2 in favor of
Dir.exist?
2022-01-05 18:45:08 +01:00
David Taylor
0e87f882a7
DEV: Use discourse image for postgres in GitHub Actions (#15291)
The discourse base image already contains a postgres installation, so pulling a separate postgres image is a little wasteful. Using the copy of Postgres in the discourse image saves about 20 seconds on every GitHub actions run.

This commit sets up Postgres with a few performance-improving flags, which we were already using for the `rake docker:test` task (used on our internal CI system).
2021-12-14 17:20:06 +00:00
Jarek Radosz
cfabdb72bc
FIX: Ambiguous column in downsize_uploads (#14972) 2021-11-16 16:23:32 +01:00
Leonardo Mosquera
48a08cc397
FIX: Vanilla importer fixes (#14699)
Import script was out of date
2021-10-27 14:22:37 +02:00
Dan Ungureanu
69f0f48dc0
DEV: Fix rubocop issues (#14715) 2021-10-27 11:39:28 +03:00
David Taylor
46d96c9feb
DEV: Apply rubocop to script/import_scripts/phorum.rb (#14727)
Followup to b24002018a
2021-10-26 19:16:52 +01:00
Jeremy Waters
b24002018a Update phorum.rb
Add attachment/file/upload handling to bring them in from phorum to discourse
2021-10-26 12:41:50 -04:00
Jarek Radosz
451cd4ec3f
DEV: Fix thor deprecation warning (#14680)
```
Deprecation warning: Thor exit with status 0 on errors. To keep this behavior, you must define `exit_on_failure?` in `DiscourseCLI`
```
2021-10-21 21:01:05 +02:00
Theodore Diamantidis
97178cd777
FIX: phpbb import - attachments not embedded in posts (#14570) 2021-10-11 14:27:54 +02:00
Constanza
a413a1e015
DEV: process image uploads in the Zendesk API import script (#14524) 2021-10-06 12:24:12 -04:00
Gerhard Schlager
a4d0d866aa
DEV: Bulk imports should find existing users by email (#14468)
Without this change, bulk imports unconditionally create new user records even when a user with the same email address exists.
2021-09-29 00:20:06 +02:00
Andrei Prigorshnev
1656b7ed01
DEV: Make db_timestamp_mover work with tables with unique constraints (#14027)
Some tables in the database have constraints on columns with dates. Because of them, the script for moving timestamps can fail from time to time. This PR makes the script work with such tables.

In general, in PostgreSQL it is not always possible to defer constraint checks to the transaction commit (Primary Keys and Unique Constraints can be deferred, but them should be declared as DEFERRABLE to make it possible. Indices created with CREATE UNIQUE INDEX can't be deferred at all).

Since we can't defer constraint checks, I've made it work using a little hack. For example, if we need to move all timestamps by one day, the script will move timestamps by 1000 years and one day, and then return timestamps back by 1000 years. The script use this hack only for columns that have unique constraints.
2021-08-12 19:24:21 +04:00
Vinoth Kannan
cd9262b7d3
DEV: minor improvements in the vanilla import script. (#14026)
We're parsing the post raw based on the record format now.
2021-08-12 15:07:44 +05:30
Andrei Prigorshnev
b66674fec2
DEV: ignore the given_daily_likes table when moving timestamps on Try (#13971)
This will fix the try-reset build that failed today. Probably this going to happen again with other tables that have constraints on date columns. I'm going to modify the script to make it work without ignoring such tables. After that, the only table we're going to need to ignore will be the 2FA table.

Before I fixed that, don't hesitate to tag me if the try-reset build fail again.
2021-08-06 18:27:23 +04:00
Ruoxin Wang
f9aaed7020
FIX: MyBB importer exposes deleted posts (#13700)
The MyBB importer exposes posts marked as deleted in MyBB. This may lead to a privacy issue.
2021-07-22 09:55:02 +02:00
Joffrey JAFFEUX
a10fd95a7e
DEV: removes unused version_bump script (#13811) 2021-07-21 10:35:24 -04:00
Joffrey JAFFEUX
265e32e3e2
DEV: uses main instead of master in version_bump script (#13805) 2021-07-21 10:40:35 +02:00
Andrei Prigorshnev
efc8d5f134
DEV: ignore the 2FA table when moving timestamps (#13793)
or 2FA will be broken after moving. We use this script for moving timestamps when restoring Try.
2021-07-20 15:49:20 +04:00
Andrei Prigorshnev
7fe3e4bab8
DEV: fix joining in the script for moving timestamps (#13721)
Without checking if t.table_schema = '#{@schema}' the SELECT with JOIN in the script were returning every column twice in case there is a 'backup' scheme with exactly the same tables as in the 'public scheme'
2021-07-13 17:21:15 +04:00
Andrei Prigorshnev
8a0349a01f
DEV: ignore some tables when updating timestamps using db_timestamps_mover.rb (#13714)
Some tables have constraints on columns with a date which can cause problems when moving timestamps. By now I think it's enough to just ignore them.
2021-07-13 12:36:41 +04:00
Andrei Prigorshnev
696bd0bf05
DEV: add a script for moving timestamps in database (#13682)
We're going to use this script for updating timestamps on Try, but it can be used with a local database during development as well.

Usage:

Commands:
  ruby db_timestamp_updater.rb yesterday <date> move all timestamps by x days so that <date> will be moved to yesterday
  ruby db_timestamp_updater.rb 100              move all timestamps forward by 100 days
  ruby db_timestamp_updater.rb -100             move all timestamps backward by 100 days
The script moves all timestamps in the database by the same amount of days forward or backward. No need to change the script if we add a new column in the future.

The more simple solution would be just to move timestamps in several tables (topics, posts, and so on). I didn't want to go that way because it could generate additional work in the future. For example, if we add a new column with a timestamp and users can see that timestamp we'd need to add that column to the script. Or, for example, if we move a post's timestamp to the future but forget to move a timestamp of topic timer or user action it can cause weird bugs.
2021-07-09 20:04:28 +04:00
Alan Guo Xiang Tan
37b8ce79c9
FEATURE: Add last visit indication to topic view page. (#13471)
This PR also removes grey old unread bubble from the topic badges by
dropping `TopicUser#highest_seen_post_number`.
2021-07-05 14:17:31 +08:00
Dan Ungureanu
6ea4bbd2ec
DEV: Prefer .pluck_first over .pluck.first (#13607) 2021-07-02 10:03:54 +08:00
Ikko Ashimine
88da06cba0 FIX: typo in discourse
occurences -> occurrences
2021-06-30 11:08:29 -04:00
David Taylor
20070f5089
DEV: Update script/promote_migrations (#13513)
Introduces the --plugins-base flag for updating plugins in a different directory. Followup to 49f39434c4
2021-06-24 13:57:23 +01:00
Jarek Radosz
046a875222
DEV: Improve script/downsize_uploads.rb (#13508)
* Only shrink images that are used in Posts and no other models
* Don't save the upload if the size is the same
2021-06-24 00:09:40 +02:00
David Taylor
49f39434c4 DEV: Introduce script/promote_migrations tool
Post-deploy migrations exist to allow for seamless Discourse upgrades. By design, they cause migrations to run out of numerical order. This has the potential to cause some unexpected edge cases. To reduce the likelihood of these edge cases, we will promote historical post_deploy migrations to regular migrations after a full Discourse stable release cycle.

This script is intended to be run at least during every Discourse release cycle.

This means that truly seamless upgrades will not be possible between non-consecutive Discourse versions. (Upgrades will still work, but may cause some server errors for users during the upgrade)
2021-06-23 17:43:38 +01:00
Bianca Nenciu
5efed91128 FIX: Set random values for digest_attempted_at
Setting a random value in the interval 1 week ago ... now works better
because this spreads digest scheduling over a week because digests are
sent one week from the date of the last digest.
2021-06-22 12:05:15 +08:00
Arpit Jalan
365d339985
DEV: fix Flarum import script (#13385) 2021-06-15 19:08:55 +05:30
Jarek Radosz
3bb765ac92
DEV: Remove the remaining Travis code (#13255)
The second attempt at #10041 now that all our plugins use GitHub Actions CI instead.
2021-06-02 20:29:47 +02:00
Lecter
4bf195d502
FEATURE: Flarum import script (#13139) 2021-05-27 02:30:50 +02:00
Bianca Nenciu
d0779a87bb
FIX: Use max_category_nesting when importing categories (#13105)
It allowed for a parent category and a sub-category.
2021-05-26 12:40:26 +03:00
Josh Soref
59097b207f
DEV: Correct typos and spelling mistakes (#12812)
Over the years we accrued many spelling mistakes in the code base. 

This PR attempts to fix spelling mistakes and typos in all areas of the code that are extremely safe to change 

- comments
- test descriptions
- other low risk areas
2021-05-21 11:43:47 +10:00
Blake Erickson
4b7b586861 DEV: Remove unused rswag script
We no longer use this script for helping generate rswag formatted docs because we can use json schema now.
2021-05-07 09:36:55 +08:00
Justin DiRose
c1517e428e
DEV: Add vBulletin5 bulk importer (#12904)
This is a pretty straightforward bulk importer, just tailored to the vBulletin 5 database structure.

Also made a few minor improvements to the base importer -- should be self explanatory in the code.
2021-04-30 11:03:33 -05:00
David Taylor
8b87cb07c3
DEV: Return a non-zero exit code when discourse remap is aborted (#12873)
This makes it much easier to integrate the script with external tools
2021-04-28 14:10:35 +01:00
Pilou
e7892df10d
FIX: handle charset=windows-1252 in mbox import script (#12832)
Co-authored-by: Loïc Dachary <loic@dachary.org>
2021-04-27 15:43:31 +02:00
Michael Maroszek
5bec0e5763
fix vbulletin importer to import unreferenced attachments (#12187) 2021-04-19 21:05:16 +02:00