Commit Graph

917 Commits

Author SHA1 Message Date
Gerhard Schlager
739430c01e FIX: mbox import failed if no tags were configured 2020-03-26 16:41:11 +01:00
Gerhard Schlager
d216483c53 FIX: Importing with pgbouncer failed
Checking if all records have been imported uses a temp table in PostgreSQL. This fails when pgbouncer is used unless the temp table is created inside a transaction.
2020-03-26 16:41:09 +01:00
Gerhard Schlager
c94b63bc75 DEV: Improve import of attachments from Telligent 2020-03-26 16:37:55 +01:00
Jarek Radosz
d21d80198c
DEV: Update rubocop-discourse (#9270)
Includes:
* DEV: Use `eq_time` matcher
2020-03-26 16:32:41 +01:00
Robin Ward
eaa324ecbd Revert "Move the widget-hbs compiler to js from es6"
This reverts commit 5d66a2c16e.
2020-03-25 16:13:26 -04:00
Robin Ward
5d66a2c16e Move the widget-hbs compiler to js from es6 2020-03-25 15:03:21 -04:00
Gerhard Schlager
5b2b769eb7 DEV: Ensure uploads aren't deleted during imports
Sidekiq might delete uploads if you, for some reason, create upload records before using them in posts.
2020-03-24 17:14:16 +01:00
Gerhard Schlager
445b35381d Improve Telligent import script
* Detects mostly all attachments and it's a lot faster
* Parses user properties in Ruby instead of the DB, because that's less errorprone
* Imports user avatars
* Imports topic views by users
* Better handling of quotes and YouTube links
2020-03-23 09:18:12 +01:00
Gerhard Schlager
d27ece9ded FIX: Method from Telligent import script was deleted by accident 2020-03-14 22:10:40 +01:00
Gerhard Schlager
e825f47daa DEV: Better handling of incremental scrapes for Google Groups 2020-03-14 00:00:36 +01:00
Gerhard Schlager
0a88232e87 DEV: Improve mbox import script
* Better documentation of settings
* Add option to exclude trimmed parts of emails (enabled by default) to not revail email addresses
2020-03-14 00:00:36 +01:00
Gerhard Schlager
36062f43c8 DEV: Improve Telligent import script
* Adds ability to map forums to categories and tags as well as ignore forums.
* Fixes regular expression for detecting attachments in posts.
* Handles "remote attachments" 😮 by inserting a link.
* Imports view counts for topics.
* Handles incorrect references of parent posts.
* Better handling of quotes.
* Finds a lot more attachments by trying to replace various Unicode characters in filenames.
2020-03-14 00:00:36 +01:00
Gerhard Schlager
ba1b840816 DEV: Don't deactivate suspended users during import
Otherwise a cleanup job might delete those deactivated users.
2020-03-14 00:00:36 +01:00
Justin DiRose
6c948f27ea
FIX: Missing constant in SMF2 importer (#9178) 2020-03-11 10:19:59 -05:00
Gerhard Schlager
0e752db411 DEV: Improve mbox import script
* Customizable email subject prefixes to remove "Re" and "Fwd" as well as localized prefixes.
* Configuration option for prefixes like [FOO] or (BAR) which can be replaced with tags during import.
* Bugfix: Import script might have skipped some users due to missing ORDER BY.
2020-03-09 10:26:45 +01:00
Gerhard Schlager
edc8d58ac3 FEATURE: Add site setting to disable staged user cleanup
... and disabled the cleanup during imports, otherwise a running Sidekiq might delete users before posts are created
2020-03-09 10:26:41 +01:00
Jarek Radosz
f7ea2fdea5
FIX: Import posts of missing users from phpbb3 (#9085)
Posts without a user probably shouldn't happen unless there was some direct database tampering, but data like that has been seen in the wild.

The importer will assign those posts to the "system" user.
2020-03-06 22:54:40 +01:00
Gerhard Schlager
d7ccb58559 FIX: Google Groups scraper failed to login 2020-03-02 17:24:48 +01:00
Brad Morrical
ff5ff8d0d2
fix invalid byte sequence in UTF-8 (ArgumentError) (#9077) 2020-02-28 10:26:18 -05:00
Justin DiRose
f35ee5e887
DEV: Improvements to SMF2 script (#9006) 2020-02-24 12:51:45 -06:00
Sam Saffron
28292d2759
PERF: avoid shelling to get hostname aggressively
Previously we had many places in the app that called `hostname` to get
hostname of a server. This commit replaces the pattern in 2 ways

1. We cache the result in `Discourse.os_hostname` so it is only ever called once

2. We prefer to use Socket.gethostname which avoids making a shell command

This improves performance as we are not spawning hostname processes throughout
the app lifetime
2020-02-18 15:13:19 +11:00
David Taylor
5919618a87
DEV: Drop legacy OpenID 2.0 support (#8894)
This is not used in core or official plugins, and has been printing a deprecation notice since v2.3.0beta4. All OpenID 2.0 code and dependencies have been dropped. The user_open_ids table remains for now, in case anyone has missed the deprecation notice, and needs to migrate their data.

Context at https://meta.discourse.org/t/-/113249
2020-02-07 17:32:35 +00:00
Régis Hanol
0843e3e6ce FIX: add support for sub-sub-categories in base_importer
Also delegates 'post_already_imported?' and 'user_already_imported?' to the base importer.
2020-02-05 10:40:28 +01:00
Jarek Radosz
8a82ceb3bc
FIX: Improve downsize_uploads (#8409)
With this change the script:
* Actually removes original large-sized images
* Doesn't save processed files if their size has increased
* Prevents inconsistent state
2020-01-27 03:31:11 +01:00
Gerhard Schlager
ab07b945c2
Merge pull request #8736 from gschlager/rename_reply_id_column
REFACTOR: Rename `post_replies.reply_id` column to `post_replies.reply_post_id`
2020-01-17 17:24:49 +01:00
Gerhard Schlager
e474cda321 REFACTOR: Restoring of backups and migration of uploads to S3 2020-01-14 11:41:35 +01:00
Sam Saffron
710eafdd35 FIX: ensure we consistently pick the same topic for bench
We pick the first topic with 30 responses as our bench topic.

Previously we simply picked the last topic, but hand no guarantee on ordering.

This also attempts to correct previous runs of the bench.
2020-01-08 16:33:45 +11:00
David Taylor
fd6fbaa4ae DEV: Update bench.rb for core changes (#8670)
- Use new api key rake task
- Switch to header-based API auth
- Stop hard-coding topic id
2020-01-08 16:23:29 +11:00
Michael Brown
7200653e16 FIX: cache_critical_dns was erroring without IPAddr
* sometimes cache_critical_dns would error out since "IPAddr" was
  undefined
* sometimes it autoloaded, so no error
2019-12-27 12:39:08 -05:00
AlexP11223
1e4a83cc2a FEATURE: add mybb.ru import script (#8609) 2019-12-20 11:10:18 -05:00
Martin Brennan
edbc356593
FIX: Replace deprecated URI.encode, URI.escape, URI.unescape and URI.unencode (#8528)
The following methods have long been deprecated in ruby due to flaws in their implementation per http://blade.nagaokaut.ac.jp/cgi-bin/vframe.rb/ruby/ruby-core/29293?29179-31097:

URI.escape
URI.unescape
URI.encode
URI.unencode
escape/encode are just aliases for one another. This PR uses the Addressable gem to replace these methods with its own encode, unencode, and encode_component methods where appropriate.

I have put all references to Addressable::URI here into the UrlHelper to keep them corralled in one place to make changes to this implementation easier.

Addressable is now also an explicit gem dependency.
2019-12-12 12:49:21 +10:00
David Taylor
e5ce2d97f6 DEV: Simplify Rubocop runner for GitHub actions
Once we are happy with basic behavior, we can try adding annotations again
2019-12-11 11:49:27 +00:00
Joffrey JAFFEUX
bd17a3a8e7
DEV: introduces Github Actions for CI (#8441)
Co-Authored-By: David Taylor <david@taylorhq.com>
2019-12-10 14:45:47 +01:00
Sam Saffron
0c52537f10 DEV: update rubocop to version 0.77
We like to stay as close as possible to latest with rubocop cause the cops
get better.

This update required some code changes, specifically the default is to avoid
explicit returns where implicit is done

Also this renames a few rules
2019-12-10 11:48:39 +11:00
Joffrey JAFFEUX
0d3d2c43a0
DEV: s/\$redis/Discourse\.redis (#8431)
This commit also adds a rubocop rule to prevent global variables.
2019-12-03 10:05:53 +01:00
Gerhard Schlager
c218036107 FIX: Make Google Groups scraper work for G Suite users 2019-11-28 02:09:51 +01:00
Sam Saffron
88ecb650a9 DEV: Implement a faster Discourse.cache
This is a bottom up rewrite of Discourse cache to support faster performance
and a limited surface area.

ActiveSupport::Cache::Store accepts many options we do not use, this partial
implementation only picks the bits out that we do use and want to support.

Additionally params are named which avoids typos such as "expires_at" vs "expires_in"

This also moves a few spots in Discourse to use Discourse.cache over setex
Performance of setex and Discourse.cache.write is similar.
2019-11-27 16:11:49 +11:00
Sam Saffron
0fb497eb23 DEV: use Discourse.cache over Rails.cache
Discourse.cache is a more consistent method to use and offers clean fallback
if you are skipping redis

This is part of a larger change that both optimizes Discoruse.cache and omits
use of setex on $redis in favor of consistently using discourse cache

Bench does reveal that use of Rails.cache and Discourse.cache is 1.25x slower
than redis.setex / get so a re-implementation will follow prior to porting
2019-11-27 12:36:19 +11:00
Vinoth Kannan
3bb7ad4be1
FEATURE: remove support for 'suppress_from_latest' category setting. (#8308) 2019-11-18 12:28:35 +05:30
Penar Musaraj
067696df8f DEV: Apply Rubocop redundant return style 2019-11-14 15:10:51 -05:00
David Taylor
9fea43e46a
DEV: Remove use of cd in the app (#8337)
`FileUtils.cd` and `Dir.chdir` cause the working directory to change for the entire process. We run sidekiq jobs, hijacked requests and deferred jobs in threads, which can make working directory changes have unintended side-effects.

- Add a rubocop rule to warn about usage of Dir.chdir and FileUtils.cd
- Added rubocop:disable for scripts used outside the app
- Refactored code using cd to use alternative methods
- Temporarily skipped the rubocop check for lib/backup_restore. This will require more complex refactoring, so I will create a separate PR for review
2019-11-13 09:57:39 +00:00
Sam Saffron
bf0ef73286 DEV: correct rake task used to grab admin key
We amended it so "api_key:get" is no longer supported and instead we are
more explicit. This matches that change and fixes the bench.
2019-11-11 10:23:14 +11:00
David Taylor
54fe887c44 DEV: Remove prototype theme-watcher script
This has been superseded by the Theme CLI: https://meta.discourse.org/t/82950
2019-11-07 17:22:54 +00:00
romanrizzi
d76d0e75ec DEV: Move warmup inside docker rake task 2019-10-25 16:31:05 -03:00
romanrizzi
4f452f0205 DEV: Add variable to warmup tmp folder and obtain accurate results when profiling specs 2019-10-25 10:52:23 -03:00
Krzysztof Kotlarek
f34a0141c7 FIX: Correct path to ImportExport module (#8227)
During the move from Classic autoloader to Zeitwerk import_export module was moved to correct file name convention.
427d54b2b0 (diff-d896ec33b95afb7fae9f8bfe73d0580b)

Problem is that export/import is still using old path to require that module

Meta: https://meta.discourse.org/t/topic-and-category-export-import/38930/40
2019-10-23 17:27:14 +11:00
Daniel Waterworth
1352a5b5fa DEV: undo pluck_first changes to micro benchmark
and add pluck_first benchmark
2019-10-21 12:21:24 +01:00
Daniel Waterworth
55a1394342 DEV: pluck_first
Doing .pluck(:column).first is a very common pattern in Discourse and in
most cases, a limit cause isn't being added. Instead of adding a limit
clause to all these callsites, this commit adds two new methods to
ActiveRecord::Relation:

pluck_first, equivalent to limit(1).pluck(*columns).first

and pluck_first! which, like other finder methods, raises an exception
when no record is found
2019-10-21 12:08:20 +01:00
Régis Hanol
e1998ef244 FIX: downsize_uploads script
The script will now correct all width/height and thumbnail_width/thumbnail_height properties of all the uploaded images.

The script now uses width * height to filter out all unaffected images.

Also handled the case where a downsized image was already an uploaded record.
2019-10-10 16:37:55 +02:00
Régis Hanol
4fdad12998 FIX: downsize_uploads script to support external storage
Also ensured we update the sha1 property of the upload record to match the actual file.
2019-10-08 17:54:39 +02:00