Commit Graph

1119 Commits

Author SHA1 Message Date
Gerhard Schlager
2b2584912a Improve Telligent import script
* Imports private messages
* Replaces internal links for topics and replies
* Allows incremental import of accepted answers
2020-04-03 18:10:52 +02:00
Robin Ward
b2b7afd310 Rename the server side widget hbs compiler 2020-03-27 12:06:14 -04:00
Gerhard Schlager
739430c01e FIX: mbox import failed if no tags were configured 2020-03-26 16:41:11 +01:00
Gerhard Schlager
d216483c53 FIX: Importing with pgbouncer failed
Checking if all records have been imported uses a temp table in PostgreSQL. This fails when pgbouncer is used unless the temp table is created inside a transaction.
2020-03-26 16:41:09 +01:00
Gerhard Schlager
c94b63bc75 DEV: Improve import of attachments from Telligent 2020-03-26 16:37:55 +01:00
Jarek Radosz
d21d80198c
DEV: Update rubocop-discourse (#9270)
Includes:
* DEV: Use `eq_time` matcher
2020-03-26 16:32:41 +01:00
Robin Ward
eaa324ecbd Revert "Move the widget-hbs compiler to js from es6"
This reverts commit 5d66a2c16e.
2020-03-25 16:13:26 -04:00
Robin Ward
5d66a2c16e Move the widget-hbs compiler to js from es6 2020-03-25 15:03:21 -04:00
Gerhard Schlager
5b2b769eb7 DEV: Ensure uploads aren't deleted during imports
Sidekiq might delete uploads if you, for some reason, create upload records before using them in posts.
2020-03-24 17:14:16 +01:00
Gerhard Schlager
445b35381d Improve Telligent import script
* Detects mostly all attachments and it's a lot faster
* Parses user properties in Ruby instead of the DB, because that's less errorprone
* Imports user avatars
* Imports topic views by users
* Better handling of quotes and YouTube links
2020-03-23 09:18:12 +01:00
Gerhard Schlager
d27ece9ded FIX: Method from Telligent import script was deleted by accident 2020-03-14 22:10:40 +01:00
Gerhard Schlager
e825f47daa DEV: Better handling of incremental scrapes for Google Groups 2020-03-14 00:00:36 +01:00
Gerhard Schlager
0a88232e87 DEV: Improve mbox import script
* Better documentation of settings
* Add option to exclude trimmed parts of emails (enabled by default) to not revail email addresses
2020-03-14 00:00:36 +01:00
Gerhard Schlager
36062f43c8 DEV: Improve Telligent import script
* Adds ability to map forums to categories and tags as well as ignore forums.
* Fixes regular expression for detecting attachments in posts.
* Handles "remote attachments" 😮 by inserting a link.
* Imports view counts for topics.
* Handles incorrect references of parent posts.
* Better handling of quotes.
* Finds a lot more attachments by trying to replace various Unicode characters in filenames.
2020-03-14 00:00:36 +01:00
Gerhard Schlager
ba1b840816 DEV: Don't deactivate suspended users during import
Otherwise a cleanup job might delete those deactivated users.
2020-03-14 00:00:36 +01:00
Justin DiRose
6c948f27ea
FIX: Missing constant in SMF2 importer (#9178) 2020-03-11 10:19:59 -05:00
Gerhard Schlager
0e752db411 DEV: Improve mbox import script
* Customizable email subject prefixes to remove "Re" and "Fwd" as well as localized prefixes.
* Configuration option for prefixes like [FOO] or (BAR) which can be replaced with tags during import.
* Bugfix: Import script might have skipped some users due to missing ORDER BY.
2020-03-09 10:26:45 +01:00
Gerhard Schlager
edc8d58ac3 FEATURE: Add site setting to disable staged user cleanup
... and disabled the cleanup during imports, otherwise a running Sidekiq might delete users before posts are created
2020-03-09 10:26:41 +01:00
Jarek Radosz
f7ea2fdea5
FIX: Import posts of missing users from phpbb3 (#9085)
Posts without a user probably shouldn't happen unless there was some direct database tampering, but data like that has been seen in the wild.

The importer will assign those posts to the "system" user.
2020-03-06 22:54:40 +01:00
Gerhard Schlager
d7ccb58559 FIX: Google Groups scraper failed to login 2020-03-02 17:24:48 +01:00
Brad Morrical
ff5ff8d0d2
fix invalid byte sequence in UTF-8 (ArgumentError) (#9077) 2020-02-28 10:26:18 -05:00
Justin DiRose
f35ee5e887
DEV: Improvements to SMF2 script (#9006) 2020-02-24 12:51:45 -06:00
Sam Saffron
28292d2759
PERF: avoid shelling to get hostname aggressively
Previously we had many places in the app that called `hostname` to get
hostname of a server. This commit replaces the pattern in 2 ways

1. We cache the result in `Discourse.os_hostname` so it is only ever called once

2. We prefer to use Socket.gethostname which avoids making a shell command

This improves performance as we are not spawning hostname processes throughout
the app lifetime
2020-02-18 15:13:19 +11:00
David Taylor
5919618a87
DEV: Drop legacy OpenID 2.0 support (#8894)
This is not used in core or official plugins, and has been printing a deprecation notice since v2.3.0beta4. All OpenID 2.0 code and dependencies have been dropped. The user_open_ids table remains for now, in case anyone has missed the deprecation notice, and needs to migrate their data.

Context at https://meta.discourse.org/t/-/113249
2020-02-07 17:32:35 +00:00
Régis Hanol
0843e3e6ce FIX: add support for sub-sub-categories in base_importer
Also delegates 'post_already_imported?' and 'user_already_imported?' to the base importer.
2020-02-05 10:40:28 +01:00
Jarek Radosz
8a82ceb3bc
FIX: Improve downsize_uploads (#8409)
With this change the script:
* Actually removes original large-sized images
* Doesn't save processed files if their size has increased
* Prevents inconsistent state
2020-01-27 03:31:11 +01:00
Gerhard Schlager
ab07b945c2
Merge pull request #8736 from gschlager/rename_reply_id_column
REFACTOR: Rename `post_replies.reply_id` column to `post_replies.reply_post_id`
2020-01-17 17:24:49 +01:00
Gerhard Schlager
e474cda321 REFACTOR: Restoring of backups and migration of uploads to S3 2020-01-14 11:41:35 +01:00
Sam Saffron
710eafdd35 FIX: ensure we consistently pick the same topic for bench
We pick the first topic with 30 responses as our bench topic.

Previously we simply picked the last topic, but hand no guarantee on ordering.

This also attempts to correct previous runs of the bench.
2020-01-08 16:33:45 +11:00
David Taylor
fd6fbaa4ae DEV: Update bench.rb for core changes (#8670)
- Use new api key rake task
- Switch to header-based API auth
- Stop hard-coding topic id
2020-01-08 16:23:29 +11:00
Michael Brown
7200653e16 FIX: cache_critical_dns was erroring without IPAddr
* sometimes cache_critical_dns would error out since "IPAddr" was
  undefined
* sometimes it autoloaded, so no error
2019-12-27 12:39:08 -05:00
AlexP11223
1e4a83cc2a FEATURE: add mybb.ru import script (#8609) 2019-12-20 11:10:18 -05:00
Martin Brennan
edbc356593
FIX: Replace deprecated URI.encode, URI.escape, URI.unescape and URI.unencode (#8528)
The following methods have long been deprecated in ruby due to flaws in their implementation per http://blade.nagaokaut.ac.jp/cgi-bin/vframe.rb/ruby/ruby-core/29293?29179-31097:

URI.escape
URI.unescape
URI.encode
URI.unencode
escape/encode are just aliases for one another. This PR uses the Addressable gem to replace these methods with its own encode, unencode, and encode_component methods where appropriate.

I have put all references to Addressable::URI here into the UrlHelper to keep them corralled in one place to make changes to this implementation easier.

Addressable is now also an explicit gem dependency.
2019-12-12 12:49:21 +10:00
David Taylor
e5ce2d97f6 DEV: Simplify Rubocop runner for GitHub actions
Once we are happy with basic behavior, we can try adding annotations again
2019-12-11 11:49:27 +00:00
Joffrey JAFFEUX
bd17a3a8e7
DEV: introduces Github Actions for CI (#8441)
Co-Authored-By: David Taylor <david@taylorhq.com>
2019-12-10 14:45:47 +01:00
Sam Saffron
0c52537f10 DEV: update rubocop to version 0.77
We like to stay as close as possible to latest with rubocop cause the cops
get better.

This update required some code changes, specifically the default is to avoid
explicit returns where implicit is done

Also this renames a few rules
2019-12-10 11:48:39 +11:00
Joffrey JAFFEUX
0d3d2c43a0
DEV: s/\$redis/Discourse\.redis (#8431)
This commit also adds a rubocop rule to prevent global variables.
2019-12-03 10:05:53 +01:00
Gerhard Schlager
c218036107 FIX: Make Google Groups scraper work for G Suite users 2019-11-28 02:09:51 +01:00
Sam Saffron
88ecb650a9 DEV: Implement a faster Discourse.cache
This is a bottom up rewrite of Discourse cache to support faster performance
and a limited surface area.

ActiveSupport::Cache::Store accepts many options we do not use, this partial
implementation only picks the bits out that we do use and want to support.

Additionally params are named which avoids typos such as "expires_at" vs "expires_in"

This also moves a few spots in Discourse to use Discourse.cache over setex
Performance of setex and Discourse.cache.write is similar.
2019-11-27 16:11:49 +11:00
Sam Saffron
0fb497eb23 DEV: use Discourse.cache over Rails.cache
Discourse.cache is a more consistent method to use and offers clean fallback
if you are skipping redis

This is part of a larger change that both optimizes Discoruse.cache and omits
use of setex on $redis in favor of consistently using discourse cache

Bench does reveal that use of Rails.cache and Discourse.cache is 1.25x slower
than redis.setex / get so a re-implementation will follow prior to porting
2019-11-27 12:36:19 +11:00
Vinoth Kannan
3bb7ad4be1
FEATURE: remove support for 'suppress_from_latest' category setting. (#8308) 2019-11-18 12:28:35 +05:30
Penar Musaraj
067696df8f DEV: Apply Rubocop redundant return style 2019-11-14 15:10:51 -05:00
David Taylor
9fea43e46a
DEV: Remove use of cd in the app (#8337)
`FileUtils.cd` and `Dir.chdir` cause the working directory to change for the entire process. We run sidekiq jobs, hijacked requests and deferred jobs in threads, which can make working directory changes have unintended side-effects.

- Add a rubocop rule to warn about usage of Dir.chdir and FileUtils.cd
- Added rubocop:disable for scripts used outside the app
- Refactored code using cd to use alternative methods
- Temporarily skipped the rubocop check for lib/backup_restore. This will require more complex refactoring, so I will create a separate PR for review
2019-11-13 09:57:39 +00:00
Sam Saffron
bf0ef73286 DEV: correct rake task used to grab admin key
We amended it so "api_key:get" is no longer supported and instead we are
more explicit. This matches that change and fixes the bench.
2019-11-11 10:23:14 +11:00
David Taylor
54fe887c44 DEV: Remove prototype theme-watcher script
This has been superseded by the Theme CLI: https://meta.discourse.org/t/82950
2019-11-07 17:22:54 +00:00
romanrizzi
d76d0e75ec DEV: Move warmup inside docker rake task 2019-10-25 16:31:05 -03:00
romanrizzi
4f452f0205 DEV: Add variable to warmup tmp folder and obtain accurate results when profiling specs 2019-10-25 10:52:23 -03:00
Krzysztof Kotlarek
f34a0141c7 FIX: Correct path to ImportExport module (#8227)
During the move from Classic autoloader to Zeitwerk import_export module was moved to correct file name convention.
427d54b2b0 (diff-d896ec33b95afb7fae9f8bfe73d0580b)

Problem is that export/import is still using old path to require that module

Meta: https://meta.discourse.org/t/topic-and-category-export-import/38930/40
2019-10-23 17:27:14 +11:00
Daniel Waterworth
1352a5b5fa DEV: undo pluck_first changes to micro benchmark
and add pluck_first benchmark
2019-10-21 12:21:24 +01:00
Daniel Waterworth
55a1394342 DEV: pluck_first
Doing .pluck(:column).first is a very common pattern in Discourse and in
most cases, a limit cause isn't being added. Instead of adding a limit
clause to all these callsites, this commit adds two new methods to
ActiveRecord::Relation:

pluck_first, equivalent to limit(1).pluck(*columns).first

and pluck_first! which, like other finder methods, raises an exception
when no record is found
2019-10-21 12:08:20 +01:00
Régis Hanol
e1998ef244 FIX: downsize_uploads script
The script will now correct all width/height and thumbnail_width/thumbnail_height properties of all the uploaded images.

The script now uses width * height to filter out all unaffected images.

Also handled the case where a downsized image was already an uploaded record.
2019-10-10 16:37:55 +02:00
Régis Hanol
4fdad12998 FIX: downsize_uploads script to support external storage
Also ensured we update the sha1 property of the upload record to match the actual file.
2019-10-08 17:54:39 +02:00
jelle van der Waa
2d4c9bbaac import_scripts: add fluxbb prefix to missing query (#8163)
Signed-off-by: Jelle van der Waa <jelle@archlinux.org>
2019-10-08 11:46:00 +11:00
Sam Saffron
1d5c2b36f6 DEV: improve diagnostics on mem leak checker
This adds mwrap logging to each iteration so we can see how much
leaks per iteration and where it is coming from
2019-10-04 09:47:33 +10:00
Sam Saffron
038a38ae1c DEV: add debugging scripts for memory leaks
These scripts are somewhat rough but I needed them to help debug a memory
leak we have noticed in rails 6.

The biggest object script finds all the biggest objects we have in memory
after boot.

The test memory leak runs a very simple iteration through all multisites
and observed memory.
2019-10-03 16:36:31 +10:00
Krzysztof Kotlarek
35b1185a08 FIX: Revert Demon::DemonBase back to Demon::Base (#8132)
I introduced DemonBase because I had got some conflict between `demon/base.rb` and `jobs/base.rb`, however, to not rename base class, it is possible to use regex on absolute path in Zeitwerk custom inflector.
2019-10-02 14:54:08 +10:00
Krzysztof Kotlarek
427d54b2b0 DEV: Upgrading Discourse to Zeitwerk (#8098)
Zeitwerk simplifies working with dependencies in dev and makes it easier reloading class chains. 

We no longer need to use Rails "require_dependency" anywhere and instead can just use standard 
Ruby patterns to require files.

This is a far reaching change and we expect some followups here.
2019-10-02 14:01:53 +10:00
Gerhard Schlager
b48ca9dee9 DEV: Simplify username validation in base importer
The `UsernameValidator` does already all the hard work. No need to do any additional checks in the import script. The checks were out-of-date anyway.
2019-10-01 20:33:09 +02:00
Gerhard Schlager
ed1e5ef6cc FIX: By default, don't abort Google Groups crawling on error 2019-09-18 18:14:04 +02:00
Gerhard Schlager
ab96239f2a FIX: Google Groups crawler failed to login
Trying to automate the login into a Google account is quite hard. This makes the crawler use the content of a cookies.txt file instead. It also removes a couple of deprecation warnings and adds some color to the output.
2019-09-18 13:09:20 +02:00
Sam Saffron
cd1ab206d9 DEV: add missing ultra low queue to mwrap sidekiq
note: mwrap is used for analysis of memory bloat and leaks of processes
2019-09-18 11:18:35 +10:00
Sam Saffron
015051ecaf PERF: avoid spinning a thread each time we close a connection
This is a temporary workaround for the issue in https://github.com/rails/rails/pull/36949

Discussing a proper fix in Rails with the Rails team.

Prior to this fix we were spinning up a thread every time we closed a connection
to the db.
2019-09-12 17:34:04 +10:00
Krzysztof Kotlarek
1d73754e84 FIX: Modify frozen String and profile_db_generator uses category id (#8080) 2019-09-09 17:38:37 +10:00
Dan Ungureanu
ab7038bfc2
DEV: User simulator tried to modify frozen string. 2019-08-16 17:32:17 +03:00
Gerhard Schlager
888b635cfc Import avatars and likes in the Zendesk AP importer
Co-authored-by: Justin DiRose <justin@justindirose.com>
2019-08-14 10:42:52 +02:00
Gerhard Schlager
4ed517a344 DEV: Make Rubocop happy
Follow-up to 6cc9fe42
2019-08-12 23:10:58 +02:00
Mohamad Abras
6cc9fe42ce add mongo adapter to nodebb importer (#8000) 2019-08-12 14:15:11 -04:00
Régis Hanol
19dda59932 FIX: add back verbose option to DbHelper.remap 2019-07-31 17:30:08 +02:00
Rishabh
dcb47d902b
REFACTOR: Rename SiteSetting.disable_edit_notifications to disable_system_edit_notifications (#7958)
* REFACTOR: Rename SiteSetting.disable_edit_notifications to disable_system_edit_notifications

- The older name could cause some confusion because the setting does not disable all edit notifications, only system ones.

* FIX: Add frozen_string_literal: true in the migration

* DEV: Deprecate 'disable_edit_notifications'
2019-07-31 20:20:41 +05:30
Régis Hanol
89fce2ce71 DEV: remove duplicate Remap class and use DbHelper.remap instead
Follow-up to 9cd3f96dee
2019-07-29 18:43:40 +02:00
Gerhard Schlager
fd12c414e7 DEV: Refactor helper methods for upload markdown
Follow-up to a61ff167
2019-07-25 16:36:35 +02:00
Gerhard Schlager
a61ff16740 DEV: Make attachment markdown reusable 2019-07-25 14:04:18 +02:00
Joffrey JAFFEUX
cc46de8f46
s/discourse-staff-notes/discourse-user-notes (#7936) 2019-07-24 20:04:27 +02:00
Gerhard Schlager
f0fea5991f FIX: Latest Selenium gem broke Google Groups import script
Selenium uses Keep-Alive since version 3.141, so the net-http-persistent gem shouldn't be needed anymore.
2019-07-10 09:45:33 +02:00
Dan Ungureanu
ab6ad220c7
DEV: Fix user simulator script. 2019-07-09 18:52:08 +03:00
Arpit Jalan
6d30be1f94 Improve XenForo import script.
- ensure only active, unbanned users are imported.
- ensure only visible threads/posts are imported.
2019-06-18 15:52:34 +05:30
Arpit Jalan
77f5577e30 DEV: Improvements to AnswerHub import script. 2019-06-13 11:46:17 +05:30
Guo Xiang Tan
36c0cfa890 FIX: Use new attachment markdown format in ImportScripts::Uploader. 2019-06-11 14:49:28 +08:00
Blake Erickson
0955d9ece9 create answerhub importer (#7671) 2019-06-03 12:17:22 +10:00
Gerhard Schlager
0f3c3bc309 Make import scripts work with frozen strings 2019-05-30 22:22:24 +02:00
Gerhard Schlager
c70d0c6659 Use an invalid domain for fake email addresses in importers 2019-05-30 22:22:24 +02:00
Gerhard Schlager
d3ba338144 Make Telligent import script more generic 2019-05-30 22:22:24 +02:00
Joffrey JAFFEUX
630e9814bc
datetime is not available at this point (#7630) 2019-05-29 14:06:32 +02:00
Joffrey JAFFEUX
6439004161
DEV: do not use STDERR to print tests timestamps (#7629) 2019-05-29 13:28:02 +02:00
Joffrey JAFFEUX
6be9a6eb2e
DEV: adds time logging to docker_test script (#7627) 2019-05-29 12:06:43 +02:00
Sam Saffron
7429700389 FIX: ensure we can download maxmind without redis or db config
This also corrects FileHelper.download so it supports "follow_redirect"
correctly (it used to always follow 1 redirect) and adds a `validate_url`
param that will bypass all uri validation if set to false (default is true)
2019-05-28 10:28:57 +10:00
Sam Saffron
2bcc3ef46b correct type 2019-05-22 12:28:17 +10:00
Sam Saffron
12264747f7 DEV: script to analyze status of sidekiq queue
This returns a proper count of all queued jobs and finds potential dupes
2019-05-22 12:27:11 +10:00
Gerhard Schlager
b788948985 FEATURE: English locale with international date formats
Makes en_US the new default locale
2019-05-20 13:47:20 +02:00
Sam Saffron
678a9a61c4 DEV: lint importer
commit #f490ed3b introduced a few linting issues, resolved now
2019-05-17 16:37:08 +10:00
Edmond Lepedus
f490ed3bbc FEATURE: Add attachment support to xenforo importer (#7548)
* FEATURE: Add attachment support to XenForo importer

If `ATTACHMENT_DIR` is provided, importer will scan each imported post
for `[GALLERY]` and `[ATTACH]` tags, attempt to import the referenced files
as Discourse uploads and replace the tags with Discourse markup.

References to files which cannot be imported are stripped.

NOTE: This only imports attachments which are referenced in imported
posts. Any XenForo media or files which are not referenced in any post
using `[ATTACH]` or `[GALLERY]` tags will not be imported. The goal is to
ensure that we don't have posts with missing images and unsightly
markup, NOT to ensure that all attachments are migrated.

* FEATURE: Add attachment support to XenForo importer

If `ATTACHMENT_DIR` is provided, importer will scan each imported post
for `[GALLERY]` and `[ATTACH]` tags, attempt to import the referenced files
as Discourse uploads and replace the tags with Discourse markup.

References to files which cannot be imported are stripped.

NOTE: This only imports attachments which are referenced in imported
posts. Any XenForo media or files which are not referenced in any post
using `[ATTACH]` or `[GALLERY]` tags will not be imported. The goal is to
ensure that we don't have posts with missing images and unsightly
markup, NOT to ensure that all attachments are migrated.

* FEATURE: Add attachment support to XenForo importer

If `ATTACHMENT_DIR` is provided, importer will scan each imported post
for `[GALLERY]` and `[ATTACH]` tags, attempt to import the referenced files
as Discourse uploads and replace the tags with Discourse markup.

References to files which cannot be imported are stripped.

NOTE: This only imports attachments which are referenced in imported
posts. Any XenForo media or files which are not referenced in any post
using `[ATTACH]` or `[GALLERY]` tags will not be imported. The goal is to
ensure that we don't have posts with missing images and unsightly
markup, NOT to ensure that all attachments are migrated.
2019-05-17 16:18:28 +10:00
Sam Saffron
b3d42b3f18 DEV: remove unmaintained script
osxdev script has not been maintained for a while, keeping it around is
only causing confusion
2019-05-17 11:47:48 +10:00
Sam Saffron
30990006a9 DEV: enable frozen string literal on all files
This reduces chances of errors where consumers of strings mutate inputs
and reduces memory usage of the app.

Test suite passes now, but there may be some stuff left, so we will run
a few sites on a branch prior to merging
2019-05-13 09:31:32 +08:00
Guo Xiang Tan
152238b4cf DEV: Prefer public_send over send. 2019-05-07 09:33:21 +08:00
Sam Saffron
9be70a22cd DEV: introduce new API to look up dynamic site setting
This removes all uses of both `send` and `public_send` from consumers of
SiteSetting and instead introduces a `get` helper for dynamic lookup

This leads to much cleaner and safer code long term as we are always explicit
to test that a site setting is really there before sending an arbitrary
string to the class

It also removes a couple of risky stubs from the auth provider test
2019-05-07 11:00:30 +10:00
Gerhard Schlager
74ca49d7cd FIX: Importing of polls from phpBB3 was broken
Follow-up to 24369a81
2019-05-06 12:37:19 +02:00
Robin Ward
3cb0d27d38 DEV: Upgrade our widget handlebars compiler
Now supports subexpressions such as i18n and concat, plus automatic
attaching of widgets similar to ember.
2019-05-02 15:47:57 -04:00
Guo Xiang Tan
24347ace10 FIX: Properly associate user_profiles background urls via upload id.
`Upload#url` is more likely and can change from time to time. When it
does changes, we don't want to have to look through multiple tables to
ensure that the URLs are all up to date. Instead, we simply associate
uploads properly to `UserProfile` so that it does not have to replicate
the URLs in the table.
2019-05-02 14:58:24 +08:00
Michael Brown
7b1783bae8 FIX: cache_critical_dns was never caching pg replica (#7461)
* it's DISCOURSE_DB_REPLICA_HOST not DISCOURSE_DB_BACKUP_HOST
2019-04-30 08:42:51 +08:00
MMX
5d4aa256be FIX: category logo upload error in Discuz importer.(#7453) 2019-04-29 17:01:15 +02:00
Michael K Johnson
9fc3de01bb FEATURE: Add import script for Friends+Me Google+ Exporter JSON archives (#7334)
This script has been used to import over 50,000 Google+ posts
and over 300,000 comments from 29 communities into a single
Discourse instance, as well as for at least three other
imports.  Google+ has closed for the public, but it is still
available at this time for GSuite customers. If GSuite customers
decide to migrate from Google+ to Discourse, or if Google
"sunsets" Google+ for GSuite customers, this importer may be
useful.
https://www.reddit.com/r/FMGE_Support/comments/b8sa5h/fmge_for_gsuite/

Development and use of this script has been discussed in detail:
https://meta.discourse.org/t/bounty-google-private-communities-export-screenscraper-importer/108029
2019-04-23 14:04:09 +10:00
Arpit Jalan
110512d4d0 Improvements to vBulletin bulk import script
- import attachments
- import avatars
- import user signatures
- create permalink file
- reconnect to MySQL db in case of failure
2019-04-11 12:35:19 +05:30
Arpit Jalan
a20f58554b IMPORT: create category definitions in import:ensure_consistency task 2019-04-11 12:06:37 +05:30
Robin Ward
b58867b6e9 FEATURE: New 'Reviewable' model to make reviewable items generic
Includes support for flags, reviewable users and queued posts, with REST API
backwards compatibility.

Co-Authored-By: romanrizzi <romanalejandro@gmail.com>
Co-Authored-By: jjaffeux <j.jaffeux@gmail.com>
2019-03-28 12:45:10 -04:00
Gerhard Schlager
453ba2da7b Make Google Groups scraper work with latest chromedriver 2019-03-25 16:11:22 +01:00
Joe
ec2123809f FEATURE: user and group cards on mobile (#7246) 2019-03-25 13:37:17 +01:00
Sam Saffron
3f35315391 DEV: add script to switch ruby version from inside container
This script can be used to flip Ruby to a patched Ruby version
or a different major version from inside the container

It is used to test and compare different Ruby versions
2019-03-25 17:41:24 +11:00
Gerhard Schlager
2349ba3bc4 Improve Google Groups scraper
* Better error detection during login phase
* Experimental support for 2FA and SMS codes
* Detect missing permissions to scrape email addresses
2019-03-24 23:15:13 +01:00
Penar Musaraj
0db2846a5b Add user bios to NodeBB importer 2019-03-20 16:40:26 -04:00
Penar Musaraj
b6a7b851c7 Nodebb importer: add permalinks, exclude disabled categories 2019-03-18 21:59:02 -04:00
Penar Musaraj
9334d2f4f7
FEATURE: add more granular user option levels for email notifications (#7143)
Migrates email user options to a new data structure, where `email_always`, `email_direct` and `email_private_messages` are replace by

* `email_messages_level`, with options: `always`, `only_when_away` and `never` (defaults to `always`)
* `email_level`, with options: `always`, `only_when_away` and `never` (defaults to `only_when_away`)
2019-03-15 10:55:11 -04:00
Sam
819d4facda FIX: ruby bench script no longer working
The library used to generate random text changed, this caused the title
of the topic used for testing to change, which meant the slug changed, so
a hit to the topic was a redirect

This fix gives the topic used for performance testing a static name to avoid
this issue in future
2019-03-15 11:31:08 +11:00
Robin Ward
fa5a158683 REFACTOR: Move queue_jobs out of SiteSetting
It is not a setting, and only relevant in specs. The new API is:

```
Jobs.run_later!        # jobs will be thrown on the queue
Jobs.run_immediately!  # jobs will run right away, avoid the queue
```
2019-03-14 10:47:38 -04:00
Gerhard Schlager
78f8114989 FEATURE: Allow discourse script to skip disabling of emails after restore 2019-03-07 21:49:33 +01:00
David Taylor
fc7938f7e0
REFACTOR: Migrate GoogleOAuth2Authenticator to use ManagedAuthenticator (#7120)
https://meta.discourse.org/t/future-social-authentication-improvements/94691/3
2019-03-07 11:31:04 +00:00
Gerhard Schlager
941e096df4 Fix error in base import script
Follow-up to 655a08dbbd
2019-03-06 21:58:25 +01:00
maulkin
655a08dbbd FIX: Return actual errors if PostCreator fails (#7096) 2019-03-06 21:29:37 +01:00
Penar Musaraj
b1035cc691 FIX: NodeBB import details
- mark imported users as active

- do not strip @ from usernames in post content

- improve uploads path matching
2019-03-06 12:30:36 -05:00
Joffrey JAFFEUX
703c724cf3
REFACTOR: Migrate InstagramAuthenticator to use ManagedAuthenticator (#7081) 2019-03-04 14:54:28 +01:00
Bianca Nenciu
714f6cde79 FIX: Remove duplicate definition of create_categories. 2019-03-04 10:32:09 +02:00
Gerhard Schlager
c36c9c2ee5 FEATURE: Import script for AnswerBase
Improves the generic database used by some import scripts:
* Adds additional columns for users
* Adds support for attachments
* Allows setting the data type for keys (numeric or string) to ensure correct sorting
2019-02-28 22:08:12 +01:00
Gerhard Schlager
24369a8166 Improve phpBB3 importer
* Log errors when mapping of posts, messages, etc. fails
* Allow permalink normalizations for old subfolder installation
* Disable importing of polls for now. It's broken.
2019-02-17 23:20:20 +01:00
Gerhard Schlager
8d5dfe1e01 FIX: Don't import parts of the email address as name 2019-02-17 22:59:18 +01:00
Penar Musaraj
c50db76f5d FIX: do not treat TIFF, BMP, WEBP as images
Treating TIFF and BMP as images cause us to add them to IMG tags, this is very inconsistent across browsers.

You can still upload these files they will simply not be displayed in IMG tags.
2019-02-11 16:28:43 +11:00
Jeff Atwood
444bc466b0 for docs, normalize on space after code fence when specifying lang 2019-01-21 01:19:28 -08:00
Régis Hanol
1e67bcb456
PERF: bulk feature topic users & reset topic counters after an import 2019-01-17 21:48:23 +01:00
Régis Hanol
788719d271 DEV: speed up posts base imports 2019-01-04 15:30:17 +01:00
Arpit Jalan
71a5369fef FIX: do not convert quote tags to markdown 2018-12-11 20:09:46 +05:30
Arpit Jalan
735a48415d FEATURE: option to use ruby-bbcode-to-md in bulk import script
ruby-bbcode-to-md provides better bbcode to markdown conversion
2018-12-10 10:28:07 +05:30
Arpit Jalan
0365d50797 Improve vBulletin bulk import script to support table prefix.
Improve base bulk import script to convert list tags to ul/li.
2018-12-10 10:10:44 +05:30
David Taylor
160d29b18a
REFACTOR: Migrate TwitterAuthenticator to use ManagedAuthenticator (#6739)
No changes to functionality. TwitterAuthenticator goes from 136 lines to 24, and all twitter-specific logic elsewhere has been deleted 🎉
2018-12-07 15:39:06 +00:00
Régis Hanol
3c9c95ac83 Update Rubocop to 0.60 2018-12-04 10:48:16 +01:00
David Taylor
9248ad1905 DEV: Enable Style/SingleLineMethods and Style/Semicolon in Rubocop (#6717) 2018-12-04 11:48:13 +08:00
David Taylor
208005f9c9 REFACTOR: Migrate FacebookAuthenticator to use ManagedAuthenticator
Changes to functionality
  - Removed syncing of user metadata including gender, location etc.
    These are no longer available to standard Facebook applications.
  - Removed the remote 'revoke' functionality. No other providers have
    it, and it does not appear to be standard practice in other apps.
  - The 'facebook_no_email' event is no longer logged. The system can
    cope fine with a missing email address.

Data is migrated to the new user_associated_accounts table.
facebook_user_infos can be dropped once we are confident the data has
been migrated successfully.
2018-11-30 11:18:11 +00:00
Sam
6acabec423 FIX: script was missing newlines when generating hosts 2018-11-28 15:18:08 +11:00
Sam
6d9d904df5 add missing newline to end of file 2018-11-23 15:43:27 +11:00
Sam
d7b0f0069c no need to double strip this line 2018-11-23 14:48:02 +11:00
Sam
4c6eeaac15 Followup on 0739c3b1d1
This corrects some minor style issues
2018-11-23 14:43:52 +11:00
Sam
0739c3b1d1 DEV: this introduces a script capable of caching critical DNS locally
This is useful for cases where you want to add resiliency to DNS lookups
for redis and postgres, so they will continue to work even if there is
a DNS outage
2018-11-22 18:46:59 +11:00
Régis Hanol
a0f0bac752
Add a comment to run the 'import:ensure_consistency' rake task after a bulk import 2018-11-21 16:28:35 +01:00
Guo Xiang Tan
5076487eaf Update discuz_x import script to not use Category#logo_url. 2018-11-09 14:15:31 +08:00
Gerhard Schlager
77fedaba88 DEV: Add script for pushing translations to Transifex 2018-11-08 23:31:05 +00:00
Gerhard Schlager
d6f89a85ef Make Rubocop happy 2018-10-31 01:30:14 +01:00
Gerhard Schlager
65db9326b4 FEATURE: Add download script for Google Groups 2018-10-31 01:12:05 +01:00
Gerhard Schlager
efa265cbc8 Rename mbox import script 2018-10-31 01:12:05 +01:00
Gerhard Schlager
edbc004a9a Remove old mbox import script 2018-10-31 01:12:05 +01:00
Régis Hanol
c39a1022cc PERF: user imports would slow down the more users were imported 2018-10-22 11:14:13 +02:00
Régis Hanol
afa22a0c6f REFACTOR: more 'fake_email' to base importer 2018-10-22 11:12:40 +02:00
Régis Hanol
8b20e2500a
Remove unnecessary line 2018-10-19 15:48:48 +02:00
Régis Hanol
637123ff6f Merge users based on their email in vBulletin importer 2018-10-19 15:16:45 +02:00
Régis Hanol
53aa0344bf FIX: properly import vBulletin's hashed password 2018-10-18 10:22:55 +02:00
Régis Hanol
5f2fb0fe33 Show original options when an error happens while importing an user 2018-10-18 10:21:12 +02:00
Gerhard Schlager
cc27d61f9e FIX: discourse script didn't allow backups with paths anymore
This restores the previous functionality. The script now allows the following options:

* `discourse backup` (uses the system generated filename)
* `discourse backup <some_filename>` (uses the provided filename)
* `discourse backup </some/path/to/filename>` (moves the backup to the provided path with the given filename)

Remote backup stores do not support the last option.
Some file extensions (like `.tar.gz`) are automatically removed from the provided filename.
2018-10-17 18:33:44 +02:00
Gerhard Schlager
341836eb42 Fix the rake task and importer instead 2018-10-17 16:48:09 +02:00
Gerhard Schlager
ee18d9ace0 FIX: mbox importer and rake task were broken 2018-10-17 16:34:18 +02:00
Guo Xiang Tan
84d4c81a26 FEATURE: Support backup uploads/downloads directly to/from S3.
This reverts commit 3c59106bac.
2018-10-15 09:43:31 +08:00
Neil Lalonde
a68032835a FEATURE: XenForo importer can import categories from the xf_node table and convert sub-categories beyond second level to tags 2018-10-11 12:04:15 -04:00
Guo Xiang Tan
3c59106bac Revert "FEATURE: Support backup uploads/downloads directly to/from S3."
This reverts commit c29a4dddc1.

We're doing a beta bump soon so un-revert this after that is done.
2018-10-11 11:08:23 +08:00
Gerhard Schlager
c29a4dddc1 FEATURE: Support backup uploads/downloads directly to/from S3. 2018-10-11 10:38:43 +08:00
David Taylor
9bf522f227
FEATURE: Mixed case tagging (#6454)
- By default, behaviour is not changed: tags are made lowercase upon creation and edit.

- If force_lowercase_tags is disabled, then mixed case tags are allowed.

- Tags must remain case-insensitively unique. This is enforced by ActiveRecord and Postgres.

- A migration is added to provide a `UNIQUE` index on `lower(name)`. Migration includes a safety to correct any current tags that do not meet the criteria.

- A `where_name` scope is added to `models/tag.rb`, to allow easy case-insensitive lookups. This is used instead of `Tag.where(name: "blah")`.

- URLs remain lowercase. Mixed case URLs are functional, but have the lowercase equivalent as the canonical.
2018-10-05 10:23:52 +01:00
Penar Musaraj
9e008047db reset before running docker tests 2018-10-03 10:32:16 -04:00
Neil Lalonde
8af6d81891 FIX: improved category merging in discourse_merger. Use full paths to look for uniqueness instead of category names. 2018-09-20 12:33:58 -04:00
Neil Lalonde
b9891c2641 FIX: error because last_id is nil in discourse_merger script 2018-09-17 09:57:11 -04:00
David Taylor
26bd67a865 DEV: Add travis_fold statements to docker_test 2018-09-12 17:52:58 +01:00
Guo Xiang Tan
71185c13b5
Merge pull request #6377 from tgxworld/remove_tif_tiff
Drop `tif`, `tiff`, `webp` and `bmp` from supported images.
2018-09-12 09:32:32 +08:00
Guo Xiang Tan
e1b16e445e Rename FileHelper.is_image? -> FileHelper.is_supported_image?. 2018-09-12 09:22:28 +08:00
Carsten Brandt
921e2213b8 FEATURE: Updated IPB import script
* IPB import script replace PHP code tags with proper markdown

remove excess newlines in code blocks
decode HTML entities in code blocks
add replacement for list items
proper handling of attachments that are not images
fix typo
improved quote handling
fix code style complaint from travis-ci build
2018-09-12 11:12:28 +10:00
Neil Lalonde
4653627a40 update plugin-translations.rb script to update .tx/config file in plugins when languages are added or removed 2018-09-10 14:22:45 -04:00
Guo Xiang Tan
434035f167 FIX: Link post to uploads in PostCreator.
* This ensures that uploads are linked to their post on creation
  instead of a background job which may be delayed if Sidekiq
  is facing difficulties.
2018-09-06 11:18:11 +08:00
Gerhard Schlager
44922b0c25 zh_TW isn't broken anymore 2018-09-05 00:47:39 +02:00
Guo Xiang Tan
8dc1463ab3 Enable Lint/ShadowingOuterLocalVariable for Rubocop. 2018-09-04 10:16:42 +08:00
Neil Lalonde
15f657309a FEATURE: Zendesk importer that uses its API to get data 2018-08-28 10:21:39 -04:00
Neil Lalonde
30722240e4 add discourse-checklist to plugin-translations.rb 2018-08-23 10:00:27 -04:00
Gerhard Schlager
ac743dab10 Improve mbox import script
* emails weren't sorted in correct order
* better default regex for splitting mbox files
* output Message-ID if email is skipped because it doesn't have a Date
2018-08-23 09:46:28 +02:00
Neil Lalonde
3fddbb603c omit zh_TW which breaks the build 2018-08-21 11:17:42 -04:00
Neil Lalonde
0ada6b81c2 DEV: add a way to skip locales with problems that break Discourse and need to be fixed in Transifex 2018-08-21 10:36:48 -04:00
Arpit Jalan
7af0da9498 Fix Vanilla bulk import script 2018-08-16 22:12:26 +05:30
Arpit Jalan
0e04e3990e Improve Vanilla bulk import script 2018-08-16 22:00:26 +05:30
Neil Lalonde
ac3b0f0164 REFACTOR: move remap out of script into a class 2018-08-15 12:37:52 -04:00
Gerhard Schlager
7f4ef3db9e Improve Telligent importer
* Try multiple filenames and do lots of guessing when searching for attachments
* Unescape HTML in filenames and replace invalid characters in filenames
* Existing permalinks prevented resuming of import
* Prevent duplicate attachments in same post
2018-08-13 15:28:11 +02:00
Gerhard Schlager
8513605421 Fix the import of avatars and attachments
This time for real ;-)
2018-08-12 22:26:07 +02:00
Gerhard Schlager
6d813c2b52 FIX: Importers failed to import avatars 2018-08-12 22:02:17 +02:00
Gerhard Schlager
1794aea939 FEATURE: Add import script for Telligent 2018-08-12 22:01:23 +02:00
Neil Lalonde
f7f24a5399 FIX: discourse_merger: skip collisions on join models when both objects were merged 2018-08-02 16:05:55 -04:00
Mohammad AlTawil
64f533db99 Add display name to user (#6198) 2018-07-31 14:43:16 +10:00
Sam
e4208113a8 improve report and add regular logging 2018-07-27 16:22:14 +10:00
Sam
5e262265a2 update script to provide more mem stats 2018-07-27 12:51:23 +10:00
Godfrey Chan
5affdcbd59 Bump Ruby version in some docs 2018-07-25 14:38:10 -07:00
Vinoth Kannan
1390eb2957 Disable bootstrap mode before start importing 2018-07-25 12:12:26 +05:30
Sam
f0a23d50b4 DEV: add script for testing memory usage in sidekiq 2018-07-24 17:57:02 +10:00
Neil Lalonde
bf7ebecb76 FIX: discourse_merger: many foreign keys were not being updated 2018-07-22 22:05:07 -04:00
Neil Lalonde
4e09206061 FIX: set uploads sequence after copying uplaods in discourse_merger 2018-07-19 11:07:15 -04:00
Régis Hanol
e8e9b5cea4 FIX: clean URLs in SMF1 importer 2018-07-19 13:17:43 +02:00
Régis Hanol
63e5349209 FIX: [img] BBCode tags might have parameters 2018-07-19 13:11:01 +02:00
Régis Hanol
5434cf02a3 FIX: smf1 importer was swallowing some data 2018-07-19 10:29:54 +02:00
Neil Lalonde
def2653fc8 FIX: discourse_merger: copied topic_link records had wrong url, and update all internal links to use new topic URLs in copied posts 2018-07-18 16:45:48 -04:00
Neil Lalonde
24da2940a7 FIX: copy uploads quickly in discourse_merger.rb, and fix user avatar upload id for copied users 2018-07-18 16:42:59 -04:00
Neil Lalonde
dbfa491ee2 FIX: avatars in discourse_merger.rb 2018-07-17 21:40:24 -04:00
Neil Lalonde
f146f94ef6 FIX: errors when copying post_uploads in discourse_merger.rb 2018-07-17 16:47:23 -04:00
Neil Lalonde
04077a7df6 WIP: a fast method of copying uploads in discourse_merger.rb. not working yet. 2018-07-17 16:46:32 -04:00
Neil Lalonde
2786c79354 another check to avoid unique index error in discourse_merger.rb 2018-07-16 13:34:41 -04:00
Neil Lalonde
8d11df6971 FIX: support amazon S3 upload urls in discourse_merger.rb 2018-07-13 16:10:31 -04:00
Neil Lalonde
71814009bd FIX: badges for merged users don't get merged by discourse_merger.rb 2018-07-12 17:43:21 -04:00
Neil Lalonde
cba292cb56 FIX: personal messages not being copied by discourse_merger.rb 2018-07-12 17:41:16 -04:00
Régis Hanol
c818550172 Support custom avatar in SMF1 importer 2018-07-12 17:38:07 +02:00
Régis Hanol
5c4534d895 Update SMF1 import
- Properly import avatar when they use an external image
- Don't import the same attachment twice
2018-07-12 16:55:30 +02:00
Neil Lalonde
c33ee13c4c FIX: discourse_merger halts when topic has nil category 2018-06-29 12:21:25 -04:00
Sam
f4f95ce956 correct linting 2018-06-29 16:04:38 +10:00
David Lee
8f43872bff Add Question2Answer import script 2018-06-29 15:48:01 +10:00
Arpit Jalan
c73f98c289 FIX: invert from and to user id in smf1 import script 2018-06-28 12:30:28 +05:30
Gerhard Schlager
fb022098f6 Base importer: Calculate category colors depending on parent category 2018-06-27 20:27:11 +02:00
Vinoth Kannan
652b32484f Assign default value for message template matches 2018-06-26 05:16:03 +05:30
Vinoth Kannan
f3011c709b Extract html content from lithium message template 2018-06-26 05:07:32 +05:30
Neil Lalonde
a1c0d0e6e5 fixes to discourse_merger: failures for Uploads, UserBadges, PostUploads hack 2018-06-21 12:16:05 -04:00
Neil Lalonde
b9cb97df7f add support for badges in discourse_merger 2018-06-19 15:11:48 -04:00
Neil Lalonde
dbcbd8d939 close connections in discourse_merger 2018-06-19 10:34:05 -04:00
Sam
5f64fd0a21 DEV: remove exec_sql and replace with mini_sql
Introduce new patterns for direct sql that are safe and fast.

MiniSql is not prone to memory bloat that can happen with direct PG usage.
It also has an extremely fast materializer and very a convenient API

- DB.exec(sql, *params) => runs sql returns row count
- DB.query(sql, *params) => runs sql returns usable objects (not a hash)
- DB.query_hash(sql, *params) => runs sql returns an array of hashes
- DB.query_single(sql, *params) => runs sql and returns a flat one dimensional array
- DB.build(sql) => returns a sql builder

See more at: https://github.com/discourse/mini_sql
2018-06-19 16:13:36 +10:00
OsamaSayegh
91b73e0c2d FIX: remap shouldn't fail silently when an error occurs 2018-06-19 14:49:43 +10:00
Vinoth Kannan
4ffa4a28b0 FIX: duplicate_emails get overridden in new batch of import 2018-06-19 00:21:48 +05:30
Vinoth Kannan
750367007c REFACTOR: Import user visits from lithium database 2018-06-18 20:38:57 +05:30
Vinoth Kannan
ef4a86456b Add attachment folder name in prefix for lithium import 2018-06-18 18:29:14 +05:30
Gerhard Schlager
3f167ae5ce Use short upload URL in import scripts 2018-06-17 22:57:32 +02:00
Gerhard Schlager
88ca838e02 Create avatar from file in base importer 2018-06-17 22:57:31 +02:00
Gerhard Schlager
84d9b2e473 Use correct post id in zendesk importer 2018-06-17 22:57:31 +02:00
Vinoth Kannan
2a0f409b9d Use lowercased email addresses to check duplicates 2018-06-16 20:34:37 +05:30
Vinoth Kannan
ac44374a59 Import user visits from user_log table 2018-06-16 19:10:55 +05:30
Neil Lalonde
20ceadffaf FEATURE: script to merge two discourse sites 2018-06-15 17:13:36 -04:00
Sam
c56bd2ac16 add memory analysis script 2018-06-14 12:18:36 +10:00
discoursehosting
fc973f9363 Improve the VBulletin importer (#5922) 2018-06-12 20:41:21 +02:00
Neil Lalonde
1ba8e8948d FIX: add support for string avatar_type values in PHPBB3 importer 2018-06-07 18:14:11 -04:00
Arpit Jalan
b4e0cddcc9 disable all outgoing emails in base importer 2018-06-07 22:49:38 +05:30
Vinoth Kannan
620a1524cb Use plus addressing email address for duplicates 2018-06-07 19:11:55 +05:30
Arpit Jalan
f9ab3848ed FEATURE: support disabling emails for non-staff users 2018-06-07 18:31:08 +05:30
Guo Xiang Tan
ad5082d969 Make rubocop happy again. 2018-06-07 13:28:18 +08:00
Régis Hanol
127398c68e FIX: import comments of 1st post in SE importer 2018-06-05 18:22:42 +02:00
Régis Hanol
685083491e FEATURE: StackOverflow importer 2018-06-04 16:57:12 +02:00
Régis Hanol
6862194255 extract configuration variables from SMF1 importer 2018-05-30 15:53:57 +02:00
Gerhard Schlager
bf30f74f60 Pulling translations for a new language didn't work 2018-05-29 20:57:32 +02:00
Gerhard Schlager
bdeae17d32 Automatically create locale.js.erb file when adding new locale 2018-05-29 12:58:31 +02:00
Régis Hanol
aeb511e8ff FEATURE: SMF1 importer 2018-05-28 11:02:19 +02:00
Gerhard Schlager
2f0e230dba Adds import script for Zendesk
It also adds a generic SQLite database that can be used when the data needs some transformation before the actual import.
2018-05-22 21:55:54 +02:00
Gerhard Schlager
eceeef8413 Imported categories use colors from settings instead of brown 2018-05-22 21:55:54 +02:00
Vinoth Kannan
bb12fa3fdc Migrate user mentions in lithium import 2018-05-21 18:19:22 +05:30
Vinoth Kannan
b229c112f6 FIX: variable name typo 2018-05-21 13:47:30 +05:30
Vinoth Kannan
09151190f9 FIX: Use avatar_dir to import user avatars 2018-05-21 13:43:23 +05:30
Vinoth Kannan
c9c3a83261 Importing lithium post images and attachments 2018-05-21 13:34:52 +05:30
Vinoth Kannan
f3385a74cb Importing lithium topic tags 2018-05-19 11:24:48 +05:30
Vinoth Kannan
ba0dd5889d Improvements in importing the lithium pms 2018-05-18 22:57:15 +05:30
Vinoth Kannan
9f92fdded0 Improvements in lithium topic and post import 2018-05-18 18:53:18 +05:30
Vinoth Kannan
9d4d6276b7 Import user profile fields and avatars 2018-05-18 17:11:20 +05:30