Commit Graph

624 Commits

Author SHA1 Message Date
David Taylor
436b3b392b
DEV: Apply syntax_tree formatting to script/* 2023-01-09 11:13:22 +00:00
David Taylor
d5491b13f5
DEV: Fix syntax/formatting in xenforo import script (#19761)
Followup to 7dfe85fc
2023-01-05 12:47:05 +00:00
GeckoLinux
cc5b4cd49a
FIX: change drupal permalink creation to use /node/
Drupal URL scheme for nodes begins with `/node/` , not `/topic/` .
2022-12-02 16:03:00 +11:00
Alan Guo Xiang Tan
7c321d3aad
PERF: Update Group#user_count counter cache outside DB transaction (#19256)
While load testing our user creation code path in production, we
identified that executing the DB statement to update the `Group#user_count` column within a
transaction is creating a bottleneck for us. This is because the
creation of a user and addition of the user to the relevant groups are
done in a transaction. When we execute the DB statement to update
`Group#user_count` for the relevant group, a row level lock is held
until the transaction completes. This row level lock acts like a global
lock when the server is creating users that will be added to the same
group in quick succession.

Instead of updating the counter cache within a transaction which the
default ActiveRecord `counter_cache` option does, we simply update the
counter cache outside of the committing transaction.

Co-authored-by: Rafael dos Santos Silva <xfalcox@gmail.com>

Co-authored-by: Rafael dos Santos Silva <xfalcox@gmail.com>
2022-11-30 11:52:08 -03:00
Leonardo Mosquera
bfecbde837
Fixes for vBulletin bulk importer (#17618)
* Allow taking table prefix from env var

* FIX: remove unused column references

The columns `filedata` and `extension` are not present in a v4.2.4
database, and they aren't used in the method anyways.

* FIX: report progress for tables without imported_id

* FIX: effectively check for AR validation errors

NOTE: other migration scripts also have this problem; see /t/58202

* FIX: properly count Posts when importing attachments

* FIX: improve logging

* Remove leftover comment

* FIX: show progress when exporting Permalink file

* PERF: stream Permalink file

The current way results in tons of memory usage; write once per line instead

* Document fixes needed

* WIP - deduplicate category names

* Ignore non alphanumeric chars for grouping

* FIX: properly deduplicate user emails by merging accounts

* FIX: don't merge empty UserEmails

* Improve logging

* Merge users AFTER fixing primary key sequences

* Parallelize user merging

* Save duplicated users structure for debugging purposes

* Add progress logging for the (multiple hour) user merging step
2022-11-28 16:30:19 -03:00
communiteq
7dfe85fcc7
DEV: Xenforo importer improvements (#18457)
* Fix: make expressions non-greedy
* Feature: import Xenforo avatars
* Feature: import Xenforo likes
* Feature: import Xenforo private messages
* Feature: Xenforo create permalinks
* Feature: Xenforo migrate view counts
* Fix: Xenforo list regexes
* Fix: Xenforo import all attachments
2022-11-28 16:42:39 +01:00
Pierre Ozoux
9e9235ca62
FEATURE: Add import script for Elgg (#19140) 2022-11-28 16:28:08 +01:00
Constanza
067c4deb4c
Fix comment to include phpbb 3.3, which is now supported (#18006) 2022-08-19 16:42:32 -04:00
Constanza
ef842a4b29
FEATURE: Adding a simple CSV importer (#17993) 2022-08-19 13:09:30 -04:00
Constanza
8836c8bcdf
FIX: the phpbbb import script was not parsing youtube tags (#17787) 2022-08-05 15:20:32 -04:00
communiteq
603f36ca4a
DEV: Support phpBB 3.3 imports (#17641)
* handle polls with duplicate items
* handle polls with incorrect poll_option_total values
* handle group IDs in personal messages
* support for version 3.3
2022-07-25 22:07:03 +02:00
Jay Pfaffman
7ab5dcf82f
FEATURE: my_bb import supports avatars (#17617) 2022-07-25 15:22:25 +02:00
Constanza
b9ac8e5748
Adding 3.2 to the versions of phpbb supported by the migration script (#17483) 2022-07-14 18:06:47 +05:30
Martin Brennan
fcc2e7ebbf
FEATURE: Promote polymorphic bookmarks to default and migrate (#16729)
This commit migrates all bookmarks to be polymorphic (using the
bookmarkable_id and bookmarkable_type) columns. It also deletes
all the old code guarded behind the use_polymorphic_bookmarks setting
and changes that setting to true for all sites and by default for
the sake of plugins.

No data is deleted in the migrations, the old post_id and for_topic
columns for bookmarks will be dropped later on.
2022-05-23 10:07:15 +10:00
Martin Brennan
222c8d9b6a
FEATURE: Polymorphic bookmarks pt. 3 (reminders, imports, exports, refactors) (#16591)
A bit of a mixed bag, this addresses several edge areas of bookmarks and makes them compatible with polymorphic bookmarks (hidden behind the `use_polymorphic_bookmarks` site setting). The main ones are:

* ExportUserArchive compatibility
* SyncTopicUserBookmarked job compatibility
* Sending different notifications for the bookmark reminders based on the bookmarkable type
* Import scripts compatibility
* BookmarkReminderNotificationHandler compatibility

This PR also refactors the `register_bookmarkable` API so it accepts a class descended from a `BaseBookmarkable` class instead. This was done because we kept having to add more and more lambdas/properties inline and it was very messy, so a factory pattern is cleaner. The classes can be tested independently as well.

Some later PRs will address some other areas like the discourse narrative bot, advanced search, reports, and the .ics endpoint for bookmarks.
2022-05-09 09:37:23 +10:00
Leonardo Mosquera
3e5faffb0d
DEV: mbox importer improvements (#16557)
* FIX: support specifying parent_category_id in mbox import metadata
* FIX: elide tabs from topic titles
* FIX: optionally fix Mailman from: addresses
* DEV: optionally elide anything up to the last = in email addresses
* Fix Mailmain broken from: detection
2022-04-29 13:24:29 -03:00
Jarek Radosz
2fc70c5572
DEV: Correctly tag heredocs (#16061)
This allows text editors to use correct syntax coloring for the heredoc sections.

Heredoc tag names we use:

languages: SQL, JS, RUBY, LUA, HTML, CSS, SCSS, SH, HBS, XML, YAML/YML, MF, ICS
other: MD, TEXT/TXT, RAW, EMAIL
2022-02-28 20:50:55 +01:00
Jarek Radosz
6f6406ea03
DEV: Fix random typos (#16066) 2022-02-28 10:20:58 +08:00
Michael Brown
3bf3b9a4a5 DEV: pull email address validation out to a new EmailAddressValidator
We validate the *format* of email addresses in many places with a match against
a regex, often with very slightly different syntax.

Adding a separate EmailAddressValidator simplifies the code in a few spots and
feels cleaner.

Deprecated the old location in case someone is using it in a plugin.

No functionality change is in this commit.

Note: the regex used at the moment does not support using address literals, e.g.:
* localpart@[192.168.0.1]
* localpart@[2001:db8::1]
2022-02-17 21:49:22 -05:00
Gerhard Schlager
6394d7cddf
DEV: Improve phpBB3 import script (#15956)
* Optional import of custom user fields from phpBB 3.1+
* Optional import of likes from phpBB3
  Requires the phpBB "Thanks for posts" extension
* Fix import of bookmarks from phpBB3
* Update `created_at` of existing user
* Support mapping of phpBB forums to existing Discourse categories
  This is in addition to the ability of merging phpBB forums and importing into newly created Discourse categories.
2022-02-16 13:04:31 +01:00
Gerhard Schlager
33d6ed60a4
DEV: Don't import year of birth (#15937)
The cakeday plugin doesn't use the year.
2022-02-14 18:10:35 +01:00
Gerhard Schlager
6a41ec179c
FIX: Default settings for phpBB3 import were broken (#15913) 2022-02-11 18:18:54 +01:00
Canapin
ea2fd75d10
DEV: Fix some regexes in phpBB3 import script (#15829)
1. bbcode hashes don't always have exactly 8 characters.

2. colors aren't always hex values, it can be a color string ("red", "blue", etc).

3. The closing tag of smileys doesn't always include a `:` character (the start of the regex was already right for this particular issue)
2022-02-07 16:16:46 +01:00
Peter Zhu
c5fd8c42db
DEV: Fix methods removed in Ruby 3.2 (#15459)
* File.exists? is deprecated and removed in Ruby 3.2 in favor of
File.exist?
* Dir.exists? is deprecated and removed in Ruby 3.2 in favor of
Dir.exist?
2022-01-05 18:45:08 +01:00
Leonardo Mosquera
48a08cc397
FIX: Vanilla importer fixes (#14699)
Import script was out of date
2021-10-27 14:22:37 +02:00
Dan Ungureanu
69f0f48dc0
DEV: Fix rubocop issues (#14715) 2021-10-27 11:39:28 +03:00
David Taylor
46d96c9feb
DEV: Apply rubocop to script/import_scripts/phorum.rb (#14727)
Followup to b24002018a
2021-10-26 19:16:52 +01:00
Jeremy Waters
b24002018a Update phorum.rb
Add attachment/file/upload handling to bring them in from phorum to discourse
2021-10-26 12:41:50 -04:00
Theodore Diamantidis
97178cd777
FIX: phpbb import - attachments not embedded in posts (#14570) 2021-10-11 14:27:54 +02:00
Constanza
a413a1e015
DEV: process image uploads in the Zendesk API import script (#14524) 2021-10-06 12:24:12 -04:00
Vinoth Kannan
cd9262b7d3
DEV: minor improvements in the vanilla import script. (#14026)
We're parsing the post raw based on the record format now.
2021-08-12 15:07:44 +05:30
Ruoxin Wang
f9aaed7020
FIX: MyBB importer exposes deleted posts (#13700)
The MyBB importer exposes posts marked as deleted in MyBB. This may lead to a privacy issue.
2021-07-22 09:55:02 +02:00
Alan Guo Xiang Tan
37b8ce79c9
FEATURE: Add last visit indication to topic view page. (#13471)
This PR also removes grey old unread bubble from the topic badges by
dropping `TopicUser#highest_seen_post_number`.
2021-07-05 14:17:31 +08:00
Dan Ungureanu
6ea4bbd2ec
DEV: Prefer .pluck_first over .pluck.first (#13607) 2021-07-02 10:03:54 +08:00
Bianca Nenciu
5efed91128 FIX: Set random values for digest_attempted_at
Setting a random value in the interval 1 week ago ... now works better
because this spreads digest scheduling over a week because digests are
sent one week from the date of the last digest.
2021-06-22 12:05:15 +08:00
Arpit Jalan
365d339985
DEV: fix Flarum import script (#13385) 2021-06-15 19:08:55 +05:30
Lecter
4bf195d502
FEATURE: Flarum import script (#13139) 2021-05-27 02:30:50 +02:00
Bianca Nenciu
d0779a87bb
FIX: Use max_category_nesting when importing categories (#13105)
It allowed for a parent category and a sub-category.
2021-05-26 12:40:26 +03:00
Josh Soref
59097b207f
DEV: Correct typos and spelling mistakes (#12812)
Over the years we accrued many spelling mistakes in the code base. 

This PR attempts to fix spelling mistakes and typos in all areas of the code that are extremely safe to change 

- comments
- test descriptions
- other low risk areas
2021-05-21 11:43:47 +10:00
Pilou
e7892df10d
FIX: handle charset=windows-1252 in mbox import script (#12832)
Co-authored-by: Loïc Dachary <loic@dachary.org>
2021-04-27 15:43:31 +02:00
Michael Maroszek
5bec0e5763
fix vbulletin importer to import unreferenced attachments (#12187) 2021-04-19 21:05:16 +02:00
Justin DiRose
e302e32a3f
DEV: Add Higher Logic import script (#12623)
Wrote up a new script to import from Higher Logic. Nothing too crazy going on here. Two major things about this script:

    It requires you to convert a Microsoft SQL file to a format MySQL can read.
    Higher Logic stores posts (at least in the case of the import I ran) with the email thread shown in the post body. The script does its best to truncate this out, but the logic may need to be improved on future imports. For the import I ran, it worked just fine as is. 🤷‍♂️
2021-04-06 16:53:55 -05:00
Justin DiRose
1aaf588fb7
DEV: Improvements to vanilla_mysql importer (#12308)
Made some improvements to the Vanilla MySQL script -- mainly because not all SQL imports require use of the VanillaBodyParser. Still left it as an option to turn on and use if so desired. Also added subcategory support, importing of likes, and solve status.
2021-03-11 10:21:56 -06:00
Blake Erickson
dbcda617b3
DEV: Add a CSV importer for restoring deleted users (#12147)
This is an importer I wrote to restore some users that were
accidentally deleted for being purged as old staged users or old
unactivated users.

It reads from CSV files exported from a discourse sql backup.
2021-02-19 13:46:54 -07:00
Blake Erickson
ed0e4582a1
DEV: If disabled do not change setting after import (#12142)
When running an import script there are many site settings that are
changed but we reset them back to where they were originally before the
import. However, there are two settings that we don't roll back:

```
purge_unactivated_users_grace_period_days
purge_deleted_uploads_grace_period_days
```

which could have some unintended consequences.

My first question is do we *really* have to change these settings? I'm
not a huge fan of changing someones settings without them really knowing
they were changed.

If we really do have to change these settings here is my proposed PR
where we don't alter the `purge_unactivated_users_grace_period_days` if
it has been disabled.

As I'm writing this another change we could make is that we don't change
either of these site settings if we detect that they aren't set to the
default values.

The drive behind this PR is that there is a discourse instance which
relies on staged users as part of their workflow and this setting was
changed by accident via the import script causing users to be deleted
that shouldn't have been.
2021-02-19 09:33:35 -07:00
Michael Maroszek
144584aacb
fix vbulletin importer to hide soft-deleted posts (#12057)
equal to theads posts can be soft-deleted which results in a visibile = 2 state. at the moment those posts will be imported fully visible.
2021-02-12 14:29:05 +01:00
Bianca Nenciu
a71b219c9a
Improvements to phpBB3 import script (#10999)
* FEATURE: Import attachments

* FEATURE: Add support for importing multiple forums in one

* FEATURE: Add support for category and tag mapping

* FEATURE: Import groups

* FIX: Add spaces around images

* FEATURE: Custom mapping of user rank to trust levels

* FIX: Do not fail import if it cannot import polls

* FIX: Optimize existing records lookup

Co-authored-by: Gerhard Schlager <mail@gerhard-schlager.at>
Co-authored-by: Jarek Radosz <jradosz@gmail.com>
2021-01-14 21:44:43 +02:00
Arpit Jalan
bd7cbcd8f8
Improve Vanilla import script. (#11701)
- import groups and group users
- import uploads/attachments
- improved code tag parsing
- improved text formatting
- mark topics as solved
2021-01-13 23:10:00 +05:30
Régis Hanol
a85d5edbf1
DEV: set digest_attempted_at during migrations (#11369) 2020-12-14 10:58:14 +11:00
Jarek Radosz
2f4a1ff61b
DEV: Update rubocop-discourse from 2.3.2 to 2.4.0 (#11079)
Also fixes whitespace related issues raised by rubocop.
2020-10-30 15:04:29 +01:00