Commit Graph

19 Commits

Author SHA1 Message Date
Gerhard Schlager
65db9326b4 FEATURE: Add download script for Google Groups 2018-10-31 01:12:05 +01:00
Gerhard Schlager
341836eb42 Fix the rake task and importer instead 2018-10-17 16:48:09 +02:00
Gerhard Schlager
ee18d9ace0 FIX: mbox importer and rake task were broken 2018-10-17 16:34:18 +02:00
Gerhard Schlager
ac743dab10 Improve mbox import script
* emails weren't sorted in correct order
* better default regex for splitting mbox files
* output Message-ID if email is skipped because it doesn't have a Date
2018-08-23 09:46:28 +02:00
Gerhard Schlager
f2d00e5eff FEATURE: Use Message-ID for detecting email replies to group
Ignores the site setting "find_related_post_with_key" and always tries to honor the `In-Reply-To` and `References` header for emails sent to a group.

The senders email address must be included in the `To` or `CC` header of a previous email sent to the group and the `Message-ID` of that email must be included in the current email's `In-Reply-To` or `References` header.
2018-04-05 11:00:38 +02:00
Gerhard Schlager
9b651adadb FIX: mbox importer should ignore emails without date 2018-03-13 13:42:57 +01:00
Gerhard Schlager
dc32ee5cbf Improvements to mbox import script
* Ignore errors during indexing and show information about the message causing the problem
* Always activate imported users if they aren't staged
2018-03-06 11:32:12 +01:00
Gerhard Schlager
479f7ed18f Ignore case when removing mailing list name from subject 2018-02-12 21:41:58 +01:00
Gerhard Schlager
6500343431 FIX: mbox importer didn't detected already indexed files 2018-01-17 17:03:53 +01:00
Gerhard Schlager
bb54eb1192 Improvements to mbox importer
* store time it took to index message in DB (to find performance issues)
* ignore listserv specific files
* better examples for split_regex
* first email in mbox shouldn't contain the split string
* always lock the DB in exclusive mode
* save email within transaction
* messages can be grouped by subject and use original order (for Listserv)
* adds option to index emails without running the import
2018-01-17 12:04:57 +01:00
Yaw Anokwa
77a92e8878 Allow user staging via setting (#5468) 2018-01-04 09:17:35 +01:00
Gerhard Schlager
cafe69caac Refactor mbox import script 2017-12-13 22:03:31 +01:00
Arpit Jalan
3190c13c22 import staged users as inactive in mbox import 2017-12-13 08:45:43 +05:30
Gerhard Schlager
16738cfb1b FEATURE: convert plain text emails to markdown 2017-12-06 01:47:51 +01:00
Gerhard Schlager
32dd1e66be improvements to the mbox import script
* ignores dot-files and empty emails
* new setting to prefer HTML over plaintext emails during import
* restore original site settings at the end of import
* elided content of HTML mails was not put inside details block
2017-11-18 17:16:44 +01:00
Gerhard Schlager
06a6ddc3ba handle plaintext and HTML emails in mbox importer 2017-11-15 20:22:11 +01:00
Gerhard Schlager
6c829c24d7 escaping the subject isn't needed in the mbox importer 2017-10-19 15:25:20 +02:00
Gerhard Schlager
c41880ab19 Improvements to the experimental mbox importer
* Disable journaling to improve performance in Docker
* Use the email cooking method
* Store IncomingEmail in order find related posts by Message-ID
* Escape HTML in imported messages
2017-10-19 14:27:40 +02:00
Gerhard Schlager
8299e7e8c3
Add new, experimental version of mbox importer 2017-05-29 20:59:18 +02:00