When receiving emails sent with Exchange, we look for some markers to identify the body of the mail and the reply (aka. previous email).
For some reasons, those markers aren't 100% reliable and sometimes, only one of them is present.
The commit 20ba54d536 introduced the bug because the `HTML_EXTRACTERS` regex for exchange looks for either `messageBodySection` or `messageReplySection` but we were only using the `reply` section. So if an email had only the `body` section, it would not be correctly extracted.
This commit handle the cases where either one of them is missing and use the other one as the actual "reply". When both are present, it correctly elides the "reply" section.
This fix handles the case where an In-Reply-To mail header
can contain multiple Message-IDs. We use this header to
try look up an EmailLog record to find the post to reply
to in the group email inbox flow.
Since the case where multiple In-Reply-To Message-IDs is
rare (we've only seen a couple of instances of this causing
errors in the wild), we are just going to use the first one
in the array.
Also, Discourse does not support replying to multiple posts
at once, so it doesn't really make sense to use multiple
In-Reply-To Message-IDs anyway.
Introduced back in 2022 in
e3d495850d,
our new more specific message-id format for inbound and
outbound emails has now been in use for a very long time,
we can remove the support for the old formats:
`topic/:topic_id/:post_id.:random@:host`
`topic/:topic_id@:host`
`topic/:topic_id.:random@:host`
## What?
Depending on the email software used, when you reply to an email that has some attachments, they will be sent along, since they're part of the embedded (replied to) email.
When Discourse processes the reply as an incoming email, it will automatically add all the (valid) attachments at the end of the post. Including those that were sent as part of the "embedded reply".
This generates posts in Discourse with duplicate attachments 🙁
## How?
When processing attachments of an incoming email, before we add it to the bottom of the post, we check it against all the previous uploads in the same topic. If there already is an `Upload` record, it means that it's a duplicate and it is _therefore_ skipped.
All the inline attachments are left untouched since they're more likely new attachments added by the sender.
https://meta.discourse.org/t/improving-mailman-email-parsing/253041
When mirroring a public mailling list which uses mailman, there were some cases where the incoming email was not associated to the proper user.
As it happens, for various (undertermined) reasons, the email from the sender is often not in the `From` header but can be in any of the following headers: `Reply-To`, `CC`, `X-Original-From`, `X-MailFrom`.
It might be in other headers as well, but those were the ones we found the most reliable.
This header is used by Microsoft Exchange to indicate when certain types of
autoresponses should not be generated for an email.
It triggers our "is this mail autogenerated?" detection, but should not be used
for this purpose.
Usually, when an email is received a user lookup is performed using the
email address found in the `From` header. When an email has an
`X-Original-From` header, if it is equal to `Reply-To` then it uses that
one instead. The comparison was sensitive to whitespaces and other
insignificant characters such as quotes because it reconstructed the
`From` header.
For the fixture added in this commit, it compared the reconstructed
`From` header `John Doe <johndoe@example.com>` with the `Reply-To`
header `"John Doe" <johndoe@example.com>`.
The display name can have quotes around it, which does not work
with our current comparison of a from field (in this case Reply-To)
and another header (X-Original-From), because we are not comparing
the two values in the same way. This causes an issue where the
commit here: b88d8c8 will not
work properly; the forwarded email gets the From address instead
of the Reply-To address as intended.
When forwarding emails into the group inbox, we now use the
original sender email as the from_address since
2ac9fd9dff. However, we have not
been saving the original CC addresses of the forwarded email,
which are needed to include those recipients in on the conversation
when replying via the group inbox.
This commit captures the CC addresses on the incoming email, and
makes sure the emails are created as staged users and added to the
list of topic allowed users so they are included on CC's sent by
the GroupSmtpEmail and other jobs.
When 2ac9fd9dff was done, this
affected the small post that is created when forwarding an email
into the group inbox. Instead of using the name and the email of
the user who forwarded the email, it used the original from email
and name to create the small post. So instead of something like
"Discourse Team forwarded the above email" we ended up with
"John Smith forwarded the above email" which is incorrect.
This fixes the issue by creating a staged user for the forwarding
email address (if such a user does not yet exist) and uses that
for the "forwarded" small post instead.
When emails were forwarded to a group inbox by the email address
of the group, for example when an email ends up in spam and must
be manually forwarded to the group+site@discoursemail.com address,
the OP of the topic ended up being the group's email address instead
of the sender who originally sent the email to the group inbox.
This commit detects that an email has been forwarded using existing
tools, and if the from address matches one of the group incoming
email addresses, then we look at the forwarded email's from address
and use that instead for the incoming email from address as well as
the staged/regular user used for the Topic.user.
This will make it much cleaner to forward emails into a group inbox,
and will prevent issues with PostAlerter where the OP is double-notified
for these emails.
Emails can include the marker in a different language, depending on
site and user settings. The email receiver always looked for the marker
in default language.
When the Reply-To header is present for incoming emails we
want to use it instead of the from address. This is usually the
case when forwarding an email via a mailing list into Discourse.
For now we are only using the Reply-To header if the email has
been forwarded via Google Groups, which is why we are checking the
X-Original-From header too. In future we may want to use the Reply-To
header in more cases.
When replying to a user_private_message email originating from
a group PM that does _not_ have a reply key (e.g. when replying
directly to the group's SMTP address), we were mistakenly linking
the new post created from the reply to the OP and the user who
created the topic, based on the first IncomingEmail message ID in
the topic, rather than using the correct reply to user and post number
that the user actually replied to.
We now use the In-Reply-To header to look up the corresponding EmailLog
record when the user who replied was sent a user_private_message email,
and use the post from that as the reply_to_user/post.
This also removes superfluous filtering of incoming_email records. After
already filtering by message_id and then addressed_to_user (which only
returns incoming emails where the to, from, or cc address includes any
of the user's emails), we were filtering again but in the ruby code for
the exact same conditions. After removing this all existing tests still
pass.
This PR changes the `UserNotification` class to send outbound `user_private_message` using the group's SMTP settings, but only if:
* The first allowed_group on the topic has SMTP configured and enabled
* SiteSetting.enable_smtp is true
* The group does not have IMAP enabled, if this is enabled the `GroupSMTPMailer` handles things
The email is sent using the group's `email_username` as both the `from` and `reply-to` address, so when the user replies from their email it will go through the group's SMTP inbox, which needs to have email forwarding set up to send the message on to a location (such as a hosted site email address like meta@discoursemail.com) where it can be POSTed into discourse's handle_mail route.
Also includes a fix to `EmailReceiver#group_incoming_emails_regex` to include the `group.email_username` so the group does not get a staged user created and invited to the topic (which was a problem for IMAP), as well as updating `Group.find_by_email` to find using the `email_username` as well for inbound emails with that as the TO address.
#### Note
This is safe to merge without impacting anyone seriously. If people had SMTP enabled for a group they would have IMAP enabled too currently, and that is a very small amount of users because IMAP is an alpha product, and also because the UserNotification change has a guard to make sure it is not used if IMAP is enabled for the group. The existing IMAP tests work, and I tested this functionality by manually POSTing replies to the SMTP address into my local discourse.
There will probably be more work needed on this, but it needs to be tested further in a real hosted environment to continue.
Over the years we accrued many spelling mistakes in the code base.
This PR attempts to fix spelling mistakes and typos in all areas of the code that are extremely safe to change
- comments
- test descriptions
- other low risk areas
Some emails coming in via the mail receiver can still end up
with bad encoding when trying to enqueue the job. This catches
the last encoding issue and forces iso-8559-1 and encodes to
UTF-8 to circumvent the issue.
Version 2.8 brings some changes to how address fields are handled and
this commits updates that and should also include a fix which handles
encoded attachment filenames.
The fork contains a bugfix to correctly decode mail attachments.
If you reply to an email with the word "mute" a topic will be muted
If you reply to an email with the word "track" a topic will be tracked
If you reply to an email with the word "watch" a topic will be watched
These ninja command can help advanced mailing list ex-users, saves a trip
to the website
* Add possibility to add hidden posts with PostCreator
* FEATURE: Create hidden posts for received spam emails
Spamchecker usually have 3 results: HAM, SPAM and PROBABLY_SPAM
SPAM gets usually directly rejected and needs no further handling.
HAM is good message and usually gets passed unmodified.
PROBABLY_SPAM gets an additional header to allow further processing.
This change addes processing capabilities for such headers and marks
new posts created as hidden when received via email.