Commit Graph

10 Commits

Author SHA1 Message Date
Gerhard Schlager
7bdf47b864 FIX: HtmlToMarkdown should keep HTML entities for <, > and & within HTML elements
Not all HTML elements are converted into Markdown. Some are kept as HTML.
Without this fix XML/HTML entities that are formatted as text instead of code are swallowed by Discourse.
This also fixes quotes in the `title` attribute of the `<abbr>` tag.
2024-06-10 16:03:30 +02:00
Gerhard Schlager
3c9d61d302 FIX: HtmlToMarkdown didn't keep text from within <center> tag
It should ignore the `<center>` tag, but keep the text from within the element.
2024-06-10 16:03:30 +02:00
Gerhard Schlager
b01905c724 FIX: HtmlToMarkdown didn't support tfoot in tables 2024-06-10 16:03:30 +02:00
Gerhard Schlager
52e81582b4 FEATURE: Use basic HTML table if it can't be converted to Markdown
Previously `HtmlToMarkdown` always converted HTML tables into Markdown tables. That lead to some badly formatted Markdown tables, e.g. when the table contained `rowspan` or `colspan`. This solves the issue by using very basic HTML tables in those cases.
2024-06-10 16:03:30 +02:00
Gerhard Schlager
b8f2cbf41c DEV: Add additional_allowed_tags to HtmlToMarkdown
Import script often use subclasses of `HtmlToMarkdown` and might need to allow additional tags that can be used within the custom class.
2024-06-10 16:03:30 +02:00
Martin Brennan
575bc4af73
FIX: Remove newlines from img alt & title in HTML to markdown parser (#25473)
We were having a minor issue with emails with embedded images
that had newlines in the alt string; for example:

```
<p class="MsoNormal"><span style="font-size:11.0pt"><img width="898"
height="498" style="width:9.3541in;height:5.1875in" id="Picture_x0020_5"
src="cid:image003.png@01DA4EBA.0400B610" alt="A screenshot of a computer
program

Description automatically generated"></span><span
style="font-size:11.0pt"><o:p></o:p></span></p>
```

Once this was parsed and converted to markdown (or directly to HTML
in some cases), this caused an issue in the composer and the post
UI, where the markdown parser didn't know how to deal with this,
making the HTML show directly instead of showing an image.

The easiest way to deal with this is to just strip \n from image
alt and title attrs in the HTMLToMarkdown class.
2024-01-31 10:23:09 +10:00
David Taylor
cb932d6ee1
DEV: Apply syntax_tree formatting to spec/* 2023-01-09 11:49:28 +00:00
Phil Pirozhkov
493d437e79
Add RSpec 4 compatibility (#17652)
* Remove outdated option

04078317ba

* Use the non-globally exposed RSpec syntax

https://github.com/rspec/rspec-core/pull/2803

* Use the non-globally exposed RSpec syntax, cont

https://github.com/rspec/rspec-core/pull/2803

* Comply to strict predicate matchers

See:
 - https://github.com/rspec/rspec-expectations/pull/1195
 - https://github.com/rspec/rspec-expectations/pull/1196
 - https://github.com/rspec/rspec-expectations/pull/1277
2022-07-28 10:27:38 +08:00
David Taylor
c9dab6fd08
DEV: Automatically require 'rails_helper' in all specs (#16077)
It's very easy to forget to add `require 'rails_helper'` at the top of every core/plugin spec file, and omissions can cause some very confusing/sporadic errors.

By setting this flag in `.rspec`, we can remove the need for `require 'rails_helper'` entirely.
2022-03-01 17:50:50 +00:00
Jarek Radosz
45cc16098d
DEV: Move spec/components to spec/lib (#15987)
Lib specs were inexplicably split into two directories (`lib` and `components`)

This moves them all into `lib`.
2022-02-18 19:41:54 +01:00