Skip to content

turndown not escaping links properly #459

@shaipetel

Description

@shaipetel

I noticed this scenario and think it would be useful for others if turndown would handle this better.

Background:
( and ) are valid characters in a URL, and won't get escaped in normal HTML.
In markdown, links are surrounded by ( and ), if your link needs to have ( and ) you will need to escape them with \

For this input html:

<a href="https://google.com/file(1).jpg">link</a>

A valid markdown should be:

[link](https://google.com\(1\).jpg)

However turndown returns:

[link](https://google.com(1).jpg)

I'm now pre-fixing image and links in my html prior to calling the turndown service, but I really think this should be handled in the parser.
When you do, be sure not to double-escape URLs if there were already escaped (meaning, if the URL has "(1).jpg" there is no need to double-escape it.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions