Skip to content

Conversation

@lumirlumir
Copy link
Member

@lumirlumir lumirlumir commented Sep 14, 2025

Prerequisites checklist

What is the purpose of this pull request?

Which language are you using?

CommonMark and GFM.

What did you do?

I've tried using consecutive backslashes in the no-reference-like-urls rule, but it reported false positives as shown below:

What did you expect to happen?

First Problem

The code below doesn't report, which is the correct behavior.

\[Mercury](mercury)

[mercury]: https://example.com/mercury

However, the code below does report, which are false positives:

\\\[Mercury](mercury) <!-- False Positive -->

\\\\\[Mercury](mercury) <!-- False Positive -->

\\\\\\\[Mercury](mercury) <!-- False Positive -->

[mercury]: https://example.com/mercury

Second Problem

The code below is reported as a link type, which is the correct behavior.

\![Mercury](mercury)

[mercury]: https://example.com/mercury

However, The code below is reported as an image type, which are incorrect behaviors.

The code below should be reported as a link type, since the ! is escaped with a backslash.

\\\![Mercury](mercury) <!-- Incorrect -->

\\\\\![Mercury](mercury) <!-- Incorrect -->

\\\\\\\![Mercury](mercury) <!-- Incorrect -->

[mercury]: https://example.com/mercury

Link to minimal reproducible Example

Adding the following test cases would help identify the issues:

(Test cases not mentioned here but added were included just to verify the behavior.)

  • valid:
		`${"\\".repeat(3)}[Mercury](mercury)\n\n[mercury]: https://example.com/mercury`,
		`${"\\".repeat(5)}[Mercury](mercury)\n\n[mercury]: https://example.com/mercury`,
		`${"\\".repeat(7)}[Mercury](mercury)\n\n[mercury]: https://example.com/mercury`,
  • invalid
		{
			code: `${"\\".repeat(3)}![Mercury](mercury)\n\n[mercury]: https://example.com/mercury`,
			output: `${"\\".repeat(3)}![Mercury][mercury]\n\n[mercury]: https://example.com/mercury`,
			errors: [
				{
					messageId: "referenceLikeUrl",
					data: { type: "link", prefix: "" },
					line: 1,
					column: 5,
					endLine: 1,
					endColumn: 23,
				},
			],
		},
		{
			code: `${"\\".repeat(5)}![Mercury](mercury)\n\n[mercury]: https://example.com/mercury`,
			output: `${"\\".repeat(5)}![Mercury][mercury]\n\n[mercury]: https://example.com/mercury`,
			errors: [
				{
					messageId: "referenceLikeUrl",
					data: { type: "link", prefix: "" },
					line: 1,
					column: 7,
					endLine: 1,
					endColumn: 25,
				},
			],
		},
		{
			code: `${"\\".repeat(7)}![Mercury](mercury)\n\n[mercury]: https://example.com/mercury`,
			output: `${"\\".repeat(7)}![Mercury][mercury]\n\n[mercury]: https://example.com/mercury`,
			errors: [
				{
					messageId: "referenceLikeUrl",
					data: { type: "link", prefix: "" },
					line: 1,
					column: 9,
					endLine: 1,
					endColumn: 27,
				},
			],
		},

What changes did you make? (Give an overview)

This PR is a follow-up to #490.

In this PR, I've addressed false positives in no-reference-like-urls rule that were caused by consecutive backslash handling.

Previously, we used the following regex pattern to avoid matching backslash escapes:

  • (?<!(?<!\\)\\)

However, this pattern is incorrect because they don't properly handle consecutive backslash escapes metioned above.

So, I've updated them to use the following regex correctly:

  • (?<=(?<!\\)(?:\\{2})*)
(?<=(?<!\\)(?:\\{2})*)a
a           // even
\a          // odd
\\a         // even
\\\a        // odd
\\\\a       // even
\\\\\a      // odd
\\\\\\a     // even
image image

Related Issues

Ref: #490

Is there anything you'd like reviewers to focus on?

In JavaScript, \\\\\\\\ (8 backslashes) is interpreted as \\\\ (4 backslashes) in Markdown, and is ultimately recognized as \\ (2 backslashes). In this case, an even number of backslashes is not treated as an escape character and is recognized as a normal backslash.

@eslint-github-bot
Copy link

Hi @lumirlumir!, thanks for the Pull Request

The pull request title isn't properly formatted. We ask that you update the pull request title to match this format, as we use it to generate changelogs and automate releases.

  • The commit message tag wasn't recognized. Did you mean "docs", "fix", or "feat"?
  • There should be a space following the initial tag and colon, for example 'feat: Message'.
  • The first letter of the tag should be in lowercase

To Fix: You can fix this problem by clicking 'Edit' next to the pull request title at the top of this page.

Read more about contributing to ESLint here

@eslintbot eslintbot added this to Triage Sep 14, 2025
@github-project-automation github-project-automation bot moved this to Needs Triage in Triage Sep 14, 2025
@lumirlumir lumirlumir changed the title Fix handle consecutive backslashes in no reference like urls fix: handle consecutive backslashes in no-reference-like-urls Sep 14, 2025
/**
* @import { SourceRange } from "@eslint/core"
* @import { Heading, Node, Paragraph, TableCell } from "mdast";
* @import { Heading, Paragraph, TableCell } from "mdast";
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Node type was not being used, so I removed it.

@lumirlumir lumirlumir marked this pull request as ready for review September 14, 2025 11:50
@lumirlumir lumirlumir requested review from a team and mdjermanovic September 14, 2025 11:50
/** Pattern to match both inline links: `[text](url)` and images: `![alt](url)`, with optional title */
const linkOrImagePattern =
/(?<!(?<!\\)\\)(?<imageBang>!)?\[(?<label>(?:\\.|[^()\\]|\([\s\S]*\))*?)\]\((?<destination>[ \t]*(?:\r\n?|\n)?(?<![ \t])[ \t]*(?:<[^>]*>|[^ \t()]+))(?:[ \t]*(?:\r\n?|\n)?(?<![ \t])[ \t]*(?<title>"[^"]*"|'[^']*'|\([^)]*\)))?[ \t]*(?:\r\n?|\n)?(?<![ \t])[ \t]*\)(?!\()/gu;
/(?<=(?<!\\)(?:\\{2})*)(?<imageBang>!)?\[(?<label>(?:\\.|[^()\\]|\([\s\S]*\))*?)\]\((?<destination>[ \t]*(?:\r\n?|\n)?(?<![ \t])[ \t]*(?:<[^>]*>|[^ \t()]+))(?:[ \t]*(?:\r\n?|\n)?(?<![ \t])[ \t]*(?<title>"[^"]*"|'[^']*'|\([^)]*\)))?[ \t]*(?:\r\n?|\n)?(?<![ \t])[ \t]*\)(?!\()/gu;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason this rule parses paragraphs and other nodes to search for links and images instead of checking link and image nodes directly?

Copy link
Member Author

@lumirlumir lumirlumir Sep 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I remember, there were some tricky edge cases that prevented us from directly visiting link and image nodes.

One edge case was:

dedent`
[link](<uri>)
[uri]: https://example.com/uri
`,
dedent`
![image](<uri>)
[uri]: https://example.com/uri
`,

{
code: dedent`
![Venus](<venus>)
[<venus>]: https://example.com/venus.jpg
`,
output: dedent`
![Venus][<venus>]
[<venus>]: https://example.com/venus.jpg
`,
errors: [
{
messageId: "referenceLikeUrl",
data: { type: "image", prefix: "!" },
line: 1,
column: 1,
endLine: 1,
endColumn: 18,
},
],
},

The link and image nodes can have links wrapped in brackets <>, but the AST just represents them as links without the brackets.

It wasn’t easy to handle this case by directly using the link and image nodes.


There’s also an edge case with auto-link literals in CommonMark and GFM mode:

{
code: dedent`
<http://foo>
[http://foo]: hi
`,
language: "markdown/gfm",
},

{
code: dedent`
http://foo
[http://foo]: hi
`,
language: "markdown/gfm",
},

These cases shouldn’t be reported either, but if we look at the link node directly, it can be pretty tricky to handle all of the edge cases correctly.


It would be nice to refactor this code to visit link and image nodes directly, but in my experience, it hasn’t been easy to implement.

So, I've thought the current approach is reasonable, especially since no-reversed-media-syntax and other rules for detecting error-prone syntax use similar strategies.

Copy link
Member

@mdjermanovic mdjermanovic Sep 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we then visit link and image nodes directly and use the regex on their text to check if they're using the syntax this rule wants to check? I think it would be less error-prone then parsing whole raw paragraphs.

So, I've thought the current approach is reasonable, especially since no-reversed-media-syntax and other rules for detecting error-prone syntax use similar strategies.

I think no-reversed-media-syntax is a different case, because text with reversed syntax is not parsed as a link. Here, in no-reference-like-urls, we could use the fact that text we want to check is already parsed as link/image by the parser, if that's always the case?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think your suggestion could help make this rule more reliable 👍

I agree with your idea, but the scope seems a bit large for this particular fix. It will also take me some time to figure out the best approach, so if it’s alright with you, I’d like to create a separate PR to refactor the code.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, then let's fix the problem this way. I could try to simplify rule to visit link and image nodes directly after this is merged.

Copy link
Member

@mdjermanovic mdjermanovic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@mdjermanovic mdjermanovic merged commit 762712d into main Sep 15, 2025
24 checks passed
@mdjermanovic mdjermanovic deleted the fix-handle-consecutive-backslashes-in-no-reference-like-urls branch September 15, 2025 14:17
@github-project-automation github-project-automation bot moved this from Needs Triage to Complete in Triage Sep 15, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

Status: Complete

Development

Successfully merging this pull request may close these issues.

3 participants