Skip to content

PlaywrightCrawler extract_links doesn't account for base href. #1589

@phughesion-h3

Description

@phughesion-h3

Title. If the page contains a <base href="anything">, then that url should be used as the base for all relative urls, not the current page.

https://developer.mozilla.org/en-US/docs/Web/HTML/Reference/Elements/base

Metadata

Metadata

Assignees

Labels

bugSomething isn't working.t-toolingIssues with this label are in the ownership of the tooling team.

Type

No fields configured for Bug.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions