Currently arguments that allow to change how different return codes are handled are available only to static http-based crawlers. Those arguments can be used in crawler __init__, but are not available in PlaywrightCrawler. If someone wants to for example ignore 403 error:
crawler = ParselCrawler(..., ignore_http_error_status_codes = {403})
but in PlaywrightCrawler they have to do something like this:
crawler = PlaywrightCrawler(...)
crawler._http_client._ignore_http_error_status_codes = {403}
That is very confusing and users will hardly even know about it. The PlaywrightCrawler behavior should be aligned with other crawlers and these should be possible to set in __init__
Currently arguments that allow to change how different return codes are handled are available only to static http-based crawlers. Those arguments can be used in crawler
__init__, but are not available inPlaywrightCrawler. If someone wants to for example ignore 403 error:but in PlaywrightCrawler they have to do something like this:
That is very confusing and users will hardly even know about it. The
PlaywrightCrawlerbehavior should be aligned with other crawlers and these should be possible to set in__init__