Conversation
Add more tests.
Isn't that a bit... weird name? :] |
janbuchar
left a comment
There was a problem hiding this comment.
This seems sound to me, I have just some minor comments. Also please fix typos in the PR title 🙂
| # Increased detection_probability_recommendation | ||
| prediction = predictor.predict(Request.from_url(url='http://www.aaa.com/some/stuffa', label=label)) | ||
| assert prediction.rendering_type == 'static' | ||
| assert prediction.detection_probability_recommendation == detection_ratio * 4 |
There was a problem hiding this comment.
These exact checks are probably too implementation-dependent. Could we just check that the recommendation is, like, higher than detection_ratio * 2 for the first couple of results so that we don't have to update the tests each time we fine-tune some constant by a tiny bit?
There was a problem hiding this comment.
This test is just describing how it actually behaves. Is it requirement or implementation detail? Impossible to say without actually having the requirements. I think it acts nicely as documentation of current behavior and is very easy to change if needed.
If we fine-tune some constant and this test fails, it will be very nice feedback in seeing what those constants actually influence and we can then do informed decision about whether we want to change the test or not.
There was a problem hiding this comment.
Not a hill I want to die on, but this particular example seems like an overspecified test. Almost to the point where if it breaks, you just adjust it without a second thought, which undermines the value of such test.
Co-authored-by: Jan Buchar <[email protected]>
I thought about naming my future kid like this. But maybe it is not so great name after all. It is some implementation of abstract |
vdusek
left a comment
There was a problem hiding this comment.
Nice! And thanks for taking this out from AdaptiveCrawler PR.
src/crawlee/crawlers/_adaptive_playwright/_rendering_type_predictor.py
Outdated
Show resolved
Hide resolved
src/crawlee/crawlers/_adaptive_playwright/_rendering_type_predictor.py
Outdated
Show resolved
Hide resolved
…ictor.py Co-authored-by: Vlada Dusek <[email protected]>
| from jaro import jaro_winkler_metric | ||
| from sklearn.linear_model import LogisticRegression |
There was a problem hiding this comment.
We should wrap this in a way that the exception tells the user to install the adaptive-playwright extra. But since this is a private subpackage, maybe it can wait until the adaptive playwirght functionality is made public.
Co-authored-by: Jan Buchar <[email protected]>
vdusek
left a comment
There was a problem hiding this comment.
Just a few nits, LGTM otherwise.
src/crawlee/crawlers/_adaptive_playwright/_rendering_type_predictor.py
Outdated
Show resolved
Hide resolved
Co-authored-by: Vlada Dusek <[email protected]>
Description
Add
LogisticalRegressionPredictorwhich is going to be defaultRenderingTypePredictorforAdaptivePlaywrightCrawler.Issues
Split from: #249