Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Splitting KV calls in smaller batches issue #1335

Open
fabricegaignier opened this issue Nov 13, 2024 · 10 comments
Open

Splitting KV calls in smaller batches issue #1335

fabricegaignier opened this issue Nov 13, 2024 · 10 comments

Comments

@fabricegaignier
Copy link

When we set a value for maxTrustedBiddingSignalsURLLength we do observe a split in the KV calls in smaller batches (for the ones with many tbs keys). And this is expected.
We would then expect that generateBid() is called for each of these batches independently of the others as they return.
Is this assumption correct?
This seems to be described here: https://github.com/WICG/turtledove/blob/main/PA_implementation_overview.md#trusted-signals-fetches--reprioritization ( we are setting enableBiddingSignalsPrioritization to false)

However we conducted a test and it does seems to wait for all the calls to the KV server to finish before starting generateBid()

Could you please advise on the expected behaviour?

@fabricegaignier
Copy link
Author

On that topic, I have one additional question. Could you please confirm that the order of the tbsKeys we receive in the KV server is determined by the priority vector as stated here: https://github.com/WICG/turtledove/blob/main/PA_implementation_overview.md#phase-1-load-interest-groups

For each buyer origin, Chrome determines a tentative priority amongst all interest groups for that buyer - using each interest group's fixed priority, or priority vector, if present. Chrome then sorts the interest groups based on priority, removing any with priority vectors and negative priorities. Interest groups with matching priorities are grouped by joining origin (to improve performance in case executionMode is group-by-origin). Groups with matching joining origins and priority are randomly shuffled.

@peiwenhu
Copy link
Contributor

On that topic, I have one additional question. Could you please confirm that the order of the tbsKeys we receive in the KV server is determined by the priority vector as stated here: https://github.com/WICG/turtledove/blob/main/PA_implementation_overview.md#phase-1-load-interest-groups

For each buyer origin, Chrome determines a tentative priority amongst all interest groups for that buyer - using each interest group's fixed priority, or priority vector, if present. Chrome then sorts the interest groups based on priority, removing any with priority vectors and negative priorities. Interest groups with matching priorities are grouped by joining origin (to improve performance in case executionMode is group-by-origin). Groups with matching joining origins and priority are randomly shuffled.

Hi, I can't answer your questions but I would like you ask you on this one if you only care about the order of keys between interest groups, not within the same interest group? It seems to me to be the case since the priority is on IG level.

@fabricegaignier
Copy link
Author

Hello @peiwenhu we do care about the order of keys between interest groups (and not within the same interest group)

@MattMenke2
Copy link
Contributor

When we set a value for maxTrustedBiddingSignalsURLLength we do observe a split in the KV calls in smaller batches (for the ones with many tbs keys). And this is expected. We would then expect that generateBid() is called for each of these batches independently of the others as they return. Is this assumption correct? This seems to be described here: https://github.com/WICG/turtledove/blob/main/PA_implementation_overview.md#trusted-signals-fetches--reprioritization ( we are setting enableBiddingSignalsPrioritization to false)

I'm not sure what you mean by "independently" here. generateBid() is called once per InterestGroup (modulo an extra call for k-anonymity reasons), and each call only gets the signals values for keys in that were in that interest group. Those values all come from a single network fetch for the signals data. In the case of group-by-origin execution mode, all those calls may or may not share a Javascript execution context, including requests that received their signals from different fetches. We run each generateBid() call as soon as we have all the data we need to run it.

However we conducted a test and it does seems to wait for all the calls to the KV server to finish before starting generateBid()

Could you please advise on the expected behaviour?

If you're using "enableBiddingSignalsPrioritization" for any of your interest groups, we can't determine the relatively priority of Interest Groups until we've received signals, so we have to wait until all trusted signals fetches for all Interest Groups with the same owner to complete, so we can determine the order to generate the bids, and apply any limits to the number of interest groups for that origin to generate bids. This performance cost is the reason "enableBiddingSignalsPrioritization" exists in the first place. We also have to wait for all promises that affect the generateBid() calls to resolve before we can start running scripts.

On that topic, I have one additional question. Could you please confirm that the order of the tbsKeys we receive in the KV server is determined by the priority vector as stated here: https://github.com/WICG/turtledove/blob/main/PA_implementation_overview.md#phase-1-load-interest-groups

Network latency can vary, some requests may be served from the network disk cache, and even latency within Chrome may affect order we send requests over the network (e.g., if we have a cache entry that we may be able to use for the first request, and none for the second, even if it turns out we can't use the cache for the first request, checking that may make it hit the network after the second request). So you cannot rely on request order. While we attempt to send trusted signals requests in priority order for all interest groups with the same order, it's best-effort only. Whether we run scripts in priority order, or in the order the signals results are received, can also vary (if enableBiddingSignalsPrioritization is true, we run in priority order, otherwise, I believe we run in order we receive the responses). There are enough processes and asynchronous events that execution order is very much best-effort. If there are limits on number of interest groups in an auction, we do perfectly respect those, it's just when things are being done in parallel that order of execution may change.

@fabricegaignier
Copy link
Author

Lets take examples to illustrate better my questions :

KV call split
Say a browser participates to an on device auction:

  1. It contains 6 IG (and we have only 1 TBS key per IG )
  2. we are setting enableBiddingSignalsPrioritization to false in each IG
  3. There are 3 KV server calls du to maxTrustedBiddingSignalsURLLength value. Each with 2 TBS Keys (so 2 Interest Groups): KV1([TBS_IG1 , TBS_IG2]), KV2([TBS_IG3 , TBS_IG4]), KV3([TBS_IG5 , TBS_IG6])

Say KV2 returns first. So we should have two calls to generateBid
generateBid(IG3,…, TBS_result(TBS_IG3))
generateBid(IG4,…, TBS_result(TBS_IG4))
And this without awaiting responses for the calls KV1 and KV3.
Is this correct?
The reason I am asking is that we see a negative correlation between the number of TBS Keys in the KV call and the execution of generateBid. So we thought that splitting the KV call in several smaller ones would help, but it does not.

Order of TBSKeys in the KV server
Here I meant: is the order preserved within a batch.
Imagine we have 2 IG on the browser with each only 1 TBS Key (TBS_IG1 , TBS_IG2) and priority such that IG2 > IG1
We only have one call to the KV server KV1 with both TBS Keys
We expect to receive the keys in this order KV1([TBSIG2 , TBSIG1 ])
Is that correct ?

@dmdabbs
Copy link
Contributor

dmdabbs commented Nov 18, 2024

Might this PA Implementation Overview caveat apply...

Also, interest groups that share an executor are guaranteed to be run in the priority order they were sorted into (this may change if trusted signals fetches are split up, though).

@MattMenke2
Copy link
Contributor

Lets take examples to illustrate better my questions :

KV call split Say a browser participates to an on device auction:

  1. It contains 6 IG (and we have only 1 TBS key per IG )
  2. we are setting enableBiddingSignalsPrioritization to false in each IG
  3. There are 3 KV server calls du to maxTrustedBiddingSignalsURLLength value. Each with 2 TBS Keys (so 2 Interest Groups): KV1([TBS_IG1 , TBS_IG2]), KV2([TBS_IG3 , TBS_IG4]), KV3([TBS_IG5 , TBS_IG6])

Say KV2 returns first. So we should have two calls to generateBid generateBid(IG3,…, TBS_result(TBS_IG3)) generateBid(IG4,…, TBS_result(TBS_IG4)) And this without awaiting responses for the calls KV1 and KV3. Is this correct? The reason I am asking is that we see a negative correlation between the number of TBS Keys in the KV call and the execution of generateBid. So we thought that splitting the KV call in several smaller ones would help, but it does not.

Yes, without enableBiddingSignalsPrioritization, each generateBid() call should be made as soon as the KV server returns, and we have everything else we need (script file was received and compiled, all promises we need resolved). We try to preserve priority order, but there's a pair of non-order-preserving process hops in there, so it's possible we're run the generateBid() call of the lower priority one first (lower priority of IG3 vs IG4).

Order of TBSKeys in the KV server Here I meant: is the order preserved within a batch. Imagine we have 2 IG on the browser with each only 1 TBS Key (TBS_IG1 , TBS_IG2) and priority such that IG2 > IG1 We only have one call to the KV server KV1 with both TBS Keys We expect to receive the keys in this order KV1([TBSIG2 , TBSIG1 ]) Is that correct ?

There's no guarantee of the order of keys when sent to the server. I think we currently sort them in lexical order, since we sort them to remove duplicates, but that's not guaranteed.

@MattMenke2
Copy link
Contributor

Might this PA Implementation Overview caveat apply...

Also, interest groups that share an executor are guaranteed to be run in the priority order they were sorted into (this may change if trusted signals fetches are split up, though).

I think that is actually no longer true - there's an unconditional process hop when the signals are received back to the browser process, and those aren't guaranteed to preserve order. I think even without enableBiddingSignalsPrioritization, we recalculate priority if the signals have a priority, and cancel the request if it's negative, and we currently only do that in the main process. There may be other reasons for it. It's been a while since I looked at that code.

@dmdabbs
Copy link
Contributor

dmdabbs commented Nov 18, 2024

I think even without enableBiddingSignalsPrioritization, we recalculate priority if the signals have a priority, and cancel the request if it's negative, and we currently only do that in the main process.

Cancel the request if it's negative means not call generateBid when an IG's signals response contains a priorityVector and the dot product with that IG's prioritySignals is negative.

@MattMenke2
Copy link
Contributor

I think even without enableBiddingSignalsPrioritization, we recalculate priority if the signals have a priority, and cancel the request if it's negative, and we currently only do that in the main process.

Cancel the request if it's negative means not call generateBid when an IG's signals response contains a priorityVector and the dot product with that IG's prioritySignals is negative.

Correct - should have used clearer language there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants