Spec: filtering IDs #123

alexmturner · 2024-03-29T18:42:35Z

Specs the ability to set a filtering ID (and modify the default ID space). See https://github.com/patcg-individual-drafts/private-aggregation-api/blob/main/flexible_filtering.md#proposal-filtering-id-in-the-encrypted-payload and issue #92.

To support this new functionality, we increase the report version. Note that this also requires aggregation service versions to support the new version.

Preview | Diff

alexmturner · 2024-04-22T14:15:52Z

@linnan-github, could you PTAL? Thanks!

spec.bs

linnan-github · 2024-04-22T16:01:48Z

spec.bs

+1. If |preSpecifiedParams|' [=pre-specified report parameters/context ID=] is
+    not null, return true.
+1. If |preSpecifiedParams|' [=pre-specified report parameters/filtering ID byte
+    size=] is not 1, return true.


Shall we define an implemention-defined value for the default filtering ID byte size?

My instinct here is to avoid an implementation-defined value here as I don't think we'd expect another implementer to change this value. But maybe we can follow up with the spec mentor

Even if it’s not implementation defined, maybe we can follow the ARA spec which defines constant values for readability?

Yep, makes sense! Added.

spec.bs

linnan-github · 2024-04-22T16:06:48Z

spec.bs

+    larger than 64.
+1. Let |filteringIdByteSize| be |params|' [=pre-specified report parameters/
+    filtering ID byte size=].
+1. [=Assert=]: |filteringIdByteSize| is an integer in the range [1, 8].


Probably consider defining implementation-defined values for the valid range of filtering ID byte size.

spec.bs

alexmturner · 2024-04-22T18:19:18Z

Thanks for the review, Nan!

alexmturner · 2024-04-22T20:47:09Z

@arichiv, could you PTAL as well? Thanks!

alexmturner · 2024-04-22T21:10:06Z

Two additional things I'm seeking feedback on:

I'm a little unsure what to name filteringIdByteSize. For context, this allows for using a custom size for the filtering ID space, e.g. you could specify 4 to allow for a 32-bit ID. (We only allow integer bytes which is why we went with ByteSize not BitSize.) Other options we were thinking of is filteringIdByteLength or filteringIdMaxBytes (maybe leaning towards this).
The default filteringIdByteSize is 1. When a non-default value is used, a report is sent even if no contributions were sent (a "null report"). This is done to avoid a privacy concern from the different payload length. However, it might be easier to describe the behavior if we activate this null reporting logic whenever a filteringIdByteSize is explicitly specified, even if it's for the default value. Personally, I'm probably leaning against this as it's not needed for privacy.

alexmturner · 2024-04-30T19:13:33Z

@linnan-github @arichiv, I've updated this PR to no longer have an explicit opt-in or require debug mode. This should simplify the change substantially, albeit potentially delay the launch slightly (as we will now want all current aggregation service releases to support the new format before launching to Chrome Stable).

spec.bs

arichiv · 2024-04-30T21:51:44Z

spec.bs

-    1. [=Set the context ID for a batching scope=] given |contextId| and
-        |batchingScope|.
-    1. Run the following steps [=in parallel=]:
+1. If |preSpecifiedParams| is not null:


This is very close to line 1174, should there be an algo to setup some of this?

This one was a mistake -- we shouldn't be repeating this. Fixed this in #128.

arichiv · 2024-04-30T21:52:48Z

spec.bs

-    1. [=Set the context ID for a batching scope=] given |contextId| and
-        |batchingScope|.
-    1. Run the following steps [=in parallel=]:
+1. If |preSpecifiedParams| is not null:


Same question about repetition, especially given it's a long series of things nested in complex ways

This one I think I might leave for now -- agreed that it's duplicate, but I don't think this PR changes that too much. I'm hoping to do a better job when we move this into the Shared Storage spec.

spec.bs

arichiv · 2024-04-30T22:14:10Z

It might be worth splitting this change into two PRs:

replacing context id with a pre-specified report parameters struct that contains only a context id
adding a filtering ID byte size to the pre-specified report parameters struct

Doing both at once when so many spots are touched by 1 but only a few feel the impact of 2 is risky from a comprehensive review perspective.

alexmturner

Thanks for the review! Pulled out the no-op pre-specified report parameter changes into their own PR (#128). I've left comments about other issues unresolved for now and will address them in this PR once its rebased onto the other one.

spec.bs

alexmturner · 2024-05-01T18:24:38Z

spec.bs

-    1. [=Set the context ID for a batching scope=] given |contextId| and
-        |batchingScope|.
-    1. Run the following steps [=in parallel=]:
+1. If |preSpecifiedParams| is not null:


This one was a mistake -- we shouldn't be repeating this. Fixed this in #128.

alexmturner · 2024-05-01T18:25:59Z

spec.bs

-    1. [=Set the context ID for a batching scope=] given |contextId| and
-        |batchingScope|.
-    1. Run the following steps [=in parallel=]:
+1. If |preSpecifiedParams| is not null:


This one I think I might leave for now -- agreed that it's duplicate, but I don't think this PR changes that too much. I'm hoping to do a better job when we move this into the Shared Storage spec.

alexmturner

This is ready for review again. I think I've addressed the remaining comments and rebased appropriately.

spec.bs

alexmturner · 2024-05-01T18:42:38Z

spec.bs

    |options|["{{SharedStorageRunOperationMethodOptions/privateAggregationConfig}}"]["{{SharedStoragePrivateAggregationConfig/contextId}}"].
 1. If |contextId|'s [=string/length=] is greater than 64, return a new
    {{DOMException}} with name "`DataError`".
-1. Return |contextId|.
+1. Let |filteringIdByteSize| be the [=default filtering ID byte size=].
+1. If |options|["{{SharedStorageRunOperationMethodOptions/privateAggregationConfig}}"]["{{SharedStoragePrivateAggregationConfig/filteringIdByteSize}}"]


Had a go simplifying slightly. It's a little weird as we have to check if the entry exists first, but I think this is a little better.

spec.bs

arichiv

Note: I have not done a detailed review of the entire spec but the change just here look good to me. Thanks for splitting the PR into multiple parts, that made it a lot easier to read through

arichiv · 2024-05-02T10:31:35Z

spec.bs

@@ -1349,9 +1408,16 @@ event, PAExtendedHistogramContribution contribution)</dfn> method steps are:
        throw=] a {{TypeError}}.
 1. Otherwise, if |contribution|["{{PAHistogramContribution/value}}"] is
    negative, [=exception/throw=] a {{TypeError}}.
+1. If |contribution|["{{PAExtendedHistogramContribution/filteringId}}"] is


This is the third time we do something similar, it might be worth making an algo but up to you

alexmturner · 2024-05-02T14:29:58Z

Thanks for the review, @arichiv! I was also wondering if you had any feedback on the questions in #123 (comment).

arichiv · 2024-05-02T14:44:50Z

Two additional things I'm seeking feedback on:

I'm a little unsure what to name filteringIdByteSize. For context, this allows for using a custom size for the filtering ID space, e.g. you could specify 4 to allow for a 32-bit ID. (We only allow integer bytes which is why we went with ByteSize not BitSize.) Other options we were thinking of is filteringIdByteLength or filteringIdMaxBytes (maybe leaning towards this).

I like filteringIdMaxBytes given the usage I see.

The default filteringIdByteSize is 1. When a non-default value is used, a report is sent even if no contributions were sent (a "null report"). This is done to avoid a privacy concern from the different payload length. However, it might be easier to describe the behavior if we activate this null reporting logic whenever a filteringIdByteSize is explicitly specified, even if it's for the default value. Personally, I'm probably leaning against this as it's not needed for privacy.

Is there a design doc I could review? I'm not sure I understand the implications from what I've reviewed alone

alexmturner · 2024-05-02T15:17:18Z

The explainer here might help, but it is a bit terse: https://github.com/patcg-individual-drafts/private-aggregation-api/blob/main/flexible_filtering.md#small-id-space-by-default-but-configurable.

To elaborate on the privacy concern, when reports are not deterministically sent, the count of reports is potentially sensitive, i.e. based on cross-site data. For example, a script could choose whether or not to make contributions in a shared storage worklet (and a report would only be sent if there are contributions). The filtering ID feature allows scripts to also affect payload size (although only based on first-party data), which could allow an adversary to "partition" its counts -- i.e. it could separately count the number of reports with each different payload size to gain more information. Thus, this additional metadata would amplify the existing counting attack.

We prevent this by ensuring that, whenever a non-default byte size is chosen (and so a non-default payload size will occur) reports are sent deterministically, i.e. even if the script makes no contributions. This means that the count of reports with that payload size can no longer be affected by cross-site data.

The question here is whether we should also extend this protection to cases where a script has explicitly chosen the default byte size. There is no privacy reason to do this -- these reports are indistinguishable from reports that got the default due to not specifying anything. However, it arguably might be easier for developers to grok the "use the param => deterministic report" logic over "specify a non-default value => deterministic report" logic. There is also a small performance downside to this -- a report will be needlessly sent if no contributions are made in this case. I'm leaning against this due to this small performance impact and because it feels more natural to align the logic with the privacy requirement; it also means we can define the WebIDL in the form type filteringIdMaxBytes = 1 and not have special "not present" logic.

(And no worries if you don't really have opinions here -- happy to resolve internally if so)

arichiv · 2024-05-02T15:28:28Z

I get it now, thanks for the elaboration. I would suggest something slightly more elaborate: modifying SharedStoragePrivateAggregationConfig to take both a filteringIdMaxBytes and a new enum called something like reportingStrategy which is a string that can be wait-for-contributions or always-send (open to other naming).

filteringIdMaxBytes has the same current properties (defaults to 1).

reportingStrategy defaults to wait-for-contributions when filteringIdMaxBytes is 1 and always-send otherwise (doesn't distinguish between filteringIdMaxBytes being 1 explicitly or by default). If reportingStrategy is set to wait-for-contributions when filteringIdMaxBytes isn't 1 an error is thrown. If reportingStrategy is set to always-send when filteringIdMaxBytes is 1 that's fine.

arichiv

Based on last comment

alexmturner · 2024-05-02T16:54:09Z

Just switched the spec over to using "max bytes".

Oh interesting! I do see the appeal of making this property a bit more explicit, but I don't think any developer will want to opt-in to additional null reports if they can avoid it. Chatted briefly internally and, given that, we were thinking that leaving the logic as is should be simple enough for developers (and aligns with this idea that they want to minimize null reports when possible).

That being said, we could consider adopting randomized null reporting for the non-deterministic case in the future, similar to ARA's proposal. If we do end up pursuing that, we might want to return to this idea as there would be more of a meaningful tradeoff.

Thanks again both of you for all the reviews!

arichiv

It sounds like if the default or an explicit 1 is used then empty reports won't be sent? I'm fine with that. I'm just against having defaults behave differently from explicit settings identical to the defaults.

alexmturner · 2024-05-02T18:02:29Z

Yep, that's right!

linnan-github · 2024-05-04T15:07:24Z

spec.bs

@@ -1038,6 +1085,7 @@ partial interface SharedStorageWorkletGlobalScope {
 dictionary SharedStoragePrivateAggregationConfig {
  USVString aggregationCoordinatorOrigin;
  USVString contextId;
+  [EnforceRange] unsigned long long filteringIdMaxBytes;


Do we use unsigned long long to match `size_t in the implementation? Do we know whether there's similar API behavior?

That's a good point, doesn't JS max out at 53 bits for ints because that's the non-exponent part of a double (and all JS numbers are doubles)? Is an unsigned long enough?

If you really need an unsigned 64 bit int then you might need to accept it as a string and have the algo convert it internally.

This is just following the guidance in the Web Platform Design Principles here. We unfortunately didn't follow this for some of the other existing fields -- we may want to change those at some point.

This follows discussion on the spec PR: patcg-individual-drafts/private-aggregation-api#123 (comment). This change should be a no-op (and the feature relying on this has never been enabled by default anyway). Bug: 330744610 Change-Id: Ia06439b9a717ce8283fbc71f90a296bf9501d848 Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/5506763 Reviewed-by: Nan Lin <[email protected]> Commit-Queue: Alex Turner <[email protected]> Cr-Commit-Position: refs/heads/main@{#1296864}

SHA: a857522 Reason: push, by alexmturner Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

Adds support for the new filtering_id and filtering_id_max_bytes fields. This support is not currently hooked up to either Shared Storage or Protected Audience. That support will be added in separate cls. The spec is available here: patcg-individual-drafts/private-aggregation-api#123 Bug: 330744610 Change-Id: Iade0c91a6ad018b7bd4b709a4a7615e783d8d9b0 Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/5443375 Reviewed-by: Qingxin Wu <[email protected]> Commit-Queue: Alex Turner <[email protected]> Reviewed-by: Yao Xiao <[email protected]> Reviewed-by: Mike Taylor <[email protected]> Reviewed-by: Nan Lin <[email protected]> Cr-Commit-Position: refs/heads/main@{#1297012}

linnan-github reviewed Apr 22, 2024

View reviewed changes

arichiv suggested changes Apr 30, 2024

View reviewed changes

alexmturner commented May 1, 2024

View reviewed changes

alexmturner mentioned this pull request May 1, 2024

Spec: Wrap context IDs in a "pre-specified report parameters" object #128

Merged

alexmturner force-pushed the filtering_ids branch from 90658fd to cc3ad57 Compare May 1, 2024 19:51

alexmturner changed the base branch from main to split_off_params_changes May 1, 2024 19:51

alexmturner commented May 1, 2024

View reviewed changes

arichiv approved these changes May 2, 2024

View reviewed changes

alexmturner force-pushed the filtering_ids branch from cc3ad57 to 8eaac44 Compare May 2, 2024 16:23

alexmturner force-pushed the split_off_params_changes branch from 36da89a to d3f4c99 Compare May 2, 2024 16:24

Base automatically changed from split_off_params_changes to main May 2, 2024 16:26

alexmturner added 3 commits May 2, 2024 16:26

Spec: filtering IDs

f587671

feedback

afec04b

byte size -> max bytes

3df2e53

alexmturner force-pushed the filtering_ids branch from 8eaac44 to 3df2e53 Compare May 2, 2024 16:29

arichiv suggested changes May 2, 2024

View reviewed changes

arichiv approved these changes May 2, 2024

View reviewed changes

linnan-github reviewed May 4, 2024

View reviewed changes

linnan-github approved these changes May 6, 2024

View reviewed changes

alexmturner merged commit a857522 into main May 6, 2024
1 check passed

alexmturner deleted the filtering_ids branch May 6, 2024 16:26

github-actions bot added a commit that referenced this pull request May 6, 2024

Spec: filtering IDs (#123)

bd26c0c

SHA: a857522 Reason: push, by alexmturner Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Spec: filtering IDs #123

Spec: filtering IDs #123

alexmturner commented Mar 29, 2024 •

edited by pr-preview bot

Loading

alexmturner commented Apr 22, 2024

linnan-github Apr 22, 2024

alexmturner Apr 22, 2024

linnan-github Apr 23, 2024

alexmturner Apr 23, 2024 •

edited

Loading

linnan-github Apr 22, 2024

alexmturner Apr 22, 2024

alexmturner commented Apr 22, 2024

alexmturner commented Apr 22, 2024

alexmturner commented Apr 22, 2024

alexmturner commented Apr 30, 2024

arichiv Apr 30, 2024

alexmturner May 1, 2024

arichiv Apr 30, 2024

alexmturner May 1, 2024

arichiv commented Apr 30, 2024 •

edited

Loading

alexmturner left a comment

alexmturner May 1, 2024

alexmturner May 1, 2024

alexmturner left a comment

alexmturner May 1, 2024

arichiv left a comment

arichiv May 2, 2024

alexmturner commented May 2, 2024

arichiv commented May 2, 2024

alexmturner commented May 2, 2024 •

edited

Loading

arichiv commented May 2, 2024 •

edited

Loading

arichiv left a comment

alexmturner commented May 2, 2024

arichiv left a comment

alexmturner commented May 2, 2024

linnan-github May 4, 2024 •

edited

Loading

arichiv May 4, 2024 •

edited

Loading

alexmturner May 6, 2024

Spec: filtering IDs #123

Spec: filtering IDs #123

Conversation

alexmturner commented Mar 29, 2024 • edited by pr-preview bot Loading

alexmturner commented Apr 22, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

alexmturner Apr 23, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

alexmturner commented Apr 22, 2024

alexmturner commented Apr 22, 2024

alexmturner commented Apr 22, 2024

alexmturner commented Apr 30, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

arichiv commented Apr 30, 2024 • edited Loading

alexmturner left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

alexmturner left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

arichiv left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

alexmturner commented May 2, 2024

arichiv commented May 2, 2024

alexmturner commented May 2, 2024 • edited Loading

arichiv commented May 2, 2024 • edited Loading

arichiv left a comment

Choose a reason for hiding this comment

alexmturner commented May 2, 2024

arichiv left a comment

Choose a reason for hiding this comment

alexmturner commented May 2, 2024

linnan-github May 4, 2024 • edited Loading

Choose a reason for hiding this comment

arichiv May 4, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

alexmturner commented Mar 29, 2024 •

edited by pr-preview bot

Loading

alexmturner Apr 23, 2024 •

edited

Loading

arichiv commented Apr 30, 2024 •

edited

Loading

alexmturner commented May 2, 2024 •

edited

Loading

arichiv commented May 2, 2024 •

edited

Loading

linnan-github May 4, 2024 •

edited

Loading

arichiv May 4, 2024 •

edited

Loading