Implement batch API with changeset, upsert, and DataFrame integration#129
Implement batch API with changeset, upsert, and DataFrame integration#129
Conversation
There was a problem hiding this comment.
Pull request overview
Implements a deferred-execution $batch API (client.batch) for Dataverse Web API operations, including transactional changesets and upsert support via UpsertMultiple, while refactoring OData request construction into reusable _build_* helpers.
Changes:
- Added
client.batchnamespace withBatchRequestbuilder, record/table/query batch operation namespaces, andchangeset()transactional grouping. - Introduced internal batch intent types + multipart serializer/parser, plus public
BatchResult/BatchItemResponsemodels. - Refactored
_ODataClientto build requests via_build_*returning_RawRequest, enabling shared payload generation for direct and batch execution (including the UpsertMultiple alternate-key body fix).
Reviewed changes
Copilot reviewed 19 out of 19 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
src/PowerPlatform/Dataverse/client.py |
Exposes new client.batch namespace. |
src/PowerPlatform/Dataverse/operations/batch.py |
Adds public batch builder + operation namespaces + changeset API. |
src/PowerPlatform/Dataverse/models/batch.py |
Adds public result models for batch responses. |
src/PowerPlatform/Dataverse/data/_raw_request.py |
Introduces internal _RawRequest dataclass for deferred request construction. |
src/PowerPlatform/Dataverse/data/_batch.py |
Implements batch intent resolution, multipart serialization, and response parsing. |
src/PowerPlatform/Dataverse/data/_odata.py |
Refactors CRUD/metadata/query to use _build_* + _execute_raw; adds _build_upsert_multiple fix and _build_sql encoding. |
tests/unit/test_batch_operations.py |
Unit tests for client.batch API surface and result models. |
tests/unit/data/test_batch_serialization.py |
Unit tests for multipart serialization and response parsing, plus intent resolution routing. |
tests/unit/data/test_odata_internal.py |
Adds tests for _build_upsert_multiple payload rules and conflict detection. |
tests/unit/data/test_sql_parse.py |
Adds tests for _build_sql URL encoding/round-tripping. |
tests/unit/data/test_format_key.py |
Adds tests for _format_key behavior. |
README.md |
Documents batch feature and examples. |
examples/advanced/batch.py |
Adds a full batch usage example script. |
examples/advanced/walkthrough.py |
Adds a batch section to the walkthrough. |
examples/basic/functional_testing.py |
Adds a live-environment batch functional test routine. |
src/PowerPlatform/Dataverse/claude_skill/dataverse-sdk-use/SKILL.md |
Updates usage skill docs with batch API. |
src/PowerPlatform/Dataverse/claude_skill/dataverse-sdk-dev/SKILL.md |
Updates dev skill docs to include new namespace/module. |
.claude/skills/dataverse-sdk-use/SKILL.md |
Mirrors usage skill docs update. |
.claude/skills/dataverse-sdk-dev/SKILL.md |
Mirrors dev skill docs update. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
Batch API Design: Batch API DesignDataverse Python SDK — OverviewThe batch API lets callers bundle multiple Dataverse operations into a single HTTP POST to The public surface mirrors the existing Zero-learning-curve design:
Capability | How to use
-- | --
Record CRUD (create / update / delete / get) | batch.records.*
Upsert by alternate key | batch.records.upsert(...)
Table metadata (create / delete / columns / relationships) | batch.tables.*
SQL queries | batch.query.sql(...)
Atomic write groups | batch.changeset()
Continue past failures | batch.execute(continue_on_error=True)
batch.records.create("account", {...}) accepts the same arguments as client.records.create("account", {...}). The only difference: batch methods return None immediately; results are available via BatchResult after execute().Pre-resolution metadata GETs and the final batch.execute() — one correlation scope All HTTP calls within a single execute() share the same GETEntityDefinitions — MetadataId pre-resolutioncorrelation-id: sharedclient-request-id: unique-A GETEntityDefinitions — MetadataId pre-resolutioncorrelation-id: sharedclient-request-id: unique-B POST/$batch — the entire batch payloadcorrelation-id: sharedclient-request-id: unique-C └─ multipart body ├─GETaccounts(guid)← bundled part, no SDK tracking ID ├─POSTaccounts← bundled part, no SDK tracking ID └─DELcontacts(guid)← bundled part, no SDK tracking ID Limitations
ExamplesRecord CRUD in a single batch
Transactional changeset with content-ID chaining
Upsert by alternate key
Upsert uses PATCH without an If-Match header — the server creates the record if it does not exist, or updates it if it does. This differs from records.update, which includes If-Match: * and fails if the record is absent.Table metadata operations
Error handling
SDK-level errors ( |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 20 out of 20 changed files in this pull request and generated 4 comments.
Comments suppressed due to low confidence (1)
tests/unit/data/test_batch_serialization.py:1
- This test is in the wrong file.
_targetscallsself.od._build_upsert_multiple, which is an_ODataClientmethod. This test class (TestBuildUpsertMultiple) belongs intest_odata_internal.pyand is already defined there (lines 344–421). The copy intest_batch_serialization.pyis a duplicate that validates the same logic. More importantly, the test's docstring says "it passes through to body" when the alternate key value matches — but looking at_build_upsert_multiplein_odata.py, keys present in bothalt_key_lowerandrecord_processedwith equal values are not explicitly excluded; however, the code does not explicitly include them either unless they were already inrecord_processed. The test does not actually assert thataccountnumberis present intarget(the body), so the description "passes through to body" is not verified. The test should either assertassertIn("accountnumber", target)or be renamed to reflect what is actually checked.
# Copyright (c) Microsoft Corporation.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
- Add 202/207 to expected batch response status codes - Fix return type annotations: List[tuple] -> List[Tuple[Dict[str, str], str]] - Fix OptionSet check: use dict key lookup instead of string search in JSON body - Lowercase select column names in _build_get for consistency with _get_multiple - Add select/filter params to batch tables.list (parity with PR #112) - Update _build_list_entities to accept filter/select parameters - Add docstring note on RFC 3986 %20 encoding in _build_sql - Fix merge-related test failures: parse data bytes instead of json kwarg - Add _TableList dataclass fields for filter/select
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 22 out of 22 changed files in this pull request and generated 6 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 23 out of 23 changed files in this pull request and generated 7 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
src/PowerPlatform/Dataverse/claude_skill/dataverse-sdk-use/SKILL.md
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 23 out of 23 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
Add unit tests for batch serialization, OData key formatting, SQL parsing, and batch operations - Implemented unit tests for internal batch multipart serialization and response parsing in `test_batch_serialization.py`. - Added tests for `_ODataClient._format_key` functionality in `test_format_key.py`. - Enhanced SQL parsing tests in `test_sql_parse.py` to cover URL encoding scenarios. - Created comprehensive tests for batch operations, including record and table operations, in `test_batch_operations.py`.
… batch processing components
…into the body; add unit tests for _ODataClient._build_upsert_multiple validation
…hance related tests
- Add 202/207 to expected batch response status codes - Fix return type annotations: List[tuple] -> List[Tuple[Dict[str, str], str]] - Fix OptionSet check: use dict key lookup instead of string search in JSON body - Lowercase select column names in _build_get for consistency with _get_multiple - Add select/filter params to batch tables.list (parity with PR #112) - Update _build_list_entities to accept filter/select parameters - Add docstring note on RFC 3986 %20 encoding in _build_sql - Fix merge-related test failures: parse data bytes instead of json kwarg - Add _TableList dataclass fields for filter/select
… fixes Spec compliance and correctness: - Skip empty changesets in _resolve_all instead of producing invalid multipart - Extract content-id from non-changeset response parts (was passing None) Edge case tests (40 new tests in test_batch_edge_cases.py): - Empty changeset handling (skipped silently) - Changeset error/rollback response parsing - Content-ID in standalone and changeset response parts - Mixed batch: changeset writes + standalone GETs - Multiple changesets with globally unique content IDs - Batch size limit counting across changesets - Top-level batch error handling (JSON, non-JSON, empty body) - Batch without continue-on-error (first failure stops) - Batch with continue-on-error (mixed success/failure) - OData multipart serialization compliance (CRLF, boundaries, headers) - BatchResult computed properties edge cases - Multipart response parsing edge cases (REQ_ID header, GUID formats) - Content-ID reference format and usage ( in @odata.bind, update, delete) - Intent validation for unknown types - Batch boundary format validation DataFrame + Batch integration: - New BatchDataFrameOperations class (batch.dataframe namespace) - batch.dataframe.create(table, df) -- DataFrame rows to CreateMultiple - batch.dataframe.update(table, df, id_column) -- DataFrame to updates - batch.dataframe.delete(table, ids_series) -- pandas Series to deletes - 18 new tests in test_batch_dataframe.py covering all operations Total: 579 tests passing (58 new tests added)
BatchResult.created_ids now extracts GUIDs from two sources:
- entity_id from OData-EntityId header (individual POST creates)
- data['Ids'] array from CreateMultiple/UpsertMultiple action responses
Previously, bulk creates via CreateMultiple returned 200 OK with
{'Ids': [...]} in the body but created_ids only looked at the
OData-EntityId header (which is absent for action responses).
Added 7 new unit tests covering:
- CreateMultiple response body parsing
- Mixed single + bulk creates in one batch
- Non-string ID filtering
- Failed CreateMultiple exclusion
- Full multipart response simulation
- Add DataFrame integration section to README batch docs - Add Example 7 (DataFrame batch operations) to examples/advanced/batch.py
Design improvement: created_ids now ONLY returns entity_id values from
OData-EntityId response headers (the standard OData response mechanism).
It no longer auto-extracts IDs from CreateMultiple/UpsertMultiple response
body data['Ids'].
Rationale: A batch can contain heterogeneous operations (creates, updates,
deletes, queries, table ops). Mixing two different response formats
(OData-EntityId header vs action body) into one property was non-standard
and could mislead callers. The OData Web API pattern is for callers to
iterate result.responses and handle each response by type.
For CreateMultiple/UpsertMultiple bulk IDs, callers access them via:
for resp in result.succeeded:
if resp.data and 'Ids' in resp.data:
bulk_ids = resp.data['Ids']
This aligns with the .NET SDK batch response model (BatchResponse returns
raw HttpResponseMessages for caller iteration) and follows OData spec
conventions.
Updated tests to verify the correct access pattern.
- Use VALIDATION_SQL_EMPTY constant instead of string literal in batch sql() - Use VALIDATION_UNSUPPORTED_COLUMN_TYPE constant in all 3 _odata.py locations - Import the constant from _error_codes.py - Add input validation to _build_create_multiple (all items must be dicts) - Narrow test assertions from Exception to ValidationError in test_odata_internal
The OData-EntityId header is returned by both POST (create) and PATCH (update) operations, not just creates. The name created_ids was misleading. entity_ids accurately reflects that it collects GUIDs from the standard OData-EntityId response header across all successful operations that return it (creates and updates). GET and DELETE operations do not return this header. For CreateMultiple/UpsertMultiple bulk action responses, callers access IDs via response.data['Ids'] as documented in the property's docstring. Updated all references across src, tests, examples, README, and SKILL files.
New test_batch_scenarios.py (20 tests) - executable documentation covering: - Response ordering matches operation order - CreateMultiple IDs in data['Ids'], not entity_ids - Update responses also return entity_id - GET response: data, not entity_id - DELETE response: no data, no entity_id - SQL query result rows in data['value'] - Empty batch returns empty result (no HTTP call) - Double execute is safe (sends two requests) - Content-ID scope: only within same changeset - add_columns: N columns = N responses - tables.create returns 204, no metadata - continue_on_error: without vs with - Changeset rollback error shape - DataFrame create: IDs in data['Ids'] - Mixed batch: changeset responses then standalone - Individual response status checking pattern - Batch max size validation (pre-flight) - Error response field availability Updated examples/advanced/batch.py: - Example 8: Response data patterns -- shows how to handle each response type (single create, bulk create, query, delete)
- Remove unused imports in test_batch_scenarios.py (json, _RecordCreate, _RecordDelete, _RecordUpdate, _QuerySql, _CRLF, BatchRequest) - Remove unused imports in test_batch_dataframe.py (patch, BatchRecordOperations) - Remove unused import in test_batch_edge_cases.py (patch) - Fix BatchRequest._items type annotation: list -> List[Any] - Fix example GUID: nonexistent-guid -> 00000000-0000-0000-0000-000000000000 - Fix SKILL.md docs: entity_ids includes creates AND updates, not just creates
7 new tests in TestRobustnessEdgeCases covering: - Malformed JSON body in batch response (silently handled) - Truncated JSON body (silently handled) - Exception in changeset context manager (changeset still in items) - Empty string table name (accepted, validated downstream) - Single-quote escaping in OData filter (_escape_odata_quotes) - Non-dict JSON body (list) in response (data stays None) - Boundary strings with special characters (+, /) Verified by systematic audit: - All user values in OData filters use _escape_odata_quotes - No bare except in batch code (only except Exception with fallback) - No mutable default arguments - json.dumps handles serialization safely - requests library auto-encodes URLs for non-batch path
- Fix pip install command in batch example (PowerPlatform-Dataverse-Client) - Hoist entity_set call above if-check in _resolve_record_create - Extract _require_entity_metadata helper for duplicate table lookups - Add inline comment explaining 400 in expected status codes - Add annotation explaining why example 5 is commented out - Populate __all__ in operations/batch.py with public classes - Use _GUID_RE from _odata.py for consistent GUID regex - Add _resolve_record_update tests (single, multiple, invalid changes) - Add delete tests (multiple IDs, no-bulk, empty list, empty strings)
33a45bc to
be47e9e
Compare
…ne regex - Remove unused imports in test_batch_edge_cases.py (_RecordCreate, _RecordUpdate, _QuerySql, _TableList) - Remove unused BatchResult import in test_batch_serialization.py - Remove dead _mock_batch_response function in test_batch_scenarios.py - Compile boundary regex to module-level _BOUNDARY_RE constant in _batch.py
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 24 out of 24 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
…ext manager - Add :type: directives to all batch docstrings (Microsoft Learn compatibility) - Fix _resolve_record_update: raise TypeError (not ValidationError) to match RecordOperations.update() convention - Add :raises: directive to BatchQueryOperations.sql() - Fix :type sql: format from double backticks to :class:\str\ - Use context manager in batch example (with DataverseClient(...) as client:)
Summary
client.batchnamespace -- a deferred-execution batch API that packs multipleDataverse Web API operations into a single
POST $batchHTTP requestclient.batch.dataframenamespace -- pandas DataFrame wrappers for batch operationsclient.records.upsert()andclient.batch.records.upsert()backed by theUpsertMultiplebound action with alternate-key supportbody, causing
400 Bad Requeston the create pathBatch API Design
Implements the Batch API Design spec from @sagebree:
batch.records.*batch.records.upsert(...)batch.tables.*batch.query.sql(...)batch.changeset()batch.execute(continue_on_error=True)batch.dataframe.create/update/deleteDesign constraints enforced:
records.getpaginated overload not supported -- single-record onlytables.createreturns no table metadata on success (HTTP 204)tables.add_columns/tables.remove_columnsdo not flush the picklist cacheclient.flush_cache()not supported in batch (client-side operation)What's included
New:
client.batchAPIbatch.records.create / get / update / delete / upsertbatch.tables.create / get / list / add_columns / remove_columns / deletebatch.tables.list(filter=..., select=...)-- parity withclient.tables.list()from Add filter and select parameters to client.tables.list() #112batch.tables.create_one_to_many_relationship / create_many_to_many_relationship / delete_relationship / get_relationship / create_lookup_fieldbatch.query.sqlbatch.changeset()context manager for transactional (all-or-nothing) operationsexecute(continue_on_error=True)for mixed success/failure batchesBatchResultwith.responses,.succeeded,.failed,.created_ids,.has_errorsNew:
client.batch.dataframeAPIbatch.dataframe.create(table, df)-- DataFrame rows to CreateMultiple batch itembatch.dataframe.update(table, df, id_column)-- DataFrame rows to update batch itemsbatch.dataframe.delete(table, ids_series)-- pandas Series to delete batch itemsExisting: Refactored existing APIs
_build_*/_RawRequestpatternexecute()OData $batch spec compliance
Content-Transfer-Encoding: binaryper partContent-Type: application/httpper partContent-Type: application/json; type=entryfor POST/PATCH bodiesHttpErrorwith parsed Dataverse error details)200,202 Accepted,207 Multi-Status, and400batch response codesReview comment fixes
expectedstatus codes to include202/207for all Dataverse environments_split_multipart/_parse_mime_partreturn type annotations:List[Tuple[Dict[str, str], str]]_build_getto lowercase select column names (consistency with_get_multiple)%20encoding documentation in_build_sqldocstringdatabytes instead ofjsonkwargbatch.records.upsert()raisesTypeError(matchingclient.records.upsert())Testing
Unit tests -- 579 tests passing:
test_batch_operations.py-- BatchRequest, BatchRecordOperations, BatchTableOperations, BatchQueryOperations, ChangeSet, BatchItemResponse, BatchResulttest_batch_serialization.py-- multipart serialization, response parsing, intent resolution, upsert dispatch, batch size limit, content-ID uniqueness, top-level error handlingtest_batch_edge_cases.py-- 40 edge case tests: empty changeset, changeset rollback, content-ID in standalone parts, mixed batch, multiple changesets, batch size limits, top-level errors, continue-on-error, serialization compliance, multipart parsing, content-ID references, intent validationtest_batch_dataframe.py-- 18 tests: DataFrame create/update/delete, validation, NaN handling, empty series, bulk deletetest_odata_internal.py--_build_upsert_multiplebody exclusion, conflict detection, URL/method correctnessE2E tests -- 14 tests passing against live Dataverse (
crm10.dynamics.com):$refcontent-ID)$ref)Examples & docs
examples/advanced/batch.py-- reference examples for all batch operation typesexamples/advanced/walkthrough.py-- batch section added (section 11)examples/basic/functional_testing.py--test_batch_all_operations()covering all operation categories against a live environment