feat: Add ability to disable layout processing by maxdswain · Pull Request #3168 · docling-project/docling

maxdswain · 2026-03-22T19:27:05Z

Overview
Add ability to disable layout processing of documents with do_layout option in PdfPipelineOptions.

Description of Changes
do_layout option added to PdfPipelineOptions and fed into all of the classes that inherit from BaseLayoutModel as well as the factories in StandardPdfPipeline and LegacyStandardPdfPipeline. When making these changes I followed how do_ocr was implemented.

Issue resolved by this Pull Request:
Resolves #3011

Checklist:

Documentation has been updated, if necessary.
Examples have been added, if necessary.
Tests have been added, if necessary.

Signed-off-by: Max Swain <[email protected]>

github-actions · 2026-03-22T19:27:15Z

✅ DCO Check Passed

Thanks @maxdswain, all your commits are properly signed off. 🎉

mergify · 2026-03-22T19:27:41Z

Merge Protections

Your pull request matches the following merge protections and will not be merged until they are valid.

🟢 Enforce conventional commit

Wonderful, this rule succeeded.

Make sure that we follow https://www.conventionalcommits.org/en/v1.0.0/

title ~= ^(fix|feat|docs|style|refactor|perf|test|build|ci|chore|revert)(?:\(.+\))?(!)?:

dosubot · 2026-03-22T19:29:20Z

Related Documentation

2 document(s) may need updating based on files changed in this PR:

Docling

How to properly enable `enable_remote_services` in Docling Serve (CPU image) to use an external OpenAI-compatible API for picture description and formula enrichment, and what is the correct config format?

View Suggested Changes

@@ -89,5 +89,5 @@
 
 - Set `UVICORN_WORKERS=1` (required at 8 GB RAM)
 - Set `DOCLING_NUM_THREADS=4` and `OMP_NUM_THREADS=4`
-- Disable unused features in requests (`do_ocr=false`, `do_table_structure=false`, etc.) unless explicitly needed
+- Disable unused features in requests (`do_ocr=false`, `do_table_structure=false`, `do_layout=false`, etc.) unless explicitly needed
 - Use external APIs for formula and picture enrichment instead of local models

[Accept] [Decline]

What are the detailed pipeline options and processing behaviors for PDF, DOCX, PPTX, and XLSX files in the Python SDK?

View Suggested Changes

@@ -8,6 +8,7 @@
     - `force_ocr`: Replace existing text with OCR-generated text
     - `ocr_engine`, `ocr_lang`: OCR engine and language options
     - `image_export_mode`: `placeholder`, `embedded`, `referenced`
+    - `do_layout` (default True): Enable document layout analysis to detect and classify page regions such as text blocks, headings, figures, tables, and other structural elements. Required for accurate content segmentation and reading-order reconstruction. Can be disabled to skip layout processing for faster performance when layout information is not needed.
     - `do_table_structure`, `table_mode`, `table_cell_matching`: Table extraction options (see Table Structure Models section below for details on TableFormer V1 and V2)
     - `do_code_enrichment`, `do_formula_enrichment`: Code/formula recognition
     - `vlm_pipeline_preset`, `vlm_pipeline_custom_config`, `picture_description_preset`, `picture_description_custom_config`, `code_formula_preset`, `code_formula_custom_config`: New model inference engine and preset options for VLM, picture description, and code/formula extraction
@@ -59,7 +60,7 @@
 
 ### PDF (continued)
 
-- **Pipeline Option Overrides**: The Python API allows you to override pipeline options at conversion time for a given format using the `format_options` argument. Only `do_*` flags (such as `do_ocr`, `do_table_structure`, `do_code_enrichment`, `do_formula_enrichment`, etc.) can be changed, and only from `True` to `False`. All other options must remain identical to those used at pipeline initialization. Attempting to enable a do_* flag or change other fields will result in an error. This enables per-call disabling of enrichment features without reinitializing the pipeline.
+- **Pipeline Option Overrides**: The Python API allows you to override pipeline options at conversion time for a given format using the `format_options` argument. Only `do_*` flags (such as `do_ocr`, `do_layout`, `do_table_structure`, `do_code_enrichment`, `do_formula_enrichment`, etc.) can be changed, and only from `True` to `False`. All other options must remain identical to those used at pipeline initialization. Attempting to enable a do_* flag or change other fields will result in an error. This enables per-call disabling of enrichment features without reinitializing the pipeline.
 - **Exporting Scanned/Image-Based PDFs**: When processing scanned or image-based PDFs with `force_full_page_ocr=True`, the layout model classifies full-page scans as `PictureItem` and OCR text is stored as children of those picture nodes. To export this OCR text via `export_to_markdown()` or `export_to_text()`, you must set the `traverse_pictures=True` parameter. Without this parameter, export functions will return empty results even though OCR text exists in the document.
 
 ```python

[Accept] [Decline]

Note: You must be authenticated to accept/decline updates.

^{How did I do? Any feedback?}

cau-git · 2026-03-23T09:43:54Z

@maxdswain while this can be technically done the way you propose, the consequence will be simply zero output, which is obviously not useful. The layout model is critical to discover structure that the subsequent pipeline stages requires to process a document at all. A proper solution would be to implement an ultra-cheap, no-AI layout detection algorithm, which can be plugged into the layout model factory we already provide.

maxdswain · 2026-03-23T09:58:46Z

@maxdswain while this can be technically done the way you propose, the consequence will be simply zero output, which is obviously not useful. The layout model is critical to discover structure that the subsequent pipeline stages requires to process a document at all. A proper solution would be to implement an ultra-cheap, no-AI layout detection algorithm, which can be plugged into the layout model factory we already provide.

My bad then, I’ll close this PR is this is the wrong approach. Thanks for the feedback!

maxdswain added 3 commits March 22, 2026 19:05

feat: Add ability to disable layout processing

6c842c1

Signed-off-by: Max Swain <[email protected]>

feat: Add support for disabling layout in docling cli

9f757b5

Signed-off-by: Max Swain <[email protected]>

docs: Add examples in custom_convert

360d22e

Signed-off-by: Max Swain <[email protected]>

maxdswain closed this Mar 23, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Add ability to disable layout processing#3168

feat: Add ability to disable layout processing#3168
maxdswain wants to merge 3 commits intodocling-project:mainfrom
maxdswain:disable-layout-model

maxdswain commented Mar 22, 2026

Uh oh!

github-actions bot commented Mar 22, 2026

Uh oh!

mergify bot commented Mar 22, 2026

Uh oh!

dosubot bot commented Mar 22, 2026

Uh oh!

cau-git commented Mar 23, 2026 •

edited

Loading

Uh oh!

maxdswain commented Mar 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

maxdswain commented Mar 22, 2026

Uh oh!

github-actions bot commented Mar 22, 2026

Uh oh!

mergify bot commented Mar 22, 2026

Merge Protections

🟢 Enforce conventional commit

Uh oh!

dosubot bot commented Mar 22, 2026

How to properly enable enable_remote_services in Docling Serve (CPU image) to use an external OpenAI-compatible API for picture description and formula enrichment, and what is the correct config format?

What are the detailed pipeline options and processing behaviors for PDF, DOCX, PPTX, and XLSX files in the Python SDK?

Uh oh!

cau-git commented Mar 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

maxdswain commented Mar 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

How to properly enable `enable_remote_services` in Docling Serve (CPU image) to use an external OpenAI-compatible API for picture description and formula enrichment, and what is the correct config format?

cau-git commented Mar 23, 2026 •

edited

Loading