Skip to content

document extractor with unstructured io for pptx does not function as expected #10956

@fdb02983rhy

Description

@fdb02983rhy

Self Checks

  • This is only for bug report, if you would like to ask a question, please head to Discussions.
  • I have searched for existing issues search for existing issues, including closed ones.
  • I confirm that I am using English to submit this report (我已阅读并同意 Language Policy).
  • [FOR CHINESE USERS] 请务必使用英文提交 Issue,否则会被关闭。谢谢!:)
  • Please do not modify this template :) and fill in all the required fields.

Dify version

0.11.2

Cloud or Self Hosted

Self Hosted (Docker)

Steps to reproduce

Use doc extractor to process a pptx with unstructured io

✔️ Expected Behavior

Process succefully

❌ Actual Behavior

api-1`         |   File "/app/api/.venv/lib/python3.10/site-packages/gunicorn/workers/base_async.py", line 115, in handle_request
api-1         |     for item in respiter:
api-1         |   File "/app/api/.venv/lib/python3.10/site-packages/werkzeug/wsgi.py", line 256, in __next__
api-1         |     return self._next()
api-1         |   File "/app/api/.venv/lib/python3.10/site-packages/werkzeug/wrappers/response.py", line 32, in _iter_encoded
api-1         |     for item in iterable:
api-1         |   File "/app/api/.venv/lib/python3.10/site-packages/flask/helpers.py", line 113, in generator
api-1         |     yield from gen
api-1         |   File "/app/api/libs/helper.py", line 186, in generate
api-1         |     yield from response
api-1         |   File "/app/api/core/app/features/rate_limiting/rate_limit.py", line 115, in __next__
api-1         |     return next(self.generator)
api-1         |   File "/app/api/core/app/apps/base_app_generate_response_converter.py", line 25, in _generate_full_response
api-1         |     for chunk in cls.convert_stream_full_response(response):
api-1         |   File "/app/api/core/app/apps/advanced_chat/generate_response_converter.py", line 67, in convert_stream_full_response
api-1         |     for chunk in stream_response:
api-1         |   File "/app/api/core/app/apps/advanced_chat/generate_task_pipeline.py", line 187, in _to_stream_response
api-1         |     for stream_response in generator:
api-1         |   File "/app/api/core/app/apps/advanced_chat/generate_task_pipeline.py", line 218, in _wrapper_process_stream_response
api-1         |     for response in self._process_stream_response(tts_publisher=tts_publisher, trace_manager=trace_manager):
api-1         |   File "/app/api/core/app/apps/advanced_chat/generate_task_pipeline.py", line 319, in _process_stream_response
api-1         |     workflow_node_execution = self._handle_workflow_node_execution_failed(event)
api-1         |   File "/app/api/core/app/task_pipeline/workflow_cycle_manage.py", line 339, in _handle_workflow_node_execution_failed
api-1         |     WorkflowNodeExecution.process_data: json.dumps(event.process_data) if event.process_data else None,
api-1         |   File "/usr/local/lib/python3.10/json/__init__.py", line 231, in dumps
api-1         |     return _default_encoder.encode(obj)
api-1         |   File "/usr/local/lib/python3.10/json/encoder.py", line 199, in encode
api-1         |     chunks = self.iterencode(o, _one_shot=True)
api-1         |   File "/usr/local/lib/python3.10/json/encoder.py", line 257, in iterencode
api-1         |     return _iterencode(o, 0)
api-1         |   File "/app/api/.venv/lib/python3.10/site-packages/frozendict/__init__.py", line 32, in default
api-1         |     return BaseJsonEncoder.default(
api-1         |   File "/usr/local/lib/python3.10/json/encoder.py", line 179, in default
api-1         |     raise TypeError(f'Object of type {o.__class__.__name__} '
api-1         | TypeError: Object of type File is not JSON serializable
``

Metadata

Metadata

Assignees

No one assigned

    Labels

    🐞 bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions