Skip to content

OCR_ERROR : OCR failed on image / Details: socket hang up (hosted version) #494

Open
@oliverlukesch

Description

@oliverlukesch

Describe the bug
We currently (and consistently) need to wait a long time (~10 minutes or longer) for even simple files to finish. This is a new phenomena and also does not happen when using the vendor multimodal model option (for example GPT-4o).

When looking at the related job details inside cloud.llamaindex.ai, we see the following error messages:

Page 3 [warning] - OCR_ERROR : OCR failed on image /home/user/dist/worker/pipeline/../../../tmp/8c1b928c-9066-44b4-a7cd-795dc09edbe8/img/img_p2_2.png. Details: socket hang up
Page 3 [warning] - OCR_ERROR : OCR failed on image /home/user/dist/worker/pipeline/../../../tmp/8c1b928c-9066-44b4-a7cd-795dc09edbe8/img/img_p2_1.png. Details: socket hang up
Page 6 [warning] - OCR_ERROR : OCR failed on image /home/user/dist/worker/pipeline/../../../tmp/8c1b928c-9066-44b4-a7cd-795dc09edbe8/img/img_p5_5.png. Details: socket hang up

Files
One of the documents we've been testing with is the public AirBnB pitchdeck, please find it attached.

AirBnB-Deck.pdf

Job ID

  • 31513454-e41c-41e9-8117-4aed16203ea5
  • b7964a02-c44e-435e-99a3-6231d98dff8f
  • 5ed77e71-a68b-421b-9856-809569f84311
    (and many more)

Client:

  • API

Additional context
Does not happen in multimodal mode.

This is the data we submit to your API via POST:

const formData = createFormData({
  file,
  parsing_instruction,
  do_not_cache: true,
  invalidate_cache: true,
  ...(richContent && {
    use_vendor_multimodal_model: true,
    vendor_multimodal_model_name: 'openai-gpt4o',
    vendor_multimodal_api_key: OPENAI_API_KEY,
  }),
})

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions