Skip to content

Releases: Unstructured-IO/unstructured

0.18.28

09 Jan 19:24
82532ca

Choose a tag to compare

Enhancement

  • Optimize clean_extra_whitespace_with_index_run (codeflash)
  • Optimize recursive_xy_cut_swapped (codeflash)
  • Optimize _DocxPartitioner._parse_category_depth_by_style_name (codeflash)
  • Optimize VertexAIEmbeddingEncoder._add_embeddings_to_elements (codeflash)
  • Optimize ngrams (codeflash)
  • Optimize stage_for_datasaur (codeflash)

0.18.27

08 Jan 00:01
e3c4b52

Choose a tag to compare

0.18.27

Fixes

  • Comment no-ops in zoom_image (codeflash)
  • Fix an issue where elements with partially filled extracted text are marked as extracted

Enhancement

  • Optimize sentence_count (codeflash)
  • Optimize _PartitionerLoader._load_partitioner (codeflash)
  • Optimize detect_languages (codeflash)
  • Optimize contains_verb (codeflash)
  • Optimize get_bbox_thickness (codeflash)
  • Upgrade pdfminer-six to 20260107 to fix ~15-18% performance regression from eager f-string evaluation

0.18.26

05 Jan 21:41
ae0efca

Choose a tag to compare

0.18.26

Fixes

  • Pin deltalake<1.3.0 to fix ARM64 Docker builds (1.3.0 missing Linux ARM64 wheels)

0.18.25

Fixes

  • Security update: Removed pdfminer.six version constraint and bumped pdfminer.six and urllib3 to address high severity CVEs

0.18.24

30 Dec 17:54
7f2cb4c

Choose a tag to compare

Enhancement

  • Optimize OCRAgentTesseract.extract_word_from_hocr (codeflash)

Fixes

  • Security update: Bumped dependencies to address security vulnerabilities

0.18.22

10 Dec 17:56
afd9118

Choose a tag to compare

0.18.22

Enhancement

Features

Fixes

  • fix(deps): Bump fonttools to address cve by @CyMule in #4125

Full Changelog: 0.18.21...0.18.22

0.18.21

24 Nov 14:55
91a9888

Choose a tag to compare

0.18.21

Enhancement

  • Update save_elements unit test to check crop box padding behavior

Features

Fixes

  • Update unstructured-inference to 1.1.2 to address CVEs

0.18.20

15 Nov 00:14
7c4d0b9

Choose a tag to compare

0.18.20

Enhancement

  • Improve the VoyageAI integration
  • Add voyage-context-3 support
  • Flag extracted elements as such in the metadata for downstream use

Features

Fixes

0.18.18

07 Nov 01:05
b01d35b

Choose a tag to compare

0.18.18

Fixes

  • Prevent path traversal in email MSG attachment filenames Fixed a security vulnerability (GHSA-gm8q-m8mv-jj5m) where malicious attachment filenames containing path traversal sequences could write files outside the intended directory. The fix normalizes both Unix and Windows path separators before sanitizing filenames, preventing cross-platform path traversal attacks in partition_msg functions

0.18.17

Enhancement

Features

Fixes

0.18.16

Enhancement

  • Speed up function _assign_hash_ids by 34% (codeflash)

Features

Fixes

0.18.15

17 Sep 14:27
2d44d73

Choose a tag to compare

What's Changed

New Contributors

Full Changelog: 0.18.14...0.18.15

0.18.14

26 Aug 13:25
fed8942

Choose a tag to compare

0.18.14

Enhancements

  • Speed up function sentence_count by 59% (codeflash)

  • Speed up function check_for_nltk_package by 111% (codeflash)

  • Speed up function under_non_alpha_ratio by 76% (codeflash)

Features

Fixes