Skip to content

0.16.6

Latest
Compare
Choose a tag to compare
@ryannikolaidis ryannikolaidis released this 22 Nov 02:09
626f73a

0.16.6

Enhancements

  • Every <table> tag is considered to be ontology.Table: Added special handling for tables in HTML partitioning. This change is made to improve the accuracy of table extraction from HTML documents.
  • Every HTML has default ontology class assigned: When parsing HTML to ontology, each defined HTML in the ontology has an assigned default ontology class. This allows assigning an ontology class instead of UncategorizedText when the HTML tag is predicted correctly but has no class assigned.
  • Use (number of actual table) weighted average for table metrics: In evaluating table metrics, the mean aggregation now uses the actual number of tables in a document to weight the metric scores.

Features

  • None added in this release.

Fixes

  • ElementMetadata consolidation: Now, text_as_html metadata is combined across all elements in CompositeElement when chunking HTML output.