You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This commit was created on GitHub.com and signed with GitHub’s verified signature.
0.16.6
Enhancements
Every <table> tag is considered to be ontology.Table: Added special handling for tables in HTML partitioning. This change is made to improve the accuracy of table extraction from HTML documents.
Every HTML has default ontology class assigned: When parsing HTML to ontology, each defined HTML in the ontology has an assigned default ontology class. This allows assigning an ontology class instead of UncategorizedText when the HTML tag is predicted correctly but has no class assigned.
Use (number of actual table) weighted average for table metrics: In evaluating table metrics, the mean aggregation now uses the actual number of tables in a document to weight the metric scores.
Features
None added in this release.
Fixes
ElementMetadata consolidation: Now, text_as_html metadata is combined across all elements in CompositeElement when chunking HTML output.