Skip to content

warcio check does not raise error when GZip records are truncated #138

Open
@anjackson

Description

One of the most likely problems we see is failed transfers leading to truncated WARC.GZ files. We can spot this with gunzip -t but it would be good if warcio check also raised this as a validation error. My tests so far have indicated that the warcio and cdxj-indexer etc. tools all skip over these errors silently.

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions