Skip to content

proper handling of database duplicates #252

Open
@johntruckenbrodt

Description

The SQLite database created via drivers.Archive maintains two tables data and duplicates. The latter contains all scenes that share a unique outname_base attribute (ID) with a scene in data. At the moment the first scene with a unique ID is put into data and no check is done to compare further scenes that share its ID.
One large deficiency of outname_base (different products with same ID, e.g. S1 SLCs and GRDs) was recently described in #251.
Furthermore, the scene in data and the scene to be inserted need to be compared to decide which of the two will be put into data. It often happens that scenes are reprocessed/republished and the scene with the latest processing time should be put into data. This could mean that the one that is currently in this table is moved to duplicates if a scene with a later processing time is being inserted into the database.

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions