GISAID’s EPI_SET functionality permits the aggregation of GISAID Accession Numbers into a single, permanent dataset identifier called EPI_SET ID. This unique identifier is used for both the acknowledgment of data contributors and to permit registered users to retrieve the dataset of all the records encompassed in the EPI_SET ID, thereby aiding in the reproducibility of an analysis.
Registered Users may generate an EPI_SET ID in myriad ways via the “Search” interface in the EpiCoV™ database. Clicking the “EPI-SET” button displays the EPI-SET generator where users may type, paste, or upload (available formats .tsv, .csv, .xml, and .json) any number of GISAID Accession Numbers. It is also possible to pre-populate the EPI-SET generator by first ticking the checkbox(es) of the desired sample(s). Available search filters may be pre-applied to narrow the universe of suitable records. Clicking on “Generate” will start generate an EPI_SET ID.
Newly minted EPI_SET IDs are sent together with a corresponding digital object identifier (“DOI”) to the User’s email address on file, i.e. in the User profile. A Supplemental Table will be attached to the email which contains relevant information on Data availability, required for example, when submitting a manuscript for publication in peer-reviewed scientific journal.
Each EPI_SET ID and the corresponding DOI can be used in any publication to fulfill the essential requirement of acknowledging contributors of data on which any research or analyses are based. By either by clicking on the DOI, or by pasting the EPI_SET ID in the "Data Acknowledgement Locator" on the GISAID homepage (under Resources), anyone – including those without GISAID Access Credentials, may instantly access the details of the data contributors.
EPI_SET IDs can also be used by registered Users in the Search filter of the EpiCoV™ database to retrieve all records that are part of the dataset used in an analysis.
Merely adding an EPI_SET ID in a manuscript or other forms of publication, does not fulfill the requirements to make best efforts to collaborate and acknowledge the contributors of data (see Database Access Agreement at §2.e). To learn more about how to acknowledge data contributors see the Author's Guide for Citing and Acknowledging Data Contributors.
In 2003, an article titled "It’s a scoop" asks whether the pressure on scientists to “be the first to publish ‘hot’ results” was “distorting scientific progress” and negatively impacting the quality of scientific work.
Researchers were reluctant to share data before publishing their own analyses for fear of being “scooped,” and there was little incentive for researchers and countries to share genomic data, which hindered the timely development of vaccines and therapeutics necessary to respond to epidemic.
To overcome this reluctance the GISAID data sharing mechanism was devised. The goal was to encourage and incentivize rapid sharing of data, particularly data related to high-impact pathogens, with a primary focus on public health. Much has been written about GISAID’s mission and the positive effect GISAID has had on the timely exchange of pathogen information, e.g. Elbe et al (2017); Shu et al (2017); Khare et al (2021).
When requesting GISAID access credentials, users agree that using data obtained from GISAID in an analysis requires the acknowledgement of all data contributors, i.e. the Authors and their Originating laboratories responsible for obtaining the specimens, and their Submitting laboratories responsible for generating and uploading the genetic sequence and associated metadata.
Making such an analysis accessible to the public for sale or for free, or whether publishing same in form of a preprint or peer-reviewed scientific journal requires compliance with the requirement to acknowledge.
GISAID Accession Numbers are used as unique and permanent identifiers for each data record beginning with the letters EPI and followed by sequence of numbers. GISAID Accession Numbers are accepted without limitation by peer-reviewed journals and required at the time of submission of manuscripts for publication.