MedGen FAQ
Searching MedGen
How can I find the names of disorders that are caused by a particular gene?
By submitting a query based on gene symbol.
If you enter a gene symbol in MedGen's query box, the result will incude all records with that term anywhere in the text. To limit the results to disorders thought to be affected by altered function of that gene, click on the link at the top of the page that reads See GENE SYMBOL in MedGen. The number in parentheses at the end of that phrase identifies the number of records in MedGen reported as being caused by altered function of the GENE SYMBOL (e.g. CFTR). Alternatively, you can enter the gene symbol followed by [gene] in the search box (e.g. "CFTR[gene]").
By starting with the Gene database
Within the phenotype section of Gene, there is a list of names of disorders with links to MedGen. Or within a Gene record, follow the MedGen link in the Related Information section at the right.
When I search by a MIM number, why do I sometimes get multiple records?
There are two major data flows that manage relationships between MIM numbers and records in MedGen. One is the daily update provided by GTR- and ClinVar-related data flows from OMIM. The second is the semi-annual update from UMLS to MedGen. In the former data flow, the relationship of MedGen record to MIM number is 1:1. In the latter data flow the MIM number may be reported for more than one concept UID or CUI.
PubMed results in MedGen
How are references chosen for the Recent clinical studies section in MedGen?
The citations listed in the Recent clinical studies are not curated, but provided computationally by using the Clinical Queries tool maintained by PubMed. The query that is used is the preferred name of the record.
How are relationships between MedGen and PubMed computed?
The links between records in MedGen and PubMed are generated by a combination of curation and computation. For those that are computed, the preferred term in MedGen is used to query PubMed, either limiting to matches in the title+ abstract of the paper, or limiting to matches to articles, once indexed to MeSH terms, that are indexed to have a genetic component. When non informative terms are identified, they are added to a 'stop list' to prevent future false positives.
Professional practice guidelines
What are the search criteria that identify Practice Guidelines from PubMed?
MedGen offers a detailed PubMed-based search pre-built for clinicians to find disease-specific practice guidelines. On each condition page, you will find the top results from our curated query of PubMed articles that has been specifically tailored to capture a wide range of practice guidelines for that condition. Additionally, there is a link to see the full search results in PubMed. If the PubMed search does not return any articles, we provide a search for a related broader concept to assist you in finding the most relevant information.
The query itself is given in full below, where the "Condition name" is MedGen's preferred name for the specific record in MedGen. It utilizes PubMed's Proximity Search feature to allow for minor deviations in word order for the condition name within the Title or Abstract of the publication. The search filters are selected to return articles indexed as practice guidelines, only articles available in English and will exclude articles that are case reports, clinical studies or randomized controlled trials. The additional terms in the search query include commonly used phrases from published practice guidelines, for example: "Genetic screening" or "Evidence-based guideline." While this complex query has been designed to be broad enough to cover a variety of phrases and concepts used in Practice Guideline publications, it may capture articles that do not fully conform to the expectation of a practice guideline nor will it identify all published practice guidelines.
("[condition name]"[tiab:~0]) AND ("english and humans"[Filter]) AND ( ("practice guideline"[Filter]) OR (practice*[titl] AND (guideline[titl] OR parameter[titl]
OR resource[titl] OR bulletin[titl] OR best[titl])) OR (genetic*[titl] AND (evaluation[titl] OR counseling[titl] OR screening[titl] OR test*[titl])) OR (clinical[titl]
AND ((expert[titl] AND consensus[titl]) OR utility[titl] OR guideline*[titl])) OR (management[titl] AND (clinical[titl] OR diagnos*[titl] OR recommendation[titl]
OR pain[titl] OR surveillance[titl] OR emergency[titl] OR guideline*[titl] OR therap*)) OR (treatment[titl] AND ((evaluation[titl] AND diagnosis[titl])
OR (assessment[titl] AND prevention[titl]) OR therap*)) OR (Diagnos*[titl] AND (prenatal[titl] OR treatment[titl] OR follow-up[titl] OR statement[titl]
OR criteria[titl] OR newborn[titl] OR differential[titl] OR neonatal[titl] OR neonate[titl])) OR (guideline*[titl] AND (pharmacogenetic*[titl]
OR recommendation[titl] OR therap*[titl] OR evidence-based[titl] OR consensus[titl] OR (technical[titl] AND standard*[titl]) OR (molecular[titl] AND testing[titl])))
OR (risk[titl] AND assessment[titl]) OR (recommendation*[titl] AND (statement[titl] OR Evidence-based[titl] OR Consensus[titl]))
OR (care AND ((Patient[titl] AND standard*[titl]) OR primary[titl] OR psychosocial[titl])) OR (Health[titl] AND supervision[titl])
OR (statement[titl] AND (policy[titl] OR position[titl] OR Consensus[titl])) OR (pharmacogenetics[titl] AND (Dosing[titl] OR therap*[titl] OR genotype*[titl] OR drug*[titl]))
OR (Chemotherapy[titl] AND decision*[titl]) OR (screening[titl] AND (newborn[titl] OR neonat*[titl] OR detection[titl] OR diagnos*[titl]))
OR (criteria[titl] OR genotype*[titl]) ) NOT ("Case reports"[Publication type] OR "clinical study"[Publication Type] OR "randomized controlled trial"[Publication Type])
What are the search criteria that identify practice guidelines on Bookshelf?
The query for titles and chapters on NCBI Bookshelf that are reported as Practice Guidelines utilizes the Publication Type and Resource Type fields. If there is a match on the preferred condition name from MedGen in any Bookshelf publication that is a "clinical guideline" or "practice guideline" it will be returned by this search. The publication type is assigned by NLM Cataloguers, and the resource type is selected by the provider of the resource. This may include resources that are broader in scope and may not capture all publications on the Bookshelf that could be considered practice guidelines.
(("clinical guidelines"[Resource Type]) OR "practice guideline"[Publication Type]) AND "[condition name]"
Are the queries for PubMed and Bookshelf Practice Guidelines comprehensive of all possible publications?
No. These queries rely on criteria such as how articles are indexed and the disease names used. We cannot guarantee that these searches will capture all potentially relevant practice guidelines, we rigorously tested various search criteria to ensure we would be able to return highly relevant publications. If there are specific practice guidelines that are not returned by these results, you can help us improve MedGen by reporting them to us at [email protected].
What are the Curated Practice Guidelines?
The team of Medical Genetics Curators at NCBI have identified professional and medical societies that issue practice guidelines which are not included in PubMed or Bookshelf. We add these manually to display under the section "Curated Practice Guidelines". To ensure current guidelines are available to the MedGen community, we review these organizations' websites regularly to identify new, updated, or retired guidelines and update them in this section of MedGen. If there are specific practice guidelines that are not listed in this section, you can help us improve MedGen by reporting them to us at [email protected].
Data and scope of concepts in MedGen
Why aren't all terms from SNOMED CT in MedGen?
MedGen includes terms and their identifiers from SNOMED CT based only on the semi-annual releases from UMLS. Thus MedGen may be up to 6 months out of date. MedGen also limits its scope to concepts of interest to Medical Genetics. Thus there are some SNOMED CT terms that are not included, no matter how long they have been established, because they are out of scope, e.g. immunologic factors.
How can I extract a report of MedGen identifiers and their relationships to MIM numbers and HPO identifiers?
There are multiple ways to access these data.
MIM. If the starting point is the MIM number, this file on Gene's ftp site reports the MedGen concept identifiers that match the phenotype records. Not all MIM numbers have a corresponding record in MedGen; genes are out of scope. ftp://ftp.ncbi.nih.gov/gene/DATA/mim2gene_medgen
HPO. If the focus is data from HPO, then there are two files on MedGen's ftp site
- ftp://ftp.ncbi.nih.gov/pub/medgen/MedGen_HPO_Mapping.txt.gz
- ftp://ftp.ncbi.nih.gov/pub/medgen/MedGen_HPO_OMIM_Mapping.txt.gz
The README files at both sites provide all the details.
How can I extract a report of MedGen identifiers and their relationships to other concepts (such as hierarchies)?
MedGen's ftp site provides a compressed file named MGREL.RRF.gz, or in the csv folder, a series of files (split to make them managable), named MGREL_(number).csv. The fourth column, REL, includes the values PAR for parent, CHD for child, and SIB for sibling. These can be used in conjunction with the CUI1 and CUI2 values to construct hierarchies. The usage in MedGen for the REL column is consistent with that of UMLS. https://www.nlm.nih.gov/research/umls/knowledge_sources/metathesaurus/release/abbreviations.html