Skip to main content

Posts

Showing posts with the label FAQ

Pathogen data in ChEMBL

Infectious disease is a leading cause of death globally and bioactivity data against pathogens (fungi, bacteria, viruses, and parasites) is an important category in ChEMBL, especially in light of the ongoing pandemic. In ChEMBL version 29, there are over 2 M bioactivity data points against fungal, bacterial or viral targets (for 460 K compounds) available for pathogen-related research. How can I find pathogen data? On the ChEMBL interface, the organism taxonomy is available as a filter that can be applied to bioactivity data. A sunburst visualisation of the organism taxonomy is also provided as an easy starting point to explore targets according to their taxonomy. In the full database, the organism_classification table holds the underlying data and can be used in bespoke SQL queries. For example, queries may be performed to extract high level pathogen data such as all bioactivity data for small molecules screened against bacterial targets (example below) or more specific subsets focus...

Sequence similarity searches in ChEMBL

  The ChEMBL database contains bioactivity data that links compounds to their biological targets.  Most ChEMBL targets are proteins (~ 70% in version 27) and these are mapped to their UniProt accessions.   On the ChEMBL interface, searches can be performed with either protein names or accessions...but did you know that protein similarity searches are also possible? Here’s an example using human Phospholipase DDHD2 , a target not found in ChEMBL.       1.       On the ChEMBL interface , click 'Enter a Sequence:     2.       Input the FASTA sequence corresponding to human  Phospholipase DDHD2  and click 'Search in ChEMBL':  3.      Review the BLAST results, select targets of interest and browse bioactivity data: The BLAST  search identifies the mouse  Phospholipase DDHD2   homologue alongside a small number of bioactivity data points and active compounds . ChE...

Data checks

  ChEMBL contains a broad range of binding, functional and ADMET type assays in formats ranging from  in vitro single protein assays to anti proliferative cell-based assays.  Some variation is expected, even for very similar assays, since these are often performed by different groups and institutes.  ChEMBL includes references for all bioactivity values so that full assay details can be reviewed if needed, however there are a number of other data checks that can be used to identify potentially problematic results. 1) Data validity comments: The data validity column was first included in ChEMBL v15 and flags activities with potential validity issues such as a non-standard unit for type or activities outside of the expected range. Users can review flagged activities and decide how these should be handled. The data validity column can be viewed on the interface (click 'Show/Hide columns' and select 'data validity comments') and can be found in the activities ...

Molecule hierarchy

During drug development, active pharmaceutical ingredients are often formulated as salts to provide the final pharmaceutical product. ChEMBL includes parent molecules and their salts (approved and investigational) as well as other alternative forms such as hydrates and radioisotopes. These alternative forms are linked to their parent compound through the molecule hierarchy.   Using the molecule hierarchy The molecule hierarchy can be used to retrieve and display connected compounds and to  aggregate activity data that has been mapped to any member of a compound family. On the interface, related compounds are automatically displayed in the ‘Alternative forms’ section of the ChEMBL compound report card. Bioactivity data can easily be aggregated in the activity summary by using the 'Include/Exclude Alternative Forms' filter. Finding the molecule hierarchy   On the interface, we include alternative forms as shown above. The downloaded database contains the molecule_hiera...

Using ChEMBL activity comments

We’re sometimes asked what the ‘activity_comments’ in the ChEMBL database mean. In this Blog post, we’ll use aspirin as an example to explain some of the more common activity comments. First, let’s review the bioactivity data included in ChEMBL. We extract bioactivity data directly from   seven core medicinal chemistry journals . Some common activity types, such as IC50s, are standardised  to allow broad comparisons across assays; the standardised data can be found in the  standard_value ,  standard_relation  and  standard_units  fields. Original data is retained in the database downloads in the  value ,  relation  and  units  fields. However, we extract all data from a publication including non-numerical bioactivity and ADME data. In these cases, the activity comments may be populated during the ChEMBL extraction-curation process  in order to capture the author's  overall  conclusions . Similarly, for depos...

FAQ: I've done some virtual screening using chembl, can you send me the compounds?

We do not have physical samples of any of the compounds in chembl, so we cannot supply any samples to you. Sorry. If you want to obtain samples of chembl compounds you do have a number of options: 1) Often the underlying literature contains a synthetic route and reagents for the compounds, this greatly helps resynthesis. 2) About 5-10% of chembl compounds are reported to be available from compound vendors (for example, you could search databases such as the excellent ZINC to find available compounds). However, the turnover of stock from compound vendors is quite high, and often a significant fraction of compounds reported to be available for purchase will be out of stock when you want them.

FAQ: Is there a license agreement I need to sign for chembl?

There is no need to sign a license agreement for any of the chembl data or applications. Nor is there the requirement for any payment. The data/software is covered by a creative commons licence - Creative Commons Attribution-Share Alike 3.0 Unported License . If you have any detailed questions about licensing please get in contact with us.

FAQ: Where can I download StARlite?

Bad news and Good news: You cannot download StARlite. StARlite was a registered trademark for a database developed and marketed by Inpharmatica Ltd.  Some of the databases and intellectual property of Inpharmatica Ltd. were licensed to the EMBL-EBI . What used to be known as StARlite is now part of chembl. chembl downloads accessible from here .