Skip to main content

Posts

Showing posts with the label myChEMBL

LSH-based similarity search in MongoDB is faster than postgres cartridge.

TL;DR: In his excellent blog post , Matt Swain described the implementation of compound similarity searches in MongoDB . Unfortunately, Matt's approach had suboptimal ( polynomial ) time complexity with respect to decreasing similarity thresholds, which renders unsuitable for production environments. In this article, we improve on the method by enhancing it with Locality Sensitive Hashing algorithm, which significantly reduces query time and outperforms RDKit PostgreSQL cartridge . myChEMBL 21 - NoSQL edition    Given that NoSQL technologies applied to computational chemistry and cheminformatics are gaining traction and popularity, we decided to include a taster in future myChEMBL releases. Two especially appealing technologies are Neo4j and MongoDB . The former is a graph database and the latter is a BSON document storage. We would like to provide IPython notebook -based tutorials explaining how to use this software to deal with common cheminformat...

myChEMBL 20 has landed

We are very pleased to announce that the latest myChEMBL release, based on the ChEMBL 20 database, is now available to download . In addition to the ChEMBL upgrade, you will also find a number of changes and new features: Updates in system and Python libraries, including the iPython notebook server Upgrade in the web services (data and utils) to match the new functionality provided by the main ChEMBL ones Current stable version of RDKit ( 2015.03 ) Two brand new notebooks, namely an RDKit tutorial and a tutorial on SureChEMBL data mining , increasing the total number of notebooks to 14 Updates in several other iPython notebooks and the KNIME workflow, in order to take advantage of the new data, models and web services functionality Several bug fixes A CentOS 7 VM version, in addition to the existing Ubuntu 14.04 one New virtualisation technologies, as explained in the section below Lots of flavours This new myChEMBL release is tec...

myChEMBL 19 Released

                      We are very pleased to announce that the latest myChEMBL release, based on the ChEMBL 19 database ,  is now available to download . In addition to the extra data, you will also find a number a great new features. So what's new then? More core chemoinformatics tools We have included OSRA (Optical Structure Recognition), which is useful for extracting compound structures from images. OSRA can be accessed from the command line or by very convenient web interface, provided by Beaker (described below). We've also added OpenBabel - another great open source cheminformatics toolkit. This means you can now experiment with both RDKit and OpenBabel and use whichever you prefer. ChEMBL Beaker myChEMBL now ships with a local instance the ChEMBL Beaker service. For those not familiar with Beaker, the service provides users with an array of chemoinformatics utilities via a RESTful API. Under the h...

myChEMBL on Bare Metal

myChEMBL is distributed as a Virtual Machine (VM), which is good because you can treat it like another file on your filesystem. It can be transmitted, copied, renamed, deleted,  etc . The myChEMBL VM behaves like a sandbox, so software installed there can't harm your computer. But there are sometimes costs associated with using a VM, for example VMs are usually several percent slower than the host they are running on. There are also a number of scenarios where using a VM may not optimal or even possible, for example: You just want to enrich your existing machine with chemistry-related software The only machine you have is itself virtual - VM provisioning software often prevents you from installing a VM within a VM When performance is critical In these cases you may not want the whole myChEMBL VM, only the software that it ships with. Fortunately we have a script, that automates the process of creating our customized VM. But not only that - we keep it publicly avail...

How to install myChEMBL using two Vagrant commands

      TL;DR install Vagrant and VirtualBox run vagrant init chembl/myChEMBL && vagrant up wait a bit... go to http://127.0.0.1:8000/ enjoy! What have I just done? Vagrant is a tool for building and deploying complete development environments and myChEMBL 18 now supports installation via Vagrant. We achieved this by first creating a myChEMBL Vagrant Box , which we then register on the Vagrant Cloud . This then allows users to install myChEMBL on there local system using two simple commands*: vagrant init chembl/myChEMBL vagrant up *assumes you have vagrant installed It's that simple! After you type this into your system console, the expected output should be similar to the one below: What are those two commands doing? The first command initializes the current directory to be a Vagrant environment by creating an initial Vagrantfile and prepopulates the config.vm.box setting in the created Vagrantfile. This happens immediately. T...

myChEMBL LaunchPad......Launched!

We are pleased to announce that the latest myChEMBL release (based on ChEMBL_18 ), is available to download . For users not familiar with myChEMBL, the aim of the project is to create an open platform, which combines public domain bioactivity data with open source web, database and cheminformatics technologies. More details about the project can be found in this paper and more details about a recent award it helped pick up can be found here . Like the previous release, once you have installed the myChEMBL virtual machine , you will have access to an Ubuntu linux machine which comes preloaded with the ChEMBL data in a RDKit enabled PostgreSQL database and the original myChEMBL web application. We have added a lot of new features and enhancements to the new myChEMBL release, which include: A local copy the ChEMBL Web Services , which uses the local PostgreSQL database as a backend. A suite of interactive tutorials, created using IPython Notebooks . Topics covered include int...