Of and in ' a to was is ) ( for as on by he with 's that at from his it an were are which this also be has or: had first one their its new after but who not they have –; her she ' two been other when there all% during into school time may years more most only over city some world would where later up such used many can state about national out known university united then made.
ChemAxon are a software company that produce a variety of cheminformatics applications and software development modules. A key driver for development has been maintaining portability among various operating systems and a focus on web-based integration thus they have made extensive use of Java. Many of the tools are free to academics or for evaluation and they have a reputation for being very active partners in collaborations.
Whilst ChemAxon have developed a variety of software components I thought I'd start with a review of the chemical editor/viewer Marvin. MarvinView is a Java based chemical viewer for single and multiple chemical structures, queries and reactions, whilst MarvinSketch is a chemical editor for drawing chemical structures, queries and reactions. Both are available as Java applets for embedding into webpages and as Java beans to provide desktop applications.
MarvinSketch is provided is a double clickable application and despite being a Java application is not particulalry un-Maclike the scroll bars, icons etc all seem fine whilst in the older versions the keyboard shortcuts all use the control key rather than the 'apple' key the latest version now uses the 'command' or 'apple' key and it fair to say ChemAxon have made great efforts to give a more Maclike look and feel to the application. It is a fairly intuitive application to use since everything is pretty much point and click. The top pallete provides a selection of predrawn templates which the user can click to activate, clicking on the drawing area then draws the selected template, templets can be linked with the pink guides identifying the atoms about to be attached. Atom types can be changed by choosing the required atom from the top pallette or by keyboard input. Keyboard input also provides rapid access to template ph for Phenyl etc. If the template is drawn as an abbrevation e.g. 'Ph' for benzene ring they can be expanded to display the full structure by choosing 'Expand' from the 'Group' menu.
The structures can be moved by selecting all atoms (command-A) and then moving the cursor to the center of the structure a blue square appears and the sttructure can be dragged sideways, move the cursor to edge of the structure and when the blue rotate icon appears the structure can be rotated in 2D. Alternatively after selecting the structure click shift once to move sideways, click shift twice to rotate in 2D or 3-times to rotate in 3D.
The 'More' button provides access to a periodic table and also a variety of tools for annotating structures but perhaps more importantly a wide variety of options for building structural queries for database searching. Including the ability to add user defined SMARTS queries. Right click on a structure offers a dropdown menu with a number of options, perhaps the most useful of which is 'Copy as SMILES' (SMILES as a simple yet comprehensive chemical language in which molecules and reactions can be specified using ASCII characters representing atom and bond symbols). Marvin supports a variety of different file formats (MOL, MOL2, SDF, RXN, RDF (V2000/V3000), SMILES, SMARTS/SMIRKS (recursive), MRV, InChi, CML, PDB).
MarvinSketch also provides access to a variety of property calculations by means of a series of dynamically loaded plugins available from the tools menu. These include Elemental Analysis, IUPAC Naming Plugin, pKa Plugin, Major Microspecies Plugin, Isoelectric Point Plugin, Partitioning, logP Plugin, logD Plugin, Charge Plugin, Polarizability Plugin, Orbital Electronegativity Plugin, Tautomerization Plugin, Resonance Plugin, Stereoisomer Plugin, Conformer Plugin, Molecular Dynamics Plugin, Topology Analysis Plugin, Geometry Plugin, Polar Surface Area Plugin (2D), Molecular Surface Area Plugin (3D), Hydrogen Bond Donor-Acceptor Plugin, Huckel Analysis Plugin, Refractivity Plugin. Many of the plugins provide detailed information, for example the pKa plugin dislplays the individual pKas for the ionisable groups and also shows the microspecies distribution curves by pH. The free version allows the calculation of on only a single molecule whereas the commercial versions can be used to calculate any number.
A particularly useful property calculation for drug discovery is the LogD Plugin, below you can see the logD(pH) profiles for a series of compounds and one might use this information to predict the likely sites of absorption from the GI tract as the pH of the varies along the length tract. In fact LogD may well be the most valuable predicted property for drug discovery. The accuracy of the many available tools for chemical property calculations is often debated, for the majority of 'regular' chemical structures I've found Marvin to be pretty good, however I would strongly suggest that you never undertake a structure activity analysis where you combine values calculated using different packages, choose one and stick to it you are useually interested in the relative values not the absolute number.
The topology analysis plugin provides a wide range of topological descriptors including atom, bond and ring counts, together with a number of distance based indicies. These together with PSA, HBD and HBA provide all of the most commonly used properties for chemiformatics analysis.
A few limitations, MarvinSketch is not a direct replacement for ChemDraw. Whilst MarvinSketch is a great Cheminformtics tool and can be used to build web portals it is not a desktop publishing application, whilst structures can be saved in a variety of image formats and embedded in documents they then of course lose any 'chemical' content and cannot be subsequently edited. For writing publication quality manuscripts ChemDraw still sets the standard. If you want to use the clipboard to transfer information between Marvin and other applications the best option is probably to stick to using SMILES.
MarvinView is a viewer for single and multiple molecules, and will open and display a variety of file types (MOL, MOL2, SDF, RXN, RDF (V2000 / V3000), SMILES, SMARTS/SMIRKS (recursive), MRV, InChi, CML, PDB). When opening a multimolecule file the structures are displayed in a grid (you can choose the number of structures to display), double click on a structure to see a larger version.
Marvin has just (11 Jan 2008) been updated to version 5.0.0. This brings several improvements to the GUI shown below, I've also shown the ability to add your own templates to the templates toolbar at the bottom of the window. The GUI is customisable so you can design your own layout or use one of the four prebuilt layouts designs. Other new features include IUPAC name generation, improvements to the calculation plugins and a new enumerate Markush structures plugin, and several surface area calculations.
ChemAxon continue to update there applications and I just thought I'd mention this new feature. If you select a structure you get a menu option to search either PubChem or ChemSpider
It seems to uses the Inchi to do the search and the results appear in your web browser.
Home | Features | Download | Tutorial | FAQ | Manual | Questions? |
Accuracy AutoDock Vina significantly improves the average accuracy of the binding mode predictions compared to AutoDock 4, judging by our tests on the training set used in AutoDock 4 development.[*] Additionally and independently, AutoDock Vina has been tested against a virtual screening benchmark called the Directory of Useful Decoys by the Watowich group, and was found to be'a strong competitor against the other programs, and at the top of the pack in many cases'. It should be noted that all six of the otherdocking programs, to which it was compared, are distributed commercially. AutoDock Tools Compatibility For its input and output, Vina uses the same PDBQT molecular structure file format used by AutoDock. PDBQT files can be generated (interactively or in batch mode) and viewed using MGLTools. Other files, such as the AutoDock and AutoGrid parameter files (GPF, DPF) and grid map files are not needed. | Binding mode prediction accuracy on the test set. 'AutoDock' refers to AutoDock 4, and 'Vina' to AutoDock Vina 1. |
Ease of Use
Vina's design philosophy is not to require the user to understand its implementation details, tweak obscure search parameters, cluster results or know advanced algebra (quaternions). All that is required is the structures of the molecules being docked and the specification of the search space including the binding site. Calculating grid maps and assigning atom charges is not needed. The usage summary can be printed with 'vina --help
'. The summary automatically remains in sync with the possible usage scenarios.
Implementation Quality
Flexible Side Chains
Like in AutoDock 4, some receptor side chains can be chosen to be treated as flexible during docking.
Speed AutoDock Vina tends to be faster than AutoDock 4 by orders of magnitude.[*] Multiple CPUs/Cores Additionally, Vina can take advantage of multiple CPUs or CPU cores on your system to significantly shorten its running time. World Community Grid Qualified projects can run AutoDock Vina calculations for free on the massively parallel World Community Grid.Existing projects using AutoDock Vina there include those targetingAIDS,Malaria,Leishmaniasis andSchistosomiasis.Some of these projects average over 50 years worth of computation per day. | Average time per receptor-ligand pair on the test set.'AutoDock' refers to AutoDock 4, and 'Vina' to AutoDock Vina 1. |
How to get started learning to use Vina?
Watching the video tutorial might be the best way to do that.
What is the meaning or significance of the name 'Vina'? Why was it developed?
Please see this mailing list post.
How accurate is AutoDock Vina?
See Features
It should be noted that the predictive accuracy varies a lot depending on the target, so it makes sense to evaluate AutoDock Vina against your particular target first,if you have known actives, or a bound native ligand structure, before ordering compounds. While evaluating any docking engine in a retrospective virtual screen, it might make sense to select decoys of similar size, and perhaps other physical characteristics,to your known actives.
What is the difference between AutoDock Vina and AutoDock 4?
AutoDock 4 (and previous versions) and AutoDock Vina were both developed in the Molecular Graphics Lab atThe Scripps Research Institute. AutoDock Vina inherits some of the ideas and approaches of AutoDock 4, such as treating docking as a stochastic global opimization of the scoring function, precalculating grid maps (Vina does that internally), and some other implementation tricks, such asprecalculating the interaction between every atom type pair at every distance. It also uses the same type of structure format (PDBQT) for maximum compatibility with auxiliary software.
However, the source code, the scoring funcion and the actual algorithms used are brand new,so it's more correct to think of AutoDock Vina as a new 'generation' rather than 'version' of AutoDock. The performance was compared in the original publication [*], and on average, AutoDock Vina didconsiderably better, both in speed and accuracy. However, for any given target, either program may provide a better result, even though AutoDock Vina is more likely to do so.This is due to the fact that the scoring functions are different, and both are inexact.
What is the difference between AutoDock Vina and AutoDock Tools?
AutoDock Tools is a module within the MGL Tools software package specifically for generating input (PDBQT files) forAutoDock or Vina. It can also be used for viewing the results.
Can I dock two proteins with AutoDock Vina?
You might be able to do that, but AutoDock Vina is designed only for receptor-ligand docking. There are better programs for protein-protein docking.
Will Vina run on my 64-bit machine?
Yes. By design, modern 64-bit machines can run 32-bit binaries natively.
Why do I get 'can not open conf.txt' error? The file exists!
Oftentimes, file browsers hide the file extension, so while you think you have a file 'conf.txt
', it's actually called 'conf.txt.txt
'.This setting can be changed in the control panel or system preferences.
You should also make sure that the file path you are providing is correct with respect to the directory (folder) you are in, e.g. if you are referring simply to conf.txt
in the command line, make sure you are in the same directory (folder)as this file. You can use ls
or dir
commands on Linux/MacOS and Windows, respectively, to list the contentsof your directory.
Why do I get 'usage errors' when I try to follow the video tutorial?
The command line options changed somewhat since the tutorial has been recorded. In particular, '--out
' replaced '--all
'.
Vina runs well on my machine, but when I run it on my exotic Linux cluster, I get a 'boost thread resource' error. Why?
Your Linux cluster is [inadvertantly] configured in such a way as to disallow spawning threads. Therefore, Vina can not run. Contact your system administrator.
Why is my docked conformation different from what you get in the video tutorial?
The docking algorithm is non-deterministic. Even though with this receptor-ligand pair, the minimum of the scoring function corresponds to the correct conformation,the docking algorithm sometimes fails to find it. Try several times and see for yourself. Note that the probability of failing to find the mininum may be different with a different system.
My docked conformation is the same, but my energies are different from what you get in the video tutorial. Why?
The scoring function has changed since the tutorial was recorded, but only in the part that is independent of the conformation:the ligand-specific penalty for flexibility has changed.
Why do my results look weird in PyMOL?
PDBQT is not a standard molecular structure format. The version of PyMOL used in the tutorial (0.99rc6) happens to display it well (because PDBQT is somewhat similar to PDB).This might not be the case for newer versions of PyMOL.
Any other way to view the results?
You can also view PDBQT files in PMV (part of MGL Tools), or convert them into a different file format (e.g. using AutoDock Tools, or with 'save as' in PMV)
How big should the search space be?
As small as possible, but not smaller. The smaller the search space, the easier it is for the docking algorithm to explore it.On the other hand, it will not explore ligand and flexible side chain atom positions outside the search space. You should probably avoid search spaces bigger than 30 x 30 x 30
Angstrom, unless you also increase '--exhaustiveness
'.
Why am I seeing a warning about the search space volume being over 27000 Angstrom^3?
This is probably because you intended to specify the search space sizes in 'grid points' (0.375 Angstrom), as in AutoDock 4.The AutoDock Vina search space sizes are given in Angstroms instead. If you really intended to use an unusuallylarge search space, you can ignore this warning, but note that the search algorithm's job may be harder.You may need to increase the value of the exhaustiveness
to make up for it. This will lead to longer run time.
The bound conformation looks reasonable, except for the hydrogens. Why?
AutoDock Vina actually uses a united-atom scoring function, i.e. one that involves only the heavy atoms.Therefore, the positions of the hydrogens in the output are arbitrary.The hydrogens in the input file are used to decide which atoms can be hydrogen bond donors or acceptors though,so the correct protonation of the input structures is still important.
What does 'exhaustiveness' really control, under the hood?
In the current implementation, the docking calculation consists of a number of independent runs, starting from random conformations.Each of these runs consists of a number of sequential steps. Each step involves a random perturbation of the conformation followedby a local optimization (using the Broyden-Fletcher-Goldfarb-Shanno algorithm) and a selection in which the step is either accepted or not. Each local optimization involves many evaluations of the scoring function as well asits derivatives in the position-orientation-torsions coordinates.The number of evaluations in a local optimization is guided by convergence and other criteria.The number of steps in a run is determined heuristically, depending on the size and flexibility of the ligand and the flexible side chains. However, the number of runs is set by the exhaustiveness
parameter. Since the individual runs are executed in parallel, where appropriate, exhaustiveness
also limits the parallelism.Unlike in AutoDock 4, in AutoDock Vina, each run can produce several results: promising intermediate results are remembered.These are merged, refined, clustered and sorted automatically to produce the final result.
Why do I not get the correct bound conformation?
It can be any of a number of things:
How can I tweak the scoring function?
You can change the weights easily, by specifying them in the configuration file,or in the command line. For example
doubles the strenth of all hydrogen bonds.Functionality that would allow the users to create new atom and pseudo-atom types,and specify their own interaction functions is planned for the future.
This should make it easier to adapt the scoring function to specific targets,model covalent docking and macro-cycle flexibility,experiment with new scoring functions,and, using pseudo-atoms, create directional interaction models.
Stay tuned to the AutoDock mailing list, if you wish to be notified of any beta-test releases.
Why don't I get as many binding modes as I specify with '--num_modes
'?
This option specifies the maximum number of binding modes to output. The docking algorithm may find fewer 'interesting' binding modes internally.The number of binding modes in the output is also limited by the 'energy_range
', which you may want to increase.
Why don't the results change when I change the partial charges?
AutoDock Vina ignores the user-supplied partial charges. It has its own way of dealing with the electrostatic interactions through the hydrophobic andthe hydrogen bonding terms. See the original publication [*] for details of the scoring function.
I changed something, and now the docking results are different. Why?
Firstly, had you not changed anything, some results could have been different anyway, due to the non-deterministic nature of the search algorithm.Exact reproducibility can be assured by supplying the same random seed
to both calculations, but only if all other inputs and parameters are the same as well. Even minor changes to the input can have an effect similar to a new random seed.What does make sense discussing arethe statistical properties of the calculations:e.g. 'with the new protonation state, Vina is much less likely to find the correct docked conformation'.
How do I use flexible side chains?
You split the receptor into two parts: rigid and flexible, with the latter represented somewhat similarly to how the ligand is represented. See the section 'Flexible Receptor PDBQT Files' of the AutoDock4.2 User Guide (page 14) for how to do thisin AutoDock Tools.Then, you can issue this command: vina --config conf --receptor rigid.pdbqt --flex side_chains.pdbqt --ligand ligand.pdbqt
.Also see this write-up on this subject.
How do I do virtual screening?
Please see the relevant section of the manual.
Please note that a variety of docking management applications exist to assist you in this task.
I don't have sufficient computing resources to run a virtual screen. What are my options?
You may be able to run your project on the World Community Grid, or use DrugDiscovery@TACC. See Other Software.
I have ideas for new features and other suggestions.
For proposed new features,we like there to be a wide consensus,resulting from a public discussion,regarding their necessity.Please consider starting or joining a discussion on the AutoDock mailing list.
Will you answer my questions about Vina if I email or call you?
No. Vina is community-supported. There is no obligation on the authors to help others with their projects.Please see this page for how to get help.
PATH
, you can just type 'vina --help
' instead.See the Video Tutorial for details.Don't forget to check out Other Software for GUIs, etc.PATH
, you can just type 'vina --help
' instead.See the Video Tutorial for details.Don't forget to check out Other Software for GUIs, etc.lib
, main
and split
, with the source code from the appropriate subdirectories. lib
must be a library, that the other projects depend on, and main
and split
must beconsole applications. For optimal performance, remember to compile using the Release
mode.On OS X and Linux, you may want to navigate to the appropriate build
subdirectory, customize the Makefile
by setting the paths and the Boost version, and then type
vina --help
':For example:
exhaustiveness
, the time spent on the search is already varied heuristically depending on the number of atoms, flexibility, etc. Normally, it does not make sense to spend extra time searching to reduce the probability of not finding the global minimum of the scoring function beyond what is significantly lower than the probability that the minimum is far from the native conformation.However, if you feel that the automatic trade-off made between exhaustiveness and time is inadequate, you can increase the exhaustiveness
level. This should increase the time linearly and decrease the probability of not finding the minimum exponentially.kcal/mol
.rmsd/lb
(RMSD lower bound) and rmsd/ub
(RMSD upper bound), differing in how the atoms are matched in the distance calculation:rmsd/ub
matches each atom in one conformation with itself in the other conformation, ignoring any symmetry rmsd'
matches each atom in one conformation with the closest atom of the same element type in the other conformation (rmsd'
can not be used directly, because it is not symmetric) rmsd/lb
is defined as follows:rmsd/lb(c1, c2) = max(rmsd'(c1, c2), rmsd'(c2, c1))
The advanced options allow
--score_only
'; see the paper[2] for what the terms are)The examples below assume that Bash is your shell. They will need to be adapted to your specific needs.
Windows
To perform virtual screening on Windows, you can either use Cygwin and the Bash scripts below, or, alternatively, adapt them for the Windows scripting language.
Linux, Mac
Suppose you are in a directory containing your receptor receptor.pdbqt
and a set of ligands named ligand_01.pdbqt
, ligand_02.pdbqt
, etc.
You can create a configuration file conf.txt
, such as
vina
is in your PATH
.Otherwise, modify it accordingly.PBS Cluster
If you have a Linux Beowulf cluster,you can perform the individual dockings in parallel.
Continuing with our example, instead of executing all the dockings in a loop locally,we will write one *.job
script per ligand,and use qsub
(a PBS command)to schedule these scripts to be executed by the cluster.
Run this shell script to do it.The script assumes that vina
and qsub
are in your PATH
.Otherwise, modify it accordingly.
Once the jobs have been scheduled, you can monitor their status with
Selecting Best Results
If you are on Unix and in a directory that contains directories with PDBQT files, all of which are AutoDock Vina results,you may find this Python script useful for selecting the top results. Run it as:
to get the file names of the top 10 hits, which can then be easily copied.O. Trott, A. J. Olson,AutoDock Vina: improving the speed and accuracy of docking with a new scoring function, efficient optimization and multithreading,Journal of Computational Chemistry 31 (2010) 455-461
ls conf.txt
to see if the file is really there)