org.openscience.cdk.similarity
Class Tanimoto

java.lang.Object
  extended by org.openscience.cdk.similarity.Tanimoto

@TestClass(value="org.openscience.cdk.similarity.TanimotoTest")
public class Tanimoto
extends java.lang.Object

Calculates the Tanimoto coefficient for a given pair of two fingerprint bitsets or real valued feature vectors. The Tanimoto coefficient is one way to quantitatively measure the "distance" or similarity of two chemical structures.

You can use the FingerPrinter class to retrieve two fingerprint bitsets. We assume that you have two structures stored in cdk.Molecule objects. A tanimoto coefficient can then be calculated like:

   BitSet fingerprint1 = Fingerprinter.getFingerprint(molecule1);
   BitSet fingerprint2 = Fingerprinter.getFingerprint(molecule2);
   float tanimoto_coefficient = Tanimoto.calculate(fingerprint1, fingerprint2);
  

The FingerPrinter assumes that hydrogens are explicitely given, if this is desired!

Note that the continuous Tanimoto coefficient does not lead to a metric space

Author:
steinbeck
Keywords:
jaccard, similarity, tanimoto
Created on:
2005-10-19
Belongs to CDK module:
fingerprint
Source code:
HEAD

Constructor Summary
Tanimoto()
           
 
Method Summary
static float calculate(java.util.BitSet bitset1, java.util.BitSet bitset2)
          Evaluates Tanimoto coefficient for two bit sets.
static float calculate(double[] features1, double[] features2)
          Evaluates the continuous Tanimoto coefficient for two real valued vectors.
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

Tanimoto

public Tanimoto()
Method Detail

calculate

@TestMethod(value="testTanimoto1,testTanimoto2")
public static float calculate(java.util.BitSet bitset1,
                                              java.util.BitSet bitset2)
                       throws CDKException
Evaluates Tanimoto coefficient for two bit sets.

Parameters:
bitset1 - A bitset (such as a fingerprint) for the first molecule
bitset2 - A bitset (such as a fingerprint) for the second molecule
Returns:
The Tanimoto coefficient
Throws:
CDKException - if bitsets are not of the same length

calculate

@TestMethod(value="testTanimoto3")
public static float calculate(double[] features1,
                                              double[] features2)
                       throws CDKException
Evaluates the continuous Tanimoto coefficient for two real valued vectors.

Parameters:
features1 - The first feature vector
features2 - The second feature vector
Returns:
The continuous Tanimoto coefficient
Throws:
CDKException - if the features are not of the same length