An exchange format for QSAR data sets

QSAR-ML is an XML-based format for exchanging complete Quantitative Structure-Activity Relationships (QSAR/QSPR) data sets, implemented as an W3C XML Schema. The XML Schema defines the outline and data types in QSAR-ML and makes it possible to assess the validity of an instance. Descriptors are defined by the Blue Obelisk Descriptor Ontology (BODO). For more information about architecture and implementation, please see the article below.

Bioclipse implementation

A set of plugins is available for Bioclipse that makes QSAR-ML easy to work with for all users. Simple graphical editors and intuitive wizards guide users among setting up a QSAR-ML compliant data set. For more information, see the Bioclipse-QSAR wiki page.

Available QSAR-ML data sets

There is a list of available data sets in QSAR-ML availble here. Future projects include a public repository and curation tools.


Please direct questions to one of the Bioclipse Mailing Lists.


If you use QSAR-ML in your scientific projects, please cite:

Towards interoperable and reproducible QSAR analyses: Exchange of data sets
Ola Spjuth, Egon L Willighagen, Rajarshi Guha, Martin Eklund, Jarl ES Wikberg
Journal of Cheminformatics 2010, 2:5 doi:10.1186/1758-2946-2-5


The QSAR-ML schema is available under Eclipse Public License (EPL).


Version: 1.0.0.v20090919
Qsar-ML XML Schema
Qsar-ML XML Schema documentation (HTML)
Qsar-ML XML Schema documentation (PDF)