This is not the document you are looking for? Use the search form below to find more!

Report home > Science

International Journal of Biometrics and Bioinformatics (IJBB) Volume 2, Issue 1, February 2008.

0.00 (0 votes)
Document Description
Molecular similarity searching is a process to find chemical compounds that are similar to a target compound. The concept of molecular similarity play an important role in modern computer aided drug design methods, and has been successfully applied in the optimization of lead series. It is used for chemical database searching and design of combinatorial libraries. In this paper, we explore the possibility and effectiveness of using Inference Bayesian network for similarity searching. The topology of the network represents the dependence relationships between molecular descriptors and molecules as well as the quantitative knowledge of probabilities encoding the strength of these relationships, mined from our compound collection. The retrieve of an active compound to a given target structure is obtained by means of an inference process through a network of dependences. The new approach is tested by its ability to retrieve seven sets of active molecules seeded in the MDDR. Our empirical results suggest that similarity method based on Bayesian networks provide a promising and encouraging alternative to existing similarity searching methods.
File Details
Submitter
  • Username: cscjournals
  • Name: cscjournals
  • Documents: 170
Embed Code:

Add New Comment




Related Documents

International Journal of Biometrics and Bioinformatics (IJBB) Volume 4, Issue 1, 2010

by: cscjournals, 25 pages

This is the first issue of volume four of International Journal of Biometric and Bioinformatics (IJBB). The Journal is published bi-monthly, with papers being peer reviewed to high international ...

International Journal of Biometrics and Bioinformatics, (IJBB), Volume (4) : Issue (1)

by: cscjournals, 24 pages

This is the first issue of volume four of International Journal of Biometric and Bioinformatics (IJBB). The Journal is published bi-monthly, with papers being peer reviewed to high international ...

International Journal of Biometrics and Bioinformatics (IJBB) Volume 2, Issue 5, October 2008

by: cscjournals, 13 pages

Many of the current Malaysian medical information and emergency systems are still paper-based and stand alone systems that do not fully utilize the Internet, multimedia, wireless and real time ...

International Journal of Biometrics and Bioinformatics (IJBB) Volume 4, Issue 2, 2010

by: cscjournals, 100 pages

This is the Second issue of volume fourth of International Journal of Biometric and Bioinformatics (IJBB). The Journal is published bi-monthly, with papers being peer reviewed to high international ...

International Journal of Biometrics and Bioinformatics (IJBB), Volume (4): Issue (5)

by: cscjournals, 45 pages

This is the fifth issue of volume four of International Journal of Biometric and Bioinformatics (IJBB). The Journal is published bi-monthly, with papers being peer reviewed to high international ...

International Journal of Biometrics and Bioinformatics (IJBB) Volume 4, Issue 4

by: cscjournals, 37 pages

This is the fourth issue of volume four of International Journal of Biometric and Bioinformatics (IJBB). The Journal is published bi-monthly, with papers being peer reviewed to high international ...

International Journal of Biometrics and Bioinformatics, (IJBB), Volume (4) : Issue (2)

by: cscjournals, 100 pages

This is the Second issue of volume fourth of International Journal of Biometric and Bioinformatics (IJBB). The Journal is published bi-monthly, with papers being peer reviewed to high international ...

International Journal of Biometrics and Bioinformatics(IJBB), Volume (4): Issue (3)

by: cscjournals, 49 pages

This is the third issue of volume four of International Journal of Biometric and Bioinformatics (IJBB). The Journal is published bi-monthly, with papers being peer reviewed to high international ...

International Journal of Biometrics and Bioinformatics (IJBB) Volume 3, Issue 6, 2010

by: cscjournals, 23 pages

This is the sixth issue of volume three of International Journal of Biometric and Bioinformatics (IJBB). The Journal is published bi-monthly, with papers being peer reviewed to high international ...

International Journal of Biometrics and Bioinformatics (IJBB) Volume 3, Issue 1 , Febuary 2009.

by: cscjournals, 22 pages

Fingerprints are the most widely used biometric feature for person identification and verification in the field of biometric identification. Fingerprints possess two main types of features that are ...

Content Preview

Editor in Chief Professor João Manuel R. S. Tavares

International Journal of Biometrics and
Bioinformatics (IJBB)

Book: 2008 Volume 2, Issue 1
Publishing Date: 28-02-2008
Proceedings
ISSN (Online): 1985-2347

This work is subjected to copyright. All rights are reserved whether the whole or
part of the material is concerned, specifically the rights of translation, reprinting,
re-use of illusions, recitation, broadcasting, reproduction on microfilms or in any
other way, and storage in data banks. Duplication of this publication of parts
thereof is permitted only under the provision of the copyright law 1965, in its
current version, and permission of use must always be obtained from CSC
Publishers. Violations are liable to prosecution under the copyright law.

IJBB Journal is a part of CSC Publishers
http://www.cscjournals.org

©IJBB Journal
Published in Malaysia

Typesetting: Camera-ready by author, data conversation by CSC Publishing
Services – CSC Journals, Malaysia


CSC Publishers




Table of Contents



Volume 2, Issue 1, February 2008.



Pages
1- 16
Inference Networks for Molecular Database Similarity Searching.

Ammar Abdo, Naomie Salim.























International Journal of Biometrics and Bioinformatics, (IJBB), Volume (2) : Issue (1)

Ammar Abdo, Naomie Salim

Inference Networks for Molecular Database Similarity Searching



Ammar Abdo*




ammar_utm@yahoo.com
Faculty of Computer Science and Information Systems
Universiti Teknologi Malaysia
Johor Bahru, Skudai, 81310, Malysia
*Corresponding author : Tel : +6- 0143123054, +6-07- 5532637, Fax : +6-07-5532210

Naomie Salim






naomie@utm.my
Faculty of Computer Science and Information Systems
Universiti Teknologi Malaysia
Johor Bahru, Skudai, 81310, Malysia


Abstract

Molecular similarity searching is a process to find chemical compounds that are
similar to a target compound. The concept of molecular similarity play an
important role in modern computer aided drug design methods, and has been
successfully applied in the optimization of lead series. It is used for chemical
database searching and design of combinatorial libraries. In this paper, we
explore the possibility and effectiveness of using Inference Bayesian network for
similarity searching. The topology of the network represents the dependence
relationships between molecular descriptors and molecules as well as the
quantitative knowledge of probabilities encoding the strength of these
relationships, mined from our compound collection. The retrieve of an active
compound to a given target structure is obtained by means of an inference
process through a network of dependences. The new approach is tested by its
ability to retrieve seven sets of active molecules seeded in the MDDR. Our
empirical results suggest that similarity method based on Bayesian networks
provide a promising and encouraging alternative to existing similarity searching
methods.

Keywords: Bayesian networks, molecular similarity searching, chemical databases, inference network,
drug discovery.


1. INTRODUCTION
The term chemoinformatics was coined only a few years ago, but it rapidly gained widespread
use. Chemoinformatics is the use of informatics methods to solve chemical problem [42].
Chemoinformatics is now being extensively used by pharmaceutical and agrochemical
companies. The pressure to find new active compounds and bring them to market as quickly as
possible has led many pharmaceutical and agrochemical companies to use information
technology in their product discovery and development processes. Database searching can be
divided into three distinct classes of problem: exact-match searching for the database record that
is identical to the query record, partial-match searching for those database records that contain
the query and best-match searching for those database records that are most similar to the query
International Journal of Biometric and Bioinformatics, Volume (2) : Issue (1)
1

Ammar Abdo, Naomie Salim

record. In chemoinformatics, the first two classes correspond to structure searching and
substructure searching, respectively. The provision of best-searching facilities for chemical
database is normally referred to as similarity searching, which involves quantifying the similarity
of a target molecule with all others in the chemical database in terms of a chosen descriptor or
set of descriptors. It is used whenever a potential drug compound, a lead, has been found. The
lead can be further optimised by finding similar compounds to it, with the hope that a similar, but
better drug can be synthesised.

The virtual screening (VS) is widely used to enhance the cost-effectiveness of drug-discovery
programmes by ranking database of chemical structures in decreasing probability of activity, this
prioritisation then means that biological testing can be focused on just those few molecules that
have significant a priori probabilities of activity. There are many different ways in which a
database can be prioritized, here we focus on similarity searching methods. Similarity searching
is one of the most widely used VS approaches. The basic idea underlying similarity searching
based VS is a very simple idea that similar property principle states that structurally similar
molecules tend to have similar properties [1]. According to this principle, any molecule that has
not been tested for biological activity but is structurally similar to a target molecule that is exhibit
the interest activity is also expected to be active. Furthermore the molecules will be ranked in
decreasing order, so that first molecule is more expected to be active than others and so on.

One objective of the computational tools which applied in chemoinformatics was to finding leads
early in a drug discovery project. The effectiveness of any similarity method can vary greatly from
one biological activity to another in a way that is difficult to predict. Moreover, any two similarity
methods tend to select different subsets of actives from a database, consequently it is advisable
to use several similarity search methods where possible [2].

In essence, most of the molecular similarity measures used originates from areas outside
chemoinformatics, particularly from text retrieval. Although chemical structures differ greatly from
other entities that are commonly stored in database, some parallels can be drawn between
chemical database searches and searches on words or documents [3]. The many similarities
between information retrieval and chemoinformatics that have already been identified suggest
that chemoinformatics is a domain of which information retrieval researchers should be aware
when considering the applicability of new techniques that they have developed [4]. During last
two decades many researches has been done to develop different textual information retrieval
techniques. Currently, Bayesian network the best approach to managing probability and to solve
the uncertainty problem in textual information retrieval.

2. MOLECULAR SIMILARITY SEARCHING
In similarity searching, a query involves the specification of an entire structure of a molecule. This
specification is in the form of one or more structural descriptors and this is compared with the
corresponding set of descriptors for each molecule in the database [5]. A measure of similarity is
then calculated between the target structure and every database structure. Similarity measures
quantify the relatedness of two molecules with a large number (or one) if their molecular
descriptions are closely related and with a small number (large negative or zero) when their
molecular descriptions are unrelated. The results of the similarity measure will be used to sort the
database structures into the order of decreasing similarity with the target. The resulting ranked list
of structures will then be returned to the user. There is an extensive and continuing debate about
what sorts of measures are most appropriate [46]. The similarity measure based on the number
of substructural fragments common to a pair of molecules and a simple association coefficient are
the most common at least until now [46]. The performance of different similarity coefficients with
regard to their use in molecular similarity searching has earlier been analyzed. Several methods
have been used to further optimise the measures of similarity between molecules, which include
weighting [49], standardisation [47] and data fusion [46, 48]. Probability-based similarity
International Journal of Biometric and Bioinformatics, Volume (2) : Issue (1)
2

Ammar Abdo, Naomie Salim

searching [50] has also been developed on top of the industry-standard vector-space models
(VSM).

A common application of similarity searching is in the rational design of new drugs and pesticides
where the nearest neighbours for an initial lead compound are sought in order to find better
compounds. Similarity searching is also used for property prediction purposes [7], where the
properties of an unknown compound are estimated from those of its nearest neighbours.
Underpinning these applications of molecular similarity measure is the similar property principle
[1], which states that structurally similar molecules will exhibit similar physiochemical and
biological properties. Related to the similar property principle is the concept of neighbourhood
behavior [8], which states that compounds within the same neighbourhood or similarity region
have the same activity. Unknown biological or physicochemical properties of a molecule can be
predicted from the properties of molecules that lie within the same neighbourhood region. In lead
finding, selection of compounds whose neighbourhood regions overlap one another should be
avoided. In lead optimisation, if a particular compound is found to be active, compounds that lie in
the same neighbourhood region can be tested to find one with the most optimum activity.

The first reports on similarity searches appeared in the mid-1980s, based on the work carried out
at Lederle Laboratories [7] and Pfizer [9]. In the Lederle study, molecules were represented by
their constituent atom pairs, where an atom pair is a substructural fragment comprising two non-
hydrogen atoms together with number of intervening bonds. The similarity search allowed users
to request either some number of the top-ranked molecules or all those that had a similarity with
the target structure greater than a minimal value. In the Pfizer system, together with a
conventional substructural query, a user can submit a target molecule typical of the type of the
structure that was required. The conventional screen search and atom-by-atom search were used
to identify matches in the substructure searching, after which a similarity measure based on the
screens common to the target and the matches was used to rank the substructure search output.
The subsequent development of a faster, inverted-file-based, nearest neighbour search algorithm
allowed the ranking of the entire database against the target structure in real time, without the
need for the specification of the initial substructural query. Since the Lederle and Pfizer systems,
similarity searching has undergone further development. An example is Hagadone’s work on
substructure similarity searching [10]. Substructure similarity searching is used to identify
molecules containing a substructure similar to a target structure or substructure. Another
extension of similarity search was described by Fisanick et al. [11] on facilities developed for
Chemical Abstracts Service (CAS) Registry File. It focuses on different types of similarity
relationships that can be identified between a structure in the query and a database structure.
This study found that different representations could give different measures of structural
resemblances between compounds, which suggest that a further analysis into a combined
approach could give a more comprehensive similarity measure between them. The use of
similarity calculations between molecules have since been used not only in similarity searching,
but also in applications like compounds selection [12, 13] and molecular diversity analysis [14, 15,
16]. Three principal tools used for the similarity calculations are the representation that is used to
characterize the molecules that are being compared, the weighting scheme that is used to assign
differing degrees of importance to the various components of these representations, and the
coefficient that is used to determine the degree of relatedness between two structural
representations [17].

2.1
Molecular descriptors
Molecular descriptors are vectors of numbers, each of which is based on some pre-defined
attributes. They are generated from a machine-readable structure representation like a 2D
connection table or a set of experimental or calculated 3D co-ordinates. Molecular descriptors
can be classified into 1D descriptors, 2D descriptors and 3D descriptors. 2D descriptors are
based on information derived from the traditional 2D structure diagram. Examples of 2D
descriptors are 2D fingerprint and topological indices, which are our focus as they play a
prominent role in the experimental work of this paper.

International Journal of Biometric and Bioinformatics, Volume (2) : Issue (1)
3

Ammar Abdo, Naomie Salim

2D fingerprints are the most commonly used descriptors. These descriptors were initially
developed to provide a fast screening step in substructure search systems in which bit strings are
used to represent molecules. They have also proved very useful for similarity searching. There
are two different types of 2D fingerprints: dictionary-based bit strings and hashed fingerprints. In
dictionary-based bit strings, a molecule is split up into fragments of specific functional groups or
substructures. The fragments used are recorded in a predefined fragment dictionary that specifies
the corresponding bit positions of the fragments in the bit string. Bits either individually or as a
group represent the absence or presence of fragments. Examples of dictionary-based
assignment are the CAS ONLINE Screen Dictionary for substructure searching [18], Barnard
Chemical Information system [19, 20] and MDL MACCS key system [21, 22]. In hashed
fingerprints, all the unique fragments that exist in a molecule are hashed using some hashing
function to fit into the length of the bit string. This approach allows for more generalisations
because it does not depend on a predefined list of structural fragments. The fingerprints
generated are characterised by the nature of the chemical structures in the database rather than
by the fragments in some predefined list. This approach is used in the Daylight Chemical
Information Systems [24] and Tripos systems [23].

Topological indices characterise the bonding pattern of a molecule by a single value integer or
real number, obtained from mathematical algorithms applied to the chemical graph representation
of the molecules. Each index thus contains information not about fragments or some locations on
the molecule, but rather about the molecule as a whole. Simpler descriptors include the number
of atoms and bonds and the number of rotatable bonds.

Similarity measures based on bit strings are currently the most widely used approach for
database searching [25]. One of the principal applications of bit string based searching is in the
selection of compounds for inclusion in biological screening programs. This is largely due to the
low processing requirements needed to calculate the similarities between a target structure and a
large number of structures.

2.2
Weighting schemes
A weighting scheme is used to differentiate between different features in a molecule, based on
how important they are in determining the similarity of that molecule with another molecule.
Certain molecular features can be emphasised by associating higher weights with them when
calculating similarity. Different types of statistical information can be extracted from computerised
representations of molecules to form the basis for a fragment weighting schemes. These are
follows, (a) Fragment Frequency (ff), is the number of occurrence of a particular fragment within a
molecule, with high frequently occurring fragments being given a greater weight than those that
occur less frequently. (b) Inverse Fragment Frequency (iff), is the frequency of the fragment in the
molecule collection, with less frequently occurring fragment being given a greater weight than
those that occur high frequently throughout the molecule collection. (c) Molecule size (mz), is the
number of the fragments assigned to a molecule, with a fragment in small molecule being
assigned a greater weight than the same fragment in a large molecule. One more weighting
scheme can be used whenever we can differentiate between active and inactive molecules within
dataset. Unfortunately, limited studies have been done on the effect of applied weighting
schemes on molecular similarity searching methods. All of the above mentioned considerations
have been used for assigning weights at the National Cancer Institute [26]. Willett and Winterman
have found that giving more weight to fragments that occur more frequently in a molecule did
seem to give good results, but other weighting schemes had little significance [27].

2.3
Similarity Coefficients
Similarity coefficients are used to obtain a numeric quantification to the degree of similarity
between a pair of structures [28]. There are four main types of similarity coefficients [29, 30, 31] :
distance coefficients, association coefficients, correlation coefficients and probabilistic
coefficients. Association coefficients are commonly used with binary representations and are
often normalized to lie within the range of zero (no similar features in common) and unity
(identical representations). However, they can be used with non-binary representations, in which
International Journal of Biometric and Bioinformatics, Volume (2) : Issue (1)
4

Ammar Abdo, Naomie Salim

case the range may be different. Correlation coefficients measure the degree of correlation
between sets of values characterizing a pair of objects. Distance coefficients quantify the degree
of dissimilarity between two objects and, when normalized and using binary data, range between
zero (identity) and unity (no similar features in common). Probabilistic coefficients, whilst not
much used in measuring molecular similarity, focus on the distribution of the frequencies of
descriptors over the members of a data set, giving more importance to a match on an infrequently
occurring variable. Examples of these coefficients can be found elsewhere [29]. Assume SK,L is
the similarity between molecules K and L, both molecules described by binary representation. For
bit string descriptors, n is the total bit positions in the bit strings representing the two molecules
compared. b is the number of bit positions set in only one of the two molecules whilst c is the
number of bit positions set in only the other molecule. d of the n bits are not set in either one of
the molecules and a is the number of bits set in both molecules. Thus, n = a + b + c + d. The
origins of the coefficients can be found in a review paper by Ellis et al. [31]. Examples of some of
the coefficients that were used are listed in Table 1.

Continuous
Binary
Coefficient
Formula
Range
Formula
Range
M
∑ (w w )
jk
jl
j =1
a
Tanimoto
M
M
M
2
2
-0.3 to 1
0 to 1
∑ (w ) + ∑ (w ) − ∑ (w w )
jk
jl
jk
jl
a + b + c
j =1
j =1
j =1


M
∑ (w
w
)
jk
jl
j = 1
a
Cosine
M
M
2
2
0 to 1
0 to 1
∑ (w
) ∑ (w )
(a + b)(a + c)
jk
jl
j = 1
j = 1


M
n ∑ ( w
w
)
jk
jl
j = 1
n × a
Forbes
M
M
2
2
-∞ to ∞
0 to ∞
w
w
( a + b )( a + c )
jk
jl

j = 1
j = 1

M
w
w
jk
jl
j = 1
a
Russell-Rao
-∞ to ∞
0 to 1
n
n


M
2 ∑ (w w )
jk
jl
j =1
2 a
Dice
M
M
2
2
0 to 1
0 to 1
∑ (w ) + ∑ (w )
jk
jl
2 a + b + c
j =1
j =1


TABLE 1: Examples of Association Coefficients.

Tanimoto coefficient in Eq. 1 is the most popular coefficient used by similarity methods. If two
molecules K and L have b and c bits set in their fragment bit-strings, with a of these bits being set
in both of the fingerprints, then the similarity between these two molecules using Tanimoto
coefficient is defined to be:

a
S
=



(1)
K , L
a + b + c

The Tanimoto coefficient gives values in the range of zero (no bits in common) to unity (all bits
the same). The Tanimoto coefficient gives the best result than the other coefficients. Currently,
International Journal of Biometric and Bioinformatics, Volume (2) : Issue (1)
5

Ammar Abdo, Naomie Salim

The Tanimoto coefficient is widely used in molecular similarity methods and was becomes the
best choice in both in-house and commercial software systems for chemical information
management.

3. BAYESIAN NETWORKS
Recent research in information retrieval has proved that retrieval models based on Bayesian
network give significant improvements in retrieval performance compare to conventional models
[36, 37, 38, 43]. It is therefore likely that Bayesian network is able to represent the main
(in)dependence relationships between molecular descriptors as conditional probabilities with the
degree of resemblance between pairs of such descriptors computed to represent the probability.
Molecular similarity will be regarded as an inference or evidential reasoning process in which the
probability that a given compound met the requirements of a query is estimated and used as
evidence. Network representations have show promise as mechanisms for inferring these kinds
of relationships. In this paper, we explore the possibility and effectiveness of using such
networks for similarity searching.

A Bayesian network (BN) is graphical model of a probability distribution [33]. A Bayesian network
is a directed acyclic graph (DAG) in which the nodes represent random variables and the arcs
show causality, relevance or dependency relationships between them. The variables and their
relationships comprise the qualitative knowledge stored in a Bayesian network. The strength of
the relationships, measured by means of probability distributions, is also stored in the DAG.
Associated with each node is a set of conditional probability distributions, one for each possible
combination of values that its parents can take. A Bayesian network can be considered an
efficient representation of a joint probability distribution that takes into account the set of
independent relationships represented in the graphical component of the model. In general terms,
given a set of variables {X1, . . . , Xn} and a Bayesian network G, the joint probability distribution in
terms of local conditional probabilities is obtained as follows:

n
P( X ,...X ) =
P( X π ( X ))

1
n
i
i
i 1
=


where π(Xi) is any combination of the values of the parent set of Xi. If Xi has no parents, then the
set π(Xi) is empty, and therefore P(Xi|π(Xi)) is just P(Xi). Once completed, a Bayesian network
can be used to derive the posterior probability distribution of one or more variables in the network,
or to update previous conclusions when new evidence reaches the system.

4. SIMILARITY INFERENCE NETWORK MODEL
The basic model for similarity inference network, shown in Fig.1, consists of two component
networks: a compound network and a query network. The compound network represents the
compound collection. The compound network is built once for a given collection and its structure
does not change during query processing. The query network consists of a single node, which
represents the target molecule and one or several query molecules, which express the target
molecule. A query network is built for each target molecule and modified during query processing
as the query is refined or additional representations are added in an attempt to better
characterize the target molecule. The compound and query networks are connected though links
between their descriptor nodes.

4.1
Compound Network

The compound network shown in Fig. 1 is a simple direct acyclic graph (DAG) consisting of
compound nodes (cj) as roots, and descriptor nodes (di) as leaves. Each compound node
represents a compound in the collection. Each compound node has a prior probability associated
International Journal of Biometric and Bioinformatics, Volume (2) : Issue (1)
6

Ammar Abdo, Naomie Salim

with it that describes the probability of observing that compound. This prior probability will
generally be set to 1/(collection size) and this probability will be small for real collections.

Compound nodes have one or more descriptor nodes as children. The descriptor nodes can be
divide into several subsets, each corresponding to a single descriptor technique that has been
applied to the compound. When 1052 bits are used to describe the compounds using BCI
fingerprint, 1052 nodes are used to represent these bits. If 10 topological indices are used to
describe the compounds, 10 nodes are used to represent these numerical values. We represent
the assignment of a specific descriptor to a compound by draw a directed arc to the descriptor
node from each compound node corresponding to a descriptor node. Each descriptor node
contains a specification of the conditional probability associated with the node given its set of
parent compound nodes. This specification incorporates the effect of any weighting scheme
associated with the descriptors node.




C
C
1
2
Cj
CM





d1
d2
d3
di
dN





Q



FIGURE 1: Similarity inference network model.

4.2

Query Network
The query network is an “inverted” DAG with a single leaf that corresponds to a target molecule
and multiple roots that correspond to the descriptors that express the target. If there is only one
query molecule, the target molecule node and query molecule node coincide. In addition, the
query network is intended to allow us to combine several query molecules to form a single query
molecule. The roots of the query network are query descriptors; they correspond to the
descriptors used to express the target molecule. A single query descriptor node has a single
compound descriptor node as parent. Each query descriptor node contains a specification of its
dependence on a single parent compound descriptor node. The query descriptor nodes define
the mapping between the descriptor layer used to represent the compound collection and the
descriptor layer used to describe target molecule. In our model, the relation between query and
compound descriptors is 1:1 and completely depends. Thus, in order to simplify and reduce our
model, the query descriptors are the same as the compound descriptors. The attachment of the
query descriptors nodes to the compound network has no effect on the basic structure of the
compound network. None of the existing links needs change and none of the conditional
probability specifications stored in the nodes are modified.

To produce a ranking of the compounds in the collection with respect to a given target molecule
T, we compute the probability that this target molecule is satisfied given that compound cj has
been observed, P(T|cj). This is referred to as instantiating cj and corresponds to attaching
evidence to the network, by stating that cj = true, whereas the rest of the compound nodes are set
to false. When the probability P(T|cj) is computed, this evidence is removed and a new compound
cj, i j , is instantiated. By repeating this computation for the rest of the compounds in the
collection, the ranking is produced.

International Journal of Biometric and Bioinformatics, Volume (2) : Issue (1)
7

Download
International Journal of Biometrics and Bioinformatics (IJBB) Volume 2, Issue 1, February 2008.

 

 

Your download will begin in a moment.
If it doesn't, click here to try again.

Share International Journal of Biometrics and Bioinformatics (IJBB) Volume 2, Issue 1, February 2008. to:

Insert your wordpress URL:

example:

http://myblog.wordpress.com/
or
http://myblog.com/

Share International Journal of Biometrics and Bioinformatics (IJBB) Volume 2, Issue 1, February 2008. as:

From:

To:

Share International Journal of Biometrics and Bioinformatics (IJBB) Volume 2, Issue 1, February 2008..

Enter two words as shown below. If you cannot read the words, click the refresh icon.

loading

Share International Journal of Biometrics and Bioinformatics (IJBB) Volume 2, Issue 1, February 2008. as:

Copy html code above and paste to your web page.

loading