Login / Register for free to get access to My MedWorm

Journal of Bioinformatics and Computational BiologyJournal of Bioinformatics and Computational Biology RSS feedThis is an RSS file. You can use it to subscribe to this data in your favourite RSS reader, such as GoogleReader, or to display this data on your own website or blog. subscribe with MyMedWormSubscribe to this data using MyMedWorm.subscribe with GoogleReaderSubscribe to this data using GoogleReader.subscribe with BloglinesSubscribe to this data using Bloglines.subscribe with MyYahooSubscribe to this data using MyYahoo.

This page shows you the latest items in this publication.

304 records returned

Search similar protein structures with classification, sequence and 3d alignments.email this articleEmail this article to a colleague. save this article to My ClippingsSave this article to My Clippings. discuss this articleDiscuss or comment on this article.
We have developed an algorithm and web tool to search similar protein structures in the PDB (Protein Data Bank). The algorithm is a combination of a series of methods including protein classification, geometric feature extraction, sequence alignment, and 3D structure alignment. Given a protein structure, the tool can efficiently discover similar structures from hundreds of thousands of structures stored in the PDB. Our experimental results show that it is more accurate than other well-known protein search systems including PSI-BLAST, 3D-BLAST, and SSM in finding proteins that are structurally similar to the query prote...
Source: Journal of Bioinformatics and Computational Biology - September 29, 2009 Category: Bioinformatics Authors: Lu Z, Zhao Z, Garcia S, Krishnaswamy K, Fu B Tags: J Bioinform Comput Biol Source Type: journals

Protein fold classification with genetic algorithms and feature selection.email this articleEmail this article to a colleague. save this article to My ClippingsSave this article to My Clippings. discuss this articleDiscuss or comment on this article.
Protein fold classification is a key step to predicting protein tertiary structures. This paper proposes a novel approach based on genetic algorithms and feature selection to classifying protein folds. Our dataset is divided into a training dataset and a test dataset. Each individual for the genetic algorithms represents a selection function of the feature vectors of the training dataset. A support vector machine is applied to each individual to evaluate the fitness value (fold classification rate) of each individual. The aim of the genetic algorithms is to search for the best individual that produces the highest fold ...
Source: Journal of Bioinformatics and Computational Biology - September 29, 2009 Category: Bioinformatics Authors: Chen P, Liu C, Burge L, Mahmood M, Southerland W, Gloster C Tags: J Bioinform Comput Biol Source Type: journals

Predicting local quality of a sequence-structure alignment.email this articleEmail this article to a colleague. save this article to My ClippingsSave this article to My Clippings. discuss this articleDiscuss or comment on this article.
We present two complementary techniques, FragQA and PosQA, to accurately predict local quality of a sequence-structure (i.e. sequence-template) alignment generated by comparative modeling (i.e. homology modeling and threading). FragQA and PosQA predict local quality from two different perspectives. Different from existing methods, FragQA directly predicts cRMSD between a continuously aligned fragment determined by an alignment and the corresponding fragment in the native structure, while PosQA predicts the quality of an individual aligned position. Both FragQA and PosQA use an SVM (Support Vector Machine) regression method...
Source: Journal of Bioinformatics and Computational Biology - September 29, 2009 Category: Bioinformatics Authors: Gao X, Xu J, Li SC, Li M Tags: J Bioinform Comput Biol Source Type: journals

Efficient simulation of ligand-receptor binding processes using the conformation dynamics approach.email this articleEmail this article to a colleague. save this article to My ClippingsSave this article to My Clippings. discuss this articleDiscuss or comment on this article.
The understanding of biological ligand-receptor binding processes is relevant for a variety of research topics and assists the rational design of novel drug molecules. Computer simulation can help to advance this understanding, but, due to the high dimensionality of according systems, suffers from the severe computational cost. Based on the framework provided by conformation dynamics and transition state theory, a novel heuristic approach of simulating ligand-receptor binding processes is introduced, which is not dependent on calculating lengthy molecular dynamics trajectories. First, the relevant portion of conformati...
Source: Journal of Bioinformatics and Computational Biology - September 29, 2009 Category: Bioinformatics Authors: Bujotzek A, Weber M Tags: J Bioinform Comput Biol Source Type: journals

Iterative two-pass algorithm for missing data imputation in SNP arrays.email this articleEmail this article to a colleague. save this article to My ClippingsSave this article to My Clippings. discuss this articleDiscuss or comment on this article.
Though nowadays high-throughput genotyping techniques' quality improves, missing data still remains fairly common. Studies have shown that even a low percentage of missing SNPs is detrimental to the reliability of down-stream analyses such as SNP-disease association tests. This paper investigates the potentiality for improving the accuracy of an SNP inference method based on the algorithm formerly designed by Roberts and co-workers (NPUTE, 2007). This initial algorithm performs a single scan of an SNP array, inferring missing SNPs in the context of sliding windows. We have first designed a variant, KNNWinOpti, which fu...
Source: Journal of Bioinformatics and Computational Biology - September 29, 2009 Category: Bioinformatics Authors: Sinoquet C Tags: J Bioinform Comput Biol Source Type: journals

A novel coherence measure for discovering scaling biclusters from gene expression data.email this articleEmail this article to a colleague. save this article to My ClippingsSave this article to My Clippings. discuss this articleDiscuss or comment on this article.
Biclustering methods are used to identify a subset of genes that are co-regulated in a subset of experimental conditions in microarray gene expression data. Many biclustering algorithms rely on optimizing mean squared residue to discover biclusters from a gene expression dataset. Recently it has been proved that mean squared residue is only good in capturing constant and shifting biclusters. However, scaling biclusters cannot be detected using this metric. In this article, a new coherence measure called scaling mean squared residue (SMSR) is proposed. Theoretically it has been proved that the proposed new measure is ab...
Source: Journal of Bioinformatics and Computational Biology - September 29, 2009 Category: Bioinformatics Authors: Mukhopadhyay A, Maulik U, Bandyopadhyay S Tags: J Bioinform Comput Biol Source Type: journals

Asymptotics of canonical and saturated RNA secondary structures.email this articleEmail this article to a colleague. save this article to My ClippingsSave this article to My Clippings. discuss this articleDiscuss or comment on this article.
It is a classical result of Stein and Waterman that the asymptotic number of RNA secondary structures is 1.104366 . n(-3/2) . 2.618034(n). In this paper, we study combinatorial asymptotics for two special subclasses of RNA secondary structures - canonical and saturated structures. Canonical secondary structures are defined to have no lonely (isolated) base pairs. This class of secondary structures was introduced by Bompfünewerer et al., who noted that the run time of Vienna RNA Package is substantially reduced when restricting computations to canonical structures. Here we provide an explanation for the speed-up, b...
Source: Journal of Bioinformatics and Computational Biology - September 29, 2009 Category: Bioinformatics Authors: Clote P, Kranakis E, Krizanc D, Salvy B Tags: J Bioinform Comput Biol Source Type: journals

A note on the calculation of N-statistics.email this articleEmail this article to a colleague. save this article to My ClippingsSave this article to My Clippings. discuss this articleDiscuss or comment on this article.
A class of statistics suitable for testing against equality of multivariate distributions is described by Klebanov and co-workers in 2007. Referred to as N-statistics, their discriminating ability is based on various forms of distance kernels in R(d), the intention being to capture distinct forms of deviation from equality. This makes them particularly suitable for large-scale genomic screening applications, in which such variety of alternatives can be anticipated. One of these kernels, denoted as L(4), introduces weighting by directional densities, hence the evaluation of L(4) requires integration on the unit sphere i...
Source: Journal of Bioinformatics and Computational Biology - September 29, 2009 Category: Bioinformatics Authors: Almudevar A Tags: J Bioinform Comput Biol Source Type: journals

Uniqueness, intractability and exact algorithms: reflections on level-k phylogenetic networks.email this articleEmail this article to a colleague. save this article to My ClippingsSave this article to My Clippings. discuss this articleDiscuss or comment on this article.
Phylogenetic networks provide a way to describe and visualize evolutionary histories that have undergone so-called reticulate evolutionary events such as recombination, hybridization or horizontal gene transfer. The level k of a network determines how non-treelike the evolution can be, with level-0 networks being trees. We study the problem of constructing level-k phylogenetic networks from triplets, i.e. phylogenetic trees for three leaves (taxa). We give, for each k, a level-k network that is uniquely defined by its triplets. We demonstrate the applicability of this result by using it to prove that (1) for all k >...
Source: Journal of Bioinformatics and Computational Biology - July 28, 2009 Category: Bioinformatics Authors: VAN Iersel L, Kelk S, Mnich M Tags: J Bioinform Comput Biol Source Type: journals

The net-hmm approach: phylogenetic network inference by combining maximum likelihood and hidden markov models.email this articleEmail this article to a colleague. save this article to My ClippingsSave this article to My Clippings. discuss this articleDiscuss or comment on this article.
We describe the properties of the NET-HMM, devise efficient algorithms for solving a set of problems related to it, and implement them in software. We also provide a novel complementary significance test for evaluating the fitness of a model (NET-HMM) to a given dataset. Using NET-HMM, we are able to answer interesting biological questions, such as inferring the length of partial HGT's and the affected nucleotides in the genomic sequences, as well as inferring the exact location of HGT events along the tree branches. These advantages are demonstrated through the analysis of synthetical inputs and three different biological...
Source: Journal of Bioinformatics and Computational Biology - July 28, 2009 Category: Bioinformatics Authors: Snir S, Tuller T Tags: J Bioinform Comput Biol Source Type: journals

Curve-based clustering of time course gene expression data using self-organizing maps.email this articleEmail this article to a colleague. save this article to My ClippingsSave this article to My Clippings. discuss this articleDiscuss or comment on this article.
There is an increasing interest in clustering time course gene expression data to investigate a wide range of biological processes. However, developing a clustering algorithm ideal for time course gene express data is still challenging. As timing is an important factor in defining true clusters, a clustering algorithm shall explore expression correlations between time points in order to achieve a high clustering accuracy. Moreover, inter-cluster gene relationships are often desired in order to facilitate the computational inference of biological pathways and regulatory networks. In this paper, a new clustering algorith...
Source: Journal of Bioinformatics and Computational Biology - July 28, 2009 Category: Bioinformatics Authors: Chen X Tags: J Bioinform Comput Biol Source Type: journals

Comparing pearson, spearman and hoeffding's d measure for gene expression association analysis.email this articleEmail this article to a colleague. save this article to My ClippingsSave this article to My Clippings. discuss this articleDiscuss or comment on this article.
DNA microarrays have become a powerful tool to describe gene expression profiles associated with different cellular states, various phenotypes and responses to drugs and other extra- or intra-cellular perturbations. In order to cluster co-expressed genes and/or to construct regulatory networks, definition of distance or similarity between measured gene expression data is usually required, the most common choices being Pearson's and Spearman's correlations. Here, we evaluate these two methods and also compare them with a third one, namely Hoeffding's D measure, which is used to infer nonlinear and non-monotonic associat...
Source: Journal of Bioinformatics and Computational Biology - July 28, 2009 Category: Bioinformatics Authors: Fujita A, Sato JR, Demasi MA, Sogayar MC, Ferreira CE, Miyano S Tags: J Bioinform Comput Biol Source Type: journals

Aliasing in gene feature detection by projective methods.email this articleEmail this article to a colleague. save this article to My ClippingsSave this article to My Clippings. discuss this articleDiscuss or comment on this article.
Because of measurements obtained under limited experimental conditions or time points compared to the presence of many genes, also known as the "large dimension, small sample size" problem, dimensionality reduction techniques are a common practice in statistical bioinformatics involving microarray analysis. However, in order to improve the performance of reverse engineering and statistical inference procedures aimed to estimate gene-gene connectivity links, some kind of regularization is usually needed to reduce the overall data complexities, together with ad hoc feature selection to uncover biologically relevant gene ...
Source: Journal of Bioinformatics and Computational Biology - July 28, 2009 Category: Bioinformatics Authors: Capobianco E Tags: J Bioinform Comput Biol Source Type: journals

Clustering-based approach for predicting motif pairs from protein interaction data.email this articleEmail this article to a colleague. save this article to My ClippingsSave this article to My Clippings. discuss this articleDiscuss or comment on this article.
Predicting motif pairs from a set of protein sequences based on the protein-protein interaction data is an important, but difficult computational problem. Tan et al. proposed a solution to this problem. However, the scoring function (using chi(2) testing) used in their approach is not adequate and their approach is also not scalable. It may take days to process a set of 5000 protein sequences with about 20,000 interactions. Later, Leung et al. proposed an improved scoring function and faster algorithms for solving the same problem. But, the model used in Leung et al. is complicated. The exact value of the scoring funct...
Source: Journal of Bioinformatics and Computational Biology - July 28, 2009 Category: Bioinformatics Authors: Leung HC, Siu MH, Yiu SM, Chin FY, Sung KW Tags: J Bioinform Comput Biol Source Type: journals

Inference of large-scale gene regulatory networks using regression-based network approach.email this articleEmail this article to a colleague. save this article to My ClippingsSave this article to My Clippings. discuss this articleDiscuss or comment on this article.
In this study, we propose a simple procedure for constructing large scale gene regulatory networks using a regression-based network approach. We determine the optimal out-degree of network structure by using the sum of squared coefficients which are obtained from all appropriate regression models. Through the simulated data, accuracy of estimation and robustness against noise are computed in order to compare with the vector autoregressive regression model. Our method shows high accuracy and robustness for inferring large-scale gene networks. Also it is applied to Caulobacter crecentus cell cycle data consisting of 1472 gen...
Source: Journal of Bioinformatics and Computational Biology - July 28, 2009 Category: Bioinformatics Authors: Kim H, Lee JK, Park T Tags: J Bioinform Comput Biol Source Type: journals

A tutorial of techniques for improving standard hidden markov model algorithms.email this articleEmail this article to a colleague. save this article to My ClippingsSave this article to My Clippings. discuss this articleDiscuss or comment on this article.
In this tutorial, we discuss two main algorithms for Hidden Markov Models or HMMs: the Viterbi algorithm and the expectation phase of the Baum-Welch algorithm, and we describe ways to improve their naïve implementations. For the Baum-Welch algorithm we first present an implementation of the expectation computations using constant space. We then discuss the classical implementation of this calculation and describe ways to reduce its space usage to logarithmic and $O(\sqrt n)$, with their respective CPU costs. We also note where each respective algorithm can be parallelized. For the Viterbi algorithm, we describe $O...
Source: Journal of Bioinformatics and Computational Biology - July 28, 2009 Category: Bioinformatics Authors: Golod D, Brown DG Tags: J Bioinform Comput Biol Source Type: journals

A fast and accurate algorithm for comparative analysis of metabolic pathways.email this articleEmail this article to a colleague. save this article to My ClippingsSave this article to My Clippings. discuss this articleDiscuss or comment on this article.
Pathways show how different biochemical entities interact with one another to perform vital functions for the survival of an organism. Comparative analysis of pathways is crucial in identifying functional similarities that are difficult to identify by comparing individual entities that build up these pathways. When interacting entities are of single type, the problem of identifying similarities by aligning the pathways can be reduced to graph isomorphism problem. For pathways with varying types of entities such as metabolic pathways, alignment problem is even more challenging. In order to simplify this problem, existin...
Source: Journal of Bioinformatics and Computational Biology - June 1, 2009 Category: Bioinformatics Authors: Ay F, Kahveci T, DE Crécy-Lagard V Tags: J Bioinform Comput Biol Source Type: journals

Efficient computation of kinship and identity coefficients on large pedigrees.email this articleEmail this article to a colleague. save this article to My ClippingsSave this article to My Clippings. discuss this articleDiscuss or comment on this article.
With the rapidly expanding field of medical genetics and genetic counseling, genealogy information is becoming increasingly abundant. An important computation on pedigree data is the calculation of identity coefficients, which provide a complete description of the degree of relatedness of a pair of individuals. The areas of application of identity coefficients are numerous and diverse, from genetic counseling to disease tracking, and thus, the computation of identity coefficients merits special attention. However, the computation of identity coefficients is not done directly, but rather as the final step after computin...
Source: Journal of Bioinformatics and Computational Biology - June 1, 2009 Category: Bioinformatics Authors: Cheng E, Elliott B, Ozsoyoglu ZM Tags: J Bioinform Comput Biol Source Type: journals

An orfome assembly approach to metagenomics sequences analysis.email this articleEmail this article to a colleague. save this article to My ClippingsSave this article to My Clippings. discuss this articleDiscuss or comment on this article.
Metagenomics is an emerging methodology for the direct genomic analysis of a mixed community of uncultured microorganisms. The current analyses of metagenomics data largely rely on the computational tools originally designed for microbial genomics projects. The challenge of assembling metagenomic sequences arises mainly from the short reads and the high species complexity of the community. Alternatively, individual (short) reads will be searched directly against databases of known genes (or proteins) to identify homologous sequences. The latter approach may have low sensitivity and specificity in identifying homologous...
Source: Journal of Bioinformatics and Computational Biology - June 1, 2009 Category: Bioinformatics Authors: Ye Y, Tang H Tags: J Bioinform Comput Biol Source Type: journals

Graph wavelet alignment kernels for drug virtual screening.email this articleEmail this article to a colleague. save this article to My ClippingsSave this article to My Clippings. discuss this articleDiscuss or comment on this article.
In this paper, we introduce a novel statistical modeling technique for target property prediction, with applications to virtual screening and drug design. In our method, we use graphs to model chemical structures and apply a wavelet analysis of graphs to summarize features capturing graph local topology. We design a novel graph kernel function to utilize the topology features to build predictive models for chemicals via Support Vector Machine classifier. We call the new graph kernel a graph wavelet-alignment kernel. We have evaluated the efficacy of the wavelet-alignment kernel using a set of chemical structure-activit...
Source: Journal of Bioinformatics and Computational Biology - June 1, 2009 Category: Bioinformatics Authors: Smalter A, Huan J, Lushington G Tags: J Bioinform Comput Biol Source Type: journals

Gene loss under neighborhood selection following whole genome duplication and the reconstruction of the ancestral populus genome.email this articleEmail this article to a colleague. save this article to My ClippingsSave this article to My Clippings. discuss this articleDiscuss or comment on this article.
We develop criteria to detect neighborhood selection effects on gene loss following whole genome duplication, and apply them to the recently sequenced poplar (Populus trichocarpa) genome. We improve on guided genome halving algorithms so that several thousand gene sets, each containing two paralogs in the descendant T of the doubling event and their single ortholog from an undoubled reference genome R, can be analyzed to reconstruct the ancestor A of T at the time of doubling. At the same time, large numbers of defective gene sets, either missing one paralog from T or missing their ortholog in R, may be incorporated in...
Source: Journal of Bioinformatics and Computational Biology - June 1, 2009 Category: Bioinformatics Authors: Zheng C, Kerr Wall P, Leebens-Mack J, DE Pamphilis C, Albert VA, Sankoff D Tags: J Bioinform Comput Biol Source Type: journals

An almost linear time algorithm for a general haplotype solution on tree pedigrees with no recombination and its extensions.email this articleEmail this article to a colleague. save this article to My ClippingsSave this article to My Clippings. discuss this articleDiscuss or comment on this article.
We study the haplotype inference problem from pedigree data under the zero recombination assumption, which is well supported by real data for tightly linked markers (i.e. single nucleotide polymorphisms (SNPs)) over a relatively large chromosome segment. We solve the problem in a rigorous mathematical manner by formulating genotype constraints as a linear system of inheritance variables. We then utilize disjoint-set structures to encode connectivity information among individuals, to detect constraints from genotypes, and to check consistency of constraints. On a tree pedigree without missing data, our algorithm can out...
Source: Journal of Bioinformatics and Computational Biology - June 1, 2009 Category: Bioinformatics Authors: Li X, Li J Tags: J Bioinform Comput Biol Source Type: journals

Peak detection in mass spectrometry by gabor filters and envelope analysis.email this articleEmail this article to a colleague. save this article to My ClippingsSave this article to My Clippings. discuss this articleDiscuss or comment on this article.
Mass Spectrometry (MS) is increasingly being used to discover diseases-related proteomic patterns. The peak detection step is one of the most important steps in the typical analysis of MS data. Recently, many new algorithms have been proposed to increase true position rate with low false discovery rate in peak detection. Most of them follow two approaches: one is the denoising approach and the other is the decomposing approach. In the previous studies, the decomposition of MS data method shows more potential than the first one. In this paper, we propose two novel methods, named GaborLocal and GaborEnvelop, both of whic...
Source: Journal of Bioinformatics and Computational Biology - June 1, 2009 Category: Bioinformatics Authors: Nguyen N, Huang H, Oraintara S, Vo A Tags: J Bioinform Comput Biol Source Type: journals

Iterative non-sequential protein structural alignment.email this articleEmail this article to a colleague. save this article to My ClippingsSave this article to My Clippings. discuss this articleDiscuss or comment on this article.
Structural similarity between proteins gives us insights into their evolutionary relationships when there is low sequence similarity. In this paper, we present a novel approach called SNAP for non-sequential pair-wise structural alignment. Starting from an initial alignment, our approach iterates over a two-step process consisting of a superposition step and an alignment step, until convergence. We propose a novel greedy algorithm to construct both sequential and non-sequential alignments. The quality of SNAP alignments were assessed by comparing against the manually curated reference alignments in the challenging SISY...
Source: Journal of Bioinformatics and Computational Biology - June 1, 2009 Category: Bioinformatics Authors: Salem S, Zaki MJ, Bystroff C Tags: J Bioinform Comput Biol Source Type: journals

Supervised ensembles of prediction methods for subcellular localization.email this articleEmail this article to a colleague. save this article to My ClippingsSave this article to My Clippings. discuss this articleDiscuss or comment on this article.
In the past decade, many automated prediction methods for the subcellular localization of proteins have been proposed, utilizing a wide range of principles and learning approaches. Based on an experimental evaluation of different methods and their theoretical properties, we propose to combine a well-balanced set of existing approaches to new, ensemble-based prediction methods. The experimental evaluation shows that our ensembles improve substantially over the underlying base methods. PMID: 19340915 [PubMed - in process] (Source: Journal of Bioinformatics and Computational Biology)
Source: Journal of Bioinformatics and Computational Biology - April 1, 2009 Category: Bioinformatics Authors: Assfalg J, Gong J, Kriegel HP, Pryakhin A, Wei T, Zimek A Tags: J Bioinform Comput Biol Source Type: journals

Alignment of minisatellite maps based on run-length encoding scheme.email this articleEmail this article to a colleague. save this article to My ClippingsSave this article to My Clippings. discuss this articleDiscuss or comment on this article.
Subsequent duplication events are responsible for the evolution of the minisatellite maps. Alignment of two minisatellite maps should therefore take these duplication events into account, in addition to the well-known edit operations. All algorithms for computing an optimal alignment of two maps, including the one presented here, first deduce the costs of optimal duplication scenarios for all substrings of the given maps. Then, they incorporate the pre-computed costs in the alignment recurrence. However, all previous algorithms addressing this problem are dependent on the number of distinct map units (map alphabet) and...
Source: Journal of Bioinformatics and Computational Biology - April 1, 2009 Category: Bioinformatics Authors: Abouelhoda MI, Giegerich R, Behzadi B, Steyaert JM Tags: J Bioinform Comput Biol Source Type: journals

Automatic modeling of signaling pathways by network flow model.email this articleEmail this article to a colleague. save this article to My ClippingsSave this article to My Clippings. discuss this articleDiscuss or comment on this article.
Signal transduction is an important process that controls cell proliferation, metabolism, differentiation, and so on. Effective computational models which unravel such a process by taking advantage of high-throughput genomic and proteomic data are highly demanded to understand the essential mechanisms underlying signal transduction. Since protein-protein interaction (PPI) plays an important role in signal transduction, in this paper, we present a novel method for modeling signaling pathways from PPI networks automatically. Given an undirected weighted protein interaction network, finding signaling pathways is treated a...
Source: Journal of Bioinformatics and Computational Biology - April 1, 2009 Category: Bioinformatics Authors: Zhao XM, Wang RS, Chen L, Aihara K Tags: J Bioinform Comput Biol Source Type: journals

Symbolic approaches for finding control strategies in Boolean Networks.email this articleEmail this article to a colleague. save this article to My ClippingsSave this article to My Clippings. discuss this articleDiscuss or comment on this article.
We present an exact algorithm, based on techniques from the field of Model Checking, for finding control policies for Boolean Networks (BN) with control nodes. Given a BN, a set of starting states, I, a set of goal states, F, and a target time, t, our algorithm automatically finds a sequence of control signals that deterministically drives the BN from I to F at, or before time t, or else guarantees that no such policy exists. Despite recent hardness-results for finding control policies for BNs, we show that, in practice, our algorithm runs in seconds to minutes on over 13,400 BNs of varying sizes and topologies, including ...
Source: Journal of Bioinformatics and Computational Biology - April 1, 2009 Category: Bioinformatics Authors: Langmead CJ, Jha SK Tags: J Bioinform Comput Biol Source Type: journals

Simultaneously segmenting multiple gene expression time courses by analyzing cluster dynamics.email this articleEmail this article to a colleague. save this article to My ClippingsSave this article to My Clippings. discuss this articleDiscuss or comment on this article.
We present a new approach to segmenting multiple time series by analyzing the dynamics of cluster formation and rearrangement around putative segment boundaries. This approach finds application in distilling large numbers of gene expression profiles into temporal relationships underlying biological processes. By directly minimizing information-theoretic measures of segmentation quality derived from Kullback-Leibler (KL) divergences, our formulation reveals clusters of genes along with a segmentation such that clusters show concerted behavior within segments but exhibit significant regrouping across segmentation boundaries....
Source: Journal of Bioinformatics and Computational Biology - April 1, 2009 Category: Bioinformatics Authors: Tadepalli S, Ramakrishnan N, Watson LT, Mishra B, Helm RF Tags: J Bioinform Comput Biol Source Type: journals

Genome halving with double cut and join.email this articleEmail this article to a colleague. save this article to My ClippingsSave this article to My Clippings. discuss this articleDiscuss or comment on this article.
The genome halving problem, previously solved by El-Mabrouk for inversions and reciprocal translocations, is here solved in a more general context allowing transpositions and block interchange as well, for genomes including multiple linear and circular chromosomes. We apply this to several datasets and compare the results to the previous algorithm. PMID: 19340920 [PubMed - in process] (Source: Journal of Bioinformatics and Computational Biology)
Source: Journal of Bioinformatics and Computational Biology - April 1, 2009 Category: Bioinformatics Authors: Warren R, Sankoff D Tags: J Bioinform Comput Biol Source Type: journals

Finding non-coding RNAs through genome-scale clustering.email this articleEmail this article to a colleague. save this article to My ClippingsSave this article to My Clippings. discuss this articleDiscuss or comment on this article.
We present an efficient method for finding potential ncRNAs in bacteria by clustering genomic sequences based on homology inferred from both primary sequence and secondary structure. We evaluate our approach using a set of predominantly Firmicutes sequences. Our results showed that, though primary sequence based-homology search was inaccurate for diverged ncRNA sequences, through our clustering method, we were able to infer motifs that recovered nearly all members of most known ncRNA families. Hence, our method shows promise for discovering new families of ncRNA. PMID: 19340921 [PubMed - in process] (Source: Journal of...
Source: Journal of Bioinformatics and Computational Biology - April 1, 2009 Category: Bioinformatics Authors: Tseng HH, Weinberg Z, Gore J, Breaker RR, Ruzzo WL Tags: J Bioinform Comput Biol Source Type: journals

Co-expression among constituents of a motif in the protein-protein interaction network.email this articleEmail this article to a colleague. save this article to My ClippingsSave this article to My Clippings. discuss this articleDiscuss or comment on this article.
Almost all cellular functions are the results of well-coordinated interactions between various proteins. A more connected hub or motif in the interaction network is expected to be more important, and any perturbation in this motif would be more damaging to the smooth performance of the related functions. Thus, some coherent robustness of these hubs has to be derived. Here, we provide the global evidence that interaction hubs obtain their robustness against uneven protein concentrations through co-expression of the constituents, and that the degree of co-expression correlates strongly with the complexity of the embedded...
Source: Journal of Bioinformatics and Computational Biology - February 1, 2009 Category: Bioinformatics Authors: Bhardwaj N, Lu H Tags: J Bioinform Comput Biol Source Type: journals

A universal operon predictor for prokaryotic genomes.email this articleEmail this article to a colleague. save this article to My ClippingsSave this article to My Clippings. discuss this articleDiscuss or comment on this article.
Identification of operons at the genome scale of prokaryotic organisms represents a key step in deciphering of their transcriptional regulation machinery, biological pathways, and networks. While numerous computational methods have been shown to be effective in predicting operons for well-studied organisms such as Escherichia coli K12 and Bacillus subtilis 168, these methods generally do not generalize well to genomes other than the ones used to train the methods, or closely related genomes because they rely on organism-specific information. Several methods have been explored to address this problem through utilizing o...
Source: Journal of Bioinformatics and Computational Biology - February 1, 2009 Category: Bioinformatics Authors: Li G, Che D, Xu Y Tags: J Bioinform Comput Biol Source Type: journals

Counting of oligomers in sequences generated by markov chains for DNA motif discovery.email this articleEmail this article to a colleague. save this article to My ClippingsSave this article to My Clippings. discuss this articleDiscuss or comment on this article.
By means of the technique of the imbedded Markov chain, an efficient algorithm is proposed to exactly calculate first, second moments of word counts and the probability for a word to occur at least once in random texts generated by a Markov chain. A generating function is introduced directly from the imbedded Markov chain to derive asymptotic approximations for the problem. Two Z-scores, one based on the number of sequences with hits and the other on the total number of word hits in a set of sequences, are examined for discovery of motifs on a set of promoter sequences extracted from A. thaliana genome. Source code is ...
Source: Journal of Bioinformatics and Computational Biology - February 1, 2009 Category: Bioinformatics Authors: Shan G, Zheng WM Tags: J Bioinform Comput Biol Source Type: journals

Constant time clash detection in protein folding.email this articleEmail this article to a colleague. save this article to My ClippingsSave this article to My Clippings. discuss this articleDiscuss or comment on this article.
Applications for the manipulation of molecular structures are usually computationally intensive. Problems like protein docking or ab-initio protein folding need to frequently determine if two atoms in the structure collide. Therefore, an efficient algorithm for this problem, usually referred as clash detection, can greatly improve the application efficiency. This work focus mainly on the ab-initio protein folding problem. A naive approach for the clash problem, the most commonly-used by molecular structure programs, consists in calculating the distance between every pair of atoms. We propose an efficient data structure...
Source: Journal of Bioinformatics and Computational Biology - February 1, 2009 Category: Bioinformatics Authors: Bugalho MM, Oliveira AL Tags: J Bioinform Comput Biol Source Type: journals

Rank-based clustering analysis for the time-course microarray data.email this articleEmail this article to a colleague. save this article to My ClippingsSave this article to My Clippings. discuss this articleDiscuss or comment on this article.
Microarray technology allows the monitoring of expression levels for thousands of genes simultaneously. In time-course microarray experiments in which gene expression is monitored over time, we are interested in clustering genes that show similar temporal profiles and identifying genes that show a pre-specified candidate profile. Unfortunately, many traditional clustering methods used for analyzing microarray data do not effectively detect temporal profiles for the time-course microarray data. We propose a rank-based clustering analysis for the time-course microarray data. Our clustering method consists of two steps: t...
Source: Journal of Bioinformatics and Computational Biology - February 1, 2009 Category: Bioinformatics Authors: Yi SG, Joo YJ, Park T Tags: J Bioinform Comput Biol Source Type: journals

Transient response analysis of the eukaryotic chemosensory system to intra-cellular fluctuations.email this articleEmail this article to a colleague. save this article to My ClippingsSave this article to My Clippings. discuss this articleDiscuss or comment on this article.
Like prokaryotic cells, those of eukaryotes are also subjected to noise from within the cells. While the cells have a built-in mechanism to attenuate the noise, conditions may arise where this is beyond the cell's ability to regulate. Start-up perturbations and those induced by metabolic shifts are examples of such situations. Then, it becomes useful to understand how the cells respond. For a eukaryotic chemosensory system, this has been studied by applying response coefficient analysis to a recent model. With even three dependent variables - an activator, an inhibitor, and a response element - the response coefficient...
Source: Journal of Bioinformatics and Computational Biology - February 1, 2009 Category: Bioinformatics Authors: Patnaik PR Tags: J Bioinform Comput Biol Source Type: journals

Hybrid modeling in biochemical systems theory by means of functional petri nets.email this articleEmail this article to a colleague. save this article to My ClippingsSave this article to My Clippings. discuss this articleDiscuss or comment on this article.
In this study, we extend BST to hybrid modeling within the framework of Hybrid Functional Petri Nets (HFPN). First, we show how the canonical GMA and S-system models in BST can be directly implemented in a standard Petri Net framework. In a second step we demonstrate how to account for different types of time delays as well as for discrete, stochastic, and switching effects. Using representative test cases, we validate the hybrid modeling approach through comparative analyses and simulations with other approaches and highlight the feasibility, quality, and efficiency of the hybrid method. PMID: 19226663 [PubMed - in pr...
Source: Journal of Bioinformatics and Computational Biology - February 1, 2009 Category: Bioinformatics Authors: Wu J, Voit E Tags: J Bioinform Comput Biol Source Type: journals

Analyzing microarray data with transitive directed acyclic graphs.email this articleEmail this article to a colleague. save this article to My ClippingsSave this article to My Clippings. discuss this articleDiscuss or comment on this article.
Post hoc assignment of patterns determined by all pairwise comparisons in microarray experiments with multiple treatments has been proven to be useful in assessing treatment effects. We propose the usage of transitive directed acyclic graphs (tDAG) as the representation of these patterns and show that such representation can be useful in clustering treatment effects, annotating existing clustering methods, and analyzing sample sizes. Advantages of this approach include: (1) unique and descriptive meaning of each cluster in terms of how genes respond to all pairs of treatments; (2) insensitivity of the observed patterns...
Source: Journal of Bioinformatics and Computational Biology - February 1, 2009 Category: Bioinformatics Authors: Phan V, Olusegun George E, Tran QT, Goodwin S, Bodreddigari S, Sutter TR Tags: J Bioinform Comput Biol Source Type: journals

Evaluation of inter-laboratory and cross-platform concordance of DNA microarrays through discriminating genes and classifier transferability.email this articleEmail this article to a colleague. save this article to My ClippingsSave this article to My Clippings. discuss this articleDiscuss or comment on this article.
Microarray technology has great potential for improving our understanding of biological processes, medical conditions, and diseases. Often, microarray datasets are collected using different microarray platforms (provided by different companies) under different conditions in different laboratories. The cross-platform and cross-laboratory concordance of the microarray technology needs to be evaluated before it can be successfully and reliably applied in biological/clinical practice. New measures and techniques are proposed for comparing and evaluating the quality of microarray datasets generated from different platforms/...
Source: Journal of Bioinformatics and Computational Biology - February 1, 2009 Category: Bioinformatics Authors: Mao S, Wang C, Dong G Tags: J Bioinform Comput Biol Source Type: journals

Genetic network analysis by quasi-bayesian method.email this articleEmail this article to a colleague. save this article to My ClippingsSave this article to My Clippings. discuss this articleDiscuss or comment on this article.
Genetic network analysis provides an important statistical strategy for the study of gene-gene interactions. Although existing methods work well in practice, several opportunities for improvement remain. For example, the regulation coefficients of some of the existing methods are not easy to solve, nor are the solutions they provide unique. Also, as genetic network analysis are typically applied to small datasets with large number of parameters, having prior knowledge about the parameters is valuable and should be incorporated into the analysis. The uniqueness of the parameter estimate and computational simplicity are ...
Source: Journal of Bioinformatics and Computational Biology - February 1, 2009 Category: Bioinformatics Authors: Yuan A, Chen G, Rotimi C Tags: J Bioinform Comput Biol Source Type: journals

Clustering of gene expression data and end-point measurements by simulated annealing.email this articleEmail this article to a colleague. save this article to My ClippingsSave this article to My Clippings. discuss this articleDiscuss or comment on this article.
Most clustering techniques do not incorporate phenotypic data. Limited biological interpretation is garnered from the informal process of clustering biological samples and then labeling groups with the phenotypes of the samples. A more formal approach of clustering samples is presented. The method utilizes simulated annealing of the Modk-prototypes objective function. Separate weighting terms are used for microarray, clinical chemistry, and histopathology measurements to control the influence of each data domain on the clustering of the samples. The weights are adapted during the clustering process. A cluster's prototy...
Source: Journal of Bioinformatics and Computational Biology - February 1, 2009 Category: Bioinformatics Authors: Bushel PR Tags: J Bioinform Comput Biol Source Type: journals

Clustering algorithms for detecting functional modules in protein interaction networks.email this articleEmail this article to a colleague. save this article to My ClippingsSave this article to My Clippings. discuss this articleDiscuss or comment on this article.
Protein-Protein Interaction (PPI) networks are believed to be important sources of information related to biological processes and complex metabolic functions of the cell. When studying the workings of a biological cell, it is useful to be able to detect known and predict still undiscovered protein complexes within the cell's PPI networks. Such predictions may be used as an inexpensive tool to direct biological experiments. The increasing amount of available PPI data necessitate a fast, accurate approach to biological complex identification. Because of its importance in the studies of protein interaction network, there...
Source: Journal of Bioinformatics and Computational Biology - February 1, 2009 Category: Bioinformatics Authors: Gao L, Sun PG, Song J Tags: J Bioinform Comput Biol Source Type: journals

Can complex cellular processes be governed by simple linear rules?email this articleEmail this article to a colleague. save this article to My ClippingsSave this article to My Clippings. discuss this articleDiscuss or comment on this article.
Complex living systems have shown remarkably well-orchestrated, self-organized, robust, and stable behavior under a wide range of perturbations. However, despite the recent generation of high-throughput experimental datasets, basic cellular processes such as division, differentiation, and apoptosis still remain elusive. One of the key reasons is the lack of understanding of the governing principles of complex living systems. Here, we have reviewed the success of perturbation-response approaches, where without the requirement of detailed in vivo physiological parameters, the analysis of temporal concentration or activat...
Source: Journal of Bioinformatics and Computational Biology - February 1, 2009 Category: Bioinformatics Authors: Selvarajoo K, Tomita M, Tsuchiya M Tags: J Bioinform Comput Biol Source Type: journals

A linear-time algorithm for predicting functional annotations from ppi networks.email this articleEmail this article to a colleague. save this article to My ClippingsSave this article to My Clippings. discuss this articleDiscuss or comment on this article.
We present a maximum likelihood formulation of the problem and the corresponding learning and inference algorithms. The time complexity of both algorithms is linear in the size of the PPI network, and our experimental results show that their accuracy in functional prediction outperforms current existing methods. PMID: 19090017 [PubMed - in process] (Source: Journal of Bioinformatics and Computational Biology)
Source: Journal of Bioinformatics and Computational Biology - December 1, 2008 Category: Bioinformatics Authors: Wu Y, Lonardi S Tags: J Bioinform Comput Biol Source Type: journals

Mining overrepresented 3d patterns of secondary structures in proteins.email this articleEmail this article to a colleague. save this article to My ClippingsSave this article to My Clippings. discuss this articleDiscuss or comment on this article.
We consider the problem of finding overrepresented arrangements of secondary structure elements (SSEs) in a given dataset of representative protein structures. While most papers in the literature study the distribution of geometrical properties, in particular angles and distances, between pairs of interacting SSEs, in this paper we focus on the distribution of angles of all quartets of SSEs and on the extraction of overrepresented angular patterns. We propose a variant of the Apriori method that obtains overrepresented arrangements of quartets of SSEs by combining arrangements of triplets of SSEs. This specific case wi...
Source: Journal of Bioinformatics and Computational Biology - December 1, 2008 Category: Bioinformatics Authors: Comin M, Guerra C, Zanotti G Tags: J Bioinform Comput Biol Source Type: journals

Visualizing microarray data for biomarker discovery by matrix reordering and replicator dynamics.email this articleEmail this article to a colleague. save this article to My ClippingsSave this article to My Clippings. discuss this articleDiscuss or comment on this article.
In most microarray data sets, there are often multiple sample classes, which are categorized into the normal or diseased type. Traditional feature selection methods consider multiple classes equally without paying attention to the upregulation/downregulation across the normal and diseased classes; while the specific gene selection methods for biomarker discovery particularly consider differential gene expressions across the normal and diseased classes, but ignore the existence of multiple classes. More importantly, there are few visualization algorithms to assist biomarker discovery from microarray data. In this paper,...
Source: Journal of Bioinformatics and Computational Biology - December 1, 2008 Category: Bioinformatics Authors: Liu Y, Li W Tags: J Bioinform Comput Biol Source Type: journals

An integrative domain-based approach to predicting protein-protein interactions.email this articleEmail this article to a colleague. save this article to My ClippingsSave this article to My Clippings. discuss this articleDiscuss or comment on this article.
Protein-protein interactions (PPIs) are intrinsic to almost all cellular processes. Different computational methods offer new chances to study PPIs. To predict PPIs, while the integrative methods use multiple data sources instead of a single source, the domain-based methods often use only protein domain features. Integration of both protein domain features and genomic/proteomic features from multiple databases can more effectively predict PPIs. Moreover, it allows discovering the reciprocal relationships between PPIs and biological features of their interacting partners. We developed a novel integrative domain-based me...
Source: Journal of Bioinformatics and Computational Biology - December 1, 2008 Category: Bioinformatics Authors: Nguyen TP, Ho TB Tags: J Bioinform Comput Biol Source Type: journals

Development of an affinity evaluation and prediction system by using the shape complementarity characteristic between proteins.email this articleEmail this article to a colleague. save this article to My ClippingsSave this article to My Clippings. discuss this articleDiscuss or comment on this article.
A system was developed to evaluate and predict the interaction between protein pairs by using the widely used shape complementarity search method as the algorithm for docking simulations between the proteins. This system, which we call the affinity evaluation and prediction (AEP) system, was used to evaluate the interaction between 20 protein pairs. The system first executes a "round robin" shape complementarity search of the target protein group, and evaluates the interaction of the complex structures obtained by shape complementarity search. These complex structures are selected by using a statistical procedure that ...
Source: Journal of Bioinformatics and Computational Biology - December 1, 2008 Category: Bioinformatics Authors: Tsukamoto K, Yoshikawa T, Hourai Y, Fukui K, Akiyama Y Tags: J Bioinform Comput Biol Source Type: journals

Duplicated RNA genes in teleost fish genomes.email this articleEmail this article to a colleague. save this article to My ClippingsSave this article to My Clippings. discuss this articleDiscuss or comment on this article.
We report here on a computational survey of structured non-coding RNAs (ncRNAs) in teleost genomes, focusing on the fate of fish-specific duplicates. As in other metazoan groups, we find evidence of a large number (11,543) of structured RNAs, most of which (~86%) are clade-specific or evolve so fast that their tetrapod homologs cannot be detected. In surprising contrast to protein-coding genes, the fish-specific genome duplication did not lead to a large number of paralogous ncRNAs: only 188 candidates, mostly microRNAs, appear in a larger copy number in teleosts than in tetrapods, suggesting that large-scale gene duplicat...
Source: Journal of Bioinformatics and Computational Biology - December 1, 2008 Category: Bioinformatics Authors: Rose D, Jöris J, Hackermüller J, Reiche K, Li Q, Stadler PF Tags: J Bioinform Comput Biol Source Type: journals