<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0">
    <channel>
        <title>International Journal of Data Mining and Bioinformatics via MedWorm.com</title>
        <description>MedWorm.com provides a medical RSS filtering service. Over 6000 RSS medical sources are combined and output via different filters. This feed contains the latest items from the 'International Journal of Data Mining and Bioinformatics' source.</description>
        <link><![CDATA[http://www.medworm.com/rss/search.php?qu=International+Journal+of+Data+Mining+and+Bioinformatics&t=International+Journal+of+Data+Mining+and+Bioinformatics&s=Search&f=source]]></link>
        <lastBuildDate>Thu, 09 Feb 2012 22:12:28 +0100</lastBuildDate>
        <item>
            <title>Identification of true EST alignments for recognising transcribed regions.</title>
            <link>http://www.medworm.com/index.php?rid=5492506&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D22145529%26dopt%3DAbstract</link>
            <description>In this study, three measures including Direction Check, Identity Check and Terminal Check were introduced to more effectively eliminate spurious EST alignments. On the basis of these introduced measures and other widely used measures, a computational tool, named ESTCleanser, has been developed to identify true EST alignments for obtaining reliable transcribed regions. The performance of ESTCleanser has been evaluated on the well-annotated human ENCyclopedia of DNA Elements (ENCODE) regions using human ESTs in the dbEST database. The evaluation results show that the accuracy of ESTCleanser at exon and intron levels is more remarkably enhanced than that of UCSC-spliced EST alignments. This work would be helpful to EST-based researches on finding new genes, complementing genome annotation, r...</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=5492506</comments>
            <pubDate>Sun, 11 Dec 2011 09:18:02 +0100</pubDate>
            <guid isPermaLink="false">5492506</guid>        </item>
        <item>
            <title>Multi-platform gene-expression mining and marker gene analysis.</title>
            <link>http://www.medworm.com/index.php?rid=5492505&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D22145530%26dopt%3DAbstract</link>
            <description>Authors: Xu Q, Xue H, Yang Q
    Abstract
    Gene-expression data are now widely available and used for a wide range of clinical and diagnostic purposes. A key challenge is to select a few significant marker genes for biological studies. While it is feasible to find important genes from a single gene-expression data set, it is often more meaningful to compare the results from different but related data sets together, especially for multiple gene-expression data sets arising from different studies of a common organism or phenotype. In this paper, we present a novel framework to exploit the commonalities across different data sets by jointly learning from different data sets simultaneously through multi-task feature learning. By identifying a common subspace of genes, we can help biologists...</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=5492505</comments>
            <pubDate>Sun, 11 Dec 2011 09:18:02 +0100</pubDate>
            <guid isPermaLink="false">5492505</guid>        </item>
        <item>
            <title>Robust classification ensemble method for microarray data.</title>
            <link>http://www.medworm.com/index.php?rid=5492504&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D22145531%26dopt%3DAbstract</link>
            <description>The objective of this study is to develop an accurate and robust classification ensemble method suitable for microarray data with noises. We proposed an algorithm, pattern match (PM)-bagging, which performs well in accuracy and is robust to noise variables and noise observations. From the experiments with real data set, the performance of the proposed method is found quite comparable and not much degraded even when the data set has noise variables or noise observations, while some other ensemble methods showed degradations of performance. A bias and variance decomposition showed that the success of the proposed method is due to an effective reduction of both bias and variance.
    PMID: 22145531 [PubMed - in process] (Source: International Journal of Data Mining and Bioinformatics)</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=5492504</comments>
            <pubDate>Sun, 11 Dec 2011 09:18:02 +0100</pubDate>
            <guid isPermaLink="false">5492504</guid>        </item>
        <item>
            <title>Computational identification of potential microRNA network biomarkers for the progression stages of gastric cancer.</title>
            <link>http://www.medworm.com/index.php?rid=5492503&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D22145532%26dopt%3DAbstract</link>
            <description>In this study, a novel concept, the miRNA network biomarker, was proposed for the selection of biomarkers. Each miRNA network biomarker contains miRNA targets, as well as Transcription Factors (TFs), that affect the miRNA expression. The obtained biomarkers were applied to classifying expression data sets in different progression stages from chronic gastritis to gastric cancer. Furthermore, these biomarkers could accurately (94%) discriminate gastric cancer samples from normal samples in another data set. Angiogenesis-related pathways and genes were found to be enriched in these network biomarkers.
    PMID: 22145532 [PubMed - in process] (Source: International Journal of Data Mining and Bioinformatics)</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=5492503</comments>
            <pubDate>Sun, 11 Dec 2011 09:18:02 +0100</pubDate>
            <guid isPermaLink="false">5492503</guid>        </item>
        <item>
            <title>MentalSquares: a generic bipolar Support Vector Machine for psychiatric disorder classification, diagnostic analysis and neurobiological data mining.</title>
            <link>http://www.medworm.com/index.php?rid=5492502&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D22145533%26dopt%3DAbstract</link>
            <description>Authors: Zhang WR, Pandurangi AK, Peace KE, Zhang YQ, Zhao Z
    Abstract
    MentalSquares (MSQs)--an equilibrium-based dimensional approach is presented for the classification and diagnostic analysis of psychological conditions with Bipolar Disorders (BPDs) as an example. While a Support Vector Machine (SVM) is defined in Hilbert space. A MSQ can be considered as a generic SVM for improved classification. Different from the traditional categorical model of BPDs, the generic approach focuses on the balance of two poles of mental equilibrium. Preliminary results show that this new approach has a number of advantages over existing models. The generic model is analytically illustrated with public domain clinical examples and well-known empirical clinical knowledge. Its clinical and computeri...</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=5492502</comments>
            <pubDate>Sun, 11 Dec 2011 09:18:02 +0100</pubDate>
            <guid isPermaLink="false">5492502</guid>        </item>
        <item>
            <title>CarGene: characterisation of sets of genes based on metabolic pathways analysis.</title>
            <link>http://www.medworm.com/index.php?rid=5492501&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D22145534%26dopt%3DAbstract</link>
            <description>Authors: Aguilar-Ruiz JS, Rodriguez-Baena DS, Diaz-Diaz N, Nepomuceno-Chamorro IA
    Abstract
    The great amount of biological information provides scientists with an incomparable framework for testing the results of new algorithms. Several tools have been developed for analysing gene-enrichment and most of them are Gene Ontology-based tools. We developed a Kyoto Encyclopedia of Genes and Genomes (Kegg)-based tool that provides a friendly graphical environment for analysing gene-enrichment. The tool integrates two statistical corrections and simultaneously analysing the information about many groups of genes in both visual and textual manner. We tested the usefulness of our approach on a previous analysis (Huttenshower et al.). Furthermore, our tool is freely available (http://www.upo.e...</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=5492501</comments>
            <pubDate>Sun, 11 Dec 2011 09:18:02 +0100</pubDate>
            <guid isPermaLink="false">5492501</guid>        </item>
        <item>
            <title>Complete coding sequence, sequence analysis and transmembrane topology modelling of Trypanosoma brucei rhodesiense putative oligosaccharyl transferase (TbOST II).</title>
            <link>http://www.medworm.com/index.php?rid=5492500&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D22145535%26dopt%3DAbstract</link>
            <description>Authors: Baticados WN, Inoue N, Sugimoto C, Nagasawa H, Baticados AM
    Abstract
    The partial nucleotide sequence of putative Trypanosoma brucei rhodesiense oligosaccharyl transferase gene was previously reported. Here, we describe the determination of its full-length nucleotide sequence by Inverse PCR (IPCR), subsequent biological sequence analysis and transmembrane topology modelling. The full-length DNA sequence has an Open Reading Frame (ORF) of 2406 bp and encodes a polypeptide of 801 amino acid residues. Protein and DNA sequence analyses revealed that homologues within the genome of other kinetoplastid and various origins exist. Protein topology analysis predicted that Trypanosoma brucei rhodesiense putative oligosaccharyl transferase clone II (TbOST II) is a transmembrane protei...</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=5492500</comments>
            <pubDate>Sun, 11 Dec 2011 09:18:02 +0100</pubDate>
            <guid isPermaLink="false">5492500</guid>        </item>
        <item>
            <title>A novel approach in discovering significant interactions from TCM patient prescription data.</title>
            <link>http://www.medworm.com/index.php?rid=5275629&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D21954669%26dopt%3DAbstract</link>
            <description>We present an exploratory analysis of clinical records using a pattern mining approach called Interaction Rules Mining.
    PMID: 21954669 [PubMed - in process] (Source: International Journal of Data Mining and Bioinformatics)</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=5275629</comments>
            <pubDate>Mon, 03 Oct 2011 12:35:02 +0100</pubDate>
            <guid isPermaLink="false">5275629</guid>        </item>
        <item>
            <title>Study on intelligent syndrome differentiation in traditional Chinese medicine based on multiple information fusion methods.</title>
            <link>http://www.medworm.com/index.php?rid=5275628&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D21954670%26dopt%3DAbstract</link>
            <description>The objective detection instruments of four-diagnostic method are applied to collect four-diagnosis objective information of 506 cases of clinical heart-system patients. Then multiple information fusion methods are adopted to establish recognition model of syndromes. The results of our experiments show that recognition rates of the six syndromes using multi-label learning is better than OCON artificial neural network and multiple support vector machine.
    PMID: 21954670 [PubMed - in process] (Source: International Journal of Data Mining and Bioinformatics)</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=5275628</comments>
            <pubDate>Mon, 03 Oct 2011 12:35:02 +0100</pubDate>
            <guid isPermaLink="false">5275628</guid>        </item>
        <item>
            <title>MAPLSC: a novel multi-class classifier for medical diagnosis.</title>
            <link>http://www.medworm.com/index.php?rid=5275626&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D21954671%26dopt%3DAbstract</link>
            <description>Authors: You M, Zhao RW, Li GZ, Hu X
    Abstract
    Analysis of clinical records contributes to the Traditional Chinese Medicine (TCM) experience expansion and techniques promotion. More than two diagnostic classes (diagnostic syndromes) in the clinical records raise a popular data mining problem: multi-value classification. In this paper, we propose a novel multi-class classifier, named Multiple Asymmetric Partial Least Squares Classifier (MAPLSC). MAPLSC attempts to be robust facing imbalanced data distribution in the multi-value classification. Elaborated comparisons with other seven state-of-the-art methods on two TCM clinical datasets and four public microarray datasets demonstrate MAPLSC's remarkable improvements.
    PMID: 21954671 [PubMed - in process] (Source: International Jour...</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=5275626</comments>
            <pubDate>Mon, 03 Oct 2011 12:35:02 +0100</pubDate>
            <guid isPermaLink="false">5275626</guid>        </item>
        <item>
            <title>Microarray data classification by multi-information based gene scoring integrated with gene ontology.</title>
            <link>http://www.medworm.com/index.php?rid=5275624&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D21954672%26dopt%3DAbstract</link>
            <description>Authors: Tseng VS, Yu HH
    Abstract
    Selecting informative genes is one of the most important issues for deciphering biological information hidden in gene expression data. However, due to the characteristics of microarray data with small samples and large number of genes, general feature selection methods that are not biologically relevant become questionable. In this paper, we propose a novel classification method for microarray data by integrating the multi-information based gene scoring method with biological information. Through experimental evaluation, our proposed method is shown to deliver good accuracy in classification and provide biologists with deeper insights into the relations between genes and gene function categories.
    PMID: 21954672 [PubMed - in process] (Source: In...</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=5275624</comments>
            <pubDate>Mon, 03 Oct 2011 12:35:02 +0100</pubDate>
            <guid isPermaLink="false">5275624</guid>        </item>
        <item>
            <title>Applications of Self-Organising Map (SOM) for prioritisation of endemic zones of filariasis in Andhra Pradesh, India.</title>
            <link>http://www.medworm.com/index.php?rid=5275623&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D21954673%26dopt%3DAbstract</link>
            <description>Authors: Murty US, Rao MS, Sriram K, Rao KM
    Abstract
    Entomological and epidemiological data of Lymphatic Filariasis (LF) was collected from 120 villages of four districts of Andhra Pradesh, India. Self-Organising Maps (SOMs), data-mining techniques, was used to classify and prioritise the endemic zones of filariasis. The results show that, SOMs classified all the villages into three major clusters by considering the data of Microfilaria (MF) rate, infection, infectivity rate and Per Man Hour (PMH). By considering the patterns of cluster, appropriate decision can be drawn for each parameter that is responsible for disease transmission of filariasis. Hence, SOM will certainly be a suitable tool for management of filariasis. The detailed application of SOM is discussed in this paper.
...</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=5275623</comments>
            <pubDate>Mon, 03 Oct 2011 12:35:02 +0100</pubDate>
            <guid isPermaLink="false">5275623</guid>        </item>
        <item>
            <title>A heuristic for gene selection and visual prediction of sample type.</title>
            <link>http://www.medworm.com/index.php?rid=5275622&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D21954674%26dopt%3DAbstract</link>
            <description>Authors: Zhou J, Grinstein G, Marx K
    Abstract
    In this paper, we introduce a heuristic method for gene selection. We target this method, coupled with RadViz visualisation, to the visual prediction of tissue samples which may exist in normal and disease states. As a result of this coupling, the gene selection process, predictive model training and evaluation as well as the model's application for tissue sample prediction can all be intuitively visualised. Such integrated visual analytics enhance the insight provided by classical statistics and machine learning methods. The case study shows our proposed method is cost effective and achieves competitive performance when compared with several widely used techniques.
    PMID: 21954674 [PubMed - in process] (Source: International Journal...</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=5275622</comments>
            <pubDate>Mon, 03 Oct 2011 12:35:02 +0100</pubDate>
            <guid isPermaLink="false">5275622</guid>        </item>
        <item>
            <title>Prediction of the disulphide bonding state of cysteines in proteins using conditional random fields.</title>
            <link>http://www.medworm.com/index.php?rid=5275621&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D21954675%26dopt%3DAbstract</link>
            <description>Authors: Shoombuatong W, Traisathit P, Prasitwattanaseree S, Tayapiwatana C, Cutler R, Chaijaruwanich J
    Abstract
    The formation of disulphide bonds between cysteines plays a major role in protein folding, structure, function and evolution. Many computational approaches have been used to predict the disulphide bonding state ofcysteines. In our work, we developed a novel method based on Conditional Random Fields (CRFs) to predict the disulphide bonding state from protein primary sequence, predicted secondary structures and predicted relative solvent accessibilities (all-state information). Our experiments obtain 84% accuracy, 88% precision and 94% recall, using all-state information. However, our results show essentially identical results when using protein sequence and predicted rela...</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=5275621</comments>
            <pubDate>Mon, 03 Oct 2011 12:35:02 +0100</pubDate>
            <guid isPermaLink="false">5275621</guid>        </item>
        <item>
            <title>How bioinformatics techniques can be used for various aspects of biological research.</title>
            <link>http://www.medworm.com/index.php?rid=5095647&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D21805820%26dopt%3DAbstract</link>
            <description>Authors: Kim S
    
    PMID: 21805820 [PubMed - in process] (Source: International Journal of Data Mining and Bioinformatics)</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=5095647</comments>
            <pubDate>Fri, 05 Aug 2011 12:00:12 +0100</pubDate>
            <guid isPermaLink="false">5095647</guid>        </item>
        <item>
            <title>Cancer progression analysis based on ordinal relationship of cancer stages and co-expression network modularity.</title>
            <link>http://www.medworm.com/index.php?rid=5095641&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D21805821%26dopt%3DAbstract</link>
            <description>Authors: Pyon YS, Li X, Li J
    A comprehensive understanding of cancer progression may shed light on genetic and molecular mechanisms of oncogenesis, and provide important information for effective diagnosis and prognosis. We propose a multicategory logit model to identify genes that show significant correlations across multiple cancer stages. We have applied the approach on a Prostate Cancer (PCA) progression data and obtained a set of genes that show consistent trends across multiple stages. Further analysis based on multiple evidences demonstrates that our candidate list includes not only some well-known prostate-cancer-related genes, but also novel genes that have been confirmed very recently.
    PMID: 21805821 [PubMed - in process] (Source: International Journal of Data Mining and ...</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=5095641</comments>
            <pubDate>Fri, 05 Aug 2011 12:00:12 +0100</pubDate>
            <guid isPermaLink="false">5095641</guid>        </item>
        <item>
            <title>Building a standards-based and collaborative e-prescribing tool: MyRxPad.</title>
            <link>http://www.medworm.com/index.php?rid=5095632&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D21805822%26dopt%3DAbstract</link>
            <description>We present our experience in applying RxNorm in an e-prescribing setting: using standard names and codes to capture prescribed medication as well as extracting information from RxNorm to support medication-related clinical decision.
    PMID: 21805822 [PubMed - in process] (Source: International Journal of Data Mining and Bioinformatics)</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=5095632</comments>
            <pubDate>Fri, 05 Aug 2011 12:00:12 +0100</pubDate>
            <guid isPermaLink="false">5095632</guid>        </item>
        <item>
            <title>Image segmentation of biofilm structures using optimal multi-level thresholding.</title>
            <link>http://www.medworm.com/index.php?rid=5095605&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D21805823%26dopt%3DAbstract</link>
            <description>Authors: Rojas D, Rueda L, Ngom A, Hurrutia H, Cárcamo G
    The appreciation of biofilm structures in digital images can be subjective to the observer, and hence it is necessary to analyse the underlying images in useful parameters by means of quantification that is, ideally, free of errors. This paper proposes a combination of techniques for segmentation of biofilm images through an optimal multi-level thresholding algorithm and a set of clustering validity indices, including the determination of the best number of thresholds. The results, which are validated through Rand Index and a quantification process performed in a laboratory, are similar to the quantification and segmentation done by an expert.
    PMID: 21805823 [PubMed - in process] (Source: International Journal of Data Mining...</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=5095605</comments>
            <pubDate>Fri, 05 Aug 2011 12:00:12 +0100</pubDate>
            <guid isPermaLink="false">5095605</guid>        </item>
        <item>
            <title>Biomarkers associated with metastasis of lung cancer to brain predict patient survival.</title>
            <link>http://www.medworm.com/index.php?rid=5095587&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D21805824%26dopt%3DAbstract</link>
            <description>Authors: Nasser S, Ranade AR, Sridhar S, Haney L, Korn RL, Gotway MB, Weiss GJ
    MicroRNAs influence cell physiology; alteration in miRNA regulation can be implicated in carcinogenesis and disease progression. Generally, one miRNA is predicted to regulate several hundred genes, and as a result, miRNAs could serve as a better classifier than gene expression. We combine validated miRNA expression values with imaging features to classify NSCLC brain mets from non-brain mets and identify possible biomarkers of brain mets. This research involves comprehensive miRNA expression profiling, evaluation of normalisation techniques and combination of miRNA with imaging features FDG-PET/CT and CT Scan. The biomarkers were validated on an independent data set to predict potential brain mets.
    PMID:...</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=5095587</comments>
            <pubDate>Fri, 05 Aug 2011 12:00:12 +0100</pubDate>
            <guid isPermaLink="false">5095587</guid>        </item>
        <item>
            <title>Pancreas modelling by a deterministic optimisation method.</title>
            <link>http://www.medworm.com/index.php?rid=5095529&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D21805825%26dopt%3DAbstract</link>
            <description>Authors: Lv D, Goodwine B
    Diabetes mellitus is a disease characterised by abnormally high glucose concentration, insulin dysfunction and resistance which may lead to health problems such as cardiovascular disease. This paper presents a mechanistic pancreas model of insulin dynamics which incorporates experimental physiological data. This model will provide an efficient and accurate way to determine the specifics of a metabolic problem in the pancreas. We use Intravenous Glucose Tolerance Test (IVGTT) data from the literature to identify the model parameters by implementing a deterministic optimisation method called DIRECT (Dividing RECTangles). Different data sets are used for optimisation and validation.
    PMID: 21805825 [PubMed - in process] (Source: International Journal of Data M...</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=5095529</comments>
            <pubDate>Fri, 05 Aug 2011 12:00:12 +0100</pubDate>
            <guid isPermaLink="false">5095529</guid>        </item>
        <item>
            <title>SActivated germinal centre B cells undergo directed migration.</title>
            <link>http://www.medworm.com/index.php?rid=5095497&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D21805826%26dopt%3DAbstract</link>
            <description>Authors: O'Connor SJ, Hauser AE, Haberman AM, Kleinstein SH
    Affinity maturation, the fundamental basis for adaptive immunity, is accomplished through somatic hypermutation of B-cell receptors followed by expansion of rare mutants with higher affinity for the immunising antigen. This process occurs over a period of weeks in unique micro-anatomic sites known as germinal centres. Two-photon microscopy has recently made it possible to track individual cells moving within germinal centres in living animals. Here we apply statistical approaches to test the hypothesis that B-cell motion is random. Our results show that activated B cells move in a directed manner that sharply contrasts with the behaviour of naïve B cells.
    PMID: 21805826 [PubMed - in process] (Source: International Journal...</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=5095497</comments>
            <pubDate>Fri, 05 Aug 2011 12:00:12 +0100</pubDate>
            <guid isPermaLink="false">5095497</guid>        </item>
        <item>
            <title>Combining multiple perspective as intelligent agents into robust approach for biomarker detection in gene expression data.</title>
            <link>http://www.medworm.com/index.php?rid=5095466&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D21805827%26dopt%3DAbstract</link>
            <description>Authors: Alshalalfa M, Naji G, Qabaja A, Alhajj R, Rokne J
    Locating exceptional, abnormal or unusual trends in gene expression data to identifying disease biomarkers is the vital problem tackled in this paper. We developed a comprehensive framework that incorporates different perspectives each realised by an agent. Each agent applies its method to analyse the gene expression data and to come up with some candidate genes as potential cancer biomarkers. Further, gene enrichment, protein interaction, and miRNA regulation are given weight; they are used to confirm the discoveries by the major agents. We conducted experiments on two data sets; the obtained results are very encouraging with a high classification rate.
    PMID: 21805827 [PubMed - in process] (Source: International Journal of...</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=5095466</comments>
            <pubDate>Fri, 05 Aug 2011 12:00:12 +0100</pubDate>
            <guid isPermaLink="false">5095466</guid>        </item>
        <item>
            <title>Integrated analysis of pharmacologic, clinical and SNP microarray data using Projection Onto the Most Interesting Statistical Evidence with Adaptive Permutation Testing.</title>
            <link>http://www.medworm.com/index.php?rid=4802516&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D21516175%26dopt%3DAbstract</link>
            <description>Authors: Pounds S, Cao X, Cheng C, Yang JJ, Campana D, Pui CH, Evans WE, Relling MV
    We recently developed the Projection Onto the Most Interesting Statistical Evidence (PROMISE) procedure that uses prior biological knowledge to guide an integrated analysis of gene expression data with multiple biological and clinical endpoints. Here, PROMISE is adapted to the integrated analysis of pharmacologic, clinical and genome-wide genotype data. An efficient permutation-testing algorithm is introduced so that PROMISE is computationally feasible in this higher-dimension setting. In the analysis of a paediatric leukaemia data set, PROMISE effectively identifies genomic features that exhibit a biologically meaningful pattern of association with multiple endpoint variables.
    PMID: 21516175 [PubMe...</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=4802516</comments>
            <pubDate>Tue, 10 May 2011 07:45:03 +0100</pubDate>
            <guid isPermaLink="false">4802516</guid>        </item>
        <item>
            <title>Predicting disease phenotypes based on the molecular networks with condition-responsive correlation.</title>
            <link>http://www.medworm.com/index.php?rid=4802515&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D21544951%26dopt%3DAbstract</link>
            <description>Authors: Lee S, Lee E, Lee KH, Lee D
    Network-based methods using molecular interaction networks integrated with gene expression profiles have been proposed to solve problems, which arose from smaller number of samples compared with the large number of predictors. However, previous network-based methods, which have focused only on expression levels of proteins, nodes in the network through the identification of condition-responsive interactions. We propose a novel network-based classification, which focuses on both nodes with discriminative expression levels and edges with Condition-Responsive Correlations (CRCs) across two phenotypes. We found that modules with condition-responsive interactions provide candidate molecular models for diseases and show improved performances compared conv...</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=4802515</comments>
            <pubDate>Tue, 10 May 2011 07:45:03 +0100</pubDate>
            <guid isPermaLink="false">4802515</guid>        </item>
        <item>
            <title>Semi-automatic 3D segmentation of brain structures from MRI.</title>
            <link>http://www.medworm.com/index.php?rid=4802514&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D21544952%26dopt%3DAbstract</link>
            <description>We present a semi-automatic 3D segmentation method for brain structures from Magnetic Resonance Imaging (MRI). There are three main contributions. First, our method combines boundary-based and region-based approaches but differs from previous hybrid methods in that we perform them in two separate phases. This allows for more efficient segmentation. Second, a probability map is generated and used throughout the segmentation to account for the brain structures with low-intensity contrast to the background. Third, we develop a set of tools for manual adjustment after the segmentation. This is particularly important in clinical research because the reliability of the results can be ensured. The experimental results and validations on different data sets are shown.
    PMID: 21544952 [PubMed - ...</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=4802514</comments>
            <pubDate>Tue, 10 May 2011 07:45:03 +0100</pubDate>
            <guid isPermaLink="false">4802514</guid>        </item>
        <item>
            <title>Predictions of flexible C-terminal tethers of bacterial proteins with the FLEXTAIL bioinformatics pipeline.</title>
            <link>http://www.medworm.com/index.php?rid=4802513&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D21544953%26dopt%3DAbstract</link>
            <description>Authors: Lu Y, Ram JL
    Proteins use conserved binding motifs associated with relatively unconserved flexible amino acid sequences as mobile tethers for interacting molecules, as exemplified by C-terminal sequences of bacterial chemotaxis receptors. The FLEXTAIL bioinformatics pipeline predicts flexible tethers and their binding motifs based on the properties of flexibility and sequence conservation. In four groups of bacterial genomes, the algorithm identified &amp;gt; 100 putative binding domains, including verifying the known bacterial chemotaxis receptor-- NWETF binding motif. Some potential C-terminal flexible regions that have not previously been recognised to function as protein tethers were found and should be investigated further for binding targets and flexibility.
    PMID: 215449...</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=4802513</comments>
            <pubDate>Tue, 10 May 2011 07:45:03 +0100</pubDate>
            <guid isPermaLink="false">4802513</guid>        </item>
        <item>
            <title>Improving accuracy of microarray classification by a simple multi-task feature selection filter.</title>
            <link>http://www.medworm.com/index.php?rid=4802512&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D21544954%26dopt%3DAbstract</link>
            <description>Authors: Lan L, Vucetic S
    Leveraging information from the publicly accessible data repositories can be very useful when training a classifier from a small-sample microarray data. To achieve this, we proposed a multi-task feature selection filter that borrows strength from auxiliary microarray data. It uses Kruskal-Wallis test on auxiliary data and ranks genes based on their aggregated p-values. The top-ranked genes are selected as features for the target task classifier. The multi-task filter was evaluated on microarray data related to nine different types of cancers. The results showed that the multi-task feature selection is very successful when applied in conjunction with both single-task and multi-task classifiers.
    PMID: 21544954 [PubMed - in process] (Source: International Jou...</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=4802512</comments>
            <pubDate>Tue, 10 May 2011 07:45:03 +0100</pubDate>
            <guid isPermaLink="false">4802512</guid>        </item>
        <item>
            <title>Modelling gene and protein regulatory networks with answer set programming.</title>
            <link>http://www.medworm.com/index.php?rid=4802511&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D21544955%26dopt%3DAbstract</link>
            <description>Authors: Fayruzov T, Janssen J, Vermeir D, Cornelis C, De Cock M
    Recently, many approaches to model regulatory networks have been proposed in the systems biology domain. However, the task is far from being solved. In this paper, we propose an Answer Set Programming (ASP)-based approach to model interaction networks. We build a general ASP framework that describes the network semantics and allows modelling specific networks with little effort. ASP provides a rich and flexible toolbox that allows expanding the framework with desired features. In this paper, we tune our framework to mimic Boolean network behaviour and apply it to model the Budding Yeast and Fission Yeast cell cycle networks. The obtained steady states of these networks correspond to those of the Boolean networks.
    PMID...</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=4802511</comments>
            <pubDate>Tue, 10 May 2011 07:45:03 +0100</pubDate>
            <guid isPermaLink="false">4802511</guid>        </item>
        <item>
            <title>Improvement in protein-coding region identification based on sliding window trigonometric fast transforms using singular value decomposition.</title>
            <link>http://www.medworm.com/index.php?rid=4748475&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D21491847%26dopt%3DAbstract</link>
            <description>Authors: Hota MK, Srivastava VK
    In this paper, the performance of various sliding window trigonometric fast transforms for identification of protein coding regions has been analysed at the nucleotide level. It is found that, Short-Time Discrete Fourier Transform (ST-DFT) gives better identification accuracy in comparison with Short-Time Discrete Cosine Transform (ST-DCT), Short-Time Discrete Sine Transform (ST-DST) and Short-Time Discrete Hartley Transform (ST-DHT). In the proposed method, identification accuracy of protein coding regions has been improved by applying Singular Value Decomposition (SVD) on the DNA spectrum obtained using sliding window trigonometric fast transforms. The results show that, in proposed method all trigonometric fast transforms gives almost similar results ...</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=4748475</comments>
            <pubDate>Tue, 26 Apr 2011 09:15:16 +0100</pubDate>
            <guid isPermaLink="false">4748475</guid>        </item>
        <item>
            <title>Meta analysis algorithms for microarray gene expression data using gene regulatory networks.</title>
            <link>http://www.medworm.com/index.php?rid=4250220&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D21133037%26dopt%3DAbstract</link>
            <description>We present a system that uses extra knowledge in published gene regulation relationships to examine findings in a microarray experiment and to aid biologists in generating hypotheses. Two algorithms are developed to highlight consistencies as well as inconsistencies between the data. We demonstrate that consistent as well as inconsistent subnetworks found in this manner are important in the discovery of active pathways and novel findings.
    PMID: 21133037 [PubMed - in process] (Source: International Journal of Data Mining and Bioinformatics)</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=4250220</comments>
            <pubDate>Sat, 11 Dec 2010 18:20:03 +0100</pubDate>
            <guid isPermaLink="false">4250220</guid>        </item>
        <item>
            <title>Learning Bayesian networks with integration of indirect prior knowledge.</title>
            <link>http://www.medworm.com/index.php?rid=4250219&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D21133038%26dopt%3DAbstract</link>
            <description>In this study, we propose an approach to efficiently integrate global ordering information into model learning, where the ordering information specifies the indirect relationships among genes. We demonstrate that, compared with a traditional Bayesian network model that uses only local prior knowledge, utilising additional global ordering knowledge can significantly improve the model's performance. The magnitude of this improvement depends on abundance of global ordering information and data quality.
    PMID: 21133038 [PubMed - in process] (Source: International Journal of Data Mining and Bioinformatics)</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=4250219</comments>
            <pubDate>Sat, 11 Dec 2010 18:20:03 +0100</pubDate>
            <guid isPermaLink="false">4250219</guid>        </item>
        <item>
            <title>Using gene ontology to enhance effectiveness of similarity measures for microarray data.</title>
            <link>http://www.medworm.com/index.php?rid=4250218&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D21133039%26dopt%3DAbstract</link>
            <description>Authors: Chen Z, Tang J
    Reducing redundancy is an important goal for most feature selection methods. Almost all methods for redundancy reduction are based on the correlation between gene expression levels. In this paper, we utilise the knowledge in Gene Ontology to provide a new model for measuring redundancy among genes. We propose a novel similarity measure, which incorporates semantic and expression level similarities. We compare our method with traditional expression value-only similarity model on several public microarray datasets. The experimental results show that our approach is capable of offering higher or the same classification accuracy while providing a smaller gene feature.
    PMID: 21133039 [PubMed - in process] (Source: International Journal of Data Mining and Bioinfor...</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=4250218</comments>
            <pubDate>Sat, 11 Dec 2010 18:20:03 +0100</pubDate>
            <guid isPermaLink="false">4250218</guid>        </item>
        <item>
            <title>Assessment of length distributions between non-coding and coding sequences amongst two model organisms.</title>
            <link>http://www.medworm.com/index.php?rid=4250217&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D21133040%26dopt%3DAbstract</link>
            <description>Authors: Caldwell R, Lin YX, Zhang R
    The availability of genomic DNA and cDNA sequence data has escalated the data mining and genomics era. We aim to investigate the length distributions of the non-coding and coding regions of protein genes of two model organisms, Arabidopsis thaliana and Drosophila melanogaster. A non-linear functional relationship model was applied and strong correlation was found between the Coding Sequence (CDS) and non-coding sequence regions, conditional on the 5' UTR data. Significant differences were found between the protein functional classes and each gene region. Examination of the non-coding and coding regions of these organisms has revealed possible correlations.
    PMID: 21133040 [PubMed - in process] (Source: International Journal of Data Mining and Bio...</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=4250217</comments>
            <pubDate>Sat, 11 Dec 2010 18:20:03 +0100</pubDate>
            <guid isPermaLink="false">4250217</guid>        </item>
        <item>
            <title>Invariance kernel of biological regulatory networks.</title>
            <link>http://www.medworm.com/index.php?rid=4250216&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D21133041%26dopt%3DAbstract</link>
            <description>Authors: Ahmad J, Roux O
    The analysis of Biological Regulatory Network (BRN) leads to the computing of the set of the possible behaviours of the biological components. These behaviours are seen as trajectories and we are specifically interested in cyclic trajectories since they stand for stability. The set of cycles is given by the so-called invariance kernel of a BRN. This paper presents a method for deriving symbolic formulae for the length, volume and diameter of a cylindrical invariance kernel. These formulae are expressed in terms of delay parameters expressions and give the existence of an invariance kernel and a hint of the number of cyclic trajectories.
    PMID: 21133041 [PubMed - in process] (Source: International Journal of Data Mining and Bioinformatics)</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=4250216</comments>
            <pubDate>Sat, 11 Dec 2010 18:20:03 +0100</pubDate>
            <guid isPermaLink="false">4250216</guid>        </item>
        <item>
            <title>bcnQL: a query language for biochemical networks.</title>
            <link>http://www.medworm.com/index.php?rid=4250215&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D21133042%26dopt%3DAbstract</link>
            <description>Authors: Yang H, Sunderraman R, Tian H
    This paper proposes a graph data model that can represent information present in Biochemical Networks. The study presented in this paper also proposes a query language, called bcnQL, which empowers users to query entities, interactions, processes and pathways with arbitrary conditions. We then discuss the query-processing techniques, more specifically, the translation of bcnQL queries into G-algebra and a set of algebraic operators on graph objects. Some query examples are presented to demonstrate the applicability of the language for this specific domain. Finally, we provide details of a prototype implementation for the query language.
    PMID: 21133042 [PubMed - in process] (Source: International Journal of Data Mining and Bioinformatics)</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=4250215</comments>
            <pubDate>Sat, 11 Dec 2010 18:20:03 +0100</pubDate>
            <guid isPermaLink="false">4250215</guid>        </item>
        <item>
            <title>Genome-wide DNA-binding specificity of PIL5, an Arabidopsis basic Helix-Loop-Helix (bHLH) transcription factor.</title>
            <link>http://www.medworm.com/index.php?rid=4250214&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D21133043%26dopt%3DAbstract</link>
            <description>In this study, we investigated if in-vivo PIL5 binding sites can be explained by any other attributes extracted from various sources. Our results showed that PIL5 binding sites can be explained by attributes such as neighbouring motif composition, nucleosome density, DNA methylation and distance from transcription start site in addition to G-box.
    PMID: 21133043 [PubMed - in process] (Source: International Journal of Data Mining and Bioinformatics)</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=4250214</comments>
            <pubDate>Sat, 11 Dec 2010 18:20:03 +0100</pubDate>
            <guid isPermaLink="false">4250214</guid>        </item>
        <item>
            <title>A hybrid clustering algorithm for identifying modules in Protein-Protein Interaction networks.</title>
            <link>http://www.medworm.com/index.php?rid=4250213&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D21133044%26dopt%3DAbstract</link>
            <description>Authors: Yu L, Gao L, Sun PG
    Identifying modules in Protein-Protein Interaction (PPI) networks is important to understand the organisation of the cellular processes. In this paper, we present a novel algorithm combining Molecular Complex Detection (MCODE) with Girvan-Newman (GN) to identify modules in PPI networks. Our algorithm can accurately discover denser modules in large-scale protein interaction networks. We applied it to S. cerevisiae PPI networks and obtained high matching rate between the predicted modules and the known protein complexes in Munich Information Center for Protein Sequences (MIPS). The simulation results show that our algorithm provides an effective, reliable and scalable method of identifying modules in PPI networks.
    PMID: 21133044 [PubMed - in process] (Sou...</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=4250213</comments>
            <pubDate>Sat, 11 Dec 2010 18:20:03 +0100</pubDate>
            <guid isPermaLink="false">4250213</guid>        </item>
        <item>
            <title>Genome-wide functional annotation by integrating multiple microarray datasets using meta-analysis.</title>
            <link>http://www.medworm.com/index.php?rid=3955230&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D20815137%26dopt%3DAbstract</link>
            <description>We present a new statistical method of integrating multiple microarray datasets for gene function prediction. We tested the performance of our model using yeast and human datasets. Our results show that combining multiple datasets improves the accuracy over the best function prediction of any single dataset significantly. We also compared performance of the meta p-value and meta correlation methods for function prediction. Supplementary results and code are available at http://digbio.missouri.edu/metaanalyses.
    PMID: 20815137 [PubMed - in process] (Source: International Journal of Data Mining and Bioinformatics)</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=3955230</comments>
            <pubDate>Fri, 10 Sep 2010 22:39:02 +0100</pubDate>
            <guid isPermaLink="false">3955230</guid>        </item>
        <item>
            <title>Synthetic gene design with a large number of hidden stops.</title>
            <link>http://www.medworm.com/index.php?rid=3955229&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D20815138%26dopt%3DAbstract</link>
            <description>Authors: Phan V, Saha S, Pandey A, Wong TY
    Hidden stops are nucleotide triples TAA, TAG and TGA that appear on the second and third reading frames of a protein coding gene. Recent studies suggested the important role of hidden stops in preventing misread of mRNA. We study the problem of designing protein-encoding genes with large number of hidden stops under several biological constraints. With simple constraints, redesigned genes have provable maximal number of hidden stops. With more complex constraints, redesigned genes still have many more hidden stops than wild-type genes. We showed that redesigned genes have a distinct positional advantage in assisting early termination of frame-shifts.
    PMID: 20815138 [PubMed - in process] (Source: International Journal of Data Mining and Bio...</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=3955229</comments>
            <pubDate>Fri, 10 Sep 2010 22:39:02 +0100</pubDate>
            <guid isPermaLink="false">3955229</guid>        </item>
        <item>
            <title>Detecting duplicate biological entities using Shortest Path Edit Distance.</title>
            <link>http://www.medworm.com/index.php?rid=3955228&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D20815139%26dopt%3DAbstract</link>
            <description>Authors: Rudniy A, Song M, Geller J
    Duplicate entity detection in biological data is an important research task. In this paper, we propose a novel and context-sensitive Shortest Path Edit Distance (SPED) extending and supplementing our previous work on Markov Random Field-based Edit Distance (MRFED). SPED transforms the edit distance computational problem to the calculation of the shortest path among two selected vertices of a graph. We produce several modifications of SPED by applying Levenshtein, arithmetic mean, histogram difference and TFIDF techniques to solve subtasks. We compare SPED performance to other well-known distance algorithms for biological entity matching. The experimental results show that SPED produces competitive outcomes.
    PMID: 20815139 [PubMed - in process] (S...</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=3955228</comments>
            <pubDate>Fri, 10 Sep 2010 22:39:02 +0100</pubDate>
            <guid isPermaLink="false">3955228</guid>        </item>
        <item>
            <title>Prediction of alternatively spliced exons using support vector machines.</title>
            <link>http://www.medworm.com/index.php?rid=3955227&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D20815140%26dopt%3DAbstract</link>
            <description>Authors: Xia J, Caragea D, Brown SJ
    Alternative splicing is a mechanism for generating different gene transcripts (called isoforms) from the same genomic sequence. In this paper, we explore the predictive power of a large set of diverse gene features that have been experimentally shown to have effect on alternative splicing. We use such features to build support vector machine classifiers for predicting alternatively spliced exons. Experimental results show that classifiers built from the diverse set of features give better results than those that consider only basic sequence features. Furthermore, we use feature selection methods to identify the most informative features for the prediction problem at hand.
    PMID: 20815140 [PubMed - in process] (Source: International Journal of Data...</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=3955227</comments>
            <pubDate>Fri, 10 Sep 2010 22:39:02 +0100</pubDate>
            <guid isPermaLink="false">3955227</guid>        </item>
        <item>
            <title>A data mining approach to dinoflagellate clustering according to sterol composition: correlations with evolutionary history.</title>
            <link>http://www.medworm.com/index.php?rid=3955226&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D20815141%26dopt%3DAbstract</link>
            <description>This study examined the sterol compositions of 102 dinoflagellates using clustering and cluster validation techniques, as a means of determining the relatedness of the organisms. In addition, dinoflagellate sterol-based relationships were compared statistically to 18S rDNA-based phylogenetic relationships using the Mantel test. Our results indicated that the examined dinoflagellates formed six clusters based on sterol composition and that several, but not all, dinoflagellate genera, which formed discrete clusters in the 18S rDNA-based phylogeny, shared similar sterol compositions. This and other correspondences suggest that the sterol compositions of dinoflagellates are explained, to a certain extent, by the evolutionary history of this lineage.
    PMID: 20815141 [PubMed - in process] (So...</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=3955226</comments>
            <pubDate>Fri, 10 Sep 2010 22:39:02 +0100</pubDate>
            <guid isPermaLink="false">3955226</guid>        </item>
        <item>
            <title>Towards site-based protein functional annotations.</title>
            <link>http://www.medworm.com/index.php?rid=3955225&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D20815142%26dopt%3DAbstract</link>
            <description>Authors: Lei SF, Huan J
    The exact relationship between protein active centres and protein functions is unclear even after decades of intensive study. To improve functional prediction ability based on the local structures, we proposed three different methods. 1. We used Markov Random Field (MRF) to describe protein active region. 2. We developed filtering method that considers the local environment around the active sites. 3. We created multiple structure motifs by extending the motif to neighbouring residues. Our experiment results with enzyme families &amp;lt; 40% sequence identity demonstrated that our methods reduced random matches and could improve up to 70% of the functional annotation ability (using area under curve).
    PMID: 20815142 [PubMed - in process] (Source: International Jo...</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=3955225</comments>
            <pubDate>Fri, 10 Sep 2010 22:39:02 +0100</pubDate>
            <guid isPermaLink="false">3955225</guid>        </item>
        <item>
            <title>Robust QTL analysis by minimum beta-divergence method.</title>
            <link>http://www.medworm.com/index.php?rid=3955224&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D20815143%26dopt%3DAbstract</link>
            <description>Authors: Mollah MN, Eguchi S
    Robustness has received too little attention in Quantitative Trait Loci (QTL) analysis in experimental crosses. This paper discusses a robust QTL mapping algorithm based on Composite Interval Mapping (CIM) model by minimising beta-divergence using the EM like algorithm. We investigate the robustness performance of the proposed method in a comparison of Interval Mapping (IM) and CIM algorithms using both synthetic and real datasets. Experimental results show that the proposed method significantly improves the performance over the traditional IM and CIM methods for QTL analysis in presence of outliers; otherwise, it keeps equal performance.
    PMID: 20815143 [PubMed - in process] (Source: International Journal of Data Mining and Bioinformatics)</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=3955224</comments>
            <pubDate>Fri, 10 Sep 2010 22:39:02 +0100</pubDate>
            <guid isPermaLink="false">3955224</guid>        </item>
        <item>
            <title>Discovering breast cancer drug candidates from biomedical literature.</title>
            <link>http://www.medworm.com/index.php?rid=3824161&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D20681478%26dopt%3DAbstract</link>
            <description>Authors: Li J, Zhu X, Chen JY
    We developed a new paradigm with the ultimate goal of enabling disease-specific drug candidate discovery with molecular-level evidences generated from literature and prior knowledge. We showed how to implement the paradigm by building a prototype literature-mining framework and performing drug-protein association mining for breast cancer drug discovery. In a molecular pharmacology study of breast cancer, 79.2% of 729 enriched drugs in 'Organic Chemicals' category were validated to be disease-related, and the remaining 20.8% were also investigated as potential for future molecular therapeutics studies. 'Doxorubicin', 'Etoposide' and 'Paclitaxel' were identified as having similar pharmacological profiles to treat breast cancer.
    PMID: 20681478 [PubMed - i...</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=3824161</comments>
            <pubDate>Thu, 05 Aug 2010 21:57:03 +0100</pubDate>
            <guid isPermaLink="false">3824161</guid>        </item>
        <item>
            <title>Detecting distant homologies on protozoans metabolic pathways using scientific workflows.</title>
            <link>http://www.medworm.com/index.php?rid=3824160&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D20681479%26dopt%3DAbstract</link>
            <description>Authors: da Cruz SM, Batista V, Silva E, Tosta F, Vilela C, Cuadrat R, Tschoeke D, DÃ¡vila AM, Campos ML, Mattoso M
    Bioinformatics experiments are typically composed of programs in pipelines manipulating an enormous quantity of data. An interesting approach for managing those experiments is through workflow management systems (WfMS). In this work we discuss WfMS features to support genome homology workflows and present some relevant issues for typical genomic experiments. Our evaluation used Kepler WfMS to manage a real genomic pipeline, named OrthoSearch, originally defined as a Perl script. We show a case study detecting distant homologies on trypanomatids metabolic pathways. Our results reinforce the benefits of WfMS over script languages and point out challenges to WfMS in distri...</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=3824160</comments>
            <pubDate>Thu, 05 Aug 2010 21:57:03 +0100</pubDate>
            <guid isPermaLink="false">3824160</guid>        </item>
        <item>
            <title>Mining the protein data bank with CReF to predict approximate 3-D structures of polypeptides.</title>
            <link>http://www.medworm.com/index.php?rid=3824159&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D20681480%26dopt%3DAbstract</link>
            <description>Authors: Dorn M, de Souza ON
    n this paper we describe CReF, a Central Residue Fragment-based method to predict approximate 3-D structures of polypeptides by mining the Protein Data Bank (PDB). The approximate predicted structures are good enough to be used as starting conformations in refinement procedures employing state-of-the-art molecular mechanics methods such as molecular dynamics simulations. CReF is very fast and we illustrate its efficacy in three case studies of polypeptides whose sizes vary from 34 to 70 amino acids. As indicated by the RMSD values, our initial results show that the predicted structures adopt the expected fold, similar to the experimental ones.
    PMID: 20681480 [PubMed - in process] (Source: International Journal of Data Mining and Bioinformatics)</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=3824159</comments>
            <pubDate>Thu, 05 Aug 2010 21:57:03 +0100</pubDate>
            <guid isPermaLink="false">3824159</guid>        </item>
        <item>
            <title>On a novel coalescent model for genome-wide evolution of copy number variations.</title>
            <link>http://www.medworm.com/index.php?rid=3824158&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D20681481%26dopt%3DAbstract</link>
            <description>Authors: Mitrofanova A, Mishra B
    Since we are limited in our knowledge of human demographic history and variations of recombination and mutation rates, large-scale computer simulation is a necessary tool in genetics. Here we propose and computationally simulate a model of evolution for unique and segmentally duplicated regions of human genome. Since such segmentally duplicated regions show a complex behaviour of copy number changes, our model will hopefully lead to a better understanding of the evolutionary developments of CNVs, algorithms for associations studies with CNV markers, and finally, for characterising parameters for stochastic diffusion models, describing asymptotic behaviour of evolutionary processes.
    PMID: 20681481 [PubMed - in process] (Source: International Journal ...</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=3824158</comments>
            <pubDate>Thu, 05 Aug 2010 21:57:03 +0100</pubDate>
            <guid isPermaLink="false">3824158</guid>        </item>
        <item>
            <title>Using hybrid hierarchical K-means (HHK) clustering algorithm for protein sequence motif super-rule-tree (SRT) structure construction.</title>
            <link>http://www.medworm.com/index.php?rid=3824157&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D20681482%26dopt%3DAbstract</link>
            <description>Authors: Chen B, He J, Pellicer S, Pan Y
    Many algorithms or techniques to discover motifs require a predefined fixed window size in advance. Because of the fixed size, these approaches often deliver a number of similar motifs simply shifted by some bases or including mismatches. To confront the mismatched motifs problem, we use the super-rule concept to construct a Super-Rule-Tree (SRT) by a modified Hybrid Hierarchical K-means (HHK) clustering algorithm, which requires no parameter set-up to identify the similarities and dissimilarities between the motifs. By analysing the motif results generated by our approach, they are significant not only in sequence area but also in secondary structure similarity.
    PMID: 20681482 [PubMed - in process] (Source: International Journal of Data Min...</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=3824157</comments>
            <pubDate>Thu, 05 Aug 2010 21:57:03 +0100</pubDate>
            <guid isPermaLink="false">3824157</guid>        </item>
        <item>
            <title>A weighted local least squares imputation method for missing value estimation in microarray gene expression data.</title>
            <link>http://www.medworm.com/index.php?rid=3824156&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D20681483%26dopt%3DAbstract</link>
            <description>Authors: Ching WK, Li L, Tsing NK, Tai CW, Ng TW, Wong AS, Cheng KW
    Many clustering techniques and classification methods for analysing microarray data require a complete dataset. However, very often gene expression datasets contain missing values due to various reasons. In this paper, we first propose to use vector angle as a measurement for the similarity between genes. We then propose the Weighted Local Least Square Imputation (WLLSI) method for missing values estimation. Numerical results on both synthetic data and real microarray data indicate that WLLSI method is more robust. The imputation methods are then applied to a breast cancer dataset and interesting results are obtained.
    PMID: 20681483 [PubMed - in process] (Source: International Journal of Data Mining and Bioinformat...</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=3824156</comments>
            <pubDate>Thu, 05 Aug 2010 21:57:03 +0100</pubDate>
            <guid isPermaLink="false">3824156</guid>        </item>
        <item>
            <title>LIBGS: a MATLAB software package for gene selection.</title>
            <link>http://www.medworm.com/index.php?rid=3824155&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D20681484%26dopt%3DAbstract</link>
            <description>Authors: Zhang Y, Wang D, Li T
    Many gene selection algorithms have been applied in gene expression data analysis successfully. To solve different developing environments of these toolkits, such as rankgene (Su et al., 2003), and mRMR (http://research.janelia.org/peng/proj/mrmr/index.htm), perform data analysis and make algorithm comparison more flexible, we have developed a software package LIBGS including: 1) Seven new gene selection algorithms implemented using MATLAB. 2) MATLAB interface for Rankgene. 3) MATLAB interface for LIBSVM and WEKA. 4) Programs for converting data formats. 5) A collection of six popular gene expression data sets. These features make LIBGS a useful tool in gene expression analysis and feature selection.
    PMID: 20681484 [PubMed - in process] (Source: Inter...</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=3824155</comments>
            <pubDate>Thu, 05 Aug 2010 21:57:03 +0100</pubDate>
            <guid isPermaLink="false">3824155</guid>        </item>
        <item>
            <title>Efficient and exact maximum likelihood quantisation of genomic features using dynamic programming.</title>
            <link>http://www.medworm.com/index.php?rid=3515694&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D20423016%26dopt%3DAbstract</link>
            <description>Authors: Song M, Haralick RM, Boissinot S
    An efficient and exact dynamic programming algorithm is introduced to quantise a continuous random variable into a discrete random variable that maximises the likelihood of the quantised probability distribution for the original continuous random variable. Quantisation is often useful before statistical analysis and modelling of large discrete network models from observations of multiple continuous random variables. The quantisation algorithm is applied to genomic features including the recombination rate distribution across the chromosomes and the non-coding transposable element LINE-1 in the human genome. The association pattern is studied between the recombination rate, obtained by quantisation at genomic locations around LINE-1 elements, an...</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=3515694</comments>
            <pubDate>Thu, 29 Apr 2010 17:46:02 +0100</pubDate>
            <guid isPermaLink="false">3515694</guid>        </item>
        <item>
            <title>Cross-platform microarray data integration using the normalised linear transform.</title>
            <link>http://www.medworm.com/index.php?rid=3515693&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D20423017%26dopt%3DAbstract</link>
            <description>Authors: Xiong H, Zhang Y, Chen XW, Yu J
    Small sample size is one of the biggest challenges in microarray data analysis. With microarray data being dramatically accumulated, integrating data from related studies represents a natural way to increase sample size so that more reliable statistical analysis may be performed. In this paper, we present a simple and effective integration scheme, called Normalised Linear Transform (NLT), to combine data from different microarray platforms. The NLT scheme is compared with three other integration schemes for two tasks: classification analysis and gene marker selection. Our experiments demonstrate that the NLT scheme performs best in terms of classification accuracy, and leads to more biologically significant marker genes.
    PMID: 20423017 [PubM...</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=3515693</comments>
            <pubDate>Thu, 29 Apr 2010 17:46:02 +0100</pubDate>
            <guid isPermaLink="false">3515693</guid>        </item>
        <item>
            <title>Medical informatics: transition from data acquisition to data analysis by means of bioinformatics tools and resources.</title>
            <link>http://www.medworm.com/index.php?rid=3515692&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D20423018%26dopt%3DAbstract</link>
            <description>Authors: Mahdavi MA
    Medical informatics has shifted its focus from acquisition and storage of healthcare data by integrating computational, informational, cognitive and organisational sciences to semantic analysis of the data for problem solving and clinical decision-making. In this transition, bioinformatics tools and resources are the most appropriate means to improve the analysis, as major biological databases are now containing clinical data alongside genomics, proteomics and other biological data. This paper briefly reviews bioinformatics tools and resources and then discusses their applications in analysing clinical data for diagnostics.
    PMID: 20423018 [PubMed - in process] (Source: International Journal of Data Mining and Bioinformatics)</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=3515692</comments>
            <pubDate>Thu, 29 Apr 2010 17:46:02 +0100</pubDate>
            <guid isPermaLink="false">3515692</guid>        </item>
        <item>
            <title>Protein structural classification using orthogonal transformation and class-association rules.</title>
            <link>http://www.medworm.com/index.php?rid=3515691&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D20423019%26dopt%3DAbstract</link>
            <description>Authors: Dua S, Kidambi PC
    Protein structure classification and comparison is a central area in the field of bioinformatics. Rapidly increasing protein structure databases commonly suffer from the 'curse of dimensionality', necessitating the development of the dimensionality reduction of structural information prior to its classification. We propose a novel automated algorithmic framework for three-dimensional structure-based classification of proteins using orthogonal transformation of the geometric shape descriptors derived from protein structures, and then employing an association rule-based supervised clustering approach. The proposed computational framework demonstrates, on two different data sets, the applicability of association rule discovery-based classification of structural ...</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=3515691</comments>
            <pubDate>Thu, 29 Apr 2010 17:46:02 +0100</pubDate>
            <guid isPermaLink="false">3515691</guid>        </item>
        <item>
            <title>Hierarchical classification of G-protein-coupled receptors with data-driven selection of attributes and classifiers.</title>
            <link>http://www.medworm.com/index.php?rid=3515690&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D20423020%26dopt%3DAbstract</link>
            <description>Authors: Secker A, Davies MN, Freitas AA, Clark EB, Timmis J, Flower DR
    We address the important bioinformatics problem of predicting protein function from a protein's primary sequence. We consider the functional classification of G-Protein-Coupled Receptors (GPCRs), whose functions are specified in a class hierarchy. We tackle this task using a novel top-down hierarchical classification system where, for each node in the class hierarchy, the predictor attributes to be used in that node and the classifier to be applied to the selected attributes are chosen in a data-driven manner. Compared with a previous hierarchical classification system selecting classifiers only, our new system significantly reduced processing time without significantly sacrificing predictive accuracy.
    PMID: 20...</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=3515690</comments>
            <pubDate>Thu, 29 Apr 2010 17:46:02 +0100</pubDate>
            <guid isPermaLink="false">3515690</guid>        </item>
        <item>
            <title>Prediction of protein-protein interactions from primary sequences.</title>
            <link>http://www.medworm.com/index.php?rid=3515689&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D20423021%26dopt%3DAbstract</link>
            <description>In this study, an efficient method is presented to predict protein-protein interactions with sequence composition information. Four kinds of basic building blocks of protein sequences are investigated. The experimental results show that there is minor difference in prediction performance among the four kinds of different building blocks. The method based on combination of all building blocks out performs any of the building blocks. We also demonstrate that the use of Latent Semantic Analysis (LSA) can efficiently remove noise and improve the prediction efficiency without significantly degrading the performance.
    PMID: 20423021 [PubMed - in process] (Source: International Journal of Data Mining and Bioinformatics)</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=3515689</comments>
            <pubDate>Thu, 29 Apr 2010 17:46:02 +0100</pubDate>
            <guid isPermaLink="false">3515689</guid>        </item>
        <item>
            <title>Feature selection for genomic data sets through feature clustering.</title>
            <link>http://www.medworm.com/index.php?rid=3515688&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D20423022%26dopt%3DAbstract</link>
            <description>Authors: Zheng F, Shen X, Fu Z, Zheng S, Li G
    A subset selected by a supervised feature selection method may not be a good one for unsupervised learning and vice versa. We propose a novel Feature Selection algorithm through Feature Clustering, FSFC. FSFC does not need the class label information in the data set and is suitable for both supervised learning and unsupervised learning. We test FSFC on some biological data sets for both clustering and classification analysis and the results indicates that FSFC algorithm can significantly reduce the original data sets without scarifying the quality of clustering and classification.
    PMID: 20423022 [PubMed - in process] (Source: International Journal of Data Mining and Bioinformatics)</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=3515688</comments>
            <pubDate>Thu, 29 Apr 2010 17:46:02 +0100</pubDate>
            <guid isPermaLink="false">3515688</guid>        </item>
        <item>
            <title>Computing approximate solutions of the protein structure determination problem using global constraints on discrete crystal lattices.</title>
            <link>http://www.medworm.com/index.php?rid=3457112&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D20376920%26dopt%3DAbstract</link>
            <description>The objective is to enhance the efficiency of lattice solvers in dealing with the construction of approximate solutions of the protein structure determination problem. Some of them (e.g., self-avoiding-walk) have been explicitly or implicitly already used in previous approaches, while others (e.g., the density constraint) are new. The intrinsic complexities of all of them are studied and preliminary experimental results are discussed.
    PMID: 20376920 [PubMed - in process] (Source: International Journal of Data Mining and Bioinformatics)</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=3457112</comments>
            <pubDate>Sat, 10 Apr 2010 23:10:03 +0100</pubDate>
            <guid isPermaLink="false">3457112</guid>        </item>
        <item>
            <title>Improved bayesian network inference using relaxed gene ordering.</title>
            <link>http://www.medworm.com/index.php?rid=3457111&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D20376921%26dopt%3DAbstract</link>
            <description>Authors: Zhu D, Li H
    Bayesian Networks (BNs) have become one of the most powerful means of reconstructing signalling pathways in silico. Excessive computational loads limit the applications of BNs to learn larger sized network structures. Recent bioinformatics research found that signalling pathways are likely hierarchically organised. Genes resident in hierarchical layers constitute biological constraint, which can be readily used by BN structural learning algorithms to substantially reduce the computational load. We propose a constrained BN structural learning algorithm that solves the NP-complete computational problem in a heuristic manner. We demonstrate the utility of our algorithm in constructing two important signalling pathways in S. cerevisiae.
    PMID: 20376921 [PubMed - in ...</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=3457111</comments>
            <pubDate>Sat, 10 Apr 2010 23:10:03 +0100</pubDate>
            <guid isPermaLink="false">3457111</guid>        </item>
        <item>
            <title>Alignment of multiple proteins with an ensemble of hidden Markov models.</title>
            <link>http://www.medworm.com/index.php?rid=3457110&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D20376922%26dopt%3DAbstract</link>
            <description>Authors: Song J, Liu C, Song Y, Qu J, Hura GS
    In this paper, we developed a new method that progressively constructs and updates a set of alignments by adding sequences in a certain order to each of the existing alignments. Each of the existing alignments is modelled with a profile Hidden Markov Model (HMM) and an added sequence is aligned to each of these profile HMMs. We introduced an integer parameter for the number of profile HMMs. The profile HMMs are then updated based on the alignments with leading scores. Our experiments on BaliBASE showed that our approach could efficiently explore the alignment space and significantly improve the alignment accuracy.
    PMID: 20376922 [PubMed - in process] (Source: International Journal of Data Mining and Bioinformatics)</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=3457110</comments>
            <pubDate>Sat, 10 Apr 2010 23:10:03 +0100</pubDate>
            <guid isPermaLink="false">3457110</guid>        </item>
        <item>
            <title>Matrix factorisation methods applied in microarray data analysis.</title>
            <link>http://www.medworm.com/index.php?rid=3457109&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D20376923%26dopt%3DAbstract</link>
            <description>Authors: Kossenkov AV, Ochs MF
    Numerous methods have been applied to microarray data to group genes into clusters that show similar expression patterns. These methods assign each gene to a single group, which does not reflect the widely held view among biologists that most, if not all, genes in eukaryotes are involved in multiple biological processes and therefore will be multiply regulated. Here, we review several methods of matrix factorisation that identify patterns of behaviour in transcriptional response and assign genes to multiple patterns. We focus on these methods rather than traditional clustering methods applied to microarray data, which assign one gene to one cluster.
    PMID: 20376923 [PubMed - in process] (Source: International Journal of Data Mining and Bioinformatics)</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=3457109</comments>
            <pubDate>Sat, 10 Apr 2010 23:10:03 +0100</pubDate>
            <guid isPermaLink="false">3457109</guid>        </item>
        <item>
            <title>Identifying the overlapping complexes in protein interaction networks.</title>
            <link>http://www.medworm.com/index.php?rid=3457108&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D20376924%26dopt%3DAbstract</link>
            <description>Authors: Li M, Wang J, Chen J, Cai Z, Chen G
    Identification of protein complexes in large interaction networks is crucial to understanding principles of cellular organisation and predict protein functions. In this paper, a new algorithm of Identifying Protein Complexes based on Maximal Clique Extension (IPC-MCE) is proposed. The maximal clique is considered as the core of the protein complex. Proteins in a complex are classed into core vertices and peripheral vertices. The relation between the core vertices and peripheral vertices is measured by the Interaction Probability. The algorithm IPC-MCE is applied to the protein interaction network of Saccharomyces cerevisiae. Many well-known protein complexes are detected.
    PMID: 20376924 [PubMed - in process] (Source: International Journa...</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=3457108</comments>
            <pubDate>Sat, 10 Apr 2010 23:10:03 +0100</pubDate>
            <guid isPermaLink="false">3457108</guid>        </item>
        <item>
            <title>Integrating flexibility and interactivity in bioinformatics visual programming tools with Focus+Context algorithm.</title>
            <link>http://www.medworm.com/index.php?rid=3457107&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D20376925%26dopt%3DAbstract</link>
            <description>Authors: Shen X, Gu J
    An improved bioinformatics visual programming prototype system VBP that aims to visualise highly complicated bioinformatical data is described. In this paper, we describe the integration of Focus+Context algorithm and bio-visual programming to show the dynamic adjusting of Focusing on details without losing the context simultaneously. Because of this flexible and interactive architecture, VBP makes an ideal bio-visual programming tool for future bioinformatics or systems biology research.
    PMID: 20376925 [PubMed - in process] (Source: International Journal of Data Mining and Bioinformatics)</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=3457107</comments>
            <pubDate>Sat, 10 Apr 2010 23:10:03 +0100</pubDate>
            <guid isPermaLink="false">3457107</guid>        </item>
        <item>
            <title>Struct-NB: Predicting Protein-RNA Binding Sites Using Structural Features.</title>
            <link>http://www.medworm.com/index.php?rid=3386157&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D20300450%26dopt%3DAbstract</link>
            <description>Authors: Towfic F, Caragea C, Gemperline DC, Dobbs D, Honavar V
    We explore whether protein-RNA interfaces differ from non-interfaces in terms of their structural features and whether structural features vary according to the type of the bound RNA (e.g., mRNA, siRNA, etc.), using a non-redundant dataset of 147 protein chains extracted from protein-RNA complexes in the Protein Data Bank. Furthermore, we use machine learning algorithms for training classifiers to predict protein-RNA interfaces using information derived from the sequence and structural features. We develop the Struct-NB classifier that takes into account structural information. We compare the performance of Na&amp;#xEF;ve Bayes and Gaussian Na&amp;#xEF;ve Bayes with that of Struct-NB classifiers on the 147 protein-RNA dataset usin...</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=3386157</comments>
            <pubDate>Thu, 14 Jan 2010 00:00:00 +0100</pubDate>
            <guid isPermaLink="false">3386157</guid>        </item>
        <item>
            <title>A semi-supervised approach to projected clustering with applications to microarray data.</title>
            <link>http://www.medworm.com/index.php?rid=2640291&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D19623769%26dopt%3DAbstract</link>
            <description>Authors: Yip KY, Cheung L, Cheung DW, Jing L, Ng MK
    Recent studies have suggested that extremely low dimensional projected clusters exist in real datasets. Here, we propose a new algorithm for identifying them. It combines object clustering and dimension selection, and allows the input of domain knowledge in guiding the clustering process. Theoretical and experimental results show that even a small amount of input knowledge could already help detect clusters with only 1% of the relevant dimensions. We also show that this semi-supervised algorithm can perform knowledge-guided selective clustering when there are multiple meaningful object groupings. The algorithm is also shown effective in analysing a microarray dataset.
    PMID: 19623769 [PubMed - in process] (Source: International Jou...</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=2640291</comments>
            <pubDate>Mon, 27 Jul 2009 00:42:02 +0100</pubDate>
            <guid isPermaLink="false">2640291</guid>        </item>
        <item>
            <title>Clustering sequences by overlap.</title>
            <link>http://www.medworm.com/index.php?rid=2640290&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D19623770%26dopt%3DAbstract</link>
            <description>Authors: Dorr DH, Denton AM
    A clustering algorithm is introduced that combines the strengths of clustering and motif finding techniques. Clusters are identified based on unambiguously defined sequence sections as in motif finding algorithms. The definition of similarity within clusters allows transitive matches and, thereby, enables the discovery of remote homologies that cannot be found through motif-finding algorithms. Directed Acyclic Graph (DAG) structures are constructed that link short clusters to the longer ones. We compare the clustering results to the corresponding domains in the InterPro database. A second comparison shows that annotations based on our domains are inherently more consistent than those based on InterPro domains.
    PMID: 19623770 [PubMed - in process] (Source...</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=2640290</comments>
            <pubDate>Mon, 27 Jul 2009 00:42:02 +0100</pubDate>
            <guid isPermaLink="false">2640290</guid>        </item>
        <item>
            <title>Stroma classification for neuroblastoma on graphics processors.</title>
            <link>http://www.medworm.com/index.php?rid=2640289&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D19623771%26dopt%3DAbstract</link>
            <description>Authors: Ruiz A, Sertel O, Ujald&amp;#xF3;n M, Catalyurek U, Saltz J, Gurcan MN
    Neuroblastoma is one of the most common childhood cancers. We are developing an image analysis system to assist pathologists in their prognosis. Since this system operates on relatively large-scale images and requires sophisticated algorithms, computerised analysis takes a long time to execute. In this paper, we propose a novel approach to benefit from high memory bandwidth and strong floating-point capabilities of graphics processing units. The proposed approach achieves a promising classification accuracy of 99.4% and an execution performance with a gain factor up to 45 times compared to hand-optimised C++ code running on the CPU.
    PMID: 19623771 [PubMed - in process] (Source: International Journal of Data...</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=2640289</comments>
            <pubDate>Mon, 27 Jul 2009 00:42:02 +0100</pubDate>
            <guid isPermaLink="false">2640289</guid>        </item>
        <item>
            <title>Clinical text classification under the Open and Closed Topic Assumptions.</title>
            <link>http://www.medworm.com/index.php?rid=2640288&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D19623772%26dopt%3DAbstract</link>
            <description>Authors: Sasaki Y, Rea B, Ananiadou S
    This paper investigates multi-topic aspects in automatic classification of clinical free text in comparison with general text. In this paper, we facilitate two different views on multi-topics: the Closed Topic Assumption (CTA) and the Open Topic Assumption (OTA). Experimental results show that the characteristics of multi-topic assignments in the Computational Medicine Centre (CMC) Medical NLP Challenge Data is strongly OTA-oriented but general text Reuters-21578 is characterised in the middle of the OTA and CTA spectrum.
    PMID: 19623772 [PubMed - in process] (Source: International Journal of Data Mining and Bioinformatics)</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=2640288</comments>
            <pubDate>Mon, 27 Jul 2009 00:42:02 +0100</pubDate>
            <guid isPermaLink="false">2640288</guid>        </item>
        <item>
            <title>Tracking multiple interacting subcellular structure by sequential Monte Carlo method.</title>
            <link>http://www.medworm.com/index.php?rid=2640287&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D19623773%26dopt%3DAbstract</link>
            <description>Authors: Wen Q, Luby-Phelps K, Gao J
    With the wide application of Green Fluorescent Proteins (GFP) in the study of live cells, there is a surging need for computer-aided analysis on the huge amount of image sequence data acquired by the advanced microscopy devices. In this paper, a framework based on Sequential Monte Carlo (SMC) is proposed for multiple interacting object tracking. The distribution of the dimension varying joint state is sampled efficiently by a Reversible Jump Markov Chain Monte Carlo (RJMCMC) algorithm with a novel height swap move. Experimental results were performed on synthetic and real confocal microscopy image sequences.
    PMID: 19623773 [PubMed - in process] (Source: International Journal of Data Mining and Bioinformatics)</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=2640287</comments>
            <pubDate>Mon, 27 Jul 2009 00:42:02 +0100</pubDate>
            <guid isPermaLink="false">2640287</guid>        </item>
        <item>
            <title>Semantic similarity based feature extraction from microarray expression data.</title>
            <link>http://www.medworm.com/index.php?rid=2640286&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D19623774%26dopt%3DAbstract</link>
            <description>Authors: Cho YR, Zhang A, Xu X
    Previous studies have proven that it is feasible to build sample classifiers using gene expression profiles. To build an effective sample classifier, dimension reduction process is necessary since classic pattern recognition algorithms do not work well in high dimensional space. In this paper, we present a novel feature extraction algorithm by integrating microarray expression data with Gene Ontology (GO). Applying semantic similarity measures, we identify the groups of genes, called virtual genes, which potentially interact with each other for a biological function. The correlation in expressions of virtual genes is used to classify samples. For colon cancer data, this approach significantly improved the classification accuracy by more than 10%.
    PMID...</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=2640286</comments>
            <pubDate>Mon, 27 Jul 2009 00:42:02 +0100</pubDate>
            <guid isPermaLink="false">2640286</guid>        </item>
        <item>
            <title>An effective convergence independent loop closure method using Forward-Backward Cyclic Coordinate Descent.</title>
            <link>http://www.medworm.com/index.php?rid=2640285&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D19623775%26dopt%3DAbstract</link>
            <description>Authors: Al-Nasr K, He J
    Cyclic Coordinate Descent (CCD) is a popular robotic approach to generate a possible loop that closes the gap between two constrained portions of a protein chain (Canutescu and Dunbrack 2003). In this paper, we describe an effective Forward-Backward CCD (FBCCD) method to connect the two constrained portions of a protein chain without requiring the loop to converge. A test of 30 loops of length 4, 8 and 12 suggests that our method takes fewer number of cycles to produce loops of comparable accuracy and more accurate second portion of the chain, when it is compared to the CCD method.
    PMID: 19623775 [PubMed - in process] (Source: International Journal of Data Mining and Bioinformatics)</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=2640285</comments>
            <pubDate>Mon, 27 Jul 2009 00:42:02 +0100</pubDate>
            <guid isPermaLink="false">2640285</guid>        </item>
        <item>
            <title>22nd annual ACM symposium on applied computing.</title>
            <link>http://www.medworm.com/index.php?rid=2473318&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D19432372%26dopt%3DAbstract</link>
            <description>Authors: Palakal M
    
    PMID: 19432372 [PubMed - indexed for MEDLINE] (Source: International Journal of Data Mining and Bioinformatics)</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=2473318</comments>
            <pubDate>Sat, 13 Jun 2009 16:15:02 +0100</pubDate>
            <guid isPermaLink="false">2473318</guid>        </item>
        <item>
            <title>A cube framework for incorporating inter-gene information into biological data mining.</title>
            <link>http://www.medworm.com/index.php?rid=2473315&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D19432373%26dopt%3DAbstract</link>
            <description>Authors: Lin KM, Kang J, Shin H, Lee J
    Large volumes of microarray data are registered daily in public repositories such as SMD (Belkin and Niyogi, 2003) and GEO (Ashburner et al., 2000). Such repositories have quickly become a community resource. However, due to the inherent heterogeneity of the microarray experiments, the data generated from different experiments could not be directly integrated and hence the resources have not been fully utilised. To address this problem, we propose a new microarray integration framework that achieves high-quality integration through exploiting invariant features such as relative information among genes. We also show how the proposed approach generalises the previous frameworks.
    PMID: 19432373 [PubMed - indexed for MEDLINE] (Source: Internationa...</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=2473315</comments>
            <pubDate>Sat, 13 Jun 2009 16:15:02 +0100</pubDate>
            <guid isPermaLink="false">2473315</guid>        </item>
        <item>
            <title>Finding new core promoter elements using backward-looking strategies.</title>
            <link>http://www.medworm.com/index.php?rid=2473312&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D19432374%26dopt%3DAbstract</link>
            <description>Authors: Huang YF, Jhan YC, Liou SW
    Core Promoter Elements (CPEs) were key players in transcription initiation. Identifying CPEs is crucial for understanding gene expression. In this paper, a framework for finding new CPEs was proposed. An experiment was performed on the sequences of Eukaryotic Promoter Database (EPD). From the results, the known CPEs were all recovered; in addition, five new motifs were discovered in Drosophila and three in human. By comparing the results with currently known CPEs, it is shown that the proposed system is feasible and reliable, and these new CPEs are worth of further exploration.
    PMID: 19432374 [PubMed - indexed for MEDLINE] (Source: International Journal of Data Mining and Bioinformatics)</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=2473312</comments>
            <pubDate>Sat, 13 Jun 2009 16:15:02 +0100</pubDate>
            <guid isPermaLink="false">2473312</guid>        </item>
        <item>
            <title>An on demand data integration model for biological databases.</title>
            <link>http://www.medworm.com/index.php?rid=2473309&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D19432375%26dopt%3DAbstract</link>
            <description>Authors: Palakal M, Naidu P
    This paper presents a user-centric biological query system for information integration and knowledge acquisition from distributed, semantically heterogeneous data sources. The proposed system, BioXBase, extracts user requested query information over the internet from multiple biological sources and organises this information into a homogeneous unified view to the user. This entire process is done in real time on-the-fly. The BioXBase system has improved the results retrieved by 30% compared to a system that has only a local database. The BioXBase system is further enhanced by 20% while combining the results with a local database, making the results more significant in biological domain.
    PMID: 19432375 [PubMed - indexed for MEDLINE] (Source: International...</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=2473309</comments>
            <pubDate>Sat, 13 Jun 2009 16:15:02 +0100</pubDate>
            <guid isPermaLink="false">2473309</guid>        </item>
        <item>
            <title>Predicting protein-protein interfaces as clusters of optimal docking area points.</title>
            <link>http://www.medworm.com/index.php?rid=2473306&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D19432376%26dopt%3DAbstract</link>
            <description>Authors: Arafat Y, Kamruzzaman J, Karmakar GC, Fernandez-Recio J
    Desolvation property is used here to predict protein-protein binding sites exploiting the fact that lower-valued 'optimal docking area' ODA (Fernandez-Recio et al., 2005) points form cluster at the interface. The proposed method involves two steps; clustering the ODA points and representing ODA points by average ODA values. On 51 nonredundant proteins, results show the success rate improved considerably. Considering only significant ODA, the previous ODA method has obtained a success rate of 65% with overall success rate of 39%. The proposed method improved the overall success rate to 61%. Further, comparable results were found for X-ray and NMR structures.
    PMID: 19432376 [PubMed - indexed for MEDLINE] (Source: Intern...</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=2473306</comments>
            <pubDate>Sat, 13 Jun 2009 16:15:02 +0100</pubDate>
            <guid isPermaLink="false">2473306</guid>        </item>
        <item>
            <title>A hybrid graph-theoretic method for mining overlapping functional modules in large sparse protein interaction networks.</title>
            <link>http://www.medworm.com/index.php?rid=2473303&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D19432377%26dopt%3DAbstract</link>
            <description>Authors: Zhang S, Liu HW, Ning XM, Zhang XS
    Modular architecture, which encompasses groups of genes/proteins involved in elementary biological functional units, is a basic form of the organisation of interacting proteins. Here, we propose a method that combines the Line Graph Transformation (LGT) and clique percolation-clustering algorithm to detect network modules, which may overlap each other in large sparse PPI networks. The resulting modules by the present method show a high coverage among yeast, fly, and worm PPI networks, respectively. Our analysis of the yeast PPI network suggests that most of these modules have well-biological significance in context of protein localisation, function annotation, and protein complexes.
    PMID: 19432377 [PubMed - indexed for MEDLINE] (Source: I...</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=2473303</comments>
            <pubDate>Sat, 13 Jun 2009 16:15:02 +0100</pubDate>
            <guid isPermaLink="false">2473303</guid>        </item>
        <item>
            <title>Irrelevant gene elimination for partial least squares based dimension reduction by using feature probes.</title>
            <link>http://www.medworm.com/index.php?rid=2473300&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D19432378%26dopt%3DAbstract</link>
            <description>Authors: Zeng XQ, Li GZ, Wu GF, Yang JY, Yang MQ
    It is hard to analyse gene expression data which has only a few observations but with thousands of measured genes. Partial Least Squares based Dimension Reduction (PLSDR) is superior for handling such high dimensional problems, but irrelevant features will introduce errors into the dimension reduction process. Here, feature selection is applied to filter the data and an algorithm named PLSDRg is described by integrating PLSDR with gene elimination, which is performed by the indication of t-statistic scores on standardised probes. Experimental results on six microarray data sets show that PLSDRg is effective and reliable to improve generalisation performance of classifiers.
    PMID: 19432378 [PubMed - indexed for MEDLINE] (Source: Intern...</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=2473300</comments>
            <pubDate>Sat, 13 Jun 2009 16:15:02 +0100</pubDate>
            <guid isPermaLink="false">2473300</guid>        </item>
        <item>
            <title>Discovering implicit associations among critical biological entities.</title>
            <link>http://www.medworm.com/index.php?rid=2473297&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D19517984%26dopt%3DAbstract</link>
            <description>Authors: Seki K, Mostafa J
    We propose an approach to predicting implicit gene-disease associations based on the inference network, whereby genes and diseases are represented as nodes and are connected via two types of intermediate nodes: gene functions and phenotypes. To estimate the probabilities involved in the model, two learning schemes are compared; one baseline using co-annotations of keywords and the other taking advantage of free text. Additionally, we explore the use of domain ontologies to complement data sparseness and examine the impact of full text documents. The validity of the proposed framework is demonstrated on the benchmark data set created from real-world data.
    PMID: 19517984 [PubMed - in process] (Source: International Journal of Data Mining and Bioinformatics)</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=2473297</comments>
            <pubDate>Sat, 13 Jun 2009 16:15:02 +0100</pubDate>
            <guid isPermaLink="false">2473297</guid>        </item>
        <item>
            <title>Double iterative optimisation for metabolic network-based drug target identification.</title>
            <link>http://www.medworm.com/index.php?rid=2473294&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D19517985%26dopt%3DAbstract</link>
            <description>We present novel and scalable algorithms for finding a set of enzymes, whose inhibition stops the production of a given set of target compounds, while eliminating minimal number of non-target compounds. Experimental results are presented for the E. coli metabolic network to demonstrate the accuracy and efficiency of our iterative method.
    PMID: 19517985 [PubMed - in process] (Source: International Journal of Data Mining and Bioinformatics)</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=2473294</comments>
            <pubDate>Sat, 13 Jun 2009 16:15:02 +0100</pubDate>
            <guid isPermaLink="false">2473294</guid>        </item>
        <item>
            <title>Study of microarray time series data based on Forward-Backward Linear Prediction and Singular Value Decomposition.</title>
            <link>http://www.medworm.com/index.php?rid=2473292&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D19517986%26dopt%3DAbstract</link>
            <description>Authors: Choong MK, Levy D, Yan H
    We propose a method to analyse the periodicities of gene expression profiles based on the spectral domain approach. Our spectral reconstruction method outperforms three other recently proposed methods, which do not require any prior knowledge. It is proven that an alternative method for studying cell-cycle regulation is possible even where very little prior knowledge is available. We also investigate the potential of combining signals with similar frequency components to form an overdetermined system of equations, and use least squares solution to estimate the spectral frequency. Results show that this new method is able to estimate the peak frequency more accurately.
    PMID: 19517986 [PubMed - in process] (Source: International Journal of Data Minin...</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=2473292</comments>
            <pubDate>Sat, 13 Jun 2009 16:15:02 +0100</pubDate>
            <guid isPermaLink="false">2473292</guid>        </item>
        <item>
            <title>Computational identification of protein-coding sequences by comparative analysis.</title>
            <link>http://www.medworm.com/index.php?rid=2473289&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D19517987%26dopt%3DAbstract</link>
            <description>Authors: Fontaine A, Touzet H
    Gene prediction is an essential step in understanding the genome of a species once it has been sequenced. For that, a promising direction in current research on gene finding is a comparative genomics approach. In this paper, we present a novel approach to identifying evolutionarily conserved protein-coding sequences in genomes. The method takes advantage of the specific substitution pattern of coding sequences together with the consistency of reading frames. It has been implemented in a software called PROTEA. Large-scale experimentation shows good results. PROTEA is intended to be a useful complement to existing tools based on homology search or statistical properties of the sequences.
    PMID: 19517987 [PubMed - in process] (Source: International Journa...</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=2473289</comments>
            <pubDate>Sat, 13 Jun 2009 16:15:02 +0100</pubDate>
            <guid isPermaLink="false">2473289</guid>        </item>
        <item>
            <title>Feature cluster selection for high-throughput data analysis.</title>
            <link>http://www.medworm.com/index.php?rid=2473287&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D19517988%26dopt%3DAbstract</link>
            <description>Authors: Yu L
    Feature selection is effective in selecting predictive gene sets for microarray classification. However, the large number of predictive gene sets and the disparity among them presents a challenge for identifying potential biomarkers. To facilitate biomarker identification, we present a new data mining task, feature cluster selection, which selects from a full set of features a small number of coherent and predictive feature clusters. We provide both theoretical definition and empirical formulation for the new problem, and propose an efficient 3M algorithm. Experiments on microarray data have shown that the 3M algorithm can select predictive and statistically significant gene clusters.
    PMID: 19517988 [PubMed - in process] (Source: International Journal of Data Mining a...</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=2473287</comments>
            <pubDate>Sat, 13 Jun 2009 16:15:02 +0100</pubDate>
            <guid isPermaLink="false">2473287</guid>        </item>
        <item>
            <title>A space-efficient algorithm for three sequence alignment and ancestor inference.</title>
            <link>http://www.medworm.com/index.php?rid=2473286&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D19517989%26dopt%3DAbstract</link>
            <description>Authors: Yue F, Tang J
    We propose a novel algorithm to simultaneously align three biological sequences with affine gap model and infer their common ancestral sequence. It applies the divide-and-conquer strategy to reduce the memory usage from O(n3) to O(n2). At the same time, it is based on dynamic programming and thus the optimal alignment is guaranteed. We implemented the algorithm and tested it extensively with both BAliBASE dataset and simulation data generated by Random Model of Sequence Evolution (ROSE). Compared with other popular multiple sequence alignment tools such as ClustalW and T-Coffee, our program produces not only better alignment, but also better ancestral sequence.
    PMID: 19517989 [PubMed - in process] (Source: International Journal of Data Mining and Bioinformati...</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=2473286</comments>
            <pubDate>Sat, 13 Jun 2009 16:15:02 +0100</pubDate>
            <guid isPermaLink="false">2473286</guid>        </item>
        <item>
            <title>Spherical-harmonic decomposition for molecular recognition in electron-density maps.</title>
            <link>http://www.medworm.com/index.php?rid=2473280&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D19517990%26dopt%3DAbstract</link>
            <description>Authors: DiMaio FP, Soni AB, Phillips GN, Shavlik JW
    Several methods for automatically constructing a protein model from an electron-density map require searching for many small protein-fragment templates in the density. We propose to use the spherical-harmonic decomposition of the template and the maps density to speed this matching. Unlike other template-matching approaches, this allows us to eliminate large portions of the map unlikely to match any templates. We train several first-pass filters for this elimination task. We show our new template-matching method improves accuracy and reduces running time, compared to previous approaches. Finally, we extend our method to produce a structural-homology detection algorithm using electron density.
    PMID: 19517990 [PubMed - in process] ...</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=2473280</comments>
            <pubDate>Sat, 13 Jun 2009 16:15:02 +0100</pubDate>
            <guid isPermaLink="false">2473280</guid>        </item>
        <item>
            <title>Word Sense Disambiguation in biomedical ontologies with term co-occurrence analysis and document clustering.</title>
            <link>http://www.medworm.com/index.php?rid=1991660&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D19024494%26dopt%3DAbstract</link>
            <description>Authors: Andreopoulos B, Alexopoulou D, Schroeder M
    With more and more genomes being sequenced, a lot of effort is devoted to their annotation with terms from controlled vocabularies such as the GeneOntology. Manual annotation based on relevant literature is tedious, but automation of this process is difficult. One particularly challenging problem is word sense disambiguation. Terms such as 'development' can refer to developmental biology or to the more general sense. Here, we present two approaches to address this problem by using term co-occurrences and document clustering. To evaluate our method we defined a corpus of 331 documents on development and developmental biology. Term co-occurrence analysis achieves an F-measure of 77%. Additionally, applying document clustering improves p...</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1991660</comments>
            <pubDate>Thu, 27 Nov 2008 12:32:20 +0100</pubDate>
            <guid isPermaLink="false">1991660</guid>        </item>
        <item>
            <title>Scoring and summarising gene product clusters using the Gene Ontology.</title>
            <link>http://www.medworm.com/index.php?rid=1991659&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D19024495%26dopt%3DAbstract</link>
            <description>Authors: Denaxas SC, Tjortjis C
    We propose an approach for quantifying the biological relatedness between gene products, based on their properties, and measure their similarities using exclusively statistical NLP techniques and Gene Ontology (GO) annotations. We also present a novel similarity figure of merit, based on the vector space model, which assesses gene expression analysis results and scores gene product clusters' biological coherency, making sole use of their annotation terms and textual descriptions. We define query profiles which rapidly detect a gene product cluster's dominant biological properties. Experimental results validate our approach, and illustrate a strong correlation between our coherency score and gene expression patterns.
    PMID: 19024495 [PubMed - in proces...</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1991659</comments>
            <pubDate>Thu, 27 Nov 2008 12:32:20 +0100</pubDate>
            <guid isPermaLink="false">1991659</guid>        </item>
        <item>
            <title>Sparse p-norm Nonnegative Matrix Factorization for clustering gene expression data.</title>
            <link>http://www.medworm.com/index.php?rid=1991658&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D19024496%26dopt%3DAbstract</link>
            <description>Authors: Liu W, Yuan K
    Nonnegative Matrix Factorization (NMF) is a powerful tool for gene expression data analysis as it reduces thousands of genes to a few compact metagenes, especially in clustering gene expression samples for cancer class discovery. Enhancing sparseness of the factorisation can find only a few dominantly coexpressed metagenes and improve the clustering effectiveness. Sparse p-norm (p &amp;gt; 1) Nonnegative Matrix Factorization (Sp-NMF) is a more sparse representation method using high order norm to normalise the decomposed components. In this paper, we investigate the benefit of high order normalisation for clustering cancer-related gene expression samples. Experimental results demonstrate that Sp-NMF leads to robust and effective clustering in both automatically deter...</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1991658</comments>
            <pubDate>Thu, 27 Nov 2008 12:32:20 +0100</pubDate>
            <guid isPermaLink="false">1991658</guid>        </item>
        <item>
            <title>A Bayesian framework for knowledge driven regression model in micro-array data analysis.</title>
            <link>http://www.medworm.com/index.php?rid=1991657&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D19024497%26dopt%3DAbstract</link>
            <description>We presented a full Bayesian framework to effectively exploit the similarity information of the input variables for linear regression. Empirical studies with gene expression data show that the regression errors can be reduced significantly by incorporating the similarity information derived from gene ontology.
    PMID: 19024497 [PubMed - in process] (Source: International Journal of Data Mining and Bioinformatics)</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1991657</comments>
            <pubDate>Thu, 27 Nov 2008 12:32:20 +0100</pubDate>
            <guid isPermaLink="false">1991657</guid>        </item>
        <item>
            <title>Classification techniques with minimal labelling effort and application to medical reports.</title>
            <link>http://www.medworm.com/index.php?rid=1991656&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D19024498%26dopt%3DAbstract</link>
            <description>Authors: Saad FH, Bell GD, de la Iglesia B
    There are a number of approaches to classify text documents. Here, we use Partially Supervised Classification (PSC) and argue that it is an effective and efficient approach for real-world problems. PSC uses a two-step strategy to cut down on the labelling effort. There are a number of methods that have been proposed for each step. An evaluation of various methods is conducted using real-world medical documents. The results show that using EM to build the classifier yields better results than SVM. We also experimentally show that careful selection of a subset of features to represent the documents can improve performance.
    PMID: 19024498 [PubMed - in process] (Source: International Journal of Data Mining and Bioinformatics)</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1991656</comments>
            <pubDate>Thu, 27 Nov 2008 12:32:20 +0100</pubDate>
            <guid isPermaLink="false">1991656</guid>        </item>
        <item>
            <title>Message Passing Clustering (MPC): a knowledge-based framework for clustering under biological constraints.</title>
            <link>http://www.medworm.com/index.php?rid=1769191&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D18767349%26dopt%3DAbstract</link>
            <description>Authors: Geng H, Deng X, Ali HH
    A new clustering algorithm, Message Passing Clustering (MPC), is proposed. MPC employs the concept of message passing to describe parallel and spontaneous clustering process by allowing data objects to communicate with each other. MPC also provides an extensible framework to accommodate additional features into clustering, such as adaptive feature weights scaling, stochastic cluster merging, and semi-supervised constraints guiding. Extensive experiments were performed using both simulation and real microarray gene expression and phylogenetic data. The results showed that MPC performed favourably to other popular clustering algorithms and MPC with the integration of additional features gave even higher accuracy rate than MPC.
    PMID: 18767349 [PubMed - ...</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1769191</comments>
            <pubDate>Sat, 06 Sep 2008 11:38:16 +0100</pubDate>
            <guid isPermaLink="false">1769191</guid>        </item>
        <item>
            <title>Identification of Intrinsically Unstructured Proteins using hierarchical classifier.</title>
            <link>http://www.medworm.com/index.php?rid=1769190&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D18767350%26dopt%3DAbstract</link>
            <description>Authors: Yang JY, Yang MQ
    It is suggested that protein functions only when folded into a particular 3-D structure. Recently, many protein regions and some entire proteins have been identified with no definite tertiary structure, but presenting instead as dynamic, disorder ensembles under different physiochemical circumstances. These proteins and regions are known as Intrinsically Unstructured regions and Proteins (IUP). We constructed a Recursive Maximum Contrast Tree (RMCT) based classifier to identify IUP. The classifier has been benchmarked against industrial standard PONDR VLXT on out-of-sample data by external evaluators. The IUP predictor is a viable alternative software tool for identifying intrinsic unstructured regions and proteins.
    PMID: 18767350 [PubMed - in process] (So...</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1769190</comments>
            <pubDate>Sat, 06 Sep 2008 11:38:16 +0100</pubDate>
            <guid isPermaLink="false">1769190</guid>        </item>
        <item>
            <title>Handling gene redundancy in microarray data using Grey Relational Analysis.</title>
            <link>http://www.medworm.com/index.php?rid=1769189&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D18767351%26dopt%3DAbstract</link>
            <description>Authors: Zhang LJ, Li ZJ, Chen HW
    Gene selection is one of the important and frequently used techniques for microarray data classification. In this paper, we introduce a new metric to measure gene-class relevance and gene-gene redundancy. The new metric is based on Grey Relational Analysis (GRA), called Grey Relational Grade (GRG), and never used in gene selection before. Based on the GRG, we develop a new gene selection method, which uses GRG to group similar genes to clusters, and then select informative genes from each cluster to avoid redundancy. Experiments on public data sets demonstrate the effectiveness of the proposed method.
    PMID: 18767351 [PubMed - in process] (Source: International Journal of Data Mining and Bioinformatics)</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1769189</comments>
            <pubDate>Sat, 06 Sep 2008 11:38:16 +0100</pubDate>
            <guid isPermaLink="false">1769189</guid>        </item>
        <item>
            <title>Large-scale Protein-Protein Interaction prediction using novel kernel methods.</title>
            <link>http://www.medworm.com/index.php?rid=1769188&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D18767352%26dopt%3DAbstract</link>
            <description>Authors: Chen XW, Han B, Fang J, Haasl RJ
    Knowledge of Protein-Protein Interactions (PPIs) can give us new insights into molecular mechanisms and properties of the cell. In this paper, we propose a novel domain-based kernel method to predict PPIs. A new kernel that measures the similarity between protein pairs based on a new feature representation is developed and applied to a large scale PPI database. Experimental results demonstrate its effectiveness. Furthermore, we evaluate the problem of cross-species PPI prediction and the effect of the number of negative samples on the performance of PPI predictions, which are two fundamental problems in most in silico PPI methods.
    PMID: 18767352 [PubMed - in process] (Source: International Journal of Data Mining and Bioinformatics)</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1769188</comments>
            <pubDate>Sat, 06 Sep 2008 11:38:16 +0100</pubDate>
            <guid isPermaLink="false">1769188</guid>        </item>
        <item>
            <title>Protein homology detection with biologically inspired features and interpretable statistical models.</title>
            <link>http://www.medworm.com/index.php?rid=1769187&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D18767353%26dopt%3DAbstract</link>
            <description>Authors: Huang PH, Pavlovic V
    Computational classification of proteins using methods such as string kernels and Fisher-SVM has demonstrated great success. However, the resulting models do not offer an immediate interpretation of the underlying biological mechanisms. In this work, we propose a biologically motivated feature set combined with a sparse classifier, based on a small subset of positions and residues in protein sequences, for protein superfamily detection and show the performance of our models is comparable to that of the state-of-the-art methods on a benchmark dataset. The set of sparse critical features discovered by the models is consistent with the confirmed biological findings.
    PMID: 18767353 [PubMed - in process] (Source: International Journal of Data Mining and Bio...</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1769187</comments>
            <pubDate>Sat, 06 Sep 2008 11:38:16 +0100</pubDate>
            <guid isPermaLink="false">1769187</guid>        </item>
        <item>
            <title>Discovery of metabolite features for the modelling and analysis of high-resolution NMR spectra.</title>
            <link>http://www.medworm.com/index.php?rid=1769186&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D18767354%26dopt%3DAbstract</link>
            <description>This study presents three feature selection methods for identifying the metabolite features in nuclear magnetic resonance spectra that contribute to the distinction of samples among varying nutritional conditions. Principal component analysis, Fisher discriminant analysis, and Partial Least Square Discriminant Analysis (PLS-DA) were used to calculate the importance of individual metabolite feature in spectra. Moreover, an Orthogonal Signal Correction (OSC) filter was used to eliminate unnecessary variations in spectra. We evaluated the presented methods by comparing the ability of classification based on the features selected by each method. The result showed that the best classification was achieved from an OSC-PLS-DA model.
    PMID: 18767354 [PubMed - in process] (Source: International ...</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1769186</comments>
            <pubDate>Sat, 06 Sep 2008 11:38:16 +0100</pubDate>
            <guid isPermaLink="false">1769186</guid>        </item>
        <item>
            <title>Gene Regulatory Network modelling: a state-space approach.</title>
            <link>http://www.medworm.com/index.php?rid=1527151&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D18399325%26dopt%3DAbstract</link>
            <description>This study proposes a state-space model with control portion for inferring Gene Regulatory Networks (GRNs). The proposed model views genes as the observation variables, whose expression values depend on the current internal state variables and control variables, and views the means of clusters of gene expression as the control variables of the internal state equation. Bayesian Information Criterion (BIC) and Probabilistic Principal Component Analysis (PPCA) are used to estimate the internal states from observation data. The proposed approach is applied to two gene expression datasets. Computational results show that inferred GRNs possesses the characteristics of the real-life GRNs.
    PMID: 18399325 [PubMed - indexed for MEDLINE] (Source: International Journal of Data Mining and Bioinform...</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1527151</comments>
            <pubDate>Wed, 18 Jun 2008 22:17:51 +0100</pubDate>
            <guid isPermaLink="false">1527151</guid>        </item>
        <item>
            <title>Segmentation of short human exons based on spectral features of double curves.</title>
            <link>http://www.medworm.com/index.php?rid=1527150&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D18399326%26dopt%3DAbstract</link>
            <description>Authors: Jiang R, Yan H
    This paper presents a new segmentation method based on spectral analysis to locate borders between short protein coding regions and non-coding regions. We formulate the innovative double curve representation of a DNA sequence and apply local three-codon measurement on the discrete Fourier spectral features at 1/3 frequency to identify short protein coding regions. The proposed spectral segmentation method based on double curves requires no prior knowledge of the DNA data. Our simulation results show that the proposed spectral method greatly improves the accuracy of identifying short coding regions in DNA sequences compared with the results obtained from the other methods that analyse DNA sequences directly.
    PMID: 18399326 [PubMed - indexed for MEDLINE] (Sour...</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1527150</comments>
            <pubDate>Wed, 18 Jun 2008 22:17:51 +0100</pubDate>
            <guid isPermaLink="false">1527150</guid>        </item>
        <item>
            <title>Temporal representation for gene networks: towards a qualitative temporal data mining.</title>
            <link>http://www.medworm.com/index.php?rid=1527149&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D18399327%26dopt%3DAbstract</link>
            <description>Authors: Turenne N, Schwer SR
    Processing literature (i.e., text corpora) to capture gene regulation events is not easy and can be driven by the final data representation. We propose to build, manually, an example of temporal representation (whole gene networks for coat formation in Bacillus Subtilis). Our temporal representation is based on a generalised formal language theory (S-languages). We propose an algorithm to link bags of relations with representation, by ordering interactions. In this paper, starting from the network made manually from text data, we show that S-languages are quite relevant to encapsulate gene properties, and infer knowledge across timestamped gene relations found in texts.
    PMID: 18399327 [PubMed - indexed for MEDLINE] (Source: International Journal of Dat...</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1527149</comments>
            <pubDate>Wed, 18 Jun 2008 22:17:51 +0100</pubDate>
            <guid isPermaLink="false">1527149</guid>        </item>
        <item>
            <title>An integrative approach for biological data mining and visualisation.</title>
            <link>http://www.medworm.com/index.php?rid=1527148&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D18399328%26dopt%3DAbstract</link>
            <description>We present a system to integrate data across multiple bioinformatics databases and enable mining across various conceptual levels of biological information. The results are represented as complex networks. Context dependent mining of these networks is achieved by use of distances. Our approach is demonstrated with three applications: full metabolic network retrieval with network topology study, exploration of properties and relationships of a set of selected proteins, and combined visualisation and exploration of gene expression data with related pathways and ontologies.
    PMID: 18399328 [PubMed - indexed for MEDLINE] (Source: International Journal of Data Mining and Bioinformatics)</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1527148</comments>
            <pubDate>Wed, 18 Jun 2008 22:17:51 +0100</pubDate>
            <guid isPermaLink="false">1527148</guid>        </item>
        <item>
            <title>A rule-based approach for RNA pseudoknot prediction.</title>
            <link>http://www.medworm.com/index.php?rid=1527147&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D18399329%26dopt%3DAbstract</link>
            <description>Authors: Fu XZ, Wang H, Harrison RW, Harrison WL
    RNA plays a critical role in mediating every step of cellular information transfer from genes to functional proteins. Pseudoknots are functionally important and widely occurring structural motifs found in all types of RNA. Therefore predicting their structures is an important problem. In this paper, we present a new RNA pseudoknot structure prediction method based on term rewriting. The method is implemented using the Mfold RNA/DNA folding package and the term rewriting language Maude. In our method, RNA structures are treated as terms and rules are discovered for predicting pseudoknots. Our method was tested on 211 pseudoknots in PseudoBase and achieves an average accuracy of 74.085% compared to the experimentally determined structure. ...</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1527147</comments>
            <pubDate>Wed, 18 Jun 2008 22:17:51 +0100</pubDate>
            <guid isPermaLink="false">1527147</guid>        </item>
        <item>
            <title>Simulation study in Probabilistic Boolean Network models for genetic regulatory networks.</title>
            <link>http://www.medworm.com/index.php?rid=1527156&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D18399072%26dopt%3DAbstract</link>
            <description>Authors: Zhang SQ, Ching WK, Ng MK, Akutsu T
    Probabilistic Boolean Network (PBN) is widely used to model genetic regulatory networks. Evolution of the PBN is according to the transition probability matrix. Steady-state (long-run behaviour) analysis is a key aspect in studying the dynamics of genetic regulatory networks. In this paper, an efficient method to construct the sparse transition probability matrix is proposed, and the power method based on the sparse matrix-vector multiplication is applied to compute the steady-state probability distribution. Such methods provide a tool for us to study the sensitivity of the steady-state distribution to the influence of input genes, gene connections and Boolean networks. Simulation results based on a real network are given to illustrate the m...</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1527156</comments>
            <pubDate>Mon, 01 Jan 2007 05:00:00 +0100</pubDate>
            <guid isPermaLink="false">1527156</guid>        </item>
        <item>
            <title>A parallel edge-betweenness clustering tool for Protein-Protein Interaction networks.</title>
            <link>http://www.medworm.com/index.php?rid=1527155&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D18399073%26dopt%3DAbstract</link>
            <description>Authors: Yang Q, Lonardi S
    The increasing availability of protein-protein interaction graphs (PPI) requires new efficient tools capable of extracting valuable biological knowledge from these networks. Among the wide range of clustering algorithms, Girvan and Newman's edge betweenness algorithm showed remarkable performances in discovering clustering structures in several real-world networks. Unfortunately, their algorithm suffers from high computational cost and it is impractical for inputs of the size of large PPI networks. Here we report on a novel parallel implementation of Girvan and Newman's clustering algorithm that achieves almost linear speed-up for up to 32 processors. The tool is available in the public domain from the authors' website.
    PMID: 18399073 [PubMed - indexed fo...</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1527155</comments>
            <pubDate>Mon, 01 Jan 2007 05:00:00 +0100</pubDate>
            <guid isPermaLink="false">1527155</guid>        </item>
        <item>
            <title>Prediction of Protein Secondary Structure with two-stage multi-class SVMs.</title>
            <link>http://www.medworm.com/index.php?rid=1527154&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D18399074%26dopt%3DAbstract</link>
            <description>Authors: Nguyen MN, Rajapakse JC
    Bioinformatics techniques to Protein Secondary Structure (PSS) prediction mostly depend on the information available in amino acid sequences. In this paper, we propose a two-stage Multi-class Support Vector Machine (MSVM) approach, where the second MSVM predictor is introduced at the output of the first stage MSVM to capture the contextual relationship among secondary structure elements in order to minimise the generalisation error in the prediction. By using position-specific scoring matrices generated by PSI-BLAST, the two-stage MSVM approach achieves Q3 accuracies of 78.0% and 76.3% on the RS126 dataset of 126 non-homologous globular proteins and the CB396 dataset of 396 non-homologous proteins, respectively, which are better than the scores reported...</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1527154</comments>
            <pubDate>Mon, 01 Jan 2007 05:00:00 +0100</pubDate>
            <guid isPermaLink="false">1527154</guid>        </item>
        <item>
            <title>Granular kernel trees with parallel genetic algorithms for drug activity comparisons.</title>
            <link>http://www.medworm.com/index.php?rid=1527153&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D18399075%26dopt%3DAbstract</link>
            <description>Authors: Jin B, Zhang YQ, Wang B
    With the growing interests of biological data prediction and chemical data prediction, more powerful and flexible kernels need to be designed so that the prior knowledge and relationships within data can be expressed effectively in kernel functions. In this paper, Granular Kernel Trees (GKTs) are proposed and parallel Genetic Algorithms (GAs) are used to optimise the parameters of GKTs. In applications, SVMs with new kernel trees are employed for drug activity comparisons. The experimental results show that GKTs and evolutionary GKTs can achieve better performances than traditional RBF kernels in terms of prediction accuracy.
    PMID: 18399075 [PubMed - indexed for MEDLINE] (Source: International Journal of Data Mining and Bioinformatics)</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1527153</comments>
            <pubDate>Mon, 01 Jan 2007 05:00:00 +0100</pubDate>
            <guid isPermaLink="false">1527153</guid>        </item>
        <item>
            <title>Exploring alternative knowledge representations for protein secondary-structure prediction.</title>
            <link>http://www.medworm.com/index.php?rid=1527152&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D18399076%26dopt%3DAbstract</link>
            <description>Authors: Midic U, Dunker AK, Obradovic Z
    Methods for 3-class secondary-structure prediction are thought to be reaching the highest achievable accuracy. Their accuracy on beta-sheet residue class is considerably lower than for the other two classes. We analysed the relevance of 315 individual input attributes for a predictor with the usual framework of using sequence-profile based data with an input window of fixed size. We propose two alternative knowledge representations with significantly smaller sets of input attributes. We also investigated the possibility of exploiting the prediction of connected pairs of beta-sheet residues and the prediction of residue contact maps for the improvement of accuracy of secondary-structure prediction.
    PMID: 18399076 [PubMed - indexed for MEDLINE...</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1527152</comments>
            <pubDate>Mon, 01 Jan 2007 05:00:00 +0100</pubDate>
            <guid isPermaLink="false">1527152</guid>        </item>
        <item>
            <title>Simulating the cellular passive transport of glucose using a time-dependent extension of Gillespie algorithm for stochastic pi-calculus.</title>
            <link>http://www.medworm.com/index.php?rid=1527141&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D18402045%26dopt%3DAbstract</link>
            <description>Authors: Lecca P
    Realistic simulations of the biological systems evolution require a mathematical model of the stochasticity of the involved processes and a formalism for specifying the concurrent nature of the biochemical interactions. A time-dependent extension of the Gillespie algorithm implementing the race condition of the stochastic pi-calculus formalism satisfies both these requirements. This paper formulates those modifications to the original Gillespie algorithm necessary when the time dependence of the reaction propensity is due to changes either of volume or temperature. This re-formulation has been incorporated in the framework of stochastic pi-calculus and has been applied to simulate the passive glucose cellular transport.
    PMID: 18402045 [PubMed - indexed for MEDLINE]...</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1527141</comments>
            <pubDate>Mon, 01 Jan 2007 05:00:00 +0100</pubDate>
            <guid isPermaLink="false">1527141</guid>        </item>
        <item>
            <title>Transductive learning with EM algorithm to classify proteins based on phylogenetic profiles.</title>
            <link>http://www.medworm.com/index.php?rid=1527140&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D18402046%26dopt%3DAbstract</link>
            <description>Authors: Craig RA, Liao L
    We proposed a novel method for protein classification based on phylogenetic profiles. Each protein's profile was extended with extra bits encoding the phylogenetic tree structure and the likelihood, in the form of weights on profile indices, of the protein's functional family membership in each of the reference genomes. The extended profiles were then integrated as part of a kernel of a support vector machine, which was trained in a transductive learning scheme using the EM algorithm to update the weights. Classification accuracy was greatly increased when tested on the proteome of Saccharomyces cerevisiae using the MIPS classification as a benchmark.
    PMID: 18402046 [PubMed - indexed for MEDLINE] (Source: International Journal of Data Mining and Bioinforma...</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1527140</comments>
            <pubDate>Mon, 01 Jan 2007 05:00:00 +0100</pubDate>
            <guid isPermaLink="false">1527140</guid>        </item>
        <item>
            <title>A constraint logic programming approach to associate 1D and 3D structural components for large protein complexes.</title>
            <link>http://www.medworm.com/index.php?rid=1527139&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D18402047%26dopt%3DAbstract</link>
            <description>Authors: Dal Pal&amp;#xF9; A, Pontelli E, He J, Lu Y
    The paper describes a novel framework, constructed using Constraint Logic Programming (CLP) and parallelism, to determine the association between parts of the primary sequence of a protein and alpha-helices extracted from 3D low-resolution descriptions of large protein complexes. The association is determined by extracting constraints from the 3D information, regarding length, relative position and connectivity of helices, and solving these constraints with the guidance of a secondary structure prediction algorithm. Parallelism is employed to enhance performance on large proteins. The framework provides a fast, inexpensive alternative to determine the exact tertiary structure of unknown proteins.
    PMID: 18402047 [PubMed - indexed for ...</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1527139</comments>
            <pubDate>Mon, 01 Jan 2007 05:00:00 +0100</pubDate>
            <guid isPermaLink="false">1527139</guid>        </item>
        <item>
            <title>A Merge-Decoupling Dead End Elimination algorithm for protein side-chain conformation.</title>
            <link>http://www.medworm.com/index.php?rid=1527138&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D18402048%26dopt%3DAbstract</link>
            <description>We present a Merge-Decoupling DEE (MD-DEE) that further reduces the number of rotamers after SG-DEE. MD-DEE works by forming residue-pairs but is fast and, like SG-DEE, is practical even for large proteins. Our experiments show that MD-DEE achieves further reduction in residue elimination (up to 25%) after SG-DEE.
    PMID: 18402048 [PubMed - indexed for MEDLINE] (Source: International Journal of Data Mining and Bioinformatics)</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1527138</comments>
            <pubDate>Mon, 01 Jan 2007 05:00:00 +0100</pubDate>
            <guid isPermaLink="false">1527138</guid>        </item>
        <item>
            <title>Biomedical text summarisation using concept chains.</title>
            <link>http://www.medworm.com/index.php?rid=1527137&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D18402049%26dopt%3DAbstract</link>
            <description>Authors: Reeve LH, Han H, Brooks AD
    BioChainSumm is a biomedical text summariser utilising concept chaining (called BioChain) to link semantically-related concepts within biomedical text together. The BioChain process is adapted from existing lexical chaining approaches which chain semantically-related terms rather than concepts. The BioChain concept chains are used to identify salient candidate sentences which are extracted to produce a summary of the biomedical text. The Unified Medical Language System Metathesaurus and Semantic Network semantic resources identify related biomedical concepts. BioChainSumm is evaluated using the ROUGE system along with several existing, publicly-available summarisers. Our results show BioChain provides a promising methodology for biomedical text summa...</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1527137</comments>
            <pubDate>Mon, 01 Jan 2007 05:00:00 +0100</pubDate>
            <guid isPermaLink="false">1527137</guid>        </item>
        <item>
            <title>Dynamic algorithm for inferring qualitative models of Gene Regulatory Networks.</title>
            <link>http://www.medworm.com/index.php?rid=1527162&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D18399066%26dopt%3DAbstract</link>
            <description>Authors: Zheng Y, Kwoh CK
    We introduce a novel algorithm, DFL (Discrete Function Learning), for reconstructing qualitative models of Gene Regulatory Networks (GRNs) from gene expression data in this paper. We analyse its complexity of O(k x N x n2) on the average and its data requirements. The experiments of synthetic Boolean networks show that the DFL algorithm is more efficient than current algorithms without loss of prediction performances. The results of yeast cell cycle gene expression data show that the DFL algorithm can identify biologically significant models with reasonable accuracy, sensitivity and high precision with respect to the literature evidences.
    PMID: 18399066 [PubMed - indexed for MEDLINE] (Source: International Journal of Data Mining and Bioinformatics)</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1527162</comments>
            <pubDate>Sun, 01 Jan 2006 05:00:00 +0100</pubDate>
            <guid isPermaLink="false">1527162</guid>        </item>
        <item>
            <title>Improving domain-based protein interaction prediction using biologically significant negative datasets.</title>
            <link>http://www.medworm.com/index.php?rid=1527161&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D18399067%26dopt%3DAbstract</link>
            <description>Authors: Li XL, Tan SH, Ng SK
    We propose a domain-based classification method to predict protein-protein interactions using probabilities of putative interacting domain pairs derived from both experimentally-determined interacting protein pairs and carefully-chosen non-interacting protein pairs. Multi-species comparative results for protein interaction prediction show that such careful generation of biologically meaningful negative training data can improve classification performance.
    PMID: 18399067 [PubMed - indexed for MEDLINE] (Source: International Journal of Data Mining and Bioinformatics)</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1527161</comments>
            <pubDate>Sun, 01 Jan 2006 05:00:00 +0100</pubDate>
            <guid isPermaLink="false">1527161</guid>        </item>
        <item>
            <title>Spectral similarity for analysis of DNA microarray time-series data.</title>
            <link>http://www.medworm.com/index.php?rid=1527160&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D18399068%26dopt%3DAbstract</link>
            <description>Authors: Yan H, Pham T
    This paper proposes a new similarity measurement for comparison and analysis of DNA microarray time-series data. In this method, a gene expression time series is decomposed into frequency components and the correlation between the data from a pair of genes is measured in the frequency domain. The method effectively solves the phase delay problem and provides a more accurate metric for microarray time-series classification.
    PMID: 18399068 [PubMed - indexed for MEDLINE] (Source: International Journal of Data Mining and Bioinformatics)</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1527160</comments>
            <pubDate>Sun, 01 Jan 2006 05:00:00 +0100</pubDate>
            <guid isPermaLink="false">1527160</guid>        </item>
        <item>
            <title>Transitive closure and metric inequality of weighted graphs: detecting protein interaction modules using cliques.</title>
            <link>http://www.medworm.com/index.php?rid=1527159&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D18399069%26dopt%3DAbstract</link>
            <description>Authors: Ding C, He X, Xiong H, Peng H, Holbrook SR
    We study transitivity properties of edge weights in complex networks. We show that enforcing transitivity leads to a transitivity inequality which is equivalent to ultra-metric inequality. This can be used to define transitive closure on weighted undirected graphs, which can be computed using a modified Floyd-Warshall algorithm. These new concepts are extended to dissimilarity graphs and triangle inequalities. From this, we extend the clique concept from unweighted graph to weighted graph. We outline several applications and present results of detecting protein functional modules in a protein interaction network.
    PMID: 18399069 [PubMed - indexed for MEDLINE] (Source: International Journal of Data Mining and Bioinformatics)</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1527159</comments>
            <pubDate>Sun, 01 Jan 2006 05:00:00 +0100</pubDate>
            <guid isPermaLink="false">1527159</guid>        </item>
        <item>
            <title>BAG: a graph theoretic sequence clustering algorithm.</title>
            <link>http://www.medworm.com/index.php?rid=1527158&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D18399070%26dopt%3DAbstract</link>
            <description>Authors: Kim S, Lee J
    In this paper, we first discuss issues in clustering biological sequences with graph properties, which inspired the design of our sequence clustering algorithm BAG. BAG recursively utilises several graph properties: biconnectedness, articulation points, pquasi-completeness, and domain knowledge specific to biological sequence clustering. To reduce the fragmentation issue, we have developed a new metric called cluster utility to guide cluster splitting. Clusters are then merged back with less stringent constraints. Experiments with the entire COG database and other sequence databases show that BAG can cluster a large number of sequences accurately while keeping the number of fragmented clusters significantly low.
    PMID: 18399070 [PubMed - indexed for MEDLINE] (S...</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1527158</comments>
            <pubDate>Sun, 01 Jan 2006 05:00:00 +0100</pubDate>
            <guid isPermaLink="false">1527158</guid>        </item>
        <item>
            <title>An efficient motif discovery algorithm with unknown motif length and number of binding sites.</title>
            <link>http://www.medworm.com/index.php?rid=1527157&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D18399071%26dopt%3DAbstract</link>
            <description>Authors: Leung HC, Chin FY
    Most motif discovery algorithms from DNA sequences require the motif's length as input. Styczynski et al. introduced the Extended (l,d)-Motif Problem (EMP) where the motif's length is not an input parameter. Unfortunately, their algorithm takes an unacceptably long time to run, e.g. over 3 months to discover a length-14 motif. Since the best motif may not be the longest nor have the largest number of binding sites, in this paper we further eliminate another input parameter about the minimum number of binding sites in order to provide more realistic/robust results. We also develop an efficient algorithm to solve EMP and this redefined problem.
    PMID: 18399071 [PubMed - indexed for MEDLINE] (Source: International Journal of Data Mining and Bioinformatics)</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1527157</comments>
            <pubDate>Sun, 01 Jan 2006 05:00:00 +0100</pubDate>
            <guid isPermaLink="false">1527157</guid>        </item>
        <item>
            <title>Adaptive Fuzzy Association Rule mining for effective decision support in biomedical applications.</title>
            <link>http://www.medworm.com/index.php?rid=1527146&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D18402040%26dopt%3DAbstract</link>
            <description>Authors: He Y, Tang Y, Zhang YQ, Sunderraman R
    Due to complexity of biomedical classification problems, it is impossible to build a perfect classifier with 100% prediction accuracy. Hence a more realistic target is to build an effective Decision Support System (DSS). Here 'effective' means that a DSS should not only predict unseen samples accurately, but also work in a human-understandable way. In this paper, we propose a novel adaptive Fuzzy Association Rules (FARs) mining algorithm, named FARM-DS, to build such a DSS for binary classification problems in the biomedical domain. In the training phase, four steps are executed to mine FARs, which are thereafter used to predict unseen samples in the testing phase. The new FARM-DS algorithm is evaluated on two publicly available medical da...</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1527146</comments>
            <pubDate>Sun, 01 Jan 2006 05:00:00 +0100</pubDate>
            <guid isPermaLink="false">1527146</guid>        </item>
        <item>
            <title>Bi-level clustering of mixed categorical and numerical biomedical data.</title>
            <link>http://www.medworm.com/index.php?rid=1527145&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D18402041%26dopt%3DAbstract</link>
            <description>We present the BILCOM algorithm for 'Bi-Level Clustering of Mixed categorical and numerical data types'. BILCOM performs a pseudo-Bayesian process, where the prior is categorical clustering. BILCOM partitions biomedical data sets of mixed types, such as hepatitis, thyroid disease and yeast gene expression data with Gene Ontology annotations, more accurately than if using one type alone.
    PMID: 18402041 [PubMed - indexed for MEDLINE] (Source: International Journal of Data Mining and Bioinformatics)</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1527145</comments>
            <pubDate>Sun, 01 Jan 2006 05:00:00 +0100</pubDate>
            <guid isPermaLink="false">1527145</guid>        </item>
        <item>
            <title>Kernel design for RNA classification using Support Vector Machines.</title>
            <link>http://www.medworm.com/index.php?rid=1527144&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D18402042%26dopt%3DAbstract</link>
            <description>Authors: Wang JT, Wu X
    Support Vector Machines (SVMs) are a state-of-the-art machine learning tool widely used in speech recognition, image processing and biological sequence analysis. An essential step in SVMs is to devise a kernel function to compute the similarity between two data points. In this paper we review recent advances of using SVMs for RNA classification. In particular we present a new kernel that takes advantage of both global and local structural information in RNAs and uses the information together to classify RNAs. Experimental results demonstrate the good performance of the new kernel and show that it outperforms existing kernels when applied to classifying non-coding RNA sequences.
    PMID: 18402042 [PubMed - indexed for MEDLINE] (Source: International Journal of Da...</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1527144</comments>
            <pubDate>Sun, 01 Jan 2006 05:00:00 +0100</pubDate>
            <guid isPermaLink="false">1527144</guid>        </item>
        <item>
            <title>State-space approach with the maximum likelihood principle to identify the system generating time-course gene expression data of yeast.</title>
            <link>http://www.medworm.com/index.php?rid=1527143&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D18402043%26dopt%3DAbstract</link>
            <description>Authors: Yamaguchi R, Higuchi T
    We use linear Gaussian state-space models to analyse time-course gene expression data of yeast. They are modelled to be generated from hidden state variables in a system. To identify the system, we estimate parameters of the model by EM algorithm and determine the dimension of the state variable by BIC.
    PMID: 18402043 [PubMed - indexed for MEDLINE] (Source: International Journal of Data Mining and Bioinformatics)</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1527143</comments>
            <pubDate>Sun, 01 Jan 2006 05:00:00 +0100</pubDate>
            <guid isPermaLink="false">1527143</guid>        </item>
        <item>
            <title>Text analysis of MEDLINE for discovering functional relationships among genes: evaluation of keyword extraction weighting schemes.</title>
            <link>http://www.medworm.com/index.php?rid=1527142&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D18402044%26dopt%3DAbstract</link>
            <description>Authors: Liu Y, Navathe SB, Pivoshenko A, Dasigi VG, Dingledine R, Cilia BJ
    One of the key challenges of microarray studies is to derive biological insights from the gene-expression patterns. Clustering genes by functional keyword association can provide direct information about the functional links among genes. However, the quality of the keyword lists significantly affects the clustering results. We compared two keyword weighting schemes: normalised z-score and term frequency-inverse document frequency (TFIDF). Two gene sets were tested to evaluate the effectiveness of the weighting schemes for keyword extraction for gene clustering. Using established measures of cluster quality, the results produced from TFIDF-weighted keywords outperformed those produced from normalised z-score wei...</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1527142</comments>
            <pubDate>Sun, 01 Jan 2006 05:00:00 +0100</pubDate>
            <guid isPermaLink="false">1527142</guid>        </item>
    </channel>
</rss>

