<?xml version="1.0" encoding="UTF-8"?>
<!-- generator="FeedCreator 1.7.2" -->
<rss version="2.0">
    <channel>
        <title>International Journal of Data Mining and Bioinformatics via MedWorm.com</title>
        <description>MedWorm.com provides a medical RSS filtering service. Over 6000 RSS medical sources are combined and output via different filters. This feed contains the latest items from the 'International Journal of Data Mining and Bioinformatics' source.</description>
        <link><![CDATA[http://www.medworm.com/rss/search.php?qu=International+Journal+of+Data+Mining+and+Bioinformatics&t=International+Journal+of+Data+Mining+and+Bioinformatics&s=Search&f=source]]></link>
        <lastBuildDate>Sat, 10 Oct 2009 19:32:23 +0100</lastBuildDate>
        <item>
            <title>A semi-supervised approach to projected clustering with applications to microarray data.</title>
            <link>http://www.medworm.com/index.php?rid=2640291&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D19623769%26dopt%3DAbstract</link>
            <description>Authors: Yip KY, Cheung L, Cheung DW, Jing L, Ng MK
    Recent studies have suggested that extremely low dimensional projected clusters exist in real datasets. Here, we propose a new algorithm for identifying them. It combines object clustering and dimension selection, and allows the input of domain knowledge in guiding the clustering process. Theoretical and experimental results show that even a small amount of input knowledge could already help detect clusters with only 1% of the relevant dimensions. We also show that this semi-supervised algorithm can perform knowledge-guided selective clustering when there are multiple meaningful object groupings. The algorithm is also shown effective in analysing a microarray dataset.
    PMID: 19623769 [PubMed - in process] (Source: International Jou...</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=2640291</comments>
            <pubDate>Mon, 27 Jul 2009 00:42:02 +0100</pubDate>
            <guid isPermaLink="false">2640291</guid>        </item>
        <item>
            <title>Clustering sequences by overlap.</title>
            <link>http://www.medworm.com/index.php?rid=2640290&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D19623770%26dopt%3DAbstract</link>
            <description>Authors: Dorr DH, Denton AM
    A clustering algorithm is introduced that combines the strengths of clustering and motif finding techniques. Clusters are identified based on unambiguously defined sequence sections as in motif finding algorithms. The definition of similarity within clusters allows transitive matches and, thereby, enables the discovery of remote homologies that cannot be found through motif-finding algorithms. Directed Acyclic Graph (DAG) structures are constructed that link short clusters to the longer ones. We compare the clustering results to the corresponding domains in the InterPro database. A second comparison shows that annotations based on our domains are inherently more consistent than those based on InterPro domains.
    PMID: 19623770 [PubMed - in process] (Source...</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=2640290</comments>
            <pubDate>Mon, 27 Jul 2009 00:42:02 +0100</pubDate>
            <guid isPermaLink="false">2640290</guid>        </item>
        <item>
            <title>Stroma classification for neuroblastoma on graphics processors.</title>
            <link>http://www.medworm.com/index.php?rid=2640289&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D19623771%26dopt%3DAbstract</link>
            <description>Authors: Ruiz A, Sertel O, Ujald&amp;#xF3;n M, Catalyurek U, Saltz J, Gurcan MN
    Neuroblastoma is one of the most common childhood cancers. We are developing an image analysis system to assist pathologists in their prognosis. Since this system operates on relatively large-scale images and requires sophisticated algorithms, computerised analysis takes a long time to execute. In this paper, we propose a novel approach to benefit from high memory bandwidth and strong floating-point capabilities of graphics processing units. The proposed approach achieves a promising classification accuracy of 99.4% and an execution performance with a gain factor up to 45 times compared to hand-optimised C++ code running on the CPU.
    PMID: 19623771 [PubMed - in process] (Source: International Journal of Data...</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=2640289</comments>
            <pubDate>Mon, 27 Jul 2009 00:42:02 +0100</pubDate>
            <guid isPermaLink="false">2640289</guid>        </item>
        <item>
            <title>Clinical text classification under the Open and Closed Topic Assumptions.</title>
            <link>http://www.medworm.com/index.php?rid=2640288&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D19623772%26dopt%3DAbstract</link>
            <description>Authors: Sasaki Y, Rea B, Ananiadou S
    This paper investigates multi-topic aspects in automatic classification of clinical free text in comparison with general text. In this paper, we facilitate two different views on multi-topics: the Closed Topic Assumption (CTA) and the Open Topic Assumption (OTA). Experimental results show that the characteristics of multi-topic assignments in the Computational Medicine Centre (CMC) Medical NLP Challenge Data is strongly OTA-oriented but general text Reuters-21578 is characterised in the middle of the OTA and CTA spectrum.
    PMID: 19623772 [PubMed - in process] (Source: International Journal of Data Mining and Bioinformatics)</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=2640288</comments>
            <pubDate>Mon, 27 Jul 2009 00:42:02 +0100</pubDate>
            <guid isPermaLink="false">2640288</guid>        </item>
        <item>
            <title>Tracking multiple interacting subcellular structure by sequential Monte Carlo method.</title>
            <link>http://www.medworm.com/index.php?rid=2640287&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D19623773%26dopt%3DAbstract</link>
            <description>Authors: Wen Q, Luby-Phelps K, Gao J
    With the wide application of Green Fluorescent Proteins (GFP) in the study of live cells, there is a surging need for computer-aided analysis on the huge amount of image sequence data acquired by the advanced microscopy devices. In this paper, a framework based on Sequential Monte Carlo (SMC) is proposed for multiple interacting object tracking. The distribution of the dimension varying joint state is sampled efficiently by a Reversible Jump Markov Chain Monte Carlo (RJMCMC) algorithm with a novel height swap move. Experimental results were performed on synthetic and real confocal microscopy image sequences.
    PMID: 19623773 [PubMed - in process] (Source: International Journal of Data Mining and Bioinformatics)</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=2640287</comments>
            <pubDate>Mon, 27 Jul 2009 00:42:02 +0100</pubDate>
            <guid isPermaLink="false">2640287</guid>        </item>
        <item>
            <title>Semantic similarity based feature extraction from microarray expression data.</title>
            <link>http://www.medworm.com/index.php?rid=2640286&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D19623774%26dopt%3DAbstract</link>
            <description>Authors: Cho YR, Zhang A, Xu X
    Previous studies have proven that it is feasible to build sample classifiers using gene expression profiles. To build an effective sample classifier, dimension reduction process is necessary since classic pattern recognition algorithms do not work well in high dimensional space. In this paper, we present a novel feature extraction algorithm by integrating microarray expression data with Gene Ontology (GO). Applying semantic similarity measures, we identify the groups of genes, called virtual genes, which potentially interact with each other for a biological function. The correlation in expressions of virtual genes is used to classify samples. For colon cancer data, this approach significantly improved the classification accuracy by more than 10%.
    PMID...</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=2640286</comments>
            <pubDate>Mon, 27 Jul 2009 00:42:02 +0100</pubDate>
            <guid isPermaLink="false">2640286</guid>        </item>
        <item>
            <title>An effective convergence independent loop closure method using Forward-Backward Cyclic Coordinate Descent.</title>
            <link>http://www.medworm.com/index.php?rid=2640285&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D19623775%26dopt%3DAbstract</link>
            <description>Authors: Al-Nasr K, He J
    Cyclic Coordinate Descent (CCD) is a popular robotic approach to generate a possible loop that closes the gap between two constrained portions of a protein chain (Canutescu and Dunbrack 2003). In this paper, we describe an effective Forward-Backward CCD (FBCCD) method to connect the two constrained portions of a protein chain without requiring the loop to converge. A test of 30 loops of length 4, 8 and 12 suggests that our method takes fewer number of cycles to produce loops of comparable accuracy and more accurate second portion of the chain, when it is compared to the CCD method.
    PMID: 19623775 [PubMed - in process] (Source: International Journal of Data Mining and Bioinformatics)</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=2640285</comments>
            <pubDate>Mon, 27 Jul 2009 00:42:02 +0100</pubDate>
            <guid isPermaLink="false">2640285</guid>        </item>
        <item>
            <title>22nd annual ACM symposium on applied computing.</title>
            <link>http://www.medworm.com/index.php?rid=2473318&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D19432372%26dopt%3DAbstract</link>
            <description>Authors: Palakal M
    
    PMID: 19432372 [PubMed - indexed for MEDLINE] (Source: International Journal of Data Mining and Bioinformatics)</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=2473318</comments>
            <pubDate>Sat, 13 Jun 2009 16:15:02 +0100</pubDate>
            <guid isPermaLink="false">2473318</guid>        </item>
        <item>
            <title>A cube framework for incorporating inter-gene information into biological data mining.</title>
            <link>http://www.medworm.com/index.php?rid=2473315&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D19432373%26dopt%3DAbstract</link>
            <description>Authors: Lin KM, Kang J, Shin H, Lee J
    Large volumes of microarray data are registered daily in public repositories such as SMD (Belkin and Niyogi, 2003) and GEO (Ashburner et al., 2000). Such repositories have quickly become a community resource. However, due to the inherent heterogeneity of the microarray experiments, the data generated from different experiments could not be directly integrated and hence the resources have not been fully utilised. To address this problem, we propose a new microarray integration framework that achieves high-quality integration through exploiting invariant features such as relative information among genes. We also show how the proposed approach generalises the previous frameworks.
    PMID: 19432373 [PubMed - indexed for MEDLINE] (Source: Internationa...</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=2473315</comments>
            <pubDate>Sat, 13 Jun 2009 16:15:02 +0100</pubDate>
            <guid isPermaLink="false">2473315</guid>        </item>
        <item>
            <title>Finding new core promoter elements using backward-looking strategies.</title>
            <link>http://www.medworm.com/index.php?rid=2473312&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D19432374%26dopt%3DAbstract</link>
            <description>Authors: Huang YF, Jhan YC, Liou SW
    Core Promoter Elements (CPEs) were key players in transcription initiation. Identifying CPEs is crucial for understanding gene expression. In this paper, a framework for finding new CPEs was proposed. An experiment was performed on the sequences of Eukaryotic Promoter Database (EPD). From the results, the known CPEs were all recovered; in addition, five new motifs were discovered in Drosophila and three in human. By comparing the results with currently known CPEs, it is shown that the proposed system is feasible and reliable, and these new CPEs are worth of further exploration.
    PMID: 19432374 [PubMed - indexed for MEDLINE] (Source: International Journal of Data Mining and Bioinformatics)</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=2473312</comments>
            <pubDate>Sat, 13 Jun 2009 16:15:02 +0100</pubDate>
            <guid isPermaLink="false">2473312</guid>        </item>
        <item>
            <title>An on demand data integration model for biological databases.</title>
            <link>http://www.medworm.com/index.php?rid=2473309&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D19432375%26dopt%3DAbstract</link>
            <description>Authors: Palakal M, Naidu P
    This paper presents a user-centric biological query system for information integration and knowledge acquisition from distributed, semantically heterogeneous data sources. The proposed system, BioXBase, extracts user requested query information over the internet from multiple biological sources and organises this information into a homogeneous unified view to the user. This entire process is done in real time on-the-fly. The BioXBase system has improved the results retrieved by 30% compared to a system that has only a local database. The BioXBase system is further enhanced by 20% while combining the results with a local database, making the results more significant in biological domain.
    PMID: 19432375 [PubMed - indexed for MEDLINE] (Source: International...</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=2473309</comments>
            <pubDate>Sat, 13 Jun 2009 16:15:02 +0100</pubDate>
            <guid isPermaLink="false">2473309</guid>        </item>
        <item>
            <title>Predicting protein-protein interfaces as clusters of optimal docking area points.</title>
            <link>http://www.medworm.com/index.php?rid=2473306&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D19432376%26dopt%3DAbstract</link>
            <description>Authors: Arafat Y, Kamruzzaman J, Karmakar GC, Fernandez-Recio J
    Desolvation property is used here to predict protein-protein binding sites exploiting the fact that lower-valued 'optimal docking area' ODA (Fernandez-Recio et al., 2005) points form cluster at the interface. The proposed method involves two steps; clustering the ODA points and representing ODA points by average ODA values. On 51 nonredundant proteins, results show the success rate improved considerably. Considering only significant ODA, the previous ODA method has obtained a success rate of 65% with overall success rate of 39%. The proposed method improved the overall success rate to 61%. Further, comparable results were found for X-ray and NMR structures.
    PMID: 19432376 [PubMed - indexed for MEDLINE] (Source: Intern...</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=2473306</comments>
            <pubDate>Sat, 13 Jun 2009 16:15:02 +0100</pubDate>
            <guid isPermaLink="false">2473306</guid>        </item>
        <item>
            <title>A hybrid graph-theoretic method for mining overlapping functional modules in large sparse protein interaction networks.</title>
            <link>http://www.medworm.com/index.php?rid=2473303&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D19432377%26dopt%3DAbstract</link>
            <description>Authors: Zhang S, Liu HW, Ning XM, Zhang XS
    Modular architecture, which encompasses groups of genes/proteins involved in elementary biological functional units, is a basic form of the organisation of interacting proteins. Here, we propose a method that combines the Line Graph Transformation (LGT) and clique percolation-clustering algorithm to detect network modules, which may overlap each other in large sparse PPI networks. The resulting modules by the present method show a high coverage among yeast, fly, and worm PPI networks, respectively. Our analysis of the yeast PPI network suggests that most of these modules have well-biological significance in context of protein localisation, function annotation, and protein complexes.
    PMID: 19432377 [PubMed - indexed for MEDLINE] (Source: I...</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=2473303</comments>
            <pubDate>Sat, 13 Jun 2009 16:15:02 +0100</pubDate>
            <guid isPermaLink="false">2473303</guid>        </item>
        <item>
            <title>Irrelevant gene elimination for partial least squares based dimension reduction by using feature probes.</title>
            <link>http://www.medworm.com/index.php?rid=2473300&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D19432378%26dopt%3DAbstract</link>
            <description>Authors: Zeng XQ, Li GZ, Wu GF, Yang JY, Yang MQ
    It is hard to analyse gene expression data which has only a few observations but with thousands of measured genes. Partial Least Squares based Dimension Reduction (PLSDR) is superior for handling such high dimensional problems, but irrelevant features will introduce errors into the dimension reduction process. Here, feature selection is applied to filter the data and an algorithm named PLSDRg is described by integrating PLSDR with gene elimination, which is performed by the indication of t-statistic scores on standardised probes. Experimental results on six microarray data sets show that PLSDRg is effective and reliable to improve generalisation performance of classifiers.
    PMID: 19432378 [PubMed - indexed for MEDLINE] (Source: Intern...</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=2473300</comments>
            <pubDate>Sat, 13 Jun 2009 16:15:02 +0100</pubDate>
            <guid isPermaLink="false">2473300</guid>        </item>
        <item>
            <title>Discovering implicit associations among critical biological entities.</title>
            <link>http://www.medworm.com/index.php?rid=2473297&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D19517984%26dopt%3DAbstract</link>
            <description>Authors: Seki K, Mostafa J
    We propose an approach to predicting implicit gene-disease associations based on the inference network, whereby genes and diseases are represented as nodes and are connected via two types of intermediate nodes: gene functions and phenotypes. To estimate the probabilities involved in the model, two learning schemes are compared; one baseline using co-annotations of keywords and the other taking advantage of free text. Additionally, we explore the use of domain ontologies to complement data sparseness and examine the impact of full text documents. The validity of the proposed framework is demonstrated on the benchmark data set created from real-world data.
    PMID: 19517984 [PubMed - in process] (Source: International Journal of Data Mining and Bioinformatics)</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=2473297</comments>
            <pubDate>Sat, 13 Jun 2009 16:15:02 +0100</pubDate>
            <guid isPermaLink="false">2473297</guid>        </item>
        <item>
            <title>Double iterative optimisation for metabolic network-based drug target identification.</title>
            <link>http://www.medworm.com/index.php?rid=2473294&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D19517985%26dopt%3DAbstract</link>
            <description>We present novel and scalable algorithms for finding a set of enzymes, whose inhibition stops the production of a given set of target compounds, while eliminating minimal number of non-target compounds. Experimental results are presented for the E. coli metabolic network to demonstrate the accuracy and efficiency of our iterative method.
    PMID: 19517985 [PubMed - in process] (Source: International Journal of Data Mining and Bioinformatics)</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=2473294</comments>
            <pubDate>Sat, 13 Jun 2009 16:15:02 +0100</pubDate>
            <guid isPermaLink="false">2473294</guid>        </item>
        <item>
            <title>Study of microarray time series data based on Forward-Backward Linear Prediction and Singular Value Decomposition.</title>
            <link>http://www.medworm.com/index.php?rid=2473292&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D19517986%26dopt%3DAbstract</link>
            <description>Authors: Choong MK, Levy D, Yan H
    We propose a method to analyse the periodicities of gene expression profiles based on the spectral domain approach. Our spectral reconstruction method outperforms three other recently proposed methods, which do not require any prior knowledge. It is proven that an alternative method for studying cell-cycle regulation is possible even where very little prior knowledge is available. We also investigate the potential of combining signals with similar frequency components to form an overdetermined system of equations, and use least squares solution to estimate the spectral frequency. Results show that this new method is able to estimate the peak frequency more accurately.
    PMID: 19517986 [PubMed - in process] (Source: International Journal of Data Minin...</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=2473292</comments>
            <pubDate>Sat, 13 Jun 2009 16:15:02 +0100</pubDate>
            <guid isPermaLink="false">2473292</guid>        </item>
        <item>
            <title>Computational identification of protein-coding sequences by comparative analysis.</title>
            <link>http://www.medworm.com/index.php?rid=2473289&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D19517987%26dopt%3DAbstract</link>
            <description>Authors: Fontaine A, Touzet H
    Gene prediction is an essential step in understanding the genome of a species once it has been sequenced. For that, a promising direction in current research on gene finding is a comparative genomics approach. In this paper, we present a novel approach to identifying evolutionarily conserved protein-coding sequences in genomes. The method takes advantage of the specific substitution pattern of coding sequences together with the consistency of reading frames. It has been implemented in a software called PROTEA. Large-scale experimentation shows good results. PROTEA is intended to be a useful complement to existing tools based on homology search or statistical properties of the sequences.
    PMID: 19517987 [PubMed - in process] (Source: International Journa...</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=2473289</comments>
            <pubDate>Sat, 13 Jun 2009 16:15:02 +0100</pubDate>
            <guid isPermaLink="false">2473289</guid>        </item>
        <item>
            <title>Feature cluster selection for high-throughput data analysis.</title>
            <link>http://www.medworm.com/index.php?rid=2473287&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D19517988%26dopt%3DAbstract</link>
            <description>Authors: Yu L
    Feature selection is effective in selecting predictive gene sets for microarray classification. However, the large number of predictive gene sets and the disparity among them presents a challenge for identifying potential biomarkers. To facilitate biomarker identification, we present a new data mining task, feature cluster selection, which selects from a full set of features a small number of coherent and predictive feature clusters. We provide both theoretical definition and empirical formulation for the new problem, and propose an efficient 3M algorithm. Experiments on microarray data have shown that the 3M algorithm can select predictive and statistically significant gene clusters.
    PMID: 19517988 [PubMed - in process] (Source: International Journal of Data Mining a...</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=2473287</comments>
            <pubDate>Sat, 13 Jun 2009 16:15:02 +0100</pubDate>
            <guid isPermaLink="false">2473287</guid>        </item>
        <item>
            <title>A space-efficient algorithm for three sequence alignment and ancestor inference.</title>
            <link>http://www.medworm.com/index.php?rid=2473286&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D19517989%26dopt%3DAbstract</link>
            <description>Authors: Yue F, Tang J
    We propose a novel algorithm to simultaneously align three biological sequences with affine gap model and infer their common ancestral sequence. It applies the divide-and-conquer strategy to reduce the memory usage from O(n3) to O(n2). At the same time, it is based on dynamic programming and thus the optimal alignment is guaranteed. We implemented the algorithm and tested it extensively with both BAliBASE dataset and simulation data generated by Random Model of Sequence Evolution (ROSE). Compared with other popular multiple sequence alignment tools such as ClustalW and T-Coffee, our program produces not only better alignment, but also better ancestral sequence.
    PMID: 19517989 [PubMed - in process] (Source: International Journal of Data Mining and Bioinformati...</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=2473286</comments>
            <pubDate>Sat, 13 Jun 2009 16:15:02 +0100</pubDate>
            <guid isPermaLink="false">2473286</guid>        </item>
        <item>
            <title>Spherical-harmonic decomposition for molecular recognition in electron-density maps.</title>
            <link>http://www.medworm.com/index.php?rid=2473280&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D19517990%26dopt%3DAbstract</link>
            <description>Authors: DiMaio FP, Soni AB, Phillips GN, Shavlik JW
    Several methods for automatically constructing a protein model from an electron-density map require searching for many small protein-fragment templates in the density. We propose to use the spherical-harmonic decomposition of the template and the maps density to speed this matching. Unlike other template-matching approaches, this allows us to eliminate large portions of the map unlikely to match any templates. We train several first-pass filters for this elimination task. We show our new template-matching method improves accuracy and reduces running time, compared to previous approaches. Finally, we extend our method to produce a structural-homology detection algorithm using electron density.
    PMID: 19517990 [PubMed - in process] ...</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=2473280</comments>
            <pubDate>Sat, 13 Jun 2009 16:15:02 +0100</pubDate>
            <guid isPermaLink="false">2473280</guid>        </item>
        <item>
            <title>Word Sense Disambiguation in biomedical ontologies with term co-occurrence analysis and document clustering.</title>
            <link>http://www.medworm.com/index.php?rid=1991660&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D19024494%26dopt%3DAbstract</link>
            <description>Authors: Andreopoulos B, Alexopoulou D, Schroeder M
    With more and more genomes being sequenced, a lot of effort is devoted to their annotation with terms from controlled vocabularies such as the GeneOntology. Manual annotation based on relevant literature is tedious, but automation of this process is difficult. One particularly challenging problem is word sense disambiguation. Terms such as 'development' can refer to developmental biology or to the more general sense. Here, we present two approaches to address this problem by using term co-occurrences and document clustering. To evaluate our method we defined a corpus of 331 documents on development and developmental biology. Term co-occurrence analysis achieves an F-measure of 77%. Additionally, applying document clustering improves p...</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1991660</comments>
            <pubDate>Thu, 27 Nov 2008 12:32:20 +0100</pubDate>
            <guid isPermaLink="false">1991660</guid>        </item>
        <item>
            <title>Scoring and summarising gene product clusters using the Gene Ontology.</title>
            <link>http://www.medworm.com/index.php?rid=1991659&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D19024495%26dopt%3DAbstract</link>
            <description>Authors: Denaxas SC, Tjortjis C
    We propose an approach for quantifying the biological relatedness between gene products, based on their properties, and measure their similarities using exclusively statistical NLP techniques and Gene Ontology (GO) annotations. We also present a novel similarity figure of merit, based on the vector space model, which assesses gene expression analysis results and scores gene product clusters' biological coherency, making sole use of their annotation terms and textual descriptions. We define query profiles which rapidly detect a gene product cluster's dominant biological properties. Experimental results validate our approach, and illustrate a strong correlation between our coherency score and gene expression patterns.
    PMID: 19024495 [PubMed - in proces...</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1991659</comments>
            <pubDate>Thu, 27 Nov 2008 12:32:20 +0100</pubDate>
            <guid isPermaLink="false">1991659</guid>        </item>
        <item>
            <title>Sparse p-norm Nonnegative Matrix Factorization for clustering gene expression data.</title>
            <link>http://www.medworm.com/index.php?rid=1991658&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D19024496%26dopt%3DAbstract</link>
            <description>Authors: Liu W, Yuan K
    Nonnegative Matrix Factorization (NMF) is a powerful tool for gene expression data analysis as it reduces thousands of genes to a few compact metagenes, especially in clustering gene expression samples for cancer class discovery. Enhancing sparseness of the factorisation can find only a few dominantly coexpressed metagenes and improve the clustering effectiveness. Sparse p-norm (p &amp;gt; 1) Nonnegative Matrix Factorization (Sp-NMF) is a more sparse representation method using high order norm to normalise the decomposed components. In this paper, we investigate the benefit of high order normalisation for clustering cancer-related gene expression samples. Experimental results demonstrate that Sp-NMF leads to robust and effective clustering in both automatically deter...</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1991658</comments>
            <pubDate>Thu, 27 Nov 2008 12:32:20 +0100</pubDate>
            <guid isPermaLink="false">1991658</guid>        </item>
        <item>
            <title>A Bayesian framework for knowledge driven regression model in micro-array data analysis.</title>
            <link>http://www.medworm.com/index.php?rid=1991657&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D19024497%26dopt%3DAbstract</link>
            <description>We presented a full Bayesian framework to effectively exploit the similarity information of the input variables for linear regression. Empirical studies with gene expression data show that the regression errors can be reduced significantly by incorporating the similarity information derived from gene ontology.
    PMID: 19024497 [PubMed - in process] (Source: International Journal of Data Mining and Bioinformatics)</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1991657</comments>
            <pubDate>Thu, 27 Nov 2008 12:32:20 +0100</pubDate>
            <guid isPermaLink="false">1991657</guid>        </item>
        <item>
            <title>Classification techniques with minimal labelling effort and application to medical reports.</title>
            <link>http://www.medworm.com/index.php?rid=1991656&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D19024498%26dopt%3DAbstract</link>
            <description>Authors: Saad FH, Bell GD, de la Iglesia B
    There are a number of approaches to classify text documents. Here, we use Partially Supervised Classification (PSC) and argue that it is an effective and efficient approach for real-world problems. PSC uses a two-step strategy to cut down on the labelling effort. There are a number of methods that have been proposed for each step. An evaluation of various methods is conducted using real-world medical documents. The results show that using EM to build the classifier yields better results than SVM. We also experimentally show that careful selection of a subset of features to represent the documents can improve performance.
    PMID: 19024498 [PubMed - in process] (Source: International Journal of Data Mining and Bioinformatics)</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1991656</comments>
            <pubDate>Thu, 27 Nov 2008 12:32:20 +0100</pubDate>
            <guid isPermaLink="false">1991656</guid>        </item>
        <item>
            <title>Message Passing Clustering (MPC): a knowledge-based framework for clustering under biological constraints.</title>
            <link>http://www.medworm.com/index.php?rid=1769191&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D18767349%26dopt%3DAbstract</link>
            <description>Authors: Geng H, Deng X, Ali HH
    A new clustering algorithm, Message Passing Clustering (MPC), is proposed. MPC employs the concept of message passing to describe parallel and spontaneous clustering process by allowing data objects to communicate with each other. MPC also provides an extensible framework to accommodate additional features into clustering, such as adaptive feature weights scaling, stochastic cluster merging, and semi-supervised constraints guiding. Extensive experiments were performed using both simulation and real microarray gene expression and phylogenetic data. The results showed that MPC performed favourably to other popular clustering algorithms and MPC with the integration of additional features gave even higher accuracy rate than MPC.
    PMID: 18767349 [PubMed - ...</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1769191</comments>
            <pubDate>Sat, 06 Sep 2008 11:38:16 +0100</pubDate>
            <guid isPermaLink="false">1769191</guid>        </item>
        <item>
            <title>Identification of Intrinsically Unstructured Proteins using hierarchical classifier.</title>
            <link>http://www.medworm.com/index.php?rid=1769190&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D18767350%26dopt%3DAbstract</link>
            <description>Authors: Yang JY, Yang MQ
    It is suggested that protein functions only when folded into a particular 3-D structure. Recently, many protein regions and some entire proteins have been identified with no definite tertiary structure, but presenting instead as dynamic, disorder ensembles under different physiochemical circumstances. These proteins and regions are known as Intrinsically Unstructured regions and Proteins (IUP). We constructed a Recursive Maximum Contrast Tree (RMCT) based classifier to identify IUP. The classifier has been benchmarked against industrial standard PONDR VLXT on out-of-sample data by external evaluators. The IUP predictor is a viable alternative software tool for identifying intrinsic unstructured regions and proteins.
    PMID: 18767350 [PubMed - in process] (So...</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1769190</comments>
            <pubDate>Sat, 06 Sep 2008 11:38:16 +0100</pubDate>
            <guid isPermaLink="false">1769190</guid>        </item>
        <item>
            <title>Handling gene redundancy in microarray data using Grey Relational Analysis.</title>
            <link>http://www.medworm.com/index.php?rid=1769189&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D18767351%26dopt%3DAbstract</link>
            <description>Authors: Zhang LJ, Li ZJ, Chen HW
    Gene selection is one of the important and frequently used techniques for microarray data classification. In this paper, we introduce a new metric to measure gene-class relevance and gene-gene redundancy. The new metric is based on Grey Relational Analysis (GRA), called Grey Relational Grade (GRG), and never used in gene selection before. Based on the GRG, we develop a new gene selection method, which uses GRG to group similar genes to clusters, and then select informative genes from each cluster to avoid redundancy. Experiments on public data sets demonstrate the effectiveness of the proposed method.
    PMID: 18767351 [PubMed - in process] (Source: International Journal of Data Mining and Bioinformatics)</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1769189</comments>
            <pubDate>Sat, 06 Sep 2008 11:38:16 +0100</pubDate>
            <guid isPermaLink="false">1769189</guid>        </item>
        <item>
            <title>Large-scale Protein-Protein Interaction prediction using novel kernel methods.</title>
            <link>http://www.medworm.com/index.php?rid=1769188&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D18767352%26dopt%3DAbstract</link>
            <description>Authors: Chen XW, Han B, Fang J, Haasl RJ
    Knowledge of Protein-Protein Interactions (PPIs) can give us new insights into molecular mechanisms and properties of the cell. In this paper, we propose a novel domain-based kernel method to predict PPIs. A new kernel that measures the similarity between protein pairs based on a new feature representation is developed and applied to a large scale PPI database. Experimental results demonstrate its effectiveness. Furthermore, we evaluate the problem of cross-species PPI prediction and the effect of the number of negative samples on the performance of PPI predictions, which are two fundamental problems in most in silico PPI methods.
    PMID: 18767352 [PubMed - in process] (Source: International Journal of Data Mining and Bioinformatics)</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1769188</comments>
            <pubDate>Sat, 06 Sep 2008 11:38:16 +0100</pubDate>
            <guid isPermaLink="false">1769188</guid>        </item>
        <item>
            <title>Protein homology detection with biologically inspired features and interpretable statistical models.</title>
            <link>http://www.medworm.com/index.php?rid=1769187&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D18767353%26dopt%3DAbstract</link>
            <description>Authors: Huang PH, Pavlovic V
    Computational classification of proteins using methods such as string kernels and Fisher-SVM has demonstrated great success. However, the resulting models do not offer an immediate interpretation of the underlying biological mechanisms. In this work, we propose a biologically motivated feature set combined with a sparse classifier, based on a small subset of positions and residues in protein sequences, for protein superfamily detection and show the performance of our models is comparable to that of the state-of-the-art methods on a benchmark dataset. The set of sparse critical features discovered by the models is consistent with the confirmed biological findings.
    PMID: 18767353 [PubMed - in process] (Source: International Journal of Data Mining and Bio...</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1769187</comments>
            <pubDate>Sat, 06 Sep 2008 11:38:16 +0100</pubDate>
            <guid isPermaLink="false">1769187</guid>        </item>
        <item>
            <title>Discovery of metabolite features for the modelling and analysis of high-resolution NMR spectra.</title>
            <link>http://www.medworm.com/index.php?rid=1769186&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D18767354%26dopt%3DAbstract</link>
            <description>This study presents three feature selection methods for identifying the metabolite features in nuclear magnetic resonance spectra that contribute to the distinction of samples among varying nutritional conditions. Principal component analysis, Fisher discriminant analysis, and Partial Least Square Discriminant Analysis (PLS-DA) were used to calculate the importance of individual metabolite feature in spectra. Moreover, an Orthogonal Signal Correction (OSC) filter was used to eliminate unnecessary variations in spectra. We evaluated the presented methods by comparing the ability of classification based on the features selected by each method. The result showed that the best classification was achieved from an OSC-PLS-DA model.
    PMID: 18767354 [PubMed - in process] (Source: International ...</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1769186</comments>
            <pubDate>Sat, 06 Sep 2008 11:38:16 +0100</pubDate>
            <guid isPermaLink="false">1769186</guid>        </item>
        <item>
            <title>Gene Regulatory Network modelling: a state-space approach.</title>
            <link>http://www.medworm.com/index.php?rid=1527151&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D18399325%26dopt%3DAbstract</link>
            <description>This study proposes a state-space model with control portion for inferring Gene Regulatory Networks (GRNs). The proposed model views genes as the observation variables, whose expression values depend on the current internal state variables and control variables, and views the means of clusters of gene expression as the control variables of the internal state equation. Bayesian Information Criterion (BIC) and Probabilistic Principal Component Analysis (PPCA) are used to estimate the internal states from observation data. The proposed approach is applied to two gene expression datasets. Computational results show that inferred GRNs possesses the characteristics of the real-life GRNs.
    PMID: 18399325 [PubMed - indexed for MEDLINE] (Source: International Journal of Data Mining and Bioinform...</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1527151</comments>
            <pubDate>Wed, 18 Jun 2008 22:17:51 +0100</pubDate>
            <guid isPermaLink="false">1527151</guid>        </item>
        <item>
            <title>Segmentation of short human exons based on spectral features of double curves.</title>
            <link>http://www.medworm.com/index.php?rid=1527150&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D18399326%26dopt%3DAbstract</link>
            <description>Authors: Jiang R, Yan H
    This paper presents a new segmentation method based on spectral analysis to locate borders between short protein coding regions and non-coding regions. We formulate the innovative double curve representation of a DNA sequence and apply local three-codon measurement on the discrete Fourier spectral features at 1/3 frequency to identify short protein coding regions. The proposed spectral segmentation method based on double curves requires no prior knowledge of the DNA data. Our simulation results show that the proposed spectral method greatly improves the accuracy of identifying short coding regions in DNA sequences compared with the results obtained from the other methods that analyse DNA sequences directly.
    PMID: 18399326 [PubMed - indexed for MEDLINE] (Sour...</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1527150</comments>
            <pubDate>Wed, 18 Jun 2008 22:17:51 +0100</pubDate>
            <guid isPermaLink="false">1527150</guid>        </item>
        <item>
            <title>Temporal representation for gene networks: towards a qualitative temporal data mining.</title>
            <link>http://www.medworm.com/index.php?rid=1527149&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D18399327%26dopt%3DAbstract</link>
            <description>Authors: Turenne N, Schwer SR
    Processing literature (i.e., text corpora) to capture gene regulation events is not easy and can be driven by the final data representation. We propose to build, manually, an example of temporal representation (whole gene networks for coat formation in Bacillus Subtilis). Our temporal representation is based on a generalised formal language theory (S-languages). We propose an algorithm to link bags of relations with representation, by ordering interactions. In this paper, starting from the network made manually from text data, we show that S-languages are quite relevant to encapsulate gene properties, and infer knowledge across timestamped gene relations found in texts.
    PMID: 18399327 [PubMed - indexed for MEDLINE] (Source: International Journal of Dat...</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1527149</comments>
            <pubDate>Wed, 18 Jun 2008 22:17:51 +0100</pubDate>
            <guid isPermaLink="false">1527149</guid>        </item>
        <item>
            <title>An integrative approach for biological data mining and visualisation.</title>
            <link>http://www.medworm.com/index.php?rid=1527148&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D18399328%26dopt%3DAbstract</link>
            <description>We present a system to integrate data across multiple bioinformatics databases and enable mining across various conceptual levels of biological information. The results are represented as complex networks. Context dependent mining of these networks is achieved by use of distances. Our approach is demonstrated with three applications: full metabolic network retrieval with network topology study, exploration of properties and relationships of a set of selected proteins, and combined visualisation and exploration of gene expression data with related pathways and ontologies.
    PMID: 18399328 [PubMed - indexed for MEDLINE] (Source: International Journal of Data Mining and Bioinformatics)</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1527148</comments>
            <pubDate>Wed, 18 Jun 2008 22:17:51 +0100</pubDate>
            <guid isPermaLink="false">1527148</guid>        </item>
        <item>
            <title>A rule-based approach for RNA pseudoknot prediction.</title>
            <link>http://www.medworm.com/index.php?rid=1527147&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D18399329%26dopt%3DAbstract</link>
            <description>Authors: Fu XZ, Wang H, Harrison RW, Harrison WL
    RNA plays a critical role in mediating every step of cellular information transfer from genes to functional proteins. Pseudoknots are functionally important and widely occurring structural motifs found in all types of RNA. Therefore predicting their structures is an important problem. In this paper, we present a new RNA pseudoknot structure prediction method based on term rewriting. The method is implemented using the Mfold RNA/DNA folding package and the term rewriting language Maude. In our method, RNA structures are treated as terms and rules are discovered for predicting pseudoknots. Our method was tested on 211 pseudoknots in PseudoBase and achieves an average accuracy of 74.085% compared to the experimentally determined structure. ...</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1527147</comments>
            <pubDate>Wed, 18 Jun 2008 22:17:51 +0100</pubDate>
            <guid isPermaLink="false">1527147</guid>        </item>
        <item>
            <title>Simulation study in Probabilistic Boolean Network models for genetic regulatory networks.</title>
            <link>http://www.medworm.com/index.php?rid=1527156&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D18399072%26dopt%3DAbstract</link>
            <description>Authors: Zhang SQ, Ching WK, Ng MK, Akutsu T
    Probabilistic Boolean Network (PBN) is widely used to model genetic regulatory networks. Evolution of the PBN is according to the transition probability matrix. Steady-state (long-run behaviour) analysis is a key aspect in studying the dynamics of genetic regulatory networks. In this paper, an efficient method to construct the sparse transition probability matrix is proposed, and the power method based on the sparse matrix-vector multiplication is applied to compute the steady-state probability distribution. Such methods provide a tool for us to study the sensitivity of the steady-state distribution to the influence of input genes, gene connections and Boolean networks. Simulation results based on a real network are given to illustrate the m...</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1527156</comments>
            <pubDate>Mon, 01 Jan 2007 05:00:00 +0100</pubDate>
            <guid isPermaLink="false">1527156</guid>        </item>
        <item>
            <title>A parallel edge-betweenness clustering tool for Protein-Protein Interaction networks.</title>
            <link>http://www.medworm.com/index.php?rid=1527155&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D18399073%26dopt%3DAbstract</link>
            <description>Authors: Yang Q, Lonardi S
    The increasing availability of protein-protein interaction graphs (PPI) requires new efficient tools capable of extracting valuable biological knowledge from these networks. Among the wide range of clustering algorithms, Girvan and Newman's edge betweenness algorithm showed remarkable performances in discovering clustering structures in several real-world networks. Unfortunately, their algorithm suffers from high computational cost and it is impractical for inputs of the size of large PPI networks. Here we report on a novel parallel implementation of Girvan and Newman's clustering algorithm that achieves almost linear speed-up for up to 32 processors. The tool is available in the public domain from the authors' website.
    PMID: 18399073 [PubMed - indexed fo...</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1527155</comments>
            <pubDate>Mon, 01 Jan 2007 05:00:00 +0100</pubDate>
            <guid isPermaLink="false">1527155</guid>        </item>
        <item>
            <title>Prediction of Protein Secondary Structure with two-stage multi-class SVMs.</title>
            <link>http://www.medworm.com/index.php?rid=1527154&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D18399074%26dopt%3DAbstract</link>
            <description>Authors: Nguyen MN, Rajapakse JC
    Bioinformatics techniques to Protein Secondary Structure (PSS) prediction mostly depend on the information available in amino acid sequences. In this paper, we propose a two-stage Multi-class Support Vector Machine (MSVM) approach, where the second MSVM predictor is introduced at the output of the first stage MSVM to capture the contextual relationship among secondary structure elements in order to minimise the generalisation error in the prediction. By using position-specific scoring matrices generated by PSI-BLAST, the two-stage MSVM approach achieves Q3 accuracies of 78.0% and 76.3% on the RS126 dataset of 126 non-homologous globular proteins and the CB396 dataset of 396 non-homologous proteins, respectively, which are better than the scores reported...</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1527154</comments>
            <pubDate>Mon, 01 Jan 2007 05:00:00 +0100</pubDate>
            <guid isPermaLink="false">1527154</guid>        </item>
        <item>
            <title>Granular kernel trees with parallel genetic algorithms for drug activity comparisons.</title>
            <link>http://www.medworm.com/index.php?rid=1527153&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D18399075%26dopt%3DAbstract</link>
            <description>Authors: Jin B, Zhang YQ, Wang B
    With the growing interests of biological data prediction and chemical data prediction, more powerful and flexible kernels need to be designed so that the prior knowledge and relationships within data can be expressed effectively in kernel functions. In this paper, Granular Kernel Trees (GKTs) are proposed and parallel Genetic Algorithms (GAs) are used to optimise the parameters of GKTs. In applications, SVMs with new kernel trees are employed for drug activity comparisons. The experimental results show that GKTs and evolutionary GKTs can achieve better performances than traditional RBF kernels in terms of prediction accuracy.
    PMID: 18399075 [PubMed - indexed for MEDLINE] (Source: International Journal of Data Mining and Bioinformatics)</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1527153</comments>
            <pubDate>Mon, 01 Jan 2007 05:00:00 +0100</pubDate>
            <guid isPermaLink="false">1527153</guid>        </item>
        <item>
            <title>Exploring alternative knowledge representations for protein secondary-structure prediction.</title>
            <link>http://www.medworm.com/index.php?rid=1527152&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D18399076%26dopt%3DAbstract</link>
            <description>Authors: Midic U, Dunker AK, Obradovic Z
    Methods for 3-class secondary-structure prediction are thought to be reaching the highest achievable accuracy. Their accuracy on beta-sheet residue class is considerably lower than for the other two classes. We analysed the relevance of 315 individual input attributes for a predictor with the usual framework of using sequence-profile based data with an input window of fixed size. We propose two alternative knowledge representations with significantly smaller sets of input attributes. We also investigated the possibility of exploiting the prediction of connected pairs of beta-sheet residues and the prediction of residue contact maps for the improvement of accuracy of secondary-structure prediction.
    PMID: 18399076 [PubMed - indexed for MEDLINE...</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1527152</comments>
            <pubDate>Mon, 01 Jan 2007 05:00:00 +0100</pubDate>
            <guid isPermaLink="false">1527152</guid>        </item>
        <item>
            <title>Simulating the cellular passive transport of glucose using a time-dependent extension of Gillespie algorithm for stochastic pi-calculus.</title>
            <link>http://www.medworm.com/index.php?rid=1527141&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D18402045%26dopt%3DAbstract</link>
            <description>Authors: Lecca P
    Realistic simulations of the biological systems evolution require a mathematical model of the stochasticity of the involved processes and a formalism for specifying the concurrent nature of the biochemical interactions. A time-dependent extension of the Gillespie algorithm implementing the race condition of the stochastic pi-calculus formalism satisfies both these requirements. This paper formulates those modifications to the original Gillespie algorithm necessary when the time dependence of the reaction propensity is due to changes either of volume or temperature. This re-formulation has been incorporated in the framework of stochastic pi-calculus and has been applied to simulate the passive glucose cellular transport.
    PMID: 18402045 [PubMed - indexed for MEDLINE]...</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1527141</comments>
            <pubDate>Mon, 01 Jan 2007 05:00:00 +0100</pubDate>
            <guid isPermaLink="false">1527141</guid>        </item>
        <item>
            <title>Transductive learning with EM algorithm to classify proteins based on phylogenetic profiles.</title>
            <link>http://www.medworm.com/index.php?rid=1527140&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D18402046%26dopt%3DAbstract</link>
            <description>Authors: Craig RA, Liao L
    We proposed a novel method for protein classification based on phylogenetic profiles. Each protein's profile was extended with extra bits encoding the phylogenetic tree structure and the likelihood, in the form of weights on profile indices, of the protein's functional family membership in each of the reference genomes. The extended profiles were then integrated as part of a kernel of a support vector machine, which was trained in a transductive learning scheme using the EM algorithm to update the weights. Classification accuracy was greatly increased when tested on the proteome of Saccharomyces cerevisiae using the MIPS classification as a benchmark.
    PMID: 18402046 [PubMed - indexed for MEDLINE] (Source: International Journal of Data Mining and Bioinforma...</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1527140</comments>
            <pubDate>Mon, 01 Jan 2007 05:00:00 +0100</pubDate>
            <guid isPermaLink="false">1527140</guid>        </item>
        <item>
            <title>A constraint logic programming approach to associate 1D and 3D structural components for large protein complexes.</title>
            <link>http://www.medworm.com/index.php?rid=1527139&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D18402047%26dopt%3DAbstract</link>
            <description>Authors: Dal Pal&amp;#xF9; A, Pontelli E, He J, Lu Y
    The paper describes a novel framework, constructed using Constraint Logic Programming (CLP) and parallelism, to determine the association between parts of the primary sequence of a protein and alpha-helices extracted from 3D low-resolution descriptions of large protein complexes. The association is determined by extracting constraints from the 3D information, regarding length, relative position and connectivity of helices, and solving these constraints with the guidance of a secondary structure prediction algorithm. Parallelism is employed to enhance performance on large proteins. The framework provides a fast, inexpensive alternative to determine the exact tertiary structure of unknown proteins.
    PMID: 18402047 [PubMed - indexed for ...</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1527139</comments>
            <pubDate>Mon, 01 Jan 2007 05:00:00 +0100</pubDate>
            <guid isPermaLink="false">1527139</guid>        </item>
        <item>
            <title>A Merge-Decoupling Dead End Elimination algorithm for protein side-chain conformation.</title>
            <link>http://www.medworm.com/index.php?rid=1527138&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D18402048%26dopt%3DAbstract</link>
            <description>We present a Merge-Decoupling DEE (MD-DEE) that further reduces the number of rotamers after SG-DEE. MD-DEE works by forming residue-pairs but is fast and, like SG-DEE, is practical even for large proteins. Our experiments show that MD-DEE achieves further reduction in residue elimination (up to 25%) after SG-DEE.
    PMID: 18402048 [PubMed - indexed for MEDLINE] (Source: International Journal of Data Mining and Bioinformatics)</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1527138</comments>
            <pubDate>Mon, 01 Jan 2007 05:00:00 +0100</pubDate>
            <guid isPermaLink="false">1527138</guid>        </item>
        <item>
            <title>Biomedical text summarisation using concept chains.</title>
            <link>http://www.medworm.com/index.php?rid=1527137&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D18402049%26dopt%3DAbstract</link>
            <description>Authors: Reeve LH, Han H, Brooks AD
    BioChainSumm is a biomedical text summariser utilising concept chaining (called BioChain) to link semantically-related concepts within biomedical text together. The BioChain process is adapted from existing lexical chaining approaches which chain semantically-related terms rather than concepts. The BioChain concept chains are used to identify salient candidate sentences which are extracted to produce a summary of the biomedical text. The Unified Medical Language System Metathesaurus and Semantic Network semantic resources identify related biomedical concepts. BioChainSumm is evaluated using the ROUGE system along with several existing, publicly-available summarisers. Our results show BioChain provides a promising methodology for biomedical text summa...</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1527137</comments>
            <pubDate>Mon, 01 Jan 2007 05:00:00 +0100</pubDate>
            <guid isPermaLink="false">1527137</guid>        </item>
        <item>
            <title>Dynamic algorithm for inferring qualitative models of Gene Regulatory Networks.</title>
            <link>http://www.medworm.com/index.php?rid=1527162&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D18399066%26dopt%3DAbstract</link>
            <description>Authors: Zheng Y, Kwoh CK
    We introduce a novel algorithm, DFL (Discrete Function Learning), for reconstructing qualitative models of Gene Regulatory Networks (GRNs) from gene expression data in this paper. We analyse its complexity of O(k x N x n2) on the average and its data requirements. The experiments of synthetic Boolean networks show that the DFL algorithm is more efficient than current algorithms without loss of prediction performances. The results of yeast cell cycle gene expression data show that the DFL algorithm can identify biologically significant models with reasonable accuracy, sensitivity and high precision with respect to the literature evidences.
    PMID: 18399066 [PubMed - indexed for MEDLINE] (Source: International Journal of Data Mining and Bioinformatics)</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1527162</comments>
            <pubDate>Sun, 01 Jan 2006 05:00:00 +0100</pubDate>
            <guid isPermaLink="false">1527162</guid>        </item>
        <item>
            <title>Improving domain-based protein interaction prediction using biologically significant negative datasets.</title>
            <link>http://www.medworm.com/index.php?rid=1527161&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D18399067%26dopt%3DAbstract</link>
            <description>Authors: Li XL, Tan SH, Ng SK
    We propose a domain-based classification method to predict protein-protein interactions using probabilities of putative interacting domain pairs derived from both experimentally-determined interacting protein pairs and carefully-chosen non-interacting protein pairs. Multi-species comparative results for protein interaction prediction show that such careful generation of biologically meaningful negative training data can improve classification performance.
    PMID: 18399067 [PubMed - indexed for MEDLINE] (Source: International Journal of Data Mining and Bioinformatics)</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1527161</comments>
            <pubDate>Sun, 01 Jan 2006 05:00:00 +0100</pubDate>
            <guid isPermaLink="false">1527161</guid>        </item>
        <item>
            <title>Spectral similarity for analysis of DNA microarray time-series data.</title>
            <link>http://www.medworm.com/index.php?rid=1527160&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D18399068%26dopt%3DAbstract</link>
            <description>Authors: Yan H, Pham T
    This paper proposes a new similarity measurement for comparison and analysis of DNA microarray time-series data. In this method, a gene expression time series is decomposed into frequency components and the correlation between the data from a pair of genes is measured in the frequency domain. The method effectively solves the phase delay problem and provides a more accurate metric for microarray time-series classification.
    PMID: 18399068 [PubMed - indexed for MEDLINE] (Source: International Journal of Data Mining and Bioinformatics)</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1527160</comments>
            <pubDate>Sun, 01 Jan 2006 05:00:00 +0100</pubDate>
            <guid isPermaLink="false">1527160</guid>        </item>
        <item>
            <title>Transitive closure and metric inequality of weighted graphs: detecting protein interaction modules using cliques.</title>
            <link>http://www.medworm.com/index.php?rid=1527159&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D18399069%26dopt%3DAbstract</link>
            <description>Authors: Ding C, He X, Xiong H, Peng H, Holbrook SR
    We study transitivity properties of edge weights in complex networks. We show that enforcing transitivity leads to a transitivity inequality which is equivalent to ultra-metric inequality. This can be used to define transitive closure on weighted undirected graphs, which can be computed using a modified Floyd-Warshall algorithm. These new concepts are extended to dissimilarity graphs and triangle inequalities. From this, we extend the clique concept from unweighted graph to weighted graph. We outline several applications and present results of detecting protein functional modules in a protein interaction network.
    PMID: 18399069 [PubMed - indexed for MEDLINE] (Source: International Journal of Data Mining and Bioinformatics)</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1527159</comments>
            <pubDate>Sun, 01 Jan 2006 05:00:00 +0100</pubDate>
            <guid isPermaLink="false">1527159</guid>        </item>
        <item>
            <title>BAG: a graph theoretic sequence clustering algorithm.</title>
            <link>http://www.medworm.com/index.php?rid=1527158&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D18399070%26dopt%3DAbstract</link>
            <description>Authors: Kim S, Lee J
    In this paper, we first discuss issues in clustering biological sequences with graph properties, which inspired the design of our sequence clustering algorithm BAG. BAG recursively utilises several graph properties: biconnectedness, articulation points, pquasi-completeness, and domain knowledge specific to biological sequence clustering. To reduce the fragmentation issue, we have developed a new metric called cluster utility to guide cluster splitting. Clusters are then merged back with less stringent constraints. Experiments with the entire COG database and other sequence databases show that BAG can cluster a large number of sequences accurately while keeping the number of fragmented clusters significantly low.
    PMID: 18399070 [PubMed - indexed for MEDLINE] (S...</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1527158</comments>
            <pubDate>Sun, 01 Jan 2006 05:00:00 +0100</pubDate>
            <guid isPermaLink="false">1527158</guid>        </item>
        <item>
            <title>An efficient motif discovery algorithm with unknown motif length and number of binding sites.</title>
            <link>http://www.medworm.com/index.php?rid=1527157&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D18399071%26dopt%3DAbstract</link>
            <description>Authors: Leung HC, Chin FY
    Most motif discovery algorithms from DNA sequences require the motif's length as input. Styczynski et al. introduced the Extended (l,d)-Motif Problem (EMP) where the motif's length is not an input parameter. Unfortunately, their algorithm takes an unacceptably long time to run, e.g. over 3 months to discover a length-14 motif. Since the best motif may not be the longest nor have the largest number of binding sites, in this paper we further eliminate another input parameter about the minimum number of binding sites in order to provide more realistic/robust results. We also develop an efficient algorithm to solve EMP and this redefined problem.
    PMID: 18399071 [PubMed - indexed for MEDLINE] (Source: International Journal of Data Mining and Bioinformatics)</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1527157</comments>
            <pubDate>Sun, 01 Jan 2006 05:00:00 +0100</pubDate>
            <guid isPermaLink="false">1527157</guid>        </item>
        <item>
            <title>Adaptive Fuzzy Association Rule mining for effective decision support in biomedical applications.</title>
            <link>http://www.medworm.com/index.php?rid=1527146&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D18402040%26dopt%3DAbstract</link>
            <description>Authors: He Y, Tang Y, Zhang YQ, Sunderraman R
    Due to complexity of biomedical classification problems, it is impossible to build a perfect classifier with 100% prediction accuracy. Hence a more realistic target is to build an effective Decision Support System (DSS). Here 'effective' means that a DSS should not only predict unseen samples accurately, but also work in a human-understandable way. In this paper, we propose a novel adaptive Fuzzy Association Rules (FARs) mining algorithm, named FARM-DS, to build such a DSS for binary classification problems in the biomedical domain. In the training phase, four steps are executed to mine FARs, which are thereafter used to predict unseen samples in the testing phase. The new FARM-DS algorithm is evaluated on two publicly available medical da...</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1527146</comments>
            <pubDate>Sun, 01 Jan 2006 05:00:00 +0100</pubDate>
            <guid isPermaLink="false">1527146</guid>        </item>
        <item>
            <title>Bi-level clustering of mixed categorical and numerical biomedical data.</title>
            <link>http://www.medworm.com/index.php?rid=1527145&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D18402041%26dopt%3DAbstract</link>
            <description>We present the BILCOM algorithm for 'Bi-Level Clustering of Mixed categorical and numerical data types'. BILCOM performs a pseudo-Bayesian process, where the prior is categorical clustering. BILCOM partitions biomedical data sets of mixed types, such as hepatitis, thyroid disease and yeast gene expression data with Gene Ontology annotations, more accurately than if using one type alone.
    PMID: 18402041 [PubMed - indexed for MEDLINE] (Source: International Journal of Data Mining and Bioinformatics)</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1527145</comments>
            <pubDate>Sun, 01 Jan 2006 05:00:00 +0100</pubDate>
            <guid isPermaLink="false">1527145</guid>        </item>
        <item>
            <title>Kernel design for RNA classification using Support Vector Machines.</title>
            <link>http://www.medworm.com/index.php?rid=1527144&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D18402042%26dopt%3DAbstract</link>
            <description>Authors: Wang JT, Wu X
    Support Vector Machines (SVMs) are a state-of-the-art machine learning tool widely used in speech recognition, image processing and biological sequence analysis. An essential step in SVMs is to devise a kernel function to compute the similarity between two data points. In this paper we review recent advances of using SVMs for RNA classification. In particular we present a new kernel that takes advantage of both global and local structural information in RNAs and uses the information together to classify RNAs. Experimental results demonstrate the good performance of the new kernel and show that it outperforms existing kernels when applied to classifying non-coding RNA sequences.
    PMID: 18402042 [PubMed - indexed for MEDLINE] (Source: International Journal of Da...</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1527144</comments>
            <pubDate>Sun, 01 Jan 2006 05:00:00 +0100</pubDate>
            <guid isPermaLink="false">1527144</guid>        </item>
        <item>
            <title>State-space approach with the maximum likelihood principle to identify the system generating time-course gene expression data of yeast.</title>
            <link>http://www.medworm.com/index.php?rid=1527143&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D18402043%26dopt%3DAbstract</link>
            <description>Authors: Yamaguchi R, Higuchi T
    We use linear Gaussian state-space models to analyse time-course gene expression data of yeast. They are modelled to be generated from hidden state variables in a system. To identify the system, we estimate parameters of the model by EM algorithm and determine the dimension of the state variable by BIC.
    PMID: 18402043 [PubMed - indexed for MEDLINE] (Source: International Journal of Data Mining and Bioinformatics)</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1527143</comments>
            <pubDate>Sun, 01 Jan 2006 05:00:00 +0100</pubDate>
            <guid isPermaLink="false">1527143</guid>        </item>
        <item>
            <title>Text analysis of MEDLINE for discovering functional relationships among genes: evaluation of keyword extraction weighting schemes.</title>
            <link>http://www.medworm.com/index.php?rid=1527142&amp;cid=s_37101_79_f&amp;fid=37101&amp;url=http%3A%2F%2Fwww.ncbi.nlm.nih.gov%2Fentrez%2Fquery.fcgi%3Ftmpl%3DNoSidebarfile%26db%3DPubMed%26cmd%3DRetrieve%26list_uids%3D18402044%26dopt%3DAbstract</link>
            <description>Authors: Liu Y, Navathe SB, Pivoshenko A, Dasigi VG, Dingledine R, Cilia BJ
    One of the key challenges of microarray studies is to derive biological insights from the gene-expression patterns. Clustering genes by functional keyword association can provide direct information about the functional links among genes. However, the quality of the keyword lists significantly affects the clustering results. We compared two keyword weighting schemes: normalised z-score and term frequency-inverse document frequency (TFIDF). Two gene sets were tested to evaluate the effectiveness of the weighting schemes for keyword extraction for gene clustering. Using established measures of cluster quality, the results produced from TFIDF-weighted keywords outperformed those produced from normalised z-score wei...</description>
            <author>International Journal of Data Mining and Bioinformatics</author>
            <type>journals</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1527142</comments>
            <pubDate>Sun, 01 Jan 2006 05:00:00 +0100</pubDate>
            <guid isPermaLink="false">1527142</guid>        </item>
    </channel>
</rss>
