Login / Register for free to get access to My MedWorm

Springer protocols feed by BioinformaticsSpringer protocols feed by Bioinformatics RSS feedThis is an RSS file. You can use it to subscribe to this data in your favourite RSS reader, such as GoogleReader, or to display this data on your own website or blog. subscribe with MyMedWormSubscribe to this data using MyMedWorm.subscribe with GoogleReaderSubscribe to this data using GoogleReader.subscribe with BloglinesSubscribe to this data using Bloglines.subscribe with MyYahooSubscribe to this data using MyYahoo.

This page shows you the latest items in this publication.

Brain Model of Text Animation as a Data Mining Strategyemail this articleEmail this article to a colleague. save this article to My ClippingsSave this article to My Clippings. discuss this articleDiscuss or comment on this article.
Imagination is the critical point in developing of realistic intelligence (AI) systems. One way to approach imagination would be simulation of its properties and operations. We developed two models “Brain Network Hierarchy of Languages,” and “Semantical Holographic Calculus” and simulation system ScriptWriter that emulate the process of imagination through an automatic animation of English texts. The purpose of this paper is to demonstrate the model and present “ScriptWriter” system http://nvo.sdsc.edu/NVO/JCSG/get_SRB_mime_file2.cgi//home/tamara.sdsc/test/demo.zip?F=/home/tamar...
Source: Springer protocols feed by Bioinformatics - August 4, 2009 Category: Bioinformatics Source Type: info

Processes Parallel Execution Using Grid Wizard Enterpriseemail this articleEmail this article to a colleague. save this article to My ClippingsSave this article to My Clippings. discuss this articleDiscuss or comment on this article.
The field of high-performance computing (HPC) has provided a wide array of strategies for supplying additional computing power to the goal of reducing the total “clock time” required to complete various computational processes. These strategies range from the development of higher-performance hardware to the assembly of large networks of commodity computers, with each strategy designed to address a particular aspect and/or manifestation of a given computational problem. GWE (Grid Wizard Enterprise) in that regard, is an HPC distributed enterprise system, aimed at providing a solution to the particular problem o...
Source: Springer protocols feed by Bioinformatics - August 4, 2009 Category: Bioinformatics Source Type: info

Single Sign-On in a Grid Portalemail this articleEmail this article to a colleague. save this article to My ClippingsSave this article to My Clippings. discuss this articleDiscuss or comment on this article.
Single Sign-On (SSO) is a practical requirement for software applications, which rely on distributed, networked services requiring authentication. SSO is as much a convenient feature for users as it is a security concern for application designers. The security requirement becomes critical in institutions that adhere to HIPPA regulations. In this chapter, we discuss SSO as it applies to a grid portal using remote computational resources and grid storage, which contain Personal Health Information (PHI). We cover the implementation of Public Key Infrastructure(PKI) to meet HIPPA security requirements such as authentication, c...
Source: Springer protocols feed by Bioinformatics - August 4, 2009 Category: Bioinformatics Source Type: info

Management of Information in Distributed Biomedical Collaboratoriesemail this articleEmail this article to a colleague. save this article to My ClippingsSave this article to My Clippings. discuss this articleDiscuss or comment on this article.
Organizing and annotating biomedical data in structured ways has gained much interest and focus in the last 30 years. Driven by decreases in digital storage costs and advances in genetics sequencing, imaging, electronic data collection, and microarray technologies, data is being collected at an alarming rate. The specialization of fields in biology and medicine demonstrates the need for somewhat different structures for storage and retrieval of data. For biologists, the need for structured information and integration across a number of domains drives development. For clinical researchers and hospitals, the need for a struc...
Source: Springer protocols feed by Bioinformatics - August 4, 2009 Category: Bioinformatics Source Type: info

Enabling Public Data Sharing: Encouraging Scientific Discovery and Educationemail this articleEmail this article to a colleague. save this article to My ClippingsSave this article to My Clippings. discuss this articleDiscuss or comment on this article.
To promote scientific discovery and education, the federated Biomedical Informatics Research Network (BIRN) Data Repository (BDR) supports data storage, sharing, querying, and downloading for the biomedical community, enabling the integration of multiple data resources from a single entry point. The BDR encourages data sharing both for investigators requesting assistance with databasing and informatics infrastructure, and for those wishing to extend the reach of existing data resources to be registered with the BDR. Both approaches rely heavily on data integration and knowledge management techniques, ensuring capabilities ...
Source: Springer protocols feed by Bioinformatics - August 4, 2009 Category: Bioinformatics Source Type: info

Mediator Infrastructure for Information Integration and Semantic Data Integration Environment for Biomedical Researchemail this articleEmail this article to a colleague. save this article to My ClippingsSave this article to My Clippings. discuss this articleDiscuss or comment on this article.
This paper presents current progress in the development of semantic data integration environment which is a part of the Biomedical Informatics Research Network (BIRN; http://www.nbirn.net ) project. BIRN is sponsored by the National Center for Research Resources (NCRR), a component of the National Institutes of Health (NIH). A goal is the development of a cyberinfrastructure for biomedical research that supports advance data acquisition, data storage, data management, data integration, data mining, data visualization, and other computing and information processing services over the Internet. Each p...
Source: Springer protocols feed by Bioinformatics - August 4, 2009 Category: Bioinformatics Source Type: info

System Biology of Gene Regulationemail this articleEmail this article to a colleague. save this article to My ClippingsSave this article to My Clippings. discuss this articleDiscuss or comment on this article.
A famous joke story that exhibits the traditionally awkward alliance between theory and experiment and showing the differences between experimental biologists and theoretical modelers is when a University sends a biologist, a mathematician, a physicist, and a computer scientist to a walking trip in an attempt to stimulate interdisciplinary research. During a break, they watch a cow in a field nearby and the leader of the group asks, “I wonder how one could decide on the size of a cow?” Since a cow is a biological object, the biologist responded first: “I have seen many cows in this area and know it is a b...
Source: Springer protocols feed by Bioinformatics - August 4, 2009 Category: Bioinformatics Source Type: info

Current Computational Methods for Prioritizing Candidate Regulatory Polymorphismsemail this articleEmail this article to a colleague. save this article to My ClippingsSave this article to My Clippings. discuss this articleDiscuss or comment on this article.
Discovery of DNA sequence variants responsible for human phenotypic variation is key to advances in molecular diagnostics and medicines. Historically, variants that alter the protein-coding sequence of genes have been targeted when attempting to identify a trait’s etiology; this is done because the rules governing these regions are generally well-understood and candidate variants can be easily selected. However, the effects of variants on gene regulation are increasingly regarded as being as important as protein-coding variation in uncovering the nature of phenotypic variation. I discuss resources and methodology tha...
Source: Springer protocols feed by Bioinformatics - August 4, 2009 Category: Bioinformatics Source Type: info

Methods of Information Geometry in Computational System Biology (Consistency between Chemical and Biological Evolution)email this articleEmail this article to a colleague. save this article to My ClippingsSave this article to My Clippings. discuss this articleDiscuss or comment on this article.
Interest in simulation of large-scale metabolic networks, species development, and genesis of various diseases requires new simulation techniques to accommodate the high complexity of realistic biological networks. Information geometry and topological formalisms are proposed to analyze information processes. We analyze the complexity of large-scale biological networks as well as transition of the system functionality due to modification in the system architecture, system environment, and system components.
Source: Springer protocols feed by Bioinformatics - August 4, 2009 Category: Bioinformatics Source Type: info

Protein Structure Prediction Based on Sequence Similarityemail this articleEmail this article to a colleague. save this article to My ClippingsSave this article to My Clippings. discuss this articleDiscuss or comment on this article.
The observation that similar protein sequences fold into similar three-dimensional structures provides a basis for the methods which predict structural features of a novel protein based on the similarity between its sequence and sequences of known protein structures. Similarity over entire sequence or large sequence fragment(s) enables prediction and modeling of entire structural domains while statistics derived from distributions of local features of known protein structures make it possible to predict such features in proteins with unknown structures. The accuracy of models of protein structures is sufficient for many pr...
Source: Springer protocols feed by Bioinformatics - August 4, 2009 Category: Bioinformatics Source Type: info

Applications of Bioinformatics to Protein Structures: How Protein Structure and Bioinformatics Overlapemail this articleEmail this article to a colleague. save this article to My ClippingsSave this article to My Clippings. discuss this articleDiscuss or comment on this article.
In this chapter, we will focus on the role of bioinformatics to analyze a protein after its protein structure has been determined. First, we present how to validate protein structures for quality assurance. Then, we discuss how to analyze protein–protein interfaces and how to predict the biomolecule which is the biological oligomeric state of the protein. Finally, we discuss how to search for homologs based on the 3-D structure which is an essential step for understanding protein function.
Source: Springer protocols feed by Bioinformatics - August 4, 2009 Category: Bioinformatics Source Type: info

Knowledge Discovery via Machine Learning for Neurodegenerative Disease Researchersemail this articleEmail this article to a colleague. save this article to My ClippingsSave this article to My Clippings. discuss this articleDiscuss or comment on this article.
Ever-increasing size of the biomedical literature makes more precise information retrieval and tapping into implicit knowledge in scientific literature a necessity. In this chapter, first, three new variants of the expectation–maximization (EM) method for semisupervised document classification (Machine Learning 39:103–134, 2000) are introduced to refine biomedical literature meta-searches. The retrieval performance of a multi-mixture per class EM variant with Agglomerative Information Bottleneck clustering (Slonim and Tishby (1999) Agglomerative information bottleneck. In Proceedings of NIPS-12) using Davies&nd...
Source: Springer protocols feed by Bioinformatics - August 4, 2009 Category: Bioinformatics Source Type: info

Recombination Detection and Analysis Using RDP3email this articleEmail this article to a colleague. save this article to My ClippingsSave this article to My Clippings. discuss this articleDiscuss or comment on this article.
Recombination between nucleotide sequences is a major process influencing the evolution of most species on Earth. While its evolutionary value is a matter of quite intense debate, so too is the influence of recombination on evolutionary analysis methods that assume nucleotide sequences replicate without recombining. The crux of the problem is that when nucleic acids recombine, the daughter or recombinant molecules no longer have a single evolutionary history. All analysis methods that derive increased power from correctly inferring evolutionary relationships between sequences will therefore be at least mildly sensitive to ...
Source: Springer protocols feed by Bioinformatics - January 1, 2009 Category: Bioinformatics Source Type: info

Detecting Signatures of Selection from DNA Sequences Using Datamonkeyemail this articleEmail this article to a colleague. save this article to My ClippingsSave this article to My Clippings. discuss this articleDiscuss or comment on this article.
Natural selection is a fundamental process affecting all evolving populations. In the simplest case, positive selection increases the frequency of alleles that confer a fitness advantage relative to the rest of the population, or increases its genetic diversity, and negative selection removes those alleles that are deleterious. Codon-based models of molecular evolution are able to infer signatures of selection from alignments of homologous sequences by estimating the relative rates of synonymous (dS) and non-synonymous substitutions (dN). Datamonkey ( http://www.datamonkey.org ) provides a user-frie...
Source: Springer protocols feed by Bioinformatics - January 1, 2009 Category: Bioinformatics Source Type: info

Trees from Trees: Construction of Phylogenetic Supertrees Using Clannemail this articleEmail this article to a colleague. save this article to My ClippingsSave this article to My Clippings. discuss this articleDiscuss or comment on this article.
We describe the most widely used supertree methods implemented in the software program “clann” and provide a step by step tutorial for investigating phylogenetic information and reconstructing the best supertree. Clann is freely available for Windows, Mac and Unix/Linux operating systems under the GNU public licence at http://bioinf.nuim.ie/software/clann .
Source: Springer protocols feed by Bioinformatics - January 1, 2009 Category: Bioinformatics Source Type: info

Estimating Maximum Likelihood Phylogenies with PhyMLemail this articleEmail this article to a colleague. save this article to My ClippingsSave this article to My Clippings. discuss this articleDiscuss or comment on this article.
Our understanding of the origins, the functions and/or the structures of biological sequences strongly depends on our ability to decipher the mechanisms of molecular evolution. These complex processes can be described through the comparison of homologous sequences in a phylogenetic framework. Moreover, phylogenetic inference provides sound statistical tools to exhibit the main features of molecular evolution from the analysis of actual sequences. This chapter focuses on phylogenetic tree estimation under the maximum likelihood (ML) principle. Phylogenies inferred under this probabilistic criterion are usually reliable and ...
Source: Springer protocols feed by Bioinformatics - January 1, 2009 Category: Bioinformatics Source Type: info

Selection of Models of DNA Evolution with jModelTestemail this articleEmail this article to a colleague. save this article to My ClippingsSave this article to My Clippings. discuss this articleDiscuss or comment on this article.
jModelTest is a bioinformatic tool for choosing among different models of nucleotide substitution. The program implements five different model selection strategies, including hierarchical and dynamical likelihood ratio tests (hLRT and dLRT), Akaike and Bayesian information criteria (AIC and BIC), and a performance-based decision theory method (DT). The output includes estimates of model selection uncertainty, parameter importances, and model-averaged parameter estimates, including model-averaged phylogenies. jModelTest is a Java program that runs under Mac OSX, Windows, and Unix systems with a Java Run Environment installe...
Source: Springer protocols feed by Bioinformatics - January 1, 2009 Category: Bioinformatics Source Type: info

SeqVis: A Tool for Detecting Compositional Heterogeneity Among Aligned Nucleotide Sequencesemail this articleEmail this article to a colleague. save this article to My ClippingsSave this article to My Clippings. discuss this articleDiscuss or comment on this article.
Compositional heterogeneity is a poorly appreciated attribute of aligned nucleotide and amino acid sequences. It is a common property of molecular phylogenetic data, and it has been found to occur across sequences and/or across sites. Most molecular phylogenetic methods assume that the sequences have evolved under globally stationary, reversible, and homogeneous conditions, implying that the sequences should be compositionally homogeneous. The presence of the above-mentioned compositional heterogeneity implies that the sequences must have evolved under more general conditions than is commonly assumed. Consequently, there i...
Source: Springer protocols feed by Bioinformatics - January 1, 2009 Category: Bioinformatics Source Type: info

Multiple Alignment of DNA Sequences with MAFFTemail this articleEmail this article to a colleague. save this article to My ClippingsSave this article to My Clippings. discuss this articleDiscuss or comment on this article.
Multiple alignment of DNA sequences is an important step in various molecular biological analyses. As a large amount of sequence data is becoming available through genome and other large-scale sequencing projects, scalability, as well as accuracy, is currently required for a multiple sequence alignment (MSA) program. In this chapter, we outline the algorithms of an MSA program MAFFT and provide practical advice, focusing on several typical situations a biologist sometimes faces. For genome alignment, which is beyond the scope of MAFFT, we introduce two tools: TBA and MAUVE.
Source: Springer protocols feed by Bioinformatics - January 1, 2009 Category: Bioinformatics Source Type: info

Gene Orthology Assessment with OrthologIDemail this articleEmail this article to a colleague. save this article to My ClippingsSave this article to My Clippings. discuss this articleDiscuss or comment on this article.
OrthologID ( http://nypg.bio.nyu.edu/orthologid/ ) allows for the rapid and accurate identification of gene orthology within a character-based phylogenetic framework. The Web application has two functions – an orthologous group search and a query orthology classification. The former determines orthologous gene sets for complete genomes and identifies diagnostic characters that define each orthologous gene set; and the latter allows for the classification of unknown query sequences to orthology groups. The first module of the Web application, the gene family generator, uses an E-value based app...
Source: Springer protocols feed by Bioinformatics - January 1, 2009 Category: Bioinformatics Source Type: info

Similarity Searching Using BLASTemail this articleEmail this article to a colleague. save this article to My ClippingsSave this article to My Clippings. discuss this articleDiscuss or comment on this article.
Similarity searches are an essential component of most bioinformatic applications. They form the bases of structural motif identification, gene identification, and insights into functional associations. With the rapid increase in the available genetic data through a wide variety of databases, similarity searches are an essential tool for accessing these data in an informative and productive way. In this chapter, we provide an overview of similarity searching approaches, related databases, and parameter options to achieve the best results for a variety of applications. We then provide a worked example and some notes for consideration.
Source: Springer protocols feed by Bioinformatics - January 1, 2009 Category: Bioinformatics Source Type: info

Mining for SNPs and SSRs Using SNPServer, dbSNP and SSR Taxonomy Treeemail this articleEmail this article to a colleague. save this article to My ClippingsSave this article to My Clippings. discuss this articleDiscuss or comment on this article.
Molecular genetic markers represent one of the most powerful tools for the analysis of genomes and the association of heritable traits with underlying genetic variation. The development of high-throughput methods for the detection of single nucleotide polymorphisms (SNPs) and simple sequence repeats (SSRs) has led to a revolution in their use as molecular markers. The availability of large sequence data sets permits mining for these molecular markers, which may then be used for applications such as genetic trait mapping, diversity analysis and marker assisted selection in agriculture. Here we describe web-based automated m...
Source: Springer protocols feed by Bioinformatics - January 1, 2009 Category: Bioinformatics Source Type: info

Analysis of Transposable Element Sequences Using CENSOR and RepeatMaskeremail this articleEmail this article to a colleague. save this article to My ClippingsSave this article to My Clippings. discuss this articleDiscuss or comment on this article.
We present here a survey of two of the most readily available and widely used bioinformatics applications for the detection, characterization, and analysis of TE sequences in eukaryotic genomes: CENSOR and RepeatMasker. For each program, information on availability, input, output, and the algorithmic methods used is provided. Specific examples of the use of CENSOR and RepeatMasker are also described. CENSOR and RepeatMasker both rely on homology-based methods for the detection of TE sequences. There are several other classes of methods available for the analysis of repetitive DNA sequences including de novo methods that co...
Source: Springer protocols feed by Bioinformatics - January 1, 2009 Category: Bioinformatics Source Type: info

Analysis of Genomic DNA with the UCSC Genome Browseremail this articleEmail this article to a colleague. save this article to My ClippingsSave this article to My Clippings. discuss this articleDiscuss or comment on this article.
Genomic DNA is being sequenced and annotated at a rapid rate, with terabases of DNA currently deposited in GenBank and other repositories. Genome browsers provide an essential collection of resources to visualize and analyze chromosomal DNA. The University of California, Santa Cruz (UCSC) Genome Browser provides annotations from the level of single nucleotides to whole chromosomes for four dozen metazoan and other species. The Genome Browser may be used to address a wide range of problems in bioinformatics (e.g., sequence analysis), comparative genomics, and evolution.
Source: Springer protocols feed by Bioinformatics - January 1, 2009 Category: Bioinformatics Source Type: info

Promoter Analysis: Gene Regulatory Motif Identification with A-GLAMemail this articleEmail this article to a colleague. save this article to My ClippingsSave this article to My Clippings. discuss this articleDiscuss or comment on this article.
Reliable detection of cis-regulatory elements in promoter regions is a difficult and unsolved problem in computational biology. The intricacy of transcriptional regulation in higher eukaryotes, primarily in metazoans, could be a major driving force of organismal complexity. Eukaryotic genome annotations have improved greatly due to large-scale characterization of full-length cDNAs, transcriptional start sites (TSSs), and comparative genomics. Regulatory elements are identified in promoter regions using a variety of enumerative or alignment-based methods. Here we present a survey of recent computational methods for eukaryot...
Source: Springer protocols feed by Bioinformatics - January 1, 2009 Category: Bioinformatics Source Type: info

Computational Gene Annotation in New Genome Assemblies Using GeneIDemail this articleEmail this article to a colleague. save this article to My ClippingsSave this article to My Clippings. discuss this articleDiscuss or comment on this article.
We present in this chapter a simple protocol mainly based on the combination of the program GeneID and other computational tools to annotate the location of a gene, which was previously annotated in D. melanogaster, in the recently assembled genome of D. yakuba.
Source: Springer protocols feed by Bioinformatics - January 1, 2009 Category: Bioinformatics Source Type: info

Genetic Code Prediction for Metazoan Mitochondria with GenDecoderemail this articleEmail this article to a colleague. save this article to My ClippingsSave this article to My Clippings. discuss this articleDiscuss or comment on this article.
There is a standard genetic code that is used by most organisms, but exceptions exist in which particular codons are translated with a different meaning, i.e., as a different amino acid. The characterization of the genetic code of an organism is hence a key step for properly analyzing and translating its protein-coding genes. Such characterization is particularly important in the case of metazoan mitochondrial genomes for two reasons: first, many variant codes occur in them and second, mitochondrial data is frequently used for evolutionary studies. Variant codes are usually found by comparative sequence analyses. Given a p...
Source: Springer protocols feed by Bioinformatics - January 1, 2009 Category: Bioinformatics Source Type: info

CodonExplorer: An Interactive Online Database for the Analysis of Codon Usage and Sequence Compositionemail this articleEmail this article to a colleague. save this article to My ClippingsSave this article to My Clippings. discuss this articleDiscuss or comment on this article.
We present principles and practical procedures for using analyses of GC content and codon usage frequency to identify highly expressed or horizontally transferred genes and to study the relative contribution of different types of mutation to gene and genome composition. CodonExplorer’s combination of a user-friendly web interface and a comprehensive genomic database makes these diverse analyses fast and straightforward to perform. CodonExplorer is thus a powerful tool that facilitates and automates a wide range of compositional analyses.
Source: Springer protocols feed by Bioinformatics - January 1, 2009 Category: Bioinformatics Source Type: info

DNA Sequence Polymorphism Analysis Using DnaSPemail this articleEmail this article to a colleague. save this article to My ClippingsSave this article to My Clippings. discuss this articleDiscuss or comment on this article.
The analysis of DNA sequence polymorphisms and SNPs (single nucleotide polymorphisms) can provide insights into the evolutionary forces acting on populations and species. Available population-genetic methods, and particularly those based on the coalescent theory, have become the primary framework to analyze such DNA polymorphism data. Here, I explain some essential analytical methods for interpreting DNA polymorphism data and also describe the basic functionalities of the DnaSP software. DnaSP is a multi-propose program that allows conducting exhaustive DNA polymorphism analysis using a graphical user-friendly interface.
Source: Springer protocols feed by Bioinformatics - January 1, 2009 Category: Bioinformatics Source Type: info

Computational Representation of Biological Systemsemail this articleEmail this article to a colleague. save this article to My ClippingsSave this article to My Clippings. discuss this articleDiscuss or comment on this article.
Integration of large and diverse biological data sets is a daunting problem facing systems biology researchers. Exploring the complex issues of data validation, integration, and representation, we present a systematic approach for the management and analysis of large biological data sets based on data warehouses. Our system has been implemented in the Bioverse, a framework combining diverse protein information from a variety of knowledge areas such as molecular interactions, pathway localization, protein structure, and protein function.
Source: Springer protocols feed by Bioinformatics - July 1, 2008 Category: Bioinformatics Source Type: info

Biological Network Inference and Analysis Using SEBINI and CABINemail this articleEmail this article to a colleague. save this article to My ClippingsSave this article to My Clippings. discuss this articleDiscuss or comment on this article.
Attaining a detailed understanding of the various biological networks in an organism lies at the core of the emerging discipline of systems biology. A precise description of the relationships formed between genes, mRNA molecules, and proteins is a necessary step toward a complete description of the dynamic behavior of an organism at the cellular level, and toward intelligent, efficient, and directed modification of an organism. The importance of understanding such regulatory, signaling, and interaction networks has fueled the development of numerous in silico inference algorithms, as well as new experimental techniques and...
Source: Springer protocols feed by Bioinformatics - July 1, 2008 Category: Bioinformatics Source Type: info

The Bioverse API and Web Applicationemail this articleEmail this article to a colleague. save this article to My ClippingsSave this article to My Clippings. discuss this articleDiscuss or comment on this article.
The Bioverse is a framework for creating, warehousing and presenting biological information based on hierarchical levels of organisation. The framework is guided by a deeper philosophy of desiring to represent all relationships between all components of biological systems towards the goal of a wholistic picture of organismal biology. Data from various sources are combined into a single repository and a uniform interface is exposed to access it. The power of the approach of the Bioverse is that, due to its inclusive nature, patterns emerge from the acquired data and new predictions are made. The implementation of this repos...
Source: Springer protocols feed by Bioinformatics - July 1, 2008 Category: Bioinformatics Source Type: info

Comparing Algorithms for Clustering of Expression Data: How to Assess Gene Clustersemail this articleEmail this article to a colleague. save this article to My ClippingsSave this article to My Clippings. discuss this articleDiscuss or comment on this article.
Clustering is a popular technique commonly used to search for groups of similarly expressed genes using mRNA expression data. There are many different clustering algorithms and the application of each one will usually produce different results. Without additional evaluation, it is difficult to determine which solutions are better.
Source: Springer protocols feed by Bioinformatics - July 1, 2008 Category: Bioinformatics Source Type: info

Effects of Functional Bias on Supervised Learning of a Gene Network Modelemail this articleEmail this article to a colleague. save this article to My ClippingsSave this article to My Clippings. discuss this articleDiscuss or comment on this article.
Gene networks have proven to be an effective approach for modeling cellular systems, capable of capturing some of the extreme complexity of cells in a formal theoretical framework. Not surprisingly, this complexity, combined with our still-limited amount of experimental data measuring the genes and their interactions, makes the reconstruction of gene networks difficult. One powerful strategy has been to analyze functional genomics data using supervised learning of network relationships based upon reference examples from our current knowledge. However, this reliance on the set of reference examples for the supervised learni...
Source: Springer protocols feed by Bioinformatics - July 1, 2008 Category: Bioinformatics Source Type: info

Connecting Protein Interaction Data, Mutations, and Disease Using Bioinformaticsemail this articleEmail this article to a colleague. save this article to My ClippingsSave this article to My Clippings. discuss this articleDiscuss or comment on this article.
Understanding how mutations lead to changes in protein function and/or protein interaction is critical to understanding the molecular causes of clinical phenotypes. In this method, we present a path toward integration of protein interaction data and mutation data and then demonstrate the identification of a subset of proteins and interactions that are important to a particular disease. We then build a statistical model of disease mutations in this disease-associated subset of proteins, and visualize these results. Using Alzheimer’s disease (AD) as case implementation, we find that we are able to identify a subset of ...
Source: Springer protocols feed by Bioinformatics - July 1, 2008 Category: Bioinformatics Source Type: info

Using Evolutionary Information to Find Specificity-Determining and Co-evolving Residuesemail this articleEmail this article to a colleague. save this article to My ClippingsSave this article to My Clippings. discuss this articleDiscuss or comment on this article.
Intricate networks of protein interactions rely on the ability of a protein to recognize its targets: other proteins, ligands, and sites on DNA and RNA. To recognize other molecules, it was suggested that a protein uses a small set of specificity-determining residues (SDRs). How can one find these residues in proteins and distinguish them from other functionally important amino acids? A number of bioinformatics methods to predict SDRs have been developed in recent years. These methods use genomic information and multiple sequence alignments to identify positions exhibiting a specific pattern of conservation and variability...
Source: Springer protocols feed by Bioinformatics - July 1, 2008 Category: Bioinformatics Source Type: info

Enzyme Function Prediction with Interpretable Modelsemail this articleEmail this article to a colleague. save this article to My ClippingsSave this article to My Clippings. discuss this articleDiscuss or comment on this article.
Enzymes play central roles in metabolic pathways, and the prediction of metabolic pathways in newly sequenced genomes usually starts with the assignment of genes to enzymatic reactions. However, genes with similar catalytic activity are not necessarily similar in sequence, and therefore the traditional sequence similarity-based approach often fails to identify the relevant enzymes, thus hindering efforts to map the metabolome of an organism.
Source: Springer protocols feed by Bioinformatics - July 1, 2008 Category: Bioinformatics Source Type: info

Identification of cis-Regulatory Elements in Gene Co-expression Networks Using A-GLAMemail this articleEmail this article to a colleague. save this article to My ClippingsSave this article to My Clippings. discuss this articleDiscuss or comment on this article.
Reliable identification and assignment of cis-regulatory elements in promoter regions is a challenging problem in biology. The sophistication of transcriptional regulation in higher eukaryotes, particularly in metazoans, could be an important factor contributing to their organismal complexity. Here we present an integrated approach where networks of co-expressed genes are combined with gene ontology–derived functional networks to discover clusters of genes that share both similar expression patterns and functions. Regulatory elements are identified in the promoter regions of these gene clusters using a Gibbs sampling...
Source: Springer protocols feed by Bioinformatics - July 1, 2008 Category: Bioinformatics Source Type: info

Structure-Based Ab Initio Prediction of Transcription Factor–Binding Sitesemail this articleEmail this article to a colleague. save this article to My ClippingsSave this article to My Clippings. discuss this articleDiscuss or comment on this article.
We present an all-atom molecular modeling method that can predict the binding specificity of a transcription factor based on its 3D structure, with no further information required. We use molecular dynamics and free energy calculations to compute the relative binding free energies for a transcription factor with multiple possible DNA sequences. These sequences are then used to construct a position weight matrix to represent the transcription factor–binding sites. Free energy differences are calculated by morphing one base pair into another using a multi-copy representation in which multiple base pairs are superimpose...
Source: Springer protocols feed by Bioinformatics - July 1, 2008 Category: Bioinformatics Source Type: info

Inferring Protein–Protein Interactions from Multiple Protein Domain Combinationsemail this articleEmail this article to a colleague. save this article to My ClippingsSave this article to My Clippings. discuss this articleDiscuss or comment on this article.
The ever accumulating wealth of knowledge about protein interactions and the domain architecture of involved proteins in different organisms offers ways to understand the intricate interplay between interactome and proteome. Ultimately, the combination of these sources of information will allow the prediction of interactions among proteins where only domain composition is known. Based on the currently available protein–protein interaction and domain data of Saccharomyces cerevisiae and Drosophila melanogaster we introduce a novel method, Maximum Specificity Set Cover (MSSC), to predict potential protein–protein...
Source: Springer protocols feed by Bioinformatics - July 1, 2008 Category: Bioinformatics Source Type: info

Prediction of Protein–Protein Interactions: A Study of the Co-evolution Modelemail this articleEmail this article to a colleague. save this article to My ClippingsSave this article to My Clippings. discuss this articleDiscuss or comment on this article.
The concept of molecular co-evolution drew attention in recent years as the basis for several algorithms for the prediction of protein–protein interactions. While being successful on specific data, the concept has never been tested on a large set of proteins. In this chapter we analyze the feasibility of the co-evolution principle for protein–protein interaction prediction through one of its derivatives, the correlated divergence model. Given two proteins, the model compares the patterns of divergence of their families and assigns a score based on the correlation between the two. The working hypothesis of the m...
Source: Springer protocols feed by Bioinformatics - July 1, 2008 Category: Bioinformatics Source Type: info

Computational Reconstruction of Protein–Protein Interaction Networks: Algorithms and Issuesemail this articleEmail this article to a colleague. save this article to My ClippingsSave this article to My Clippings. discuss this articleDiscuss or comment on this article.
Accurate mapping of protein–protein interaction networks in model organisms is a crucial first step toward subsequent quantitative study of the organization and evolution of biological systems. Data quality of experimental interactome maps can be assessed and improved by integrating multiple sources of evidence using machine learning methods. Here we describe the commonly used algorithms for predicting protein–protein interaction by genome data integration, and discuss several important yet often overlooked issues in computational reconstruction of protein–protein interaction networks.
Source: Springer protocols feed by Bioinformatics - July 1, 2008 Category: Bioinformatics Source Type: info

Prediction and Integration of Regulatory and Protein–Protein Interactionsemail this articleEmail this article to a colleague. save this article to My ClippingsSave this article to My Clippings. discuss this articleDiscuss or comment on this article.
We describe how to compile and handle various formats and identifiers of data sets from different sources and how to predict TRIs using a homology-based approach, utilizing the compiled data sets. Integrated data sets include experimentally verified TRIs, binding sites of transcription factors, promoter sequences, protein subcellular localization, and protein families. Predicted TRIs expand the networks of gene regulation for a large number of organisms. The integration of experimentally verified and predicted TRIs with other known protein–protein interactions (PPIs) gives insight into specific pathways, network moti...
Source: Springer protocols feed by Bioinformatics - July 1, 2008 Category: Bioinformatics Source Type: info

Detecting Hierarchical Modularity in Biological Networksemail this articleEmail this article to a colleague. save this article to My ClippingsSave this article to My Clippings. discuss this articleDiscuss or comment on this article.
Spatially or chemically isolated modules that carry out discrete functions are considered fundamental building blocks of cellular organization. However, detecting them in highly integrated biological networks requires a thorough understanding of the organization of these networks. In this chapter I argue that many biological networks are organized into many small, highly connected topologic modules that combine in a hierarchical manner into larger, less cohesive units. On top of a scale-free degree distribution, these networks show a power law scaling of the clustering coefficient with the node degree, a property that can ...
Source: Springer protocols feed by Bioinformatics - July 1, 2008 Category: Bioinformatics Source Type: info

Inferring Molecular Interactions Pathways from eQTL Dataemail this articleEmail this article to a colleague. save this article to My ClippingsSave this article to My Clippings. discuss this articleDiscuss or comment on this article.
Analysis of expression quantitative trait loci (eQTL) helps elucidate the connection between genotype, gene expression levels, and phenotype. However, standard statistical genetics can only attribute the changes in expression levels to loci on the genome, not specific genes. Each locus can contain many genes, making it very difficult to discover which gene is controlling the expression levels of other genes. Furthermore, it is even more difficult to find a pathway of molecular interactions responsible for controlling the expression levels. Here we describe a series of techniques for finding explanatory pathways by explorin...
Source: Springer protocols feed by Bioinformatics - July 1, 2008 Category: Bioinformatics Source Type: info

Managing Sequence Dataemail this articleEmail this article to a colleague. save this article to My ClippingsSave this article to My Clippings. discuss this articleDiscuss or comment on this article.
Nucleotide and protein sequences are the foundation for all bioinformatics tools and resources. Researchers can analyze these sequences to discover genes or predict the function of their products. The INSD (International Nucleotide Sequence Database—DDBJ/EMBL/GenBank) is an international, centralized primary sequence resource that is freely available on the internet. This database contains all publicly available nucleotide and derived protein sequences. This chapter summarizes the nucleotide sequence database resources, provides information on how to submit sequences to the databases, and explains how to access the sequence data.
Source: Springer protocols feed by Bioinformatics - May 1, 2008 Category: Bioinformatics Source Type: info

Reconstruction of Full-Length Isoforms from Splice Graphsemail this articleEmail this article to a colleague. save this article to My ClippingsSave this article to My Clippings. discuss this articleDiscuss or comment on this article.
Most alternative splicing events in human and other eukaryotic genomes are detected using sequence fragments produced by high throughput genomic technologies, such as EST sequencing and oligonu-cleotide microarrays. Reconstructing full-length transcript isoforms from such sequence fragments is a major interest and challenge for computational analyses of pre-mRNA alternative splicing. This chapter describes a general graph-based approach for computational inference of full-length isoforms.
Source: Springer protocols feed by Bioinformatics - May 1, 2008 Category: Bioinformatics Source Type: info

Sequence Segmentationemail this articleEmail this article to a colleague. save this article to My ClippingsSave this article to My Clippings. discuss this articleDiscuss or comment on this article.
Whole-genome comparisons among mammalian and other eukaryotic organisms have revealed that they contain large quantities of conserved non—protein-coding sequence. Although some of the functions of this non-coding DNA have been identified, there remains a large quantity of conserved genomic sequence that is of no known function. Moreover, the task of delineating the conserved sequences is non-trivial, particularly when some sequences are conserved in only a small number of lineages. Sequence segmentation is a statistical technique for identifying putative functional elements in genomes based on atypical sequence chara...
Source: Springer protocols feed by Bioinformatics - May 1, 2008 Category: Bioinformatics Source Type: info

Discovering Sequence Motifsemail this articleEmail this article to a colleague. save this article to My ClippingsSave this article to My Clippings. discuss this articleDiscuss or comment on this article.
Sequence motif discovery algorithms are an important part of the computational biologist's toolkit. The purpose of motif discovery is to discover patterns in biopolymer (nucleotide or protein) sequences in order to better understand the structure and function of the molecules the sequences represent. This chapter provides an overview of the use of sequence motif discovery in biology and a general guide to the use of motif discovery algorithms. The chapter discusses the types of biological features that DNA and protein motifs can represent and their usefulness. It also defines what sequence motifs are, how they are represen...
Source: Springer protocols feed by Bioinformatics - May 1, 2008 Category: Bioinformatics Source Type: info

Modeling Sequence Evolutionemail this articleEmail this article to a colleague. save this article to My ClippingsSave this article to My Clippings. discuss this articleDiscuss or comment on this article.
DNA and amino acid sequences contain information about both the phylogenetic relationships among species and the evolutionary processes that caused the sequences to divergence. Mathematical and statistical methods try to detect this information to determine how and why DNA and protein molecules work the way they do. This chapter describes some of the models of evolution of biological sequences most widely used. It first focuses on single nucleotide/amino acid replacement rate models. Then it discusses the modelling of evolution at gene and protein module levels. The chapter concludes with speculations about the future use ...
Source: Springer protocols feed by Bioinformatics - May 1, 2008 Category: Bioinformatics Source Type: info