Bioinformaticians Blogs
This is an RSS file. You can use it to subscribe to this data in your favourite RSS reader, such as GoogleReader, or to display this data on your own website or blog.
Subscribe to this data using MyMedWorm.
Subscribe to this data using GoogleReader.
Subscribe to this data using Bloglines.
Subscribe to this data using MyYahoo.
Please support the Doctors In Chains campaign for the medics tortured and sentenced for up to 15 years in Bahrain. #FreeDoctors
This page shows you the most recent publications within this specialty of the MedWorm directory.
Inside the Variation Toolkit: Tools for Gene Ontology
Email this article to a colleague.
Save this article to My Clippings.
Discuss or comment on this article.
GeneOntologyDbManager is a C++ tool that is part of my experimental Variation Toolkit.
This program is a set of tools for GeneOntology, it is based on the sqlite3 library.
Download
Download the sources from Google-Code using subversion:....
svn checkout http://variationtoolkit.googlecode.com/svn/trunk/ variationtoolkit-read-only
... or update the sources of an existing installation...
cd (Source: YOKOFAKUN)
Source: YOKOFAKUN - January 31, 2012 Category: Bioinformaticians Authors: Pierre Lindenbaum Source Type: blogs
Does Illlmina Also Have A Homopolymer Problem?
Email this article to a colleague.
Save this article to My Clippings.
Discuss or comment on this article.
One of the most widely-publicized error modes with Ion Torrent and 454 sequencing has been the challenge of correctly counting the number of bases in homopolymer runs. Because these chemistries use non-terminating nucleotides, polymerase is free to add as many as possible. Unfortunately, the signal linearity breaks down, making it difficult to correctly count. Ion Torrent today released a note on homopolymers, but rather than plowing this well-trod ground it goes for a less publicized problem: Illumina having a more specific challenge in this department. The note is available on the Ion Community, free registration...
Source: Omics! Omics! - January 30, 2012 Category: Bioinformaticians Authors: Keith Robison Source Type: blogs
Inside the variation toolkit: VCF2XML
Email this article to a colleague.
Save this article to My Clippings.
Discuss or comment on this article.
vcf2xml is C++ tool that is part of my Variation Toolkit.
It transforms a "Variant Call Format document" to XML, so it can be later processed with xslt, xquery, etc...
Dependencies
libxml http://xmlsoft.org/
Download
Download the sources from Google-Code using subversion:....
svn checkout http://variationtoolkit.googlecode.com/svn/trunk/ variationtoolkit-read-only
... or update the (Source: YOKOFAKUN)
Source: YOKOFAKUN - January 28, 2012 Category: Bioinformaticians Authors: Pierre Lindenbaum Source Type: blogs
Insert your VCFs in a sqlite database.
Email this article to a colleague.
Save this article to My Clippings.
Discuss or comment on this article.
vcf2sqlite is C++ tool that is part of my Variation Toolkit.
It inserts a "Variant Call Format document" (VCF) into a sqlite3 database.
Download
Download the sources from Google-Code using subversion:....
svn checkout http://variationtoolkit.googlecode.com/svn/trunk/ variationtoolkit-read-only
... or update the sources of an existing installation...
cd variationtoolkit
svn update
... and (Source: YOKOFAKUN)
Source: YOKOFAKUN - January 28, 2012 Category: Bioinformaticians Authors: Pierre Lindenbaum Source Type: blogs
Wiider postmortem
Email this article to a colleague.
Save this article to My Clippings.
Discuss or comment on this article.
I always intended to write this postmortem earlier … now three years after development ceased, I’m finally getting around to it. Warning – retrospective rambling ahead.
In mid 2007, Nintendo released the Opera-powered browser for their Wii gaming console which they called the Internet Channel. For many people, including myself, this was the first time they had been able to use “Internet on the TV”. Because of the typical viewing distance, low resolution for CRT-based televisions, and the unique navigation interface using the Wiimote, many web sites were functional but not particularly comfort...
Source: Your bones got a little machine. - January 27, 2012 Category: Bioinformaticians Authors: Andrew Perry Tags: software code nintendo python web2.0 wii Source Type: blogs
de-Bruijn assembler
Email this article to a colleague.
Save this article to My Clippings.
Discuss or comment on this article.
In the last post I talked about the overlap-layout-consensus (OLC) way of Genome assembly. The approach which is (really!) getting popular these days is the other one, de-Bruijn-graphs (DBG). It is based on the simple idea of converting the hard problem of finding Hamiltonian to relatively simpler Eulerian for assembling biological sequences. There is a beautiful tutorial like introduction co-authored by the 'father' of this idea, Prof. Pevzner.
To put it naively, in contrast with OLC approach where one represents the reads as vertex in graph which are connected if the end of the read repres...
Source: Bioinformatics Latest News - January 27, 2012 Category: Bioinformaticians Authors: Animesh Sharma Source Type: blogs
Reproducible research: three links that made me think
Email this article to a colleague.
Save this article to My Clippings.
Discuss or comment on this article.
I’m constantly amazed, bemused and troubled by how little published scientific research is genuinely reproducible, in that you or I (or even the original authors) could go back and check the results. Three examples from around the Web converged in my mind this week.
Software availability
A BioStar user asks: where is the software for the method described in a Science article, “A Composite of Multiple Signals Distinguishes Causal Variants in Regions of Positive Selection.”
No-one can find it on the Web; the best we can do is a press release from 2010 stating that the software “should soon be availab...
Source: What You're Doing Is Rather Desperate - January 26, 2012 Category: Bioinformaticians Authors: nsaunders Tags: bioinformatics publications statistics reproducibility retraction Source Type: blogs
Roche Guns For Illumina
Email this article to a colleague.
Save this article to My Clippings.
Discuss or comment on this article.
Due to a business dinner & general exhaustion, I turned in early last night & was caught unaware this morning of the big news: Roche is making a hostile takeover bid for Illumina. Ugh!Read more » (Source: Omics! Omics!)
Source: Omics! Omics! - January 25, 2012 Category: Bioinformaticians Authors: Keith Robison Source Type: blogs
Science Online 2012 on Twitter
Email this article to a colleague.
Save this article to My Clippings.
Discuss or comment on this article.
Science Online 2012, held in at the NC state university in Raleigh, NC, last week was a transforming event. The 450 attendees were a colorful mix of science writers, journalists, researchers, educators and artists. Apparently, the ratio of scientists dropped in its 6th installment. But the love of Science in the air would have graced any proper scientific meeting. Much of it has already been praised in readable form and can be accessed in a wiki repository. So why was Science Online 2012 so great? The reliable WLAN, the conference hall, the sessions run by the usual suspects, more or less prepared? I fail to answer tha...
Source: Notes from the biomass - January 23, 2012 Category: Bioinformaticians Authors: Roland Krause Tags: Blogs Conferences #scio12 Source Type: blogs
Sequencing Technology Fireworks
Email this article to a colleague.
Save this article to My Clippings.
Discuss or comment on this article.
I actually awoke today expecting an exciting press release, but I sure wasn't prepared for the big announcements from Ion and Illumina. Not that they were totally unexpected, but there's a huge difference between speculation and announced products (which, of course, are hugely different from ones you can actually buy!)Read more » (Source: Omics! Omics!)
Source: Omics! Omics! - January 10, 2012 Category: Bioinformaticians Authors: Keith Robison Source Type: blogs
A CGI-version of samtools tview.
Email this article to a colleague.
Save this article to My Clippings.
Discuss or comment on this article.
I've created a lightweight CGI-based web-application for samtools tview. This C++ program named ngsproject.cgi uses the samtools api, it allows any user to visualize all the alignments in a given NGS project. The projects and their BAMS are defined on the server side using a simple XML document. e.g:
/home/lindenb/samtools-0.1.18/ (Source: YOKOFAKUN)
Source: YOKOFAKUN - January 7, 2012 Category: Bioinformaticians Authors: Pierre Lindenbaum Source Type: blogs
Ion Torrent Pairs: To What End?
Email this article to a colleague.
Save this article to My Clippings.
Discuss or comment on this article.
Ion Torrent quietly released a set of paired end datasets over the holiday break. This is a bit embarassing for me, as in my last post on Ion I stated the platform "will probably never have paired ends" and in fact Ion had already announced the protocol. Oy! I also missed their mate pair protocol being released, though the document itself is another victim of Ion's incredibly counterproductive security policy. If you don't own a PGM, you can't access the document -- never mind if you are trying to plan for a potential buy or are preparing a library for a friend/collaborator to run. Read more...
Source: Omics! Omics! - January 7, 2012 Category: Bioinformaticians Authors: Keith Robison Source Type: blogs
The Variation Toolkit
Email this article to a colleague.
Save this article to My Clippings.
Discuss or comment on this article.
During the last weeks, I've worked on an experimental C++ package named The Variation Toolkit (varkit). It was originally designed to provide some command lines equivalent to knime4bio but I've added more tools over time. Some of those tools are very simple-and-stupid ( fasta2tsv) , reinvent the wheel ("numericsplit"), are part of an answer to biostar, are some old tools (e.g. bam2wig) that have (Source: YOKOFAKUN)
Source: YOKOFAKUN - January 5, 2012 Category: Bioinformaticians Authors: Pierre Lindenbaum Source Type: blogs
Hamiltonian Assembler
Email this article to a colleague.
Save this article to My Clippings.
Discuss or comment on this article.
Just came across Andrew's Hamiltonian Cycle finder (HCF)
get:
wget https://raw.github.com/ahh/ahh-toys/master/ham.sh
simple check:
bash ham.sh a b b c c a [followed by Ctrl-D should produce]
a b c
based purely on shell commands and thought of testing it out as a Genome assembler. The genome assembly problem is closely related to finding the shortest common superstring (S) of a given set of strings (s1, s2… sn). The superstring S corresponds to the genome and the set of strings being the short sequence reads produced by the sequencing machines.
The popular way...
Source: Bioinformatics Latest News - January 5, 2012 Category: Bioinformaticians Authors: Animesh Sharma Source Type: blogs
2011 blog stats courtesy of WordPress.com
Email this article to a colleague.
Save this article to My Clippings.
Discuss or comment on this article.
The kind people at WordPress.com have prepared a 2011 annual report for this blog.
Click here to see the complete report.
Filed under: this blog Tagged: 2011, summary, wordpress.com (Source: What You're Doing Is Rather Desperate)
Source: What You're Doing Is Rather Desperate - December 31, 2011 Category: Bioinformaticians Authors: nsaunders Tags: this blog 2011 summary wordpress.com Source Type: blogs
A Fundamental Breakthrough in Protein Folding
Email this article to a colleague.
Save this article to My Clippings.
Discuss or comment on this article.
In my humble opinion, the biggest paper in protein folding from the last few years just got published in the wee hours of 2011. It is Protein 3D Structure Computed from Evolutionary Sequence Variation from Debora Marks, Lucy Colwell and colleagues (and when I say colleague I mean Chris Sander, which you should all know as a co-author of DSSP). This paper proves the tremendous result that the key structural contacts in a protein structure can be derived from a multiple sequence alignment. And that these contacts are sufficient to generate reliable structures of the protein. And big proteins at that.
I heard on the grapev...
Source: Trapped in the USA - December 28, 2011 Category: Bioinformaticians Authors: bosco Source Type: blogs
Year's End
Email this article to a colleague.
Save this article to My Clippings.
Discuss or comment on this article.
I hoped this year to push myself to blog more frequently and regularly. Clearly I did better than some years, but not up to the standard I had hoped for. I've also realized that I missed noting some significant personal milestones.Read more » (Source: Omics! Omics!)
Source: Omics! Omics! - December 28, 2011 Category: Bioinformaticians Authors: Keith Robison Source Type: blogs
Sequencing for relics from the Sanger era part 1: getting the raw data
Email this article to a colleague.
Save this article to My Clippings.
Discuss or comment on this article.
Sequencing in the good old daysIn another life, way back in the mists of time, I did a Ph.D. Part of my project was to sequence a gene from a bacterium, which encoded an enzyme involved in nitrate metabolism. It took the best part of a year to obtain ~ 2 000 bp of DNA sequence: partly because I was rubbish at sequencing, but also because of the technology at the time. It was an elegant biochemical technique called the dideoxy chain termination method, or “Sanger sequencing” after its inventor. Sequence was visualized by exposing radioactively-labelled DNA to X-ray film, resulting in images like the one at left,...
Source: What You're Doing Is Rather Desperate - December 21, 2011 Category: Bioinformaticians Authors: nsaunders Tags: bioinformatics research diary how to next-generation ngs sequencing tutorial Source Type: blogs
Books Read 2011
Email this article to a colleague.
Save this article to My Clippings.
Discuss or comment on this article.
I started this year with the good intention of writing a paragraph or three for each book I read. A short review so to speak. I managed for about 15 books up to April. It’s too much work to write about what I just read. Easier to read something new. So here’s my reading list for this year (managed to read a lot of technical shit):
1. Stendhal, The Red and the Black
2. Steig Larson, The Girl who kicked the Hornet’s Nest
3. Neil Howe and William Strauss, The Fourth Turning
5. Edith Wharton, “Age of Innocence”.
6. Gary Taubes, “Why we get Fat”
7. Nathan Haren & Mike Cliffe Jones, “Beyond Blogging...
Source: Trapped in the USA - December 19, 2011 Category: Bioinformaticians Authors: bosco Source Type: blogs
Magic Numbers and unit conversions in Structural Biology
Email this article to a colleague.
Save this article to My Clippings.
Discuss or comment on this article.
If you end up doing any kind of energy calculation in proteins or organic chemistry – and that includes messing around with Molecular Dynamic trajectories – you may end up dealing with actual numbers.
And that means you’ll have to get your head around physical units and their conversions.
I’ve spent days trying to figure out magic numbers in equations and source-code. Diving into the guts of someone else’s source-code is not the nicest place to figure such things out. Do it enough, and you’ll start seeing the same numbers pop up everywhere. As I’ve never seen anyone bother to...
Source: Trapped in the USA - December 14, 2011 Category: Bioinformaticians Authors: bosco Source Type: blogs
#arseniclife: the genome
Email this article to a colleague.
Save this article to My Clippings.
Discuss or comment on this article.
It’s about one year since the science story dubbed #arseniclife hit the headlines. November 30th saw the release of a draft genome sequence for Halomonas sp. GFAJ-1, the bacterium behind the furore.
As Iddo pointed out on Twitter, sequencing the DNA from GFAJ-1 is itself strong evidence against arsenate in the DNA backbone, since the sequencing chemistry would be highly unlikely to work in that case. However, if like me you think that a new microbial genome provides the most fun to be had in bioinformatics [*], you’ll be excited by the availability of the data.
In this post then: where to get it, some very prel...
Source: What You're Doing Is Rather Desperate - December 13, 2011 Category: Bioinformaticians Authors: nsaunders Tags: bioinformatics genomics arseniclife extremophiles halomonas microbiology Source Type: blogs
Reflecting on a Year of Ion Torrent
Email this article to a colleague.
Save this article to My Clippings.
Discuss or comment on this article.
Ion Torrent released three more datasets this morning, all generated on their 318 chip. One's from E.coli but two are human genomic samples. With approximately 1.2Gbp of raw data coming from these 318 chips (fron around 6 million quality filtered reads per chip), they are starting to move up the food chain in human genomics from pure amplicon sequencing to more complex small targeted resequencing efforts.Read more » (Source: Omics! Omics!)
Source: Omics! Omics! - December 9, 2011 Category: Bioinformaticians Authors: Keith Robison Source Type: blogs
New Look, Mobile Friendly
Email this article to a colleague.
Save this article to My Clippings.
Discuss or comment on this article.
I’ve been meaning to do this for a while, I’ve updated the look of the blog.
The previous design was inspired by an old design of über minimalist Ev Bogue. Back then I was wandering around the world as a nomadic minimalist.
I’ve since been folded back in academia and it’s time for a change. The new look has two goals:
1. Emphasise the article nature of most of my (longish) posts
2. Responsive-Web-Design
Resize the window and make it real narrow. You will see the design collapse into a linear layout. Perfect for reading on a mobile† device.
†I really mean iPhones, but I...
Source: Trapped in the USA - December 7, 2011 Category: Bioinformaticians Authors: bosco Source Type: blogs
A Friday round-up
Email this article to a colleague.
Save this article to My Clippings.
Discuss or comment on this article.
Just a brief selection of items that caught my eye this week. Note that this is a Friday as opposed to Friday, lest you mistake this for a new, regular feature.
1. R/statistics
ggbio
A new Bioconductor package which builds on the excellent ggplot graphics library, for the visualization of biological data.
R development master class
Hadley Wickham recently presented this course on R package development for my organisation. I was on parental leave at the time, otherwise I would have attended for sure.
2. Bioinformatics in the media
DNA Sequencing Caught in Deluge of Data
I described this NYT article as a “surprisingl...
Source: What You're Doing Is Rather Desperate - December 1, 2011 Category: Bioinformaticians Authors: nsaunders Tags: bioinformatics research diary statistics web resources Source Type: blogs
Suggest some new terms for the EDAM Ontology for Bioinformatics
Email this article to a colleague.
Save this article to My Clippings.
Discuss or comment on this article.
EDAM is an ontology of general bioinformatics concepts, including topics and data types, formats, identifiers and operations.
Is your specific subject of research present in this ontology (e.g "RNA-Seq") ? go and have a look at http://www.ebi.ac.uk/ontology-lookup/browse.do?ontName=EDAM. If it is not, feel free to suggest a new term in the form below. Your term might be included in the next (Source: YOKOFAKUN)
Source: YOKOFAKUN - December 1, 2011 Category: Bioinformaticians Authors: Pierre Lindenbaum Source Type: blogs
Special Topic Issue "Webscience in medicine"
Email this article to a colleague.
Save this article to My Clippings.
Discuss or comment on this article.
A Call for papers has been released for a special topic issue on Webscience in medicine. The special topic will appear within the Journal "Methods of information in Medicine". (Source: Medical Ecosystem for Personalized Event-Based Surveillance)
Source: Medical Ecosystem for Personalized Event-Based Surveillance - November 29, 2011 Category: Bioinformaticians Source Type: blogs
Boring, monotonous day-to-day tasks? That’s synonymous with bioinformatics.
Email this article to a colleague.
Save this article to My Clippings.
Discuss or comment on this article.
In response to this question, I can only point out that J.C.R. Licklider figured it out over 50 years ago:
Despite the fact that there is a voluminous literature on thinking and problem solving, including intensive case-history studies of the process of invention, I could find nothing comparable to a time-and-motion-study analysis of the mental work of a person engaged in a scientific or technical enterprise. In the spring and summer of 1957, therefore, I tried to keep track of what one moderately technical person actually did during the hours he regarded as devoted to work. Although I was aware of the inadequacy of the sa...
Source: What You're Doing Is Rather Desperate - November 24, 2011 Category: Bioinformaticians Authors: nsaunders Tags: bioinformatics productivity biostar licklider Source Type: blogs
Processing json data with apache velocity.
Email this article to a colleague.
Save this article to My Clippings.
Discuss or comment on this article.
I've written a tool named "apache velocity" which parse json data and processes it with "Apache velocity" (a template engine ). The (javacc) source code is available here:
https://github.com/lindenb/jsandbox/blob/master/src/sandbox/VelocityJson.jj
Example
Say you have defined some classes using JSON:
[
{
"type": "record",
"name": "Exon",
"fields" : [
{"name": "start" (Source: YOKOFAKUN)
Source: YOKOFAKUN - November 20, 2011 Category: Bioinformaticians Authors: Pierre Lindenbaum Source Type: blogs
OSGi and Eclipse RCP Talk
Email this article to a colleague.
Save this article to My Clippings.
Discuss or comment on this article.
I gave a talk at the Seattle Java User Group this week, where I talk about two technologies I use at work: The OSGi module system, and the Eclipse Rich Client Platform. Giving this talk was a good experience, and the audience was great. You can watch the talk below. (Source: eric.jain.name)
Source: eric.jain.name - November 17, 2011 Category: Bioinformaticians Authors: Eric Jain Tags: Life Science Programming Source Type: blogs
"VCF annotation" with the NHLBI GO Exome Sequencing Project (JAX-WS)
Email this article to a colleague.
Save this article to My Clippings.
Discuss or comment on this article.
The NHLBI Exome Sequencing Project (ESP) has released a web service to query their data. "The goal of the NHLBI GO Exome Sequencing Project (ESP) is to discover novel genes and mechanisms contributing to heart, lung and blood disorders by pioneering the application of next-generation sequencing of the protein coding regions of the human genome across diverse, richly-phenotyped populations and to (Source: YOKOFAKUN)
Source: YOKOFAKUN - November 16, 2011 Category: Bioinformaticians Authors: Pierre Lindenbaum Source Type: blogs
The infidelity of the theoretical protein backbone
Email this article to a colleague.
Save this article to My Clippings.
Discuss or comment on this article.
Problems in protein simulations are often reduced to one of two categories:
do we have a sufficiently accurate model of atomic interactions?
have we sufficiently explored the conformational space of our proteins?
If you talk to molecular modellers, they will try to tell you the problem lies wholly in conformational sampling. It is an article of their faith that force-fields are good enough. Our erstwhile modeller can then claim that the problem lies not in software but in hardware. It’s just that their computers are too slow to finish their simulations properly.
Well, I beg to differ. One major flaw in at...
Source: Trapped in the USA - November 15, 2011 Category: Bioinformaticians Authors: bosco Source Type: blogs
Some English pronunciation tips
Email this article to a colleague.
Save this article to My Clippings.
Discuss or comment on this article.
Although I learnt English as a second language, I now wear it like a well-worn glove. But in science, there are tons of made-up poly-syllabic words, and I occassionally trip over some of them, such as: equilibrate, equilibrium and equilibration.
However, I’ve been learning a bunch of languages the last few years, and through doing that, it’s thrown English pronunciation in relief. I’m starting to get a conscious handle on English pronunciation, and I’ve discovered some useful rules that were not apparent to me before:
English is an accented language. There is one emphasized syllable in every...
Source: Trapped in the USA - November 14, 2011 Category: Bioinformaticians Authors: bosco Source Type: blogs
Some English pronounciation tips
Email this article to a colleague.
Save this article to My Clippings.
Discuss or comment on this article.
Although I learnt English as a second language, I now wear it like a well-worn glove. But in science, there are tons of made-up poly-syllabic words, and I occassionally trip over some of them, such as: equilibrate, equilibrium and equilibration.
However, I’ve been learning a bunch of languages the last few years, and through doing that, it’s thrown English prounciation in relief. I’m starting to get a conscious handle on English prounciation, and I’ve discovered some useful rules that were not apparent to me before:
English is an accented language. There is one emphasized syllable in every p...
Source: Trapped in the USA - November 10, 2011 Category: Bioinformaticians Authors: bosco Source Type: blogs
MedEx 2011
Email this article to a colleague.
Save this article to My Clippings.
Discuss or comment on this article.
On October 28, 2011 the Second International Workshop on Webscience and Information Exchange (MedEx 2011) took place in Glasgow. The audience had the opportunity to learn more about the facets and research questions related to webscience in medicine and healthcare. Three full papers and five position papers were presented covering a wide spectrum of research problems in medical natural language processing and webscience. Among others, two papers from the M-Eco context have been introduced.In her keynote, Wendy Chapman provided insights into her vision of enabling collaboration and sharing in clinical natural language proce...
Source: Medical Ecosystem for Personalized Event-Based Surveillance - November 8, 2011 Category: Bioinformaticians Tags: Webscience MedEx 2011 Source Type: blogs
The paper about BioStar has been published in "PLoS Computational Biology"
Email this article to a colleague.
Save this article to My Clippings.
Discuss or comment on this article.
The article describing BioStar has been published in PLoS Computational Biology:
BioStar: An Online Question & Answer Resource for the Bioinformatics Community
Laurence D. Parnell, Pierre Lindenbaum, Khader Shameer, Giovanni Marco Dall'Olio, Daniel C. Swan, Lars Juhl Jensen, Simon J. Cockell, Brent S. Pedersen, Mary E. Mangan, Christopher A. Miller, Istvan Albert. 2011
PLoS Comput Biol 7(10 (Source: YOKOFAKUN)
Source: YOKOFAKUN - November 1, 2011 Category: Bioinformaticians Authors: Pierre Lindenbaum Source Type: blogs
Fitting Out
Email this article to a colleague.
Save this article to My Clippings.
Discuss or comment on this article.
One of the attractions of my new shop was the possibility to see a biotech company built from the ground up. Each of my previous companies had been a long-standing concern by the time I got there; even Codon Devices had a year plus under its belt and some equipment already mothballed. The new venture moved into its first lab space last week, and as you can see from the picture all we have at the moment there are bare walls.Read more » (Source: Omics! Omics!)
Source: Omics! Omics! - October 27, 2011 Category: Bioinformaticians Authors: Keith Robison Source Type: blogs
A reference genome with or without the 'chr' prefix
Email this article to a colleague.
Save this article to My Clippings.
Discuss or comment on this article.
The name of the chromosomes in the fasta files for the human genome are prefixed with 'chr' :
$ grep ">" hg19.fa
>chr1
>chr2
>chr3
>chr4
>chr5
>chr6
(...)
The FAIDX index for this fasta file looks like this:
chr1 249250621 6 50 51
chr2 243199373 254235646 50 51
chr3 198022430 502299013 50 51
chr4 191154276 704281898 50 51
chr5 180915260 899259266 50 51
chr6 171115067 1083792838 50 51
(...)
. (Source: YOKOFAKUN)
Source: YOKOFAKUN - October 21, 2011 Category: Bioinformaticians Authors: Pierre Lindenbaum Source Type: blogs
Running a local JABAWS server for Jalview on Ubuntu (11.04 Natty)
Email this article to a colleague.
Save this article to My Clippings.
Discuss or comment on this article.
The excellent Jalview sequence alignment visualization and editing tool has the ability to send a set of sequences to a multiple sequence alignment web service (“JABAWS”) and receive the results in a new alignment window. This is really convenient when you are doing lots of sequence analysis, and Geoff Barton’s group at the University of Dundee provide a JABAWS server that Jalview will use by default.
But maybe the Dundee server is down. Or maybe you think your local machine will do things faster. Or maybe you work on über secret sequences in some Faraday cage bunker with no permanent network connectio...
Source: Your bones got a little machine. - October 13, 2011 Category: Bioinformaticians Authors: Andrew Perry Tags: bioinformatics howto linux Source Type: blogs
MiSeq Made Easy?
Email this article to a colleague.
Save this article to My Clippings.
Discuss or comment on this article.
The first computer I ever tried to program was built from a kit by my brother and father. The DATAC-1000 was a single-board machine, with that single printed circuit board about the area of a large laptop (image on page 9). Sporting a grand 1K of RAM, it was a grand machine. User input-output was entirely through a set of binary touchpads and LEDs, though a cassette tape interface enabled storing and reading programs. If I helped any with it, I might have sorted the resistors since I had just learned the color code. The machine sported the same processor as some other machines of the time, such as the KIM-1 and the...
Source: Omics! Omics! - October 11, 2011 Category: Bioinformaticians Authors: Keith Robison Source Type: blogs
Swimming, Running, Hunting, and Meditating: the Evolutionary Origin of Mystical States
Email this article to a colleague.
Save this article to My Clippings.
Discuss or comment on this article.
I’ve wondered often at how mystical states came about, states of mind that takes us out of the every day, such as the samadhi of buddhist meditation, or the zone of the long-distance runner. Assume, for argument sake, that such mystical states are an intrinsic part of our human heritage, then they must have evolved from a mental substratum dating back to a more primordial existence. The question then arises as to what possible use our distant quasi-monkey ancestors might have had for mystical-like states or mind.
Or to rephrase it another way, is there some kind of biological function that would require the evolut...
Source: Trapped in the USA - October 10, 2011 Category: Bioinformaticians Authors: bosco Source Type: blogs
Knime4Bio: a set of custom nodes for the interpretation of NGS data with KNIME
Email this article to a colleague.
Save this article to My Clippings.
Discuss or comment on this article.
Our paper has just been published in Bioinformatics :-)
http://bioinformatics.oxfordjournals.org/content/early/2011/10/07/bioinformatics.btr554.abstract
Knime4Bio: a set of custom nodes for the interpretation of Next Generation Sequencing data with KNIME.
Pierre Lindenbaum, Solena Le Scouarnec, Vincent Portero and Richard Redon
Summary: Here, we describe Knime4Bio, a set of (Source: YOKOFAKUN)
Source: YOKOFAKUN - October 7, 2011 Category: Bioinformaticians Authors: Pierre Lindenbaum Source Type: blogs
Grouping mutations/Gene=f(sample)
Email this article to a colleague.
Save this article to My Clippings.
Discuss or comment on this article.
GroupByGene is a small C++ tool grouping the data:
CHROM
POS
REF
GENE
SAMPLE
by gene=f(sample). This tool is available on github:https://github.com/lindenb/ccsandbox/blob/master/src/groupbygene.cpp.
Example:
$ cat input.tsv
#CHROM POS REF ALT GENE SAMPLE
chr1 10 A T gene1 indi1
chr1 10 A T gene1 indi2
chr1 11 C G gene1 indi2
chr2 110 C G gene2 indi3
chr3 210 A T gene3 indi1
chr3 211 C T gene3 (Source: YOKOFAKUN)
Source: YOKOFAKUN - October 5, 2011 Category: Bioinformaticians Authors: Pierre Lindenbaum Source Type: blogs
Verticalize: printing the input stream vertically.
Email this article to a colleague.
Save this article to My Clippings.
Discuss or comment on this article.
A useful tool: verticalize is a small C++ tool printing the input stream vertically. The source is available on github : https://github.com/lindenb/ccsandbox/blob/master/src/verticalize.cpp.
An Example with 1000genomes.org :
$ curl -s "ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20100804/ALL.2of4intersection.20100804.sites.vcf.gz"|\
gunzip -c | grep -v "##" |\
verticalize | head -n 30
>>> (Source: YOKOFAKUN)
Source: YOKOFAKUN - October 5, 2011 Category: Bioinformaticians Authors: Pierre Lindenbaum Source Type: blogs
Surveillance Workshop at GMDS
Email this article to a colleague.
Save this article to My Clippings.
Discuss or comment on this article.
On September 28, 2011, the workshop "From indicator-based to event-based surveillance" took place in conjunction with the GMDS Jahrestagung in Mainz. We had four interesting presentations. Tim Eckmanns gave an overview on indicator- and event based systems and showed the delays in traditional reporting on the EHEC outbreak in 2011. Jas Mantero from ECDC spoke about the event-based surveillace as performed by ECDC. Jens Linge from JRC presented the MediSys System as an example of an event-based system. Kerstin Denecke (L3S) introduced the M-Eco projet. The audience was very interested in the topic and asked motivating quest...
Source: Medical Ecosystem for Personalized Event-Based Surveillance - September 30, 2011 Category: Bioinformaticians Source Type: blogs
Thinking Outside the Box or Just Plain Nuts?
Email this article to a colleague.
Save this article to My Clippings.
Discuss or comment on this article.
Please take the title in the spirit it is intended: as a bit lighthearted. Seeing the object pictured and reading the accompanying blog post from one of Jonathan Eisen's graduate students. It's an unusual solution to a common problem, and gave me a good chuckle.Read more » (Source: Omics! Omics!)
Source: Omics! Omics! - September 28, 2011 Category: Bioinformaticians Authors: Keith Robison Source Type: blogs
Boston's Boris Bikes
Email this article to a colleague.
Save this article to My Clippings.
Discuss or comment on this article.
When I discovered that my new gig would temporarily in Boston, I realized I had an opportunity to try out Boston's new bikeshare program. Started this summer, Hubway consists of racks of bikes in public places which can be used for short hops around town. I like my folding bike, but on some rush hour trains it is very hard to find space for it, especially with some train conductors who are more interested in giving dirty looks than serving their passengers. Plus, it's now quite dark on the last leg of my commute, and even if I had some really slick lights I don't like riding even short distances in the dark.Rea...
Source: Omics! Omics! - September 27, 2011 Category: Bioinformaticians Authors: Keith Robison Source Type: blogs
---
Email this article to a colleague.
Save this article to My Clippings.
Discuss or comment on this article.
Nelson Award Nominee @ Hypertext 2011 We were pleased to be nominated for the Nelson Award at Hypertext 2011 for our paper entitled, A Transfer Approach to Detecting Disease Reporting Events in Blog Social Media, Avaré Stewart, et.al.We continue our work of tackling the burden of manually labelling data and address the problems associated with building a supervised learner to classify frequently evolving, and variable blog content. We automatically classify outbreak reports to train a supervised learner, and the knowledge acquired from the learning process is then transferred to the task of classifying blogs.For more deta...
Source: Medical Ecosystem for Personalized Event-Based Surveillance - September 26, 2011 Category: Bioinformaticians Source Type: blogs
PostScript as a Programming Language for Bioinformatics: mynotebook
Email this article to a colleague.
Save this article to My Clippings.
Discuss or comment on this article.
"PostScript (PS) is an interpreted, stack-based programming language. It is best known for its use as a page description language in the electronic and desktop publishing areas."[wikipedia]. In this post, I'll show how I've used to create a simple and lightweight view
of the genome.
Introduction: just a simple postscript program
The following PS program fills a rectangular gray shape; You (Source: YOKOFAKUN)
Source: YOKOFAKUN - September 26, 2011 Category: Bioinformaticians Authors: Pierre Lindenbaum Source Type: blogs
Joining genomic annotations files with the tabix API.
Email this article to a colleague.
Save this article to My Clippings.
Discuss or comment on this article.
Tabix is a software that is part of the samtools package.
After indexing a file, tabix is able to quickly retrieve data lines overlapping genomic regions (see also my previous post about tabix). Here, I wrote a tool named jointabix that joins the data of a (chrom/start/end) file with a file indexed with tabix. I've posted the code on github at: https://github.com/lindenb/samtools-utilities/blob/ (Source: YOKOFAKUN)
Source: YOKOFAKUN - September 23, 2011 Category: Bioinformaticians Authors: Pierre Lindenbaum Source Type: blogs
Transitions
Email this article to a colleague.
Save this article to My Clippings.
Discuss or comment on this article.
I went to an Infinity going-away lunch last week. We head off to some favorite local restaurant and order a modest (but delicious) meal on the company dime. The departee makes an impromptu speech, there are goodbyes and handshakes and usually a number of pleas to stay, both fictitious and heartfelt. Those staying wonder what could lure someone away from the very safe and green pastures of the company. I've been to many such lunches with Millennium and Infinity; with Codon the lunches tended to be group affairs as people were laid off in batches. Read more » (Source: Omics! Omics!)
Source: Omics! Omics! - September 18, 2011 Category: Bioinformaticians Authors: Keith Robison Source Type: blogs
