<?xml version="1.0" encoding="iso-8859-1"?>
<!-- generator="FeedCreator 1.7.2" -->
<rss version="2.0">
    <channel>
        <title>Beginning Python for Bioinformatics via MedWorm.com</title>
        <description>MedWorm.com provides a medical RSS filtering service. Over 6000 RSS medical sources are combined and output via different filters. This feed contains the latest items from the 'Beginning Python for Bioinformatics' source.</description>
        <link><![CDATA[http://www.medworm.com/rss/search.php?qu=Beginning+Python+for+Bioinformatics&t=Beginning+Python+for+Bioinformatics&s=Search&f=source]]></link>
        <lastBuildDate>Sat, 26 Feb 2011 01:23:34 +0100</lastBuildDate>
        <item>
            <title>Rebuilding</title>
            <link>http://python.genedrift.org/2010/12/28/rebuilding/</link>
            <description>The blog engine (WP) has been hacked last week and I&amp;#8217;m still rebuilding some of its features. Sorry for any inconvenience. (Source: Beginning Python for Bioinformatics)</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=4293535</comments>
            <pubDate>Tue, 28 Dec 2010 16:50:54 +0100</pubDate>
            <guid isPermaLink="false">4293535</guid>        </item>
        <item>
            <title>Why I left Biostar, but I still like Stackoverflow</title>
            <link>http://python.genedrift.org/2010/12/01/why-i-left-biostar-but-i-still-like-stackoverflow/</link>
            <description>About eight months ago I started using Biostar as I saw it as a great opportunity to exchange some ideas, concepts, tips in biology and bioinformatics. I even mentioned the website in this space, trying to bring more people to the mix; at the time the community wasn&amp;#8217;t big enough, and some days went by without any question being posted.
But a couple of months ago my interest started to go down the drain. I don&amp;#8217;t know if it was the constant next-generation sequencing barrage of questions every day, if it was the infantile blog/twitter posts from members competing for points or maybe the lack of votes for some answers that I posted (that&amp;#8217;s selfish on my part, I admit). But at some point it seemed that the website turned into a competition of CVs or knowledge, very different ...</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=4287288</comments>
            <pubDate>Wed, 01 Dec 2010 15:44:08 +0100</pubDate>
            <guid isPermaLink="false">4287288</guid>        </item>
        <item>
            <title>Preview of Django 1.1 Testing and Debugging</title>
            <link>http://python.genedrift.org/2010/05/20/preview-of-django-1-1-testing-and-debugging/</link>
            <description>Packt Publishing invited me to review Django 1.1 Testing and Debuggingby Karen M. Tracey. They also kindly provided a free chapter that you can download from the link below. A full review will be posted as soon as I finish the book.
preview chapter &amp;#8211; Chapter No.3 &amp;#8220;Testing 1, 2, 3: Basic Unit Testing&amp;#8221; (Source: Beginning Python for Bioinformatics)</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=4287289</comments>
            <pubDate>Thu, 20 May 2010 21:34:41 +0100</pubDate>
            <guid isPermaLink="false">4287289</guid>        </item>
        <item>
            <title>Initial impressions about Bioinformatics Programming using Python</title>
            <link>http://python.genedrift.org/2010/05/01/initial-impressions-about-bioinformatics-programming-using-python/</link>
            <description>Last week I made a 5 book order at Amazon and one of them was Bioinformatics Programming Using Python: Practical Programming for Biological Data (Animal Guide) by Mitchell L Model. 
I started reading the book late Friday night, and I&amp;#8217;m on the third chapter, where there&amp;#8217;s an introduction to sequences. So far, I found the book very confusing, especially as it claims to be a book for people with no programming background. The examples are OK, but there&amp;#8217;s a very messy mixture of Python interpreter and standalone script usage, as the author jumps back and forth them. Another thing is that some examples are explained in detail including the line number, while others you depend on the code&amp;#8217;s docstring to understand it.
So far, I&amp;#8217;m not impressed. The initial Python se...</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=4287290</comments>
            <pubDate>Sun, 02 May 2010 01:59:17 +0100</pubDate>
            <guid isPermaLink="false">4287290</guid>        </item>
        <item>
            <title>Python for Bioinformatics by Sebastian Bassi: a (short) review</title>
            <link>http://python.genedrift.org/2010/04/19/python-for-bioinformatics-by-sebastian-bassi-a-review/</link>
            <description>I promised some time ago to post a complete review of Python for Bioinformatics (Chapman &amp;#038; Hall/CRC Mathematical &amp;#038; Computational Biology) by Sebastian Bassi. It&amp;#8217;s long overdue, but the delay allowed me to get more acquainted to the book and its contents.
I can only say that I highly recommend this book, especially for the biologist that is beginning in bioinformatics or python (or both). I cannot compare it to any other Python and Bioinformatics books (I&amp;#8217;m planning to buy the another one), but I can say that I could learn a thing or two from Sebastian&amp;#8217;s book. Evidently is not a perfect book, as some of the explanations are a little bit rushed and might be difficult for a beginner. At the same time this is a very carefully thought and planned book and has more th...</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=4287291</comments>
            <pubDate>Mon, 19 Apr 2010 14:01:08 +0100</pubDate>
            <guid isPermaLink="false">4287291</guid>        </item>
        <item>
            <title>Biostar: bioinformatics community</title>
            <link>http://python.genedrift.org/2010/04/17/biostar-bioinformatics-community/</link>
            <description>Biostar is a bioinformatics community on the StackExchange network. It&amp;#8217;s still small and not a lot of questions are asked and answered every day, so we need more people participating. If you are new to bioinformatics, or are just curious about the newest trends in the field, help us grow. (Source: Beginning Python for Bioinformatics)</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=4287292</comments>
            <pubDate>Sat, 17 Apr 2010 18:26:08 +0100</pubDate>
            <guid isPermaLink="false">4287292</guid>        </item>
        <item>
            <title>The real value of blogging</title>
            <link>http://python.genedrift.org/2010/04/10/the-real-value-of-blogging/</link>
            <description>A couple of days ago I posted on here an entry called &amp;#8216;The &amp;#8220;sickest&amp;#8221; Python code I&amp;#8217;ve ever created&amp;#8216;. It&amp;#8217;s a code that does some file management for proteomics data, with a different set of inputs each time you run it.
The &amp;#8220;sickest&amp;#8221; part of the title is that it was a small challenge to me. I&amp;#8217;ve been away of actual hard-core coding for quite sometime, and you lose some of the gist of the thing with time. Mostly, nowadays, I make simple scripts that don&amp;#8217;t require any kind of advanced skills (in any language) and I don&amp;#8217;t worry that much of releasing code or about ultra fast performance. I knew from the time I posted that a lot of people would jump and help and teach me, as I was aware it wasn&amp;#8217;t the most elegant code out th...</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=4287293</comments>
            <pubDate>Sat, 10 Apr 2010 18:45:03 +0100</pubDate>
            <guid isPermaLink="false">4287293</guid>        </item>
        <item>
            <title>The “sickest” Python code I’ve ever created 1</title>
            <link>http://python.genedrift.org/2010/04/08/the-sickest-python-code-ive-ever-created-1/</link>
            <description>But, I guess, it can be easily refactored/enhanced/despised by the audience that read or have access to this blog via Planet Python. Anyway, for someone like me, whose main task now is not to generate tons of code and lines, I think the code (or part of it) that I will present below is quite good. Feel free to comment, criticize and say bad and good things about it. 
We needed a script that would take files coming out from protein search engines that would be able to compare the peptides and protein sequences, their abundance and some other characteristics. We had a combination of protein and peptide files, with a list of proteins (one protein per line in a tab delimited file) that was related to a list of peptides in another file (one peptide per line, with multiple peptides/lines related...</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=4287294</comments>
            <pubDate>Fri, 09 Apr 2010 01:59:54 +0100</pubDate>
            <guid isPermaLink="false">4287294</guid>        </item>
        <item>
            <title>Python Testing Beginner’s Guide, review</title>
            <link>http://python.genedrift.org/2010/03/03/python-testing-beginner%e2%80%99s-guide-review/</link>
            <description>I posted about a week ago that Packt Publishing had invited me to review Python Testing Beginner&amp;#8217;s Guide by Daniel Arbuckle. Having finished reading the book (I must admit that I haven&amp;#8217;t tried all the code in it), I can say that I have an excellent initial impression of the book.
PTBG is not a long book and the topic is divided in 10 chapters and one appendix. One of the first things that I liked about the book is that there&amp;#8217;s no introduction (or something similar) to Python. It just goes straight to the point assuming that you have some good understanding of the language and everything that surrounds it. In the past I was frustrated with some &amp;#8220;Introduction to X with Python&amp;#8221; that wasted precious space talking over and over about a topic, learning Python, bette...</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=4287295</comments>
            <pubDate>Thu, 04 Mar 2010 03:46:35 +0100</pubDate>
            <guid isPermaLink="false">4287295</guid>        </item>
        <item>
            <title>Preview of Python Testing Beginner’s Guide</title>
            <link>http://python.genedrift.org/2010/02/22/preview-of-python-testing-beginners-guide/</link>
            <description>I was invited by Packt Publishing to review Python Testing Beginner&amp;#8217;s Guide by Daniel Arbuckle. This is a book on one of the most important aspects of scientific programming (even though the majority of scientific software don&amp;#8217;t have any testing routines): code testing, checking if your code actually does what is intended to do. I can say I&amp;#8217;m not really an expert on testing so I guess I&amp;#8217;m the right audience for it:
You&amp;#8217;ll learn about several of Python&amp;#8217;s automated testing tools, and you&amp;#8217;ll learn about the philosophies and methodologies that they were designed to support, like unit testing and test-driven development. When you&amp;#8217;re done, you&amp;#8217;ll be able to produce thoroughly tested code faster and more easily than ever before, and you&amp;#8217;...</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=4287296</comments>
            <pubDate>Mon, 22 Feb 2010 16:00:53 +0100</pubDate>
            <guid isPermaLink="false">4287296</guid>        </item>
        <item>
            <title>Preliminary review of Python for Bioinformatics by Sebastian Bassi</title>
            <link>http://python.genedrift.org/2010/01/03/preliminary-review-of-python-for-bioinformatics-by-sebastian-bassi/</link>
            <description>Let me start by saying that Python for Bioinformatics (Chapman &amp; Hall/Crc Mathematical &amp; Computational Biology) is a massive book, massive in a way that it contains a lot of material. I still didn&amp;#8217;t have enough time to check everything, but I&amp;#8217;m well into the first section of the book that gives an initial view of Python and how to program it. 
The initial section of the book is well written (I&amp;#8217;m not going criticize the book in terms of good/poor English, as I&amp;#8217;m not well qualified to do that), and gives a clear perspective on how to program Python for scientists, who are the main target demographic of the book. Of course, it always help to have some basic knowledge of command line shells, but the book also includes some explanations of IDLE and other Python-c...</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=4287297</comments>
            <pubDate>Sun, 03 Jan 2010 19:25:00 +0100</pubDate>
            <guid isPermaLink="false">4287297</guid>        </item>
        <item>
            <title>This is (more or less) the end</title>
            <link>http://python.genedrift.org/2009/05/22/this-is-more-or-less-the-end/</link>
            <description>So, I&amp;#8217;m closing the blog, maybe for good, maybe not. I haven&amp;#8217;t been updating it and some other responsibilities are consuming my spare time. 
I would like to thank everyone that contributed, commented and read it. You have my deepest appreciation. My work is fulfilled if I helped at least one person along the way.
Cheers
Paulo
PS: there&amp;#8217;s the wiki, so register and help me improve it.
PS II: sorry that I couldn&amp;#8217;t finish the last project. Maybe some other time. (Source: Beginning Python for Bioinformatics)</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=4287298</comments>
            <pubDate>Sat, 23 May 2009 02:19:01 +0100</pubDate>
            <guid isPermaLink="false">4287298</guid>        </item>
        <item>
            <title>Wiki</title>
            <link>http://python.genedrift.org/2009/05/11/wiki/</link>
            <description>Image via Wikipedia



I&amp;#8217;m slowly moving the posts from the blog to a wiki. It makes easier to display post series and allows people to modify/enhance/discuss.
The wiki address is http://wiki.genedrift.org. (Source: Beginning Python for Bioinformatics)</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=4287299</comments>
            <pubDate>Mon, 11 May 2009 22:43:31 +0100</pubDate>
            <guid isPermaLink="false">4287299</guid>        </item>
        <item>
            <title>Managing a simple database with Python, SQLite and wxPython, 8</title>
            <link>http://feedproxy.google.com/~r/BeginningPythonForBioinformatics/~3/8mqTN_35zw0/</link>
            <description>Image via Wikipedia



Thanks to the comments and suggestions to the last post, it&amp;#8217;s possible to make now a more pythonic and clearly generic database update class. Let&amp;#8217;s check how the &amp;#8220;generic&amp;#8221; update/edit entry function is currently:

def update_data(self, values_list):
  &amp;#039;&amp;#039;&amp;#039;edits and updates fields&amp;#039;&amp;#039;&amp;#039;

  if sys.platform == &amp;#039;darwin&amp;#039;:
    (cursor, database) = link_db(self.db_path)
  else:
    (cursor, database) = link_db()

  cursor.execute(&amp;quot;UPDATE bac SET projects = ?, comments = ?, temperature = ?, cell = ?, box = ?, tubes = ?, chromosome = ?, sdate = ?, clone = ?, source
	= ?, location1 = ?, startpos = ?, endpos = ?,
	gene = ?, genelink = ?, dnaex = ?, validation = ?, pcr = ?, refs = ?, antibiotic = ? WHERE idbac = ?&amp;...</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=2357443</comments>
            <pubDate>Wed, 22 Apr 2009 22:04:39 +0100</pubDate>
            <guid isPermaLink="false">2357443</guid>        </item>
        <item>
            <title>Managing a simple database with Python, SQLite and wxPython, 7 (includes a question)</title>
            <link>http://feedproxy.google.com/~r/BeginningPythonForBioinformatics/~3/nPwfjuPGLhc/</link>
            <description>And we&amp;#8217;re back. After a couple of weeks of inactivity we will get back to our small soap-opera pf Python, wxPython and SQLite. Continuing in our database management code let&amp;#8217;s check two other functions that changed since our first inception of the code. The first one is the insert_data function that looks like this now

def insert_data(self, values_list, insert_string):
  &amp;#039;&amp;#039;&amp;#039;inserts data in the database&amp;#039;&amp;#039;&amp;#039;

  if sys.platform == &amp;#039;darwin&amp;#039;:
    (cursor, database) = link_db(self.db_path)
  else:
    (cursor, database) = link_db()

  cursor.execute(insert_string % self.table_name, values_list)

  database.commit()
  database.close()

Basically no changes, apart from the obvious check for the current running operating system, which was explaine...</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=2346929</comments>
            <pubDate>Mon, 20 Apr 2009 17:22:31 +0100</pubDate>
            <guid isPermaLink="false">2346929</guid>        </item>
        <item>
            <title>Managing a simple database with Python, SQLite and wxPython, 6</title>
            <link>http://python.genedrift.org/2009/03/31/managing-a-simple-database-with-python-sqlite-and-wxpython-6/</link>
            <description>Image via Wikipedia



Let&amp;#8217;s get back to our SQLite and wxPython project. We haven&amp;#8217;t seen anything on wxPython yet, and we will check the interface only on the next post. For now, let&amp;#8217;s see some extra code added to the SQLite access class. Remember that we have a generic class and one class derived from it that would work on accessing specific tables in our database file.
When we last covered the db access routines, there was no search for an entry (the function returned everything in the table no matter what), there was no update function in case someone would want to modify an entry and there was no delete method if you wanted to delete something. In the meantime, I added all of this functionality (and some other) to the generic class and extended it to the class derive...</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=4287302</comments>
            <pubDate>Tue, 31 Mar 2009 17:06:08 +0100</pubDate>
            <guid isPermaLink="false">4287302</guid>        </item>
        <item>
            <title>RoR commits</title>
            <link>http://python.genedrift.org/2009/03/15/ror-commits/</link>
            <description>Just illustrating my point (or lack of), an animation about the commits of RoR to its repository. Notice the jump after it was migrated to Github
Ruby on Rails from Ilya Grigorik on Vimeo.
Sorry for the non-Python post. (Source: Beginning Python for Bioinformatics)</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=4287303</comments>
            <pubDate>Sun, 15 Mar 2009 16:59:18 +0100</pubDate>
            <guid isPermaLink="false">4287303</guid>        </item>
        <item>
            <title>BioPython and CVS</title>
            <link>http://feedproxy.google.com/~r/BeginningPythonForBioinformatics/~3/YorBykGxURk/</link>
            <description>Image via Wikipedia



I start this post with an apology. I usually don&amp;#8217;t rant or vent here, which are feelings that I usually reserve to my personal blog.
I don&amp;#8217;t use BioPython, never used it. I have it installed in my systems, but I never wrote a piece of code importing BioPython routines. But I subscribe to their mailing lists, both user and developer. I maybe have written once to the list, and I just follow the discussions there. 
Since last year one of the main topics has been the possibility of moving BioPython from CVS to another version control system. Yes, you read it right. It&amp;#8217;s 2009 and BioPython uses CVS and their version control system. Soon, CVS will be like typewriters and LPs to young developers. Last stable release of CVS was sometime in 2005, what in int...</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=2260001</comments>
            <pubDate>Fri, 13 Mar 2009 20:41:36 +0100</pubDate>
            <guid isPermaLink="false">2260001</guid>        </item>
        <item>
            <title>Managing a simple database with Python, SQLite and wxPython, 5</title>
            <link>http://feedproxy.google.com/~r/BeginningPythonForBioinformatics/~3/Wkmik3NTYGg/</link>
            <description>We have seen how to connect, get and insert data (at least theoretically) in the database. Now, a little not about the SQL engine of choice here: SQLite. SQLite databases have the main characteristic that they are self-contained files. Also it does not require an installation, works without a server and works pretty well in most operating systems. 
Basically for the type of application we&amp;#8217;re developing here, SQLite seems ideal. It eliminates a lot of infrastructure that would be needed if we were working with MySQL or postgresql. We don&amp;#8217;t need a server or know how to configure users or manage the databases and tables. All we need is contained in a single file that can be transported from system to system and can be accesed from the computers used in the lab, mainly XP and OS X....</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=2240510</comments>
            <pubDate>Thu, 05 Mar 2009 19:10:35 +0100</pubDate>
            <guid isPermaLink="false">2240510</guid>        </item>
        <item>
            <title>Managing a simple database with Python, SQLite and wxPython, 4</title>
            <link>http://python.genedrift.org/2009/03/02/managing-a-simple-database-with-python-sqlite-and-wxpython-4/</link>
            <description>Image via Wikipedia



Let&amp;#8217;s continue building our small db app. As mentioned in the previous post we need now to instantiate a specific class from our generic SQLite access class. In order to do this we just have to declare a new class and its type will be DB_Generic. 
[sourcecode language='python']class Bac(DB_Generic)[/sourcecode]
This new class is called Bac because it&amp;#8217;s linked to the bac table in our database file. A side note, bacs are Bacterial Artificial Chromosomes and are used in different molecular biology techniques. Mainly in our case bacs have incorporated human DNA segments and are used as probes for deletion, duplication, etc studies.
Now, back to our Python code, as soon as we instantiate our generic class, the object (class) we create has access to all methods...</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=4287306</comments>
            <pubDate>Mon, 02 Mar 2009 16:58:09 +0100</pubDate>
            <guid isPermaLink="false">4287306</guid>        </item>
        <item>
            <title>Why do I blog. Or: Science Blogging, is it worth?</title>
            <link>http://python.genedrift.org/2009/02/21/why-do-i-blog-or-science-blogging-is-it-worth/</link>
            <description>Image via Wikipedia



Mirroring the post that appeared on Blind.Scientist
Some time ago there was a meme about science blogging and one of the questions were &amp;#8220;why do you blog&amp;#8221;. Well, I do it because of the &amp;#8220;Nada Surf effect&amp;#8221;. You don&amp;#8217;t know the &amp;#8220;Nada Surf effect&amp;#8221;? Pity you weren&amp;#8217;t in Washington, DC 2001. 
In March or April of 2001, Nada Surf played a concert there. It was a small bar on 14th Street W, close to the more famous Black Cat. It was a spring night, I was with a couple of Dutch friends that had told me about the concert, if I&amp;#8217;m not wrong, a couple of days before. It was also mid-week, so you wouldn&amp;#8217;t expect big crowds in most concerts. We left ISH around 7 pm, with spare time for the 9 pm concert. We didn&amp;#8217;t know t...</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=4287307</comments>
            <pubDate>Sat, 21 Feb 2009 23:18:24 +0100</pubDate>
            <guid isPermaLink="false">4287307</guid>        </item>
        <item>
            <title>Managing a simple database with Python, SQLite and wxPython, 3</title>
            <link>http://feedproxy.google.com/~r/BeginningPythonForBioinformatics/~3/9xptBVtfhlg/</link>
            <description>In the last post we saw how to connect to a SQLite database file and generate a cursor that would allow us to actually interact with such database. Now we need some functionality that will interact with the data, add, read, delete and search. As was mentioned before the idea is to have a generic database interaction class and have unique instantiated class objects for each database of the project. In the db_obj.py file we have an initial structure set, so let&amp;#8217;s check the DB_Generic class. 

class DB_Generic():
  &amp;#039;&amp;#039;&amp;#039;generic class to add DB functionality&amp;#039;&amp;#039;&amp;#039;
  def __init__(self, table_name):
    #par= name of the table to be used
    self.table_name = table_name

  def delete_entry(self):
    pass

  def get_data_generic(self):
    &amp;#039;&amp;#039;&amp;#039;gets th...</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=2240513</comments>
            <pubDate>Wed, 18 Feb 2009 16:06:58 +0100</pubDate>
            <guid isPermaLink="false">2240513</guid>        </item>
        <item>
            <title>Managing a simple database with Python, SQLite and wxPython, 2</title>
            <link>http://feedproxy.google.com/~r/BeginningPythonForBioinformatics/~3/Cd7lPFdF7H8/</link>
            <description>Let&amp;#8217;s continue coding our small Python + SQLite application. The initial idea was to have a file for the interface and another file for the DB access. We will start with the later. If you have access to the repository you will see two Pythonf files, bac_form.py and db_obj.py. At the moment they are not weel commented and have some junk lines at the bottom, legacy from older versions. Take a look on db_obj.py.
It has two class declarations, one called DB_Generic and another one called Bac. Remember in the last post where I mentioned that the idea was to have different simple tables in the same SQLite database and each table would have a simple input/output interface (If I didn&amp;#8217;t mention that, I just did!). So, we can create a generic DB access class and we can subtype from it fo...</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=2240514</comments>
            <pubDate>Tue, 17 Feb 2009 16:04:18 +0100</pubDate>
            <guid isPermaLink="false">2240514</guid>        </item>
        <item>
            <title>Managing a simple database with Python, SQLite and wxPython, 1</title>
            <link>http://feedproxy.google.com/~r/BeginningPythonForBioinformatics/~3/pezmeguuCx4/</link>
            <description>Image via Wikipedia



A little break from reviewing the book, let&amp;#8217;s check somne database topics in Python. I was asked to create a simple database to organize wet-lab stuff. No relationships needs, no relational tables required. Just a simple table with determined columns, and a nice GUI to go with it so people can edit, search and use.
My first idea was to use SQLite database, and I stuck with it. After the initial phase of &amp;#8220;interviews&amp;#8221; to check database requirements, I ended up with a list of tables and decided to start working on the table that organizes the BACs used in the lab. BAC is a DNA vector into which large DNA fragments can be inserted and cloned in a bacterial host, and are used mainly in cytogenetics around here. In the end the table had this structure

CR...</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=2240515</comments>
            <pubDate>Mon, 09 Feb 2009 20:11:57 +0100</pubDate>
            <guid isPermaLink="false">2240515</guid>        </item>
        <item>
            <title>Expert Python Programming by Tarek Ziadé – a review of Chapter 3</title>
            <link>http://feedproxy.google.com/~r/BeginningPythonForBioinformatics/~3/YPd4AeGcGoQ/</link>
            <description>The chapter 3 review that I promised for &amp;#8220;tomorrow&amp;#8221; (last Saturday) was lazily postponed until today. So, let&amp;#8217;s get to it. Tarek in this chapter continues with syntax best practices, but at this time at class level. As expected the chapter requires that you have a minimal knowledge of Python classes, so I can say it&amp;#8217;s geared to somewhat experienced programmers, and not to newcomers. There is a short explanation on sub-classing that warms up things for the next sections. 
Next is the built-in method (type?) super, which was new to me. Basically super gives you access a method or attribute of a class by calling its parent directly. This is a segue into understanding the Method Resolution Order in Python, which is understanding which class has precedence over the other...</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=2521943</comments>
            <pubDate>Mon, 09 Feb 2009 18:29:24 +0100</pubDate>
            <guid isPermaLink="false">2521943</guid>        </item>
        <item>
            <title>Expert Python Programming by Tarek Ziadé - a review of Chapter 3</title>
            <link>http://feedproxy.google.com/~r/BeginningPythonForBioinformatics/~3/YPd4AeGcGoQ/</link>
            <description>The chapter 3 review that I promised for &amp;#8220;tomorrow&amp;#8221; (last Saturday) was lazily postponed until today. So, let&amp;#8217;s get to it. Tarek in this chapter continues with syntax best practices, but at this time at class level. As expected the chapter requires that you have a minimal knowledge of Python classes, so I can say it&amp;#8217;s geared to somewhat experienced programmers, and not to newcomers. There is a short explanation on sub-classing that warms up things for the next sections. 
Next is the built-in method (type?) super, which was new to me. Basically super gives you access a method or attribute of a class by calling its parent directly. This is a segue into understanding the Method Resolution Order in Python, which is understanding which class has precedence over the other...</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=2240516</comments>
            <pubDate>Mon, 09 Feb 2009 18:29:24 +0100</pubDate>
            <guid isPermaLink="false">2240516</guid>        </item>
        <item>
            <title>Expert Python Programming by Tarek Ziadé – a review of Chapter 2</title>
            <link>http://feedproxy.google.com/~r/BeginningPythonForBioinformatics/~3/MskQSrJarI8/</link>
            <description>Cover via Amazon



So we&amp;#8217;re up to the second chapter of Tarek&amp;#8217;s book. A short disclaimer before diving into it. I started this blog, basically one year after I had started programming with Python. The initial idea was to &amp;#8220;convert&amp;#8221; the Beginning Perl for Bioinformatics book to Python and see what were the advantages and disadvantages of both languages. I was far from being a advanced Python programmer, and the inception of the blog helped me getting close to that, even though I consider myself far from being an expert programmer in Python. I learned a lot working on converting the Perl and learned a lot from the comments and interaction with other programmers and visitors of the blog. As anything in life one&amp;#8217;s path is long and tortuous and there&amp;#8217;s nothin...</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=2521944</comments>
            <pubDate>Fri, 06 Feb 2009 16:08:03 +0100</pubDate>
            <guid isPermaLink="false">2521944</guid>        </item>
        <item>
            <title>Expert Python Programming by Tarek Ziadé - a review of Chapter 2</title>
            <link>http://feedproxy.google.com/~r/BeginningPythonForBioinformatics/~3/MskQSrJarI8/</link>
            <description>Cover via Amazon



So we&amp;#8217;re up to the second chapter of Tarek&amp;#8217;s book. A short disclaimer before diving into it. I started this blog, basically one year after I had started programming with Python. The initial idea was to &amp;#8220;convert&amp;#8221; the Beginning Perl for Bioinformatics book to Python and see what were the advantages and disadvantages of both languages. I was far from being a advanced Python programmer, and the inception of the blog helped me getting close to that, even though I consider myself far from being an expert programmer in Python. I learned a lot working on converting the Perl and learned a lot from the comments and interaction with other programmers and visitors of the blog. As anything in life one&amp;#8217;s path is long and tortuous and there&amp;#8217;s nothin...</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=2240517</comments>
            <pubDate>Fri, 06 Feb 2009 16:08:03 +0100</pubDate>
            <guid isPermaLink="false">2240517</guid>        </item>
        <item>
            <title>Expert Python Programming by Tarek Ziadé - a review of Chapter 1</title>
            <link>http://feedproxy.google.com/~r/BeginningPythonForBioinformatics/~3/kH34f-snfyU/</link>
            <description>I&amp;#8217;ve bought (no, Packt Publishing didn&amp;#8217;t send me a copy for review) Tarek&amp;#8217;s book quite sometime ago, but job changes, and extra-Python issues kept me away from reading it with the attention if fully deserve. When I saw the announcement, I thought that this was the book I wanted in Python. First, a little bit of perspective.
I&amp;#8217;m a a biologist, self-taught programmer/coder/you-name-it. I only had a brief course on programming logic with Pascal in 1993 (I think). I first learned Basic on Apple ][, then on PC, then moved to Visual Basic, Pascal, C and C++, most of them with the help of books. About three and a half years ago, I got tired of compiling things and decided to learn a different language that would be more agile to code with. Not liking Perl, made me check Py...</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=2163490</comments>
            <pubDate>Thu, 05 Feb 2009 23:20:26 +0100</pubDate>
            <guid isPermaLink="false">2163490</guid>        </item>
        <item>
            <title>Expert Python Programming by Tarek Ziadé – a review of Chapter 1</title>
            <link>http://feedproxy.google.com/~r/BeginningPythonForBioinformatics/~3/kH34f-snfyU/</link>
            <description>I&amp;#8217;ve bought (no, Packt Publishing didn&amp;#8217;t send me a copy for review) Tarek&amp;#8217;s book quite sometime ago, but job changes, and extra-Python issues kept me away from reading it with the attention if fully deserve. When I saw the announcement, I thought that this was the book I wanted in Python. First, a little bit of perspective.
I&amp;#8217;m a a biologist, self-taught programmer/coder/you-name-it. I only had a brief course on programming logic with Pascal in 1993 (I think). I first learned Basic on Apple ][, then on PC, then moved to Visual Basic, Pascal, C and C++, most of them with the help of books. About three and a half years ago, I got tired of compiling things and decided to learn a different language that would be more agile to code with. Not liking Perl, made me check Py...</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=2521945</comments>
            <pubDate>Thu, 05 Feb 2009 23:20:23 +0100</pubDate>
            <guid isPermaLink="false">2521945</guid>        </item>
        <item>
            <title>BPforB is now PEP 8 compliant!</title>
            <link>http://feedproxy.google.com/~r/BeginningPythonForBioinformatics/~3/QRXfGixp_lY/</link>
            <description>As mentioned in the previous post, Robin Stocker kindly provided a git patch with the required changes to all scripts stored on the repository to be compliant with the PEP 8.
The changes were mainly regarding variable/object names, but they were important as make the code available here more Pythonic following the rules of the Benevolent Dictator for Life.
I would like to thank Robin for spending his time doing this. Much appreciated.
Now, just a quick git tutorial on how to apply patches:
git apply __patch_file__
git commit -a -m &amp;#8220;patch applied&amp;#8221;
git push
That&amp;#8217;s it. Apply, commit, push and you&amp;#8217;re done. The repository is already updated. (Source: Beginning Python for Bioinformatics)</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=2149561</comments>
            <pubDate>Sun, 01 Feb 2009 02:31:36 +0100</pubDate>
            <guid isPermaLink="false">2149561</guid>        </item>
        <item>
            <title>Finally it’s 2009 …</title>
            <link>http://feedproxy.google.com/~r/BeginningPythonForBioinformatics/~3/KEdngzp6Tmo/</link>
            <description>Image via Wikipedia



And &amp;#8230; we&amp;#8217;re back. The long and cold winter is still out there and January 2009 is almost in the books. After a long period without updating I&amp;#8217;ll try to &amp;#8220;rush&amp;#8221; some posts this week, trying to get back on track. So, a little bit of what&amp;#8217;s up and coming:
- a patch provided by Robin Stocker to make all scripts published here (at least the ones on GitHub) PEP 8 compliant.
- using SQLite databases in Python
- developing an interface to access the database
- anything that you might suggest, just leave a comment.
Let&amp;#8217;s start 2009 then. (Source: Beginning Python for Bioinformatics)</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=2149562</comments>
            <pubDate>Sat, 31 Jan 2009 17:30:45 +0100</pubDate>
            <guid isPermaLink="false">2149562</guid>        </item>
        <item>
            <title>Twitter</title>
            <link>http://feedproxy.google.com/~r/BeginningPythonForBioinformatics/~3/4hIduVFCnJ0/</link>
            <description>I&amp;#8217;m on Twitter, for quite some time. Some Python stuff, some biology, some bioinformatics, and a little bit of everything else.
nuin. (Source: Beginning Python for Bioinformatics)</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=2147492</comments>
            <pubDate>Sat, 31 Jan 2009 00:47:48 +0100</pubDate>
            <guid isPermaLink="false">2147492</guid>        </item>
        <item>
            <title>Python Magazine?</title>
            <link>http://feedproxy.google.com/~r/BeginningPythonForBioinformatics/~3/SydGtLeBdLY/</link>
            <description>I have been buying Python Magazine in the last months and I really like it, especially now that I already miss Linux Magazine and have no close source for Linux Journal (I should subscribe, I know). Last week I got an email from Python Magazine that I could use a coupon to buy some issues. Coupon that I used right away. Paid with PayPal and I&amp;#8217;m still waiting for my issue to show up. Sent a couple of emails using the contact form and until now, nothing. I&amp;#8217;ll wait until next year and see what happened. It&amp;#8217;s really sad because this issue covers cloud computing with Python.
Edit: problem solved. Thanks everyone! (Source: Beginning Python for Bioinformatics)</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=2077659</comments>
            <pubDate>Tue, 30 Dec 2008 17:23:03 +0100</pubDate>
            <guid isPermaLink="false">2077659</guid>        </item>
        <item>
            <title>That’s it for 2008</title>
            <link>http://feedproxy.google.com/~r/BeginningPythonForBioinformatics/~3/gVxpAsrY4tc/</link>
            <description>The date came and is now gone, and I forgot to &amp;#8220;celebrate&amp;#8221; two years of Beginning Python for Bioinformatics on December 13th. I would like to thank everyone that commented, helped with posts and suggested anything that would make this website better. Clearly it is far from being what I wanted it to be, but slowly but surely we will get there.
Thanks again and I wish an excellent holiday season and a great 2009 to everyone!
See you in 2009. (Source: Beginning Python for Bioinformatics)</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=2055743</comments>
            <pubDate>Fri, 19 Dec 2008 17:34:17 +0100</pubDate>
            <guid isPermaLink="false">2055743</guid>        </item>
        <item>
            <title>Scripts and Python 3.0, part 2, using 2to3</title>
            <link>http://feedproxy.google.com/~r/BeginningPythonForBioinformatics/~3/njbIUMbdSK8/</link>
            <description>And we&amp;#8217;re back to check our initial scripts to run on Python 3.0. Along with this latest release, a nice tool to parse your scripts is also installed. It&amp;#8217;s called 2to3 and it&amp;#8217;s available in the Tools/scripts of your Python 3.o installation directory. Basic usage is very similar to any python script:
[sourcecode language='bash']2to3 (Source: Beginning Python for Bioinformatics)</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=2021295</comments>
            <pubDate>Mon, 08 Dec 2008 19:56:35 +0100</pubDate>
            <guid isPermaLink="false">2021295</guid>        </item>
        <item>
            <title>Scripts and Python 3.0, part 1</title>
            <link>http://feedproxy.google.com/~r/BeginningPythonForBioinformatics/~3/AMOkekz11Fk/</link>
            <description>Yes, Python 3.0 was released earlier than Perl &amp;#8230; what version was it? 6? 7? Anyway, I decided to go back to most of the scripts that were posted here. In the github repo we have 50 files in the &amp;#8220;original scripts&amp;#8221; directory. Let&amp;#8217;s check how do they fare on Python 3.0 and what type of changes we need to do in order to make them work. Starting with code_01.py, which is a couple of lines long

myDNA = &amp;quot;ACGTACGTACGTACGTACGTACGT&amp;quot;
print myDNA

Here we have one of the most evident differences between Python 2.x and 3.0. Now print is a function not a statement anymore, so whatever we want to print now should be passed as a function parameter. The above code would be changed to 

myDNA = &amp;quot;ACGTACGTACGTACGTACGTACGT&amp;quot;
print(myDNA)

That simple ins this case. B...</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=2017423</comments>
            <pubDate>Sat, 06 Dec 2008 21:07:50 +0100</pubDate>
            <guid isPermaLink="false">2017423</guid>        </item>
        <item>
            <title>Creating an interface for the motif finding script, final</title>
            <link>http://feedproxy.google.com/~r/BeginningPythonForBioinformatics/~3/NVMl1fzdZz4/</link>
            <description>We can say that this would be our final version of the script. There are many nice wxPython programming resources, and one is a very good book called wxPython in Action, which is co-written by Robin Dunn, the wxPython maintainer. Go check it out.
So for the last entry in this series, we just need to add a couple of changes to our interface and motif finding scripts. Basically on the interface script we need to add a line that gets the value entered (or the default one, if not changed) in the motif width input box. And we can do that by including the line below in the run_finder function.

width = self.motif_width.GetValue()

This line tells the script to get the value of the box and assign to the variable width. This method will get whatever is inside the input box and save as a string to ...</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1974856</comments>
            <pubDate>Wed, 19 Nov 2008 22:07:14 +0100</pubDate>
            <guid isPermaLink="false">1974856</guid>        </item>
        <item>
            <title>Creating an interface for the motif finding script, some corrections</title>
            <link>http://feedproxy.google.com/~r/BeginningPythonForBioinformatics/~3/d-xPuvKEix8/</link>
            <description>We need to pause a bit and do some corrections on our code. First the code I posted on the last entry for the pymotif.py module is wrong. Ok, not wrong, but some of the code I use to test ended up on the blog. Ths first two lines of the calculate_motifs function contained a link to the files I use for testing and should be replaced by

input_seqs = fasta.read_seqs(open(input_seqs).readlines())
input_seqs2 = fasta.read_seqs(open(input_seqs2).readlines())

Also both variables that store the filenames and paths in pymoteGUI.py are declared in the wrong scope. The should have be declared at the pymotGUI class level, so it is accessible to all the functions in that class. This also means that every time we access the variable it should be preceded by the class name in order for the interpreter ...</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1968610</comments>
            <pubDate>Tue, 18 Nov 2008 19:45:31 +0100</pubDate>
            <guid isPermaLink="false">1968610</guid>        </item>
        <item>
            <title>Creating an interface for the motif finding script, part 8</title>
            <link>http://feedproxy.google.com/~r/BeginningPythonForBioinformatics/~3/JIHTCoSXrY4/</link>
            <description>Let&amp;#8217;s see now how do we connect our GUI to the the pymotif file (I changed the name because of some conflicts with the app name [my bad!], the git repo was updated accordingly). And also how to display the results, in a simpler manner. 
Ok, first to connecting the script to the function file, pymotif.py. The file is already imported in our script and we have used it before. We need to find the exact point and which parameters to pass. pytmotif.py is a slightly modified version of your command line script, and the code is below.

#!/usr/bin/env python

import fasta
import sys
from collections import defaultdict

def choose(n, k):
  if 0 &amp;lt;= k &amp;lt;= n:
    ntok = 1
    ktok = 1
    for t in xrange(1, min(k, n - k) + 1):
      ntok *= n
      ktok *= t
      n -= 1
    return ntok // ...</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1960482</comments>
            <pubDate>Thu, 13 Nov 2008 22:28:35 +0100</pubDate>
            <guid isPermaLink="false">1960482</guid>        </item>
        <item>
            <title>Creating an interface for the motif finding script, part 7</title>
            <link>http://feedproxy.google.com/~r/BeginningPythonForBioinformatics/~3/7f-s1jH6xgo/</link>
            <description>Let&amp;#8217;s get back to the last post and check one line we entered

self.motif_width = wx.TextCtrl(panel, -1, &amp;#039;10&amp;#039;, (95, 50), (40,18))

There is something in this line that I did not explain. The third parameter in the test box declaration is '10'. How does this affect our box? That&amp;#8217;s the default text that will be displayed inside the box as soon as it is created. In our case, 10 is the motif width, and it&amp;#8217;s the value we consider to be the most common search width.
Another aspect not explained is the run_finder. We added a line 

wx.MessageBox(&amp;#039;It should run, eh?&amp;#039;)

where we declare a wx.MessageBox. What is it? A message box is the usual error/information dialog that you see in most programs. In our case it is very simple, just a warning/reminder that we ne...</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1960483</comments>
            <pubDate>Tue, 11 Nov 2008 18:53:09 +0100</pubDate>
            <guid isPermaLink="false">1960483</guid>        </item>
        <item>
            <title>Creating an interface for the motif finding script, part 6</title>
            <link>http://feedproxy.google.com/~r/BeginningPythonForBioinformatics/~3/50Jz8O_Mx5Q/</link>
            <description>Last entry we saw how to allow the user to open a file. Now we need to work on this file and store its path so the script can process it later on. After the file is selected on the file menu, the filename is printed on the label. Let&amp;#8217;s think for a second &amp;#8230; If we get only the filename from the dialog, the program won&amp;#8217;t work, because the file might be located in another directory, partition, you name it. So we need tp get the file&amp;#8217;s full path. We need to change the lines

back_file = dialog.GetFilename()
self.fore_label.SetLabel(dialog.GetFilename())

by 

back_file = dialog.GetPath()
self.fore_label.SetLabel(back_file)

(do not forget to do the same to the fore_file!).
Let&amp;#8217;s run the script and check what happens. The frame should look like the one below (with a...</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1946983</comments>
            <pubDate>Tue, 04 Nov 2008 22:39:01 +0100</pubDate>
            <guid isPermaLink="false">1946983</guid>        </item>
        <item>
            <title>Motif finding interface on github</title>
            <link>http://feedproxy.google.com/~r/BeginningPythonForBioinformatics/~3/56ZqrfoJKeM/</link>
            <description>There has been actual development of the GUI and most of the development is being added simultaneously in the blog and on the code. So, go get a copy of git, install it and play with the BPB repository there. 
http://github.com/nuin/beginning-python-for-bioinformatics/tree/master (Source: Beginning Python for Bioinformatics)</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1920865</comments>
            <pubDate>Thu, 30 Oct 2008 19:59:03 +0100</pubDate>
            <guid isPermaLink="false">1920865</guid>        </item>
        <item>
            <title>Creating an interface for the motif finding script, part 5</title>
            <link>http://feedproxy.google.com/~r/BeginningPythonForBioinformatics/~3/Zev4tQP7yhk/</link>
            <description>Last time we saw how to bind an interface element to a function. Now we need to make good use of it, and make the script have some actual functionality. First thing we are going to do is to include a label (or static text) on the interface. Remember that initially we added a panel to the frame, so the label should go on the panel. For a label we use a wx.StaticText and has these parameters

(self, parent, id=-1, label=EmptyString, pos=DefaultPosition, size=DefaultSize, style=0, name=StaticTextNameStr)

We don&amp;#8217;t need all of them, just a couple would be enough. Basically, parent, id, label and pos will do it, as the size would be default and based on the text length we input. We are going to work on our __do_layout function and add two labels to the panel on the frame, one for each the...</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1920866</comments>
            <pubDate>Thu, 30 Oct 2008 19:45:19 +0100</pubDate>
            <guid isPermaLink="false">1920866</guid>        </item>
        <item>
            <title>Creating an interface for the motif finding script, part 4</title>
            <link>http://feedproxy.google.com/~r/BeginningPythonForBioinformatics/~3/tsJLh3EgtjM/</link>
            <description>Last time we checked how to add a menu to our simple frame. Unfortunately, just adding it won&amp;#8217;t make the menu useful. In order to do that we need to bind some events to it. As any interface framework, wxPython is governed by events generated by the user, being these events mouse clicks on buttons and menus, objects getting/losing focus, etc. In our case, so far, we eveidently need a event called menu event, which will tell the code what path to use when a menu is clicked. 
My personal preference for binding an event to menu is to create a separate function to store these procedures, __do_binding. But by using this route we would need to change some code in the menu declaration, and to simplify things we will add the menu binding at the end of the __do_layout function.
And how we crea...</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1920867</comments>
            <pubDate>Wed, 29 Oct 2008 18:35:27 +0100</pubDate>
            <guid isPermaLink="false">1920867</guid>        </item>
        <item>
            <title>Creating an interface for the motif finding script, part 3</title>
            <link>http://feedproxy.google.com/~r/BeginningPythonForBioinformatics/~3/Pr49lfy0RZo/</link>
            <description>Today we will add some elements to our interface. Looking at the previous screencap it is easy to conclude that our interface needs a lot of work to be ready. First, it has a dark gray background that does not ressemble the usual window background (it looks more like a MDI frame). We need to change that. Also, there are no menu bars or menus, or tool bars. It&amp;#8217;s pretty bare bones, and not exactly good or useful.
There many ways of customizing the look of a window/frame in wxPython, and two of these methods are adding a panel to the frame or adding the so-called sizers. The latter is a difficult method to master, but powerful and very good to customize objects, look and feels of a window. Addin a panel and subsequently adding objects to it is a more laborious process, but easier to und...</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1905759</comments>
            <pubDate>Wed, 22 Oct 2008 20:30:13 +0100</pubDate>
            <guid isPermaLink="false">1905759</guid>        </item>
        <item>
            <title>Creating an interface for the motif finding script, part 2</title>
            <link>http://feedproxy.google.com/~r/BeginningPythonForBioinformatics/~3/2Qcyo3KIoNo/</link>
            <description>Let&amp;#8217;s take a deeper look on the code we started yesterday, piece by piece

class pymot(wx.App):
  def __init__(self, redirect=False):
    wx.App.__init__(self, redirect, filename)

This is the class pymot we derived from wx.App, and this will be the main class for your application. As any other class derived it needs a OnInit or a __init__ function that will take care of initializing things. As usual, we pass self and a redirect parameter, that will tell the application to redirect some output to the command line. We actually don&amp;#8217;t need a redirect, but it can be useful in the future to track errors. It&amp;#8217;s set to false as we don&amp;#8217;t need it now.

class pymotGUI(wx.Frame):
  def __init__(self, parent, id):
    wx.Frame.__init__(self, parent, id, &amp;#039;Python Motif Finder...</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1905760</comments>
            <pubDate>Tue, 21 Oct 2008 16:40:33 +0100</pubDate>
            <guid isPermaLink="false">1905760</guid>        </item>
        <item>
            <title>Creating an interface for the motif finding script, part 1</title>
            <link>http://feedproxy.google.com/~r/BeginningPythonForBioinformatics/~3/3TJ8YDUoVcg/</link>
            <description>And we are back. After much ado about real life, I am able to &amp;#8220;restart&amp;#8221; this blog and probably with a good frequency of posts. Last time we saw the final product of our motif finding series. We ended up creating a very elegant script in Python that efficiently counts words in FASTA sequences and then using a basic statistical method, calculates the significance of each word and output the overrepresented ones.
Our script used a little bit less than 50 lines, and if you include the imported fasta module, it won&amp;#8217;t top 100. But the number of lines is not important. The efficiency, clarity and speed are key here. At the same time, running a script from the command line is not something everyone is used to do. In order to add more visibility to our simple script, why not inclu...</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1905761</comments>
            <pubDate>Mon, 20 Oct 2008 21:31:50 +0100</pubDate>
            <guid isPermaLink="false">1905761</guid>        </item>
        <item>
            <title>Git repository updated</title>
            <link>http://feedproxy.google.com/~r/BeginningPythonForBioinformatics/~3/PHSxmZq3kls/</link>
            <description>Image via Wikipedia I just updated the git repository of BPB. Click here to access it. Most of the code presented in the blog is there, some with extra comments, some being updated. 
This close another phase in the blog and soon we will check some different aspects of Python programming in Bioinformatics. (Source: Beginning Python for Bioinformatics)</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1790177</comments>
            <pubDate>Sat, 13 Sep 2008 02:48:55 +0100</pubDate>
            <guid isPermaLink="false">1790177</guid>        </item>
        <item>
            <title>Python, overepresented motifs, the Grand Finale</title>
            <link>http://feeds.feedburner.com/~r/BeginningPythonForBioinformatics/~3/384696755/</link>
            <description>In this final part, let&amp;#8217;s do some very simple refactoring and modify the output section to make the result a little bit better. There are not many options about the functions to calculate the binomial expansion. But Andrew posted some opinions on how to slight change the quorum function.

def get_quorums(seqs, mlen):
  &amp;quot;&amp;quot;&amp;quot;
  add seq id_no to a set
  use explicit counter to create seq_no
  &amp;quot;&amp;quot;&amp;quot;
  quorum = defaultdict(int)
  for seq in seqs:
    for n in range(len(seq) - mlen):
      quorum[seq[n:n + mlen]] += 1
  return quorum

His modifications were small but improved the code a bit, as you remove one variable/object from the function. At the same time there is need to change a bit our output section of the code, as we don&amp;#8217;t use a defaultdict initia...</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1768764</comments>
            <pubDate>Sat, 06 Sep 2008 02:32:14 +0100</pubDate>
            <guid isPermaLink="false">1768764</guid>        </item>
        <item>
            <title>Obtaining overrepresented motifs in DNA sequences, final</title>
            <link>http://feeds.feedburner.com/~r/BeginningPythonForBioinformatics/~3/382825234/</link>
            <description>The part 13 of the motifs series is the last one. In a couple of weeks I will post a refactored code, including the suggestions from Andrew in the last post. I will update the blog contents on OWW and commit some of the code to the GitHub repository. (Source: Beginning Python for Bioinformatics)</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1759764</comments>
            <pubDate>Thu, 04 Sep 2008 01:21:38 +0100</pubDate>
            <guid isPermaLink="false">1759764</guid>        </item>
        <item>
            <title>Obtaining overrepresented motifs in DNA sequences, part 13</title>
            <link>http://feeds.feedburner.com/~r/BeginningPythonForBioinformatics/~3/370526287/</link>
            <description>Now that we have the best quorum determination function and the ideal function to calculate the binomial expansions it is easy to program a script to calculate the p value of motifs in DNA sequences. To the script

#!/usr/bin/env python

import fasta
import sys
from collections import defaultdict

def choose(n, k):
  if 0 &amp;lt;= k &amp;lt;= n:
    ntok = 1
    ktok = 1
    for t in xrange(1, min(k, n - k) + 1):
      ntok *= n
      ktok *= t
      n -= 1
    #print ntok // ktok
    return ntok // ktok
  else:
    return 0

def get_quorums(seqs, mlen):
  &amp;quot;&amp;quot;&amp;quot;
  add seq id_no to a set
  use explicit counter to create seq_no
  &amp;quot;&amp;quot;&amp;quot;
  quorum = defaultdict(set)
  id_no = 0
  for seq in seqs:
    id_no += 1
    for n in range(len(seq) - mlen):
      quorum[seq[n:n...</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1720252</comments>
            <pubDate>Thu, 21 Aug 2008 02:32:57 +0100</pubDate>
            <guid isPermaLink="false">1720252</guid>        </item>
        <item>
            <title>Obtaining overrepresented motifs in DNA sequences, part 12.5</title>
            <link>http://feeds.feedburner.com/~r/BeginningPythonForBioinformatics/~3/366023393/</link>
            <description>So let&amp;#8217;s modify a little bit the factorial function and then benchmark both by using timeit. Ideally our factorial function would need to calculate a value similar to the binomial expansion, as we have three factorials to calculate in for each binomial in the Hypergeometric Distribution.
So we can add two extra factorial calculations to our function and perform the multiplication and division to return the equivalent to the binomial calculation. So the function would be 

def fac(n, m):
  value1 = 1
  for i in xrange(2, n + 1):
    value1 *= i
  value2 = 1
  for i in xrange(2, m + 1):
    value2 *= i
  value3 = 1
  for i in xrange(2, (n - m) + 1):
    value3 *= i 

  return value1 / (value2 * value3)

m and n are both values of the binomial and n - m is the subtraction of one by the ...</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1708834</comments>
            <pubDate>Fri, 15 Aug 2008 22:16:15 +0100</pubDate>
            <guid isPermaLink="false">1708834</guid>        </item>
        <item>
            <title>Obtaining overrepresented motifs in DNA sequences, part 12</title>
            <link>http://feeds.feedburner.com/~r/BeginningPythonForBioinformatics/~3/364390662/</link>
            <description>So let&amp;#8217;s modify a little bit the factorial function and then benchmark both by using timeit. Ideally our factorial function would need to calculate a value similar to the binomial expansion, as we have three factorials to calculate in for each binomial in the Hypergeometric Distribution.
So we can add two extra factorial calculations to our function and perform the multiplication and division to return the equivalent to the binomial calculation. So the function would be 

def fac(n, m):
  value1 = 1
  for i in xrange(2, n + 1):
    value1 *= i
  value2 = 1
  for i in xrange(2, m + 1):
    value2 *= i
  value3 = 1
  for i in xrange(2, (n - m) + 1):
    value3 *= i 

  return value1 / (value2 * value3)

m and n are both values of the binomial and n - m is the subtraction of one by the ...</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1704626</comments>
            <pubDate>Thu, 14 Aug 2008 01:51:21 +0100</pubDate>
            <guid isPermaLink="false">1704626</guid>        </item>
        <item>
            <title>Obtaining overrepresented motifs in DNA sequences, part 11</title>
            <link>http://feeds.feedburner.com/~r/BeginningPythonForBioinformatics/~3/363459330/</link>
            <description>After a long hiatus we are (almost) back on track in order to get our scripts to determine overrepresented motifs in DNA sequences. Last time we checked we defined the &amp;#8220;best&amp;#8221; factorial function in Python 

def fac_01(n):
  result = 1
  for i in xrange(2, n+1):
    result *= i
  return result

and Andrew Dalke sent a couple of links pointing out to a binomial calculation function, one of them is below 

# This file contains the Python code from Program 14.10 of
# &amp;quot;Data Structures and Algorithms
# with Object-Oriented Design Patterns in Python&amp;quot;
# by Bruno R. Preiss.
#
# Copyright (c) 2003 by Bruno R. Preiss, P.Eng. All rights reserved.
#
# http://www.brpreiss.com/books/opus7/programs/pgm14_10.txt
#
def binom(n, m):
  b = [0] * (n + 1)
  b[0] = 1
  for i in xrange(1, n +...</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1700566</comments>
            <pubDate>Wed, 13 Aug 2008 01:54:12 +0100</pubDate>
            <guid isPermaLink="false">1700566</guid>        </item>
        <item>
            <title>Test from Zoundry Raven</title>
            <link>http://feeds.feedburner.com/~r/BeginningPythonForBioinformatics/~3/356830620/</link>
            <description>I am testing a offline/desktop bloggin tool, called Zoundry Raven. New posts are on the way, as promised. (Source: Beginning Python for Bioinformatics)</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1682876</comments>
            <pubDate>Tue, 05 Aug 2008 23:29:14 +0100</pubDate>
            <guid isPermaLink="false">1682876</guid>        </item>
        <item>
            <title>Python for Bioinformatics: a room on FriendFeed</title>
            <link>http://feeds.feedburner.com/~r/BeginningPythonForBioinformatics/~3/347537199/</link>
            <description>Image via WikipediaA couple of days ago a Python for Bioinformatics room was created on FriendFeed. From all the Bioinformatics-related computer languages groups is the one with the highest number of subscribers. Anyone is welcomed to join and contribute to the room. The room has limited resources so far, but we are hoping it grows a lot in the next few days.
Come join us. (Source: Beginning Python for Bioinformatics)</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1658068</comments>
            <pubDate>Sun, 27 Jul 2008 16:12:58 +0100</pubDate>
            <guid isPermaLink="false">1658068</guid>        </item>
        <item>
            <title>BPB on OWW</title>
            <link>http://feeds.feedburner.com/~r/BeginningPythonForBioinformatics/~3/347070181/</link>
            <description>Image via WikipediaThe contents of this site/blog are being (slowly) transferred to a wiki-like page on Open Wet Ware. Thanks to Ricardo Vidal that started including the articles and was the person responsible for this initiative (and a great artist, illustrator).
What is the OWW?
OpenWetWare is an effort to promote the sharing of information, know-how, and wisdom among researchers and groups who are working in biology &amp; biological engineering. Learn more about us.
If you would like edit access, would be interested in helping out, or want your lab website hosted on OpenWetWare, please join us. (Source: Beginning Python for Bioinformatics)</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1657116</comments>
            <pubDate>Sun, 27 Jul 2008 02:15:52 +0100</pubDate>
            <guid isPermaLink="false">1657116</guid>        </item>
        <item>
            <title>New posts</title>
            <link>http://feeds.feedburner.com/~r/BeginningPythonForBioinformatics/~3/345076184/</link>
            <description>Image via WikipediaPosts will resume next week. I was off on vacation and then I changed jobs, so there was no time for Python fun. (Source: Beginning Python for Bioinformatics)</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1652227</comments>
            <pubDate>Thu, 24 Jul 2008 22:59:12 +0100</pubDate>
            <guid isPermaLink="false">1652227</guid>        </item>
        <item>
            <title>Bioinformatics career survey</title>
            <link>http://feeds.feedburner.com/~r/BeginningPythonForBioinformatics/~3/325028350/</link>
            <description>Via Bioinformatics Zen:
Loading&amp;#8230; (Source: Beginning Python for Bioinformatics)</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1563783</comments>
            <pubDate>Wed, 02 Jul 2008 16:49:03 +0100</pubDate>
            <guid isPermaLink="false">1563783</guid>        </item>
        <item>
            <title>A quick assessment of factorial functions in Python</title>
            <link>http://feeds.feedburner.com/~r/BeginningPythonForBioinformatics/~3/306223153/</link>
            <description>A short pause on the motifs subject. As mentioned on comments there is a lot of different ways of calculating factorials in Python (the same &amp;#8220;problem&amp;#8221; can be found in some other languages too). Cariaso suggested to time the execution of different factorial functions, including the ones found on Python&amp;#8217;s cookbook (which I should have included in the beginning of last post). Anyway all functions from the webpage were included, as the one mentioned on a comment and both functions seen here. Using timeit (thanks Mike!) the execution time of all of them were measured by calculating the factorial of 800 and 4000. First, the functions:

def fac_01(n):
  result = 1
  for i in xrange(2, n+1):
    result *= i
  return result

def fac_02(n):
  value = reduce(lambda i, j : i * j, ran...</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1499810</comments>
            <pubDate>Fri, 06 Jun 2008 18:21:44 +0100</pubDate>
            <guid isPermaLink="false">1499810</guid>        </item>
        <item>
            <title>Obtaining overrepresented motifs in DNA sequences, part 10</title>
            <link>http://feeds.feedburner.com/~r/BeginningPythonForBioinformatics/~3/304659280/</link>
            <description>Let&amp;#8217;s get back to the statistical module, that will calculate an Hypergeometric Distribution (HD) p value so we can define the overrepresented motifs. Last time we saw it, we just had defined the factorial function, which is immensely helpful in this case due to the number of factorial calculations needed in the HD. The factorial function was the one below


def fac(n):
  value = reduce(lambda i, j : i * j, range(1, n + 1))
  return value

but as mentioned in the comments by Dave and by Mike via email the method used is not the best method to calculate factorial in Python. The best approach in this case is to use operator.mul. All functions in the operator modules are in implemented in pure C and they mimic the same operators in Python. So in this module we can find mul for multiplic...</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1494212</comments>
            <pubDate>Wed, 04 Jun 2008 16:33:51 +0100</pubDate>
            <guid isPermaLink="false">1494212</guid>        </item>
        <item>
            <title>Obtaining overrepresented motifs in DNA sequences, part 9</title>
            <link>http://feeds.feedburner.com/~r/BeginningPythonForBioinformatics/~3/303932549/</link>
            <description>Back on new functions for motif quorums. We jump function 7 in order to explain &amp;#8220;simpler&amp;#8221; ones, 8 and 9. Both functions use generators. We&amp;#8217;ve already seen here generators, which are functions that use the yield statement to generate iterators. The generator is very similar to a function but instead of returning a value, it yields one and waits for another call to resume. In function 8, a generator is used to return the motif sequence that is used as a key in the defaultdict. Notice the scope of the generator that is coded inside a function.

def get_quorums_08(seqs, mlen):
  &amp;quot;&amp;quot;&amp;quot;
  add seq id_no to a set
  use enumerate to create seq_no
  use an explicit generator to create the motifs
  &amp;quot;&amp;quot;&amp;quot;
  def motif_gen(seq):
    for n in range(len(seq)-mle...</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1491940</comments>
            <pubDate>Tue, 03 Jun 2008 20:20:52 +0100</pubDate>
            <guid isPermaLink="false">1491940</guid>        </item>
        <item>
            <title>Obtaining overrerpresented motifs in DNA sequences, part 8</title>
            <link>http://feeds.feedburner.com/~r/BeginningPythonForBioinformatics/~3/303895113/</link>
            <description>We keep on introducing Mike&amp;#8217;s functions. This time there are a couple of Python methods that we haven&amp;#8217;t seen here and need some introduction, izip and count. To use these two we also need to import new modules

from itertools import count, izip

count returns consecutive integers starting at a defined point (the method&amp;#8217;s parameter). If empty it starts from zero. Basically, by starting a count it will give an iterable with a increasing integer values, in a fashion similar to a function with yield. Every time our loop accesses the count it will &amp;#8220;remember&amp;#8221; the last return value and increment it by one.
izip also returns an iterator, but from a list of iterables. It is basically used to iterate through a list of many iterables at the same time. In the function bel...</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1491941</comments>
            <pubDate>Tue, 03 Jun 2008 17:49:53 +0100</pubDate>
            <guid isPermaLink="false">1491941</guid>        </item>
        <item>
            <title>Obtaining overrepresented motifs in DNA sequences, part 7</title>
            <link>http://feeds.feedburner.com/~r/BeginningPythonForBioinformatics/~3/303805587/</link>
            <description>Continuing on Mike&amp;#8217;s functions to obtain motif quorums. We see function 3, 4 and 5. Function get_quorums_03, uses an old friend of the blog, sets. Recall that sets are very similar to lists, but their are unordered and items are unique.

def get_quorums_03(seqs, mlen):
  &amp;quot;&amp;quot;&amp;quot;
  add seq id_no to a set
  use explicit counter to create seq_no
  &amp;quot;&amp;quot;&amp;quot;
  quorum = defaultdict(set)
  id_no = 0
  for seq in seqs:
    id_no += 1
    for n in range(len(seq)-mlen):
      quorum[seq[n:n+mlen]].add(id_no)
  return quorum

Basically, the sequence numbers (an incremented counter) are added to a defaultdict which was initialized as a set. This way you don&amp;#8217;t need to check for the existence of the sequence number in the defaultdict list and count on the ability of set ...</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1491942</comments>
            <pubDate>Tue, 03 Jun 2008 15:27:49 +0100</pubDate>
            <guid isPermaLink="false">1491942</guid>        </item>
        <item>
            <title>Obtaining overrepresented motifs, part 6</title>
            <link>http://feeds.feedburner.com/~r/BeginningPythonForBioinformatics/~3/301473304/</link>
            <description>We will take a break on developing the statistical module to obtain overrepresented motifs (I will introduce mul in the next stats post), and take a deeper look at the possibilities on obtaining the motif quorums. Mike DeHaemer, a regular commenter and contributor to the blog, sent me a Python script with 8 different ways distributed in 13 distinct functions for obtaining the motif quorums. I will take advantage of his contribution and post all of them, with some quick comments on each one of them (his code comments were kept in each function). After, a small benchmarking will be posted.
Most of the functions need to import a couple of module

from collections import defaultdict, deque
from itertools import count, izip, tee

and they have two parameters, a sequence list and the length of t...</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1480555</comments>
            <pubDate>Sat, 31 May 2008 02:30:44 +0100</pubDate>
            <guid isPermaLink="false">1480555</guid>        </item>
        <item>
            <title>Obtaining overrepresented motifs in DNA sequences, part 5</title>
            <link>http://feeds.feedburner.com/~r/BeginningPythonForBioinformatics/~3/295317470/</link>
            <description>Now that we have the script to generate the word quorums working (and working fast!) we need then to calculate the a p value for each motif based on the fore and background quorums. A p value cut-off will determine the statistically significant words, or overrepresented. These overrepresented words then can be analysed in more details (that we won&amp;#8217;t see here) and for instance determine new or already known transcription factor binding sites.
A well established statistical method to determine such overrepresented words is the Hypergeometric Distribution (HD for short). HD measures &amp;#8220;success&amp;#8221; and &amp;#8220;failures&amp;#8221; for values that do not fit in the binomial distribution, and depend on the measurements without replacement.
Basically, HD&amp;#8217;s equation has a a series of ...</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1465957</comments>
            <pubDate>Sat, 24 May 2008 02:13:52 +0100</pubDate>
            <guid isPermaLink="false">1465957</guid>        </item>
        <item>
            <title>Obtaining overrepresented motifs in DNA sequences, part IV</title>
            <link>http://feeds.feedburner.com/~r/BeginningPythonForBioinformatics/~3/295317470/</link>
            <description>Now that we have the script to generate the word quorums working (and working fast!) we need then to calculate the a p value for each motif based on the fore and background quorums. A p value cut-off will determine the statistically significant words, or overrepresented. These overrepresented words then can be analysed in more details (that we won&amp;#8217;t see here) and for instance determine new or already known transcription factor binding sites.
A well established statistical method to determine such overrepresented words is the Hypergeometric Distribution (HD for short). HD measures &amp;#8220;success&amp;#8221; and &amp;#8220;failures&amp;#8221; for values that do not fit in the binomial distribution, and depend on the measurements without replacement.
Basically, HD&amp;#8217;s equation has a a series of ...</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1460851</comments>
            <pubDate>Wed, 21 May 2008 20:48:29 +0100</pubDate>
            <guid isPermaLink="false">1460851</guid>        </item>
        <item>
            <title>Obtaining overrepresented motifs in DNA sequences, part 4</title>
            <link>http://feeds.feedburner.com/~r/BeginningPythonForBioinformatics/~3/288903822/</link>
            <description>We found a way to make the Python script as good as or better than the C++ executable. But for the analysis we need to do, motif counts are not the value we want. We need the quorum: the number of sequences the motif is present at least once. For instance, if the desired motifs was AAACCCTTTG we will check in which sequences this word was present. Let&amp;#8217;s say in a cluster of 10 sequences, we would find it in sequences 1, 2, 3, 4 and 5, giving us a quorum of 5 out of 10, or 50%. The quorum will be used in the future in the statistical calculation in order to determine the overrepresented motifs.
With only a couple of modifications, we can adapt the script used to get the motif counts to get the quorum.
[sourcecode language=&amp;#8217;python]#!/scratch/python/bin/python
from collections impo...</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1436732</comments>
            <pubDate>Mon, 12 May 2008 19:36:22 +0100</pubDate>
            <guid isPermaLink="false">1436732</guid>        </item>
        <item>
            <title>Obtaining overrepresented motifs in DNA sequences, part 3</title>
            <link>http://feeds.feedburner.com/~r/BeginningPythonForBioinformatics/~3/288856545/</link>
            <description>We take a break of developing code and check for performance (non-scientific testing!). In the previous entry a simple file was used as input: 8 DNA sequences of 500 bases each. That&amp;#8217;s not enough to test the performance of the Python script against the C++ compiled executable. So, we use a larger file; two larger files to be more exact. First, we use a 555 sequence file with the sequences averaging 19371 nucleotides and another with 3854 sequences averaging 20000 nucleotides in length. Those files were the largest foreground and background clusters used in analysis. Let&amp;#8217;s see how the Pyhton and C++ fared (Linux&amp;#8217;s time was used in the comparison, for simplicity).
Foreground cluster
Average of 10 runs
C++: 45.66 seconds
Python: 36.4 seconds
Background cluster
Average of 10 ...</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1436733</comments>
            <pubDate>Mon, 12 May 2008 18:16:29 +0100</pubDate>
            <guid isPermaLink="false">1436733</guid>        </item>
        <item>
            <title>Obtaining overrepresented motifs in DNA sequences, part 2</title>
            <link>http://feeds.feedburner.com/~r/BeginningPythonForBioinformatics/~3/287826901/</link>
            <description>We move one on our search for overrepresented motifs in DNA sequences. I was preparing this entry when the comments to part 1 arrived. Because of that we will modify our previous code to include some suggestions and then we will change t a little bit to output the actual values we want. As Titus pointed out in his comment, when we merge the sequences we generate errors, because we artificially including motifs that are not supposed to be there. But in the end we will see that we are no after the actual number of times a motif appears in the sequence and what matters is the motif quorum.
The main focus of the part one was to show the decrease in code length from C++ to Python, introduce generator functions and yield and also show the nice permutation function. Turns out by using the approac...</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1434371</comments>
            <pubDate>Sun, 11 May 2008 02:56:51 +0100</pubDate>
            <guid isPermaLink="false">1434371</guid>        </item>
        <item>
            <title>Obtaining overrepresented motifs in DNA sequences, part I</title>
            <link>http://feeds.feedburner.com/~r/BeginningPythonForBioinformatics/~3/287246400/</link>
            <description>Changing gears now, leaving behind Pfam alignments. I decided to start a new series of posts based on the conversion of some small C++ programs I developed in the past. These small programs (I call them modules because they were part of a larger application) were used to count motifs, short nucleotide words up to 10-12 base pairs, and then calculate statistical overrepresentation of these words by comparing a foreground set of DNA sequences against a background set.
We will start comparing the different approaches of the C++ and the Python codes and point out advantages and disadvantages of doing it in one language or the other. First thing we need to do is to count the motifs in all sequences from our foreground and background sets. For the project I was working on, the ideal word length ...</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1433677</comments>
            <pubDate>Sat, 10 May 2008 02:46:50 +0100</pubDate>
            <guid isPermaLink="false">1433677</guid>        </item>
        <item>
            <title>Repository</title>
            <link>http://feeds.feedburner.com/~r/BeginningPythonForBioinformatics/~3/286299620/</link>
            <description>As mentioned on the last post I am moving the current repository, an html page, to an actual Git repository on github.com. The link to the repository is 
http://github.com/nuin/beginning-python-for-bioinformatics/tree/master
and it can be accessed by anyone. There are only a handful of scripts there but I am slowly adding more comments to the scripts and moving them to github. The web interface at github is pretty nice and the code can be viewed on the website with a nice code highlighting, for example. Also there is an RSS feed to receive updates, commits, etc.
How to create a local copy
Git is very easy to use and it is very simple to create a local copy of the repository on your local machine. Git is available on most systems as a command-line utility (there is a gui but I haven&amp;#8217;t...</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1428899</comments>
            <pubDate>Thu, 08 May 2008 19:41:28 +0100</pubDate>
            <guid isPermaLink="false">1428899</guid>        </item>
        <item>
            <title>Code repository</title>
            <link>http://feeds.feedburner.com/~r/BeginningPythonForBioinformatics/~3/284987787/</link>
            <description>I am transferring the current repository, which is &amp;#8220;hosted&amp;#8221; in a static page on th blog, to a Git repository on github.com. I am posting soon a quick guide on how to access it. I am also making all relevant entries from the blog into stand alone html pages for off-line browsing.
Anyone with Git experience the link to the repository is here (not many commits yet, but it is coming). (Source: Beginning Python for Bioinformatics)</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1426185</comments>
            <pubDate>Wed, 07 May 2008 00:01:22 +0100</pubDate>
            <guid isPermaLink="false">1426185</guid>        </item>
        <item>
            <title>Still on merging Pfam alignments …</title>
            <link>http://feeds.feedburner.com/~r/BeginningPythonForBioinformatics/~3/284184352/</link>
            <description>One of the things I like about Python and the Python community is the search for the making code simple and clear. Tal left a comment in the last post about merging Pfam alignment sequences suggesting another approach to our problem. The code is below

def merge_seqs(data1, data2):
  from itertools import chain, groupby
  format = &amp;quot;%s-%s-&amp;gt;%d\n%s%s&amp;quot;
  flist = []
  keyfunc = lambda it: it.name[it.name.find('|') + 1 : it.name.find('/')]
  for it, g in groupby(sorted(chain(data1, data2), key=keyfunc), keyfunc):
    values = list(g)
    if len(values) == 2:
      jname, jseq = values[0].name, values[0].sequence
      kname, kseq = values[1].name, values[1].sequence
      flist.append(format % (jname, kname, len(jseq), jseq, kseq) )

  return flist

The code also uses the itertools ...</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1422996</comments>
            <pubDate>Mon, 05 May 2008 20:55:46 +0100</pubDate>
            <guid isPermaLink="false">1422996</guid>        </item>
        <item>
            <title>Revisting Pfam alignments, again …</title>
            <link>http://feeds.feedburner.com/~r/BeginningPythonForBioinformatics/~3/282745321/</link>
            <description>Continuing on the Pfam alignment sequence merge, Luke provided two solutions, one for case where we have duplicates in the file (that happens) and anothet one where duplicates are not tested. First for the case with no duplicates:

def merge_seqs(data1, data2):
  first, second = dict(), dict()
  for i in data1:
    first[i.name[i.name.find('|') + 1:i.name.find('/')]] = i

  for i in data2:
    second[i.name[i.name.find('|') + 1:i.name.find('/')]] = i

  shared_ids = set(first).intersection(set(second))

  flist = []
  for i in shared_ids:
    j = first[i]
    k = second[i]
    tempname = j.name + '-' + k.name + '-&amp;gt;' + str(len(j.sequence))
    tempseq = j.sequence + k.sequence
    flist.append(tempname + '\n' + tempseq)

  return flist

Basically, his approach is to create two dictio...</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1418378</comments>
            <pubDate>Sat, 03 May 2008 13:39:31 +0100</pubDate>
            <guid isPermaLink="false">1418378</guid>        </item>
        <item>
            <title>Revisiting Pfam alignments: using defaultdicts, chains …</title>
            <link>http://feeds.feedburner.com/~r/BeginningPythonForBioinformatics/~3/282161941/</link>
            <description>I haven&amp;#8217;t posted in a while, so let&amp;#8217;s get back to the last topic covered here, merging sequences from Pfam alignments. Two comments to my last post suggested some changes to the original code, and both comments made a considerable improvement to script. But following our line of thought here, there were many things in both posts that we haven&amp;#8217;t covered in the series. In order to make it clear to anyone that is cronologically following the entries, we will see what new things were suggested in the comments. 
We will start with Mike&amp;#8217;s comment. The code is below.

def merge_seqs(data1, data2):
  from collections import defaultdict
  from itertools import chain

  data = defaultdict(list)

  for item in chain(data1, data2):
    ident = item.name[item.name.find('|') + 1 ...</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1417825</comments>
            <pubDate>Fri, 02 May 2008 15:12:11 +0100</pubDate>
            <guid isPermaLink="false">1417825</guid>        </item>
        <item>
            <title>Looking for topics</title>
            <link>http://feeds.feedburner.com/~r/BeginningPythonForBioinformatics/~3/280196051/</link>
            <description>I am preparing a couple of posts based on the latest entries and some other subject and after that we going to check again functional programming in Python geared to Bioinformatics. I am also planning on covering topics on Mastering Perl for Bioinformatics and convert them to Python.
Apart from that I already have some suggestions from friends and colleagues, but I am still looking for some subjects that the readers would like to see covered here. Leave me a message or a comment below with some Bioinformatics subject that would be interesting to write about.
On another note, the blog now is syndicated on Planet Python and Unofficial Planet Python. Welcome. (Source: Beginning Python for Bioinformatics)</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1406827</comments>
            <pubDate>Tue, 29 Apr 2008 16:47:52 +0100</pubDate>
            <guid isPermaLink="false">1406827</guid>        </item>
        <item>
            <title>Python and AppEngine in Vista</title>
            <link>http://feeds.feedburner.com/~r/BeginningPythonForBioinformatics/~3/270021817/</link>
            <description>I had problems installing Google&amp;#8217;s AppEngine in Vista. I had Python 2.5.1 installed in my machine but every time I tried to install the msi package it failed, claiming that Python was not present, even though C:\Python25 was in the path. AppEngine issues site did not help much either, the &amp;#8220;solution&amp;#8221; listed there was to make sure Python was in the path.
So, I decided to start over. I removed Python (and ActiveState Python, which I installed before to see if AppEngine would work) and re-installed it, or tried to. Strangely, Python&amp;#8217;s msi package was installing it in the C drive root, not under Python25. For half an hour I tested all possible combinations, versions and tricks to have it installed in the proper directory/folder. Then I remembered msiexec, a command line ...</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1370662</comments>
            <pubDate>Mon, 14 Apr 2008 13:57:07 +0100</pubDate>
            <guid isPermaLink="false">1370662</guid>        </item>
        <item>
            <title>Merging sequences from a Pfam alignment: using sets, part II</title>
            <link>http://feeds.feedburner.com/~r/BeginningPythonForBioinformatics/~3/260967355/</link>
            <description>Mike and Luke left comments and suggestions on how to attack the problem from the previous entry. I copied their comments here in order to maintain code formatting a little bit better (comments do not have code highlighting and formatting). 
Mike:
Why not build a dictionary of the fasta data keyed by protein ID. The value of each dictionary entry would be a list of (name,sequence) tuples. Then iterate over the dictionary items and build the output list from dictionary entries of length == 2.
Obviously, I don’t have any data sets to test it on, but something like this might work:

def merge_seqs(data1, data2):
  from collections import defaultdict
  from itertools import chain

  data = defaultdict(list)

  for item in chain(data1, data2):
    ident = i.name[i.name.find(’|')+1:i.name.fi...</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1337842</comments>
            <pubDate>Mon, 31 Mar 2008 00:30:49 +0100</pubDate>
            <guid isPermaLink="false">1337842</guid>        </item>
        <item>
            <title>Rss</title>
            <link>http://feeds.feedburner.com/~r/BeginningPythonForBioinformatics/~3/259990069/</link>
            <description>I added a more evident link to the RSS feed on the sidebar. If you haven&amp;#8217;t already subscribed, please do so. The feed is now syndicated through the Unofficial Planet Python. (Source: Beginning Python for Bioinformatics)</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1335163</comments>
            <pubDate>Sat, 29 Mar 2008 02:17:53 +0100</pubDate>
            <guid isPermaLink="false">1335163</guid>        </item>
        <item>
            <title>Merging sequences from a Pfam alignment: using sets</title>
            <link>http://feeds.feedburner.com/~r/BeginningPythonForBioinformatics/~3/259982183/</link>
            <description>A colleague came with a &amp;#8220;problem&amp;#8221;: what would be the most efficient way to merge Pfam alignments? He had FASTA files containing sequences and he wanted to find identical IDs in two files and merge the related sequences from different domains of the same protein. His C++ approach was taking too long to run so I jumped in to help with some Python tricks. 
Fasta headers of Pfam alignments look like this
&amp;gt;P00526.2|SRC_RSVP/147-229 
where the first section, before the pipe, is the protein family, the section between the pipe and the slash is the protein ID and the after the slash are the start and end positions. Basically we want to match the protein ID, between | and /, which is the only section that should not change from one alignment from the other, if there are similar seque...</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1335164</comments>
            <pubDate>Sat, 29 Mar 2008 01:58:24 +0100</pubDate>
            <guid isPermaLink="false">1335164</guid>        </item>
        <item>
            <title>Python sets, intersections, reduce and more</title>
            <link>http://feeds.feedburner.com/~r/BeginningPythonForBioinformatics/~3/253243011/</link>
            <description>Last time we saw how to use a set intersection to check for clusters of DNA/protein sequences and their genome intersections. We going to use the same example but this time we will see how we can change it and create a function that would calculate an arbitrary number of intersections at a time and also be able to check the intersection of more than two sets.
Our previous code to calculate intersections was

from sets import Set #for Python 2.3 and below
genA, genB, genC = Set([]), Set([]), Set([])

#populate the sets ...

print len(Set.intersection(genA, genB))
print len(Set.intersection(genA, genC))
print len(Set.intersection(genB, genC))

and that&amp;#8217;s does not gives us the most important piece of information: the intersection of A, B and C. Let&amp;#8217;s see how we can do it. Python h...</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1308999</comments>
            <pubDate>Mon, 17 Mar 2008 21:33:42 +0100</pubDate>
            <guid isPermaLink="false">1308999</guid>        </item>
        <item>
            <title>Python sets and intersections</title>
            <link>http://feeds.feedburner.com/~r/BeginningPythonForBioinformatics/~3/251579739/</link>
            <description>Sometime ago we saw how to use sets and uniquify lists. This time we will see anothe example of the use of sets. 
Note: in the previous example we saw sets imported as an extra module. This has to be done for Python versions 2.3 and under. There is a difference between both: when sets [...] (Source: Beginning Python for Bioinformatics)</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1303132</comments>
            <pubDate>Fri, 14 Mar 2008 19:37:31 +0100</pubDate>
            <guid isPermaLink="false">1303132</guid>        </item>
        <item>
            <title>GenBank parsing: both scripts</title>
            <link>http://feeds.feedburner.com/~r/BeginningPythonForBioinformatics/~3/251551219/</link>
            <description>Both scripts to parse GenBank scripts are below and included in the repository.
First the one that extracts amino acid sequences, and then the latest to extract the DNA.

#! /usr/bin/env python

'''
input is a GenBank file. The script searches for gene annotations, extract all lines
from the file and then parses these lines in order to extract protein [...] (Source: Beginning Python for Bioinformatics)</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1303133</comments>
            <pubDate>Fri, 14 Mar 2008 18:40:26 +0100</pubDate>
            <guid isPermaLink="false">1303133</guid>        </item>
        <item>
            <title>Fasta module: transcribing DNA</title>
            <link>http://feeds.feedburner.com/~r/BeginningPythonForBioinformatics/~3/249739755/</link>
            <description>A long time ago when the blog was still based on the Perl book we have seen how to transcribe DNA to RNA. This entry serves only to remember the method and add a function to the fasta module in the repositoty.
It is really simple to transcribe in Python, by employing the replace method on [...] (Source: Beginning Python for Bioinformatics)</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1295998</comments>
            <pubDate>Tue, 11 Mar 2008 21:23:47 +0100</pubDate>
            <guid isPermaLink="false">1295998</guid>        </item>
        <item>
            <title>Fasta module: generating reverse complement of DNA sequences</title>
            <link>http://feeds.feedburner.com/~r/BeginningPythonForBioinformatics/~3/249619003/</link>
            <description>As shown in the GenBank DNA parser script, it is really useful to have the ability to get the reverse complement of some DNA sequences. The reverse complement of a 5&amp;#8242;-3&amp;#8242; DNA sequence is on its complementary strand. Using our fasta module it is easy to implement a function to generate the antiparallel sequence
Basically we [...] (Source: Beginning Python for Bioinformatics)</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1294264</comments>
            <pubDate>Tue, 11 Mar 2008 19:35:17 +0100</pubDate>
            <guid isPermaLink="false">1294264</guid>        </item>
        <item>
            <title>GenBank parsing: both scripts</title>
            <link>http://feeds.feedburner.com/~r/BeginningPythonForBioinformatics/~3/249522766/</link>
            <description>Both scripts to parse GenBank scripts are below and included in the repository.
First the one that extracts amino acid sequences, and then the latest to extract the DNA.

#! /usr/bin/env python

'''
input is a GenBank file. The script searches for gene annotations, extract all lines
from the file and then parses these lines in order to extract protein [...] (Source: Beginning Python for Bioinformatics)</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1294265</comments>
            <pubDate>Tue, 11 Mar 2008 14:08:26 +0100</pubDate>
            <guid isPermaLink="false">1294265</guid>        </item>
        <item>
            <title>GenBank parsing: part II</title>
            <link>http://feeds.feedburner.com/~r/BeginningPythonForBioinformatics/~3/249513103/</link>
            <description>Last thing we saw in the previous entry was the function to parse a GenBank file in order to get the DNA sequences of an annotated gene. This time we will dissect the function.

def parse_entry(gene_data):
  #changes a string to list, splitting at line ends
  gene_data = gene_data.split('\n')
  [...] (Source: Beginning Python for Bioinformatics)</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1294266</comments>
            <pubDate>Tue, 11 Mar 2008 13:49:47 +0100</pubDate>
            <guid isPermaLink="false">1294266</guid>        </item>
        <item>
            <title>Parsing GenBank: getting DNA from genome entries, part I</title>
            <link>http://feeds.feedburner.com/~r/BeginningPythonForBioinformatics/~3/247011008/</link>
            <description>Before we see how to improve the speed in our Python code (would it be possible?), let&amp;#8217;s go back to our GenBank parsing algorithm. Last time we saw how to extract the proteins from the CDS entries along the file. This time we will see how to get the DNA from the file. Usually DNA [...] (Source: Beginning Python for Bioinformatics)</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1284703</comments>
            <pubDate>Thu, 06 Mar 2008 22:17:03 +0100</pubDate>
            <guid isPermaLink="false">1284703</guid>        </item>
        <item>
            <title>Change</title>
            <link>http://feeds.feedburner.com/~r/BeginningPythonForBioinformatics/~3/241083107/</link>
            <description>I changed the plugin that handles code highlighting, making possible to also change the theme. I don&amp;#8217;t know if I let some unmodified code lingering, so if anyone see any problem, just drop me a note. (Source: Beginning Python for Bioinformatics)</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1254979</comments>
            <pubDate>Mon, 25 Feb 2008 20:37:48 +0100</pubDate>
            <guid isPermaLink="false">1254979</guid>        </item>
        <item>
            <title>Test II</title>
            <link>http://feeds.feedburner.com/~r/BeginningPythonForBioinformatics/~3/240087361/</link>
            <description>import sys
for i in file:
  print i (Source: Beginning Python for Bioinformatics)</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1252397</comments>
            <pubDate>Sun, 24 Feb 2008 03:01:11 +0100</pubDate>
            <guid isPermaLink="false">1252397</guid>        </item>
        <item>
            <title>test</title>
            <link>http://feeds.feedburner.com/~r/BeginningPythonForBioinformatics/~3/239941548/</link>
            <description>[sourcecode language=”python”]a = len(list)[/sourcecode] (Source: Beginning Python for Bioinformatics)</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1251703</comments>
            <pubDate>Sat, 23 Feb 2008 14:04:19 +0100</pubDate>
            <guid isPermaLink="false">1251703</guid>        </item>
        <item>
            <title>99 problems</title>
            <link>http://feeds.feedburner.com/~r/BeginningPythonForBioinformatics/~3/239770397/</link>
            <description>André Roberge, of Crunchy fame, is converting a list of 99 Prolog problems to Python. Check it out. (Source: Beginning Python for Bioinformatics)</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1251079</comments>
            <pubDate>Sat, 23 Feb 2008 04:13:17 +0100</pubDate>
            <guid isPermaLink="false">1251079</guid>        </item>
        <item>
            <title>A comparison between Python and C(++): parsing a GenBank file</title>
            <link>http://feeds.feedburner.com/~r/BeginningPythonForBioinformatics/~3/238333926/</link>
            <description>Last time we saw how to extract all the protein sequences from a GenBank genome file. A program is available in the WU-Blast package that does exactly the same thing, called gt2fasta. Let&amp;#8217;s compare the performance of our Python script and this compiled C++ (I believe it is C++, the source is not available). I [...] (Source: Beginning Python for Bioinformatics)</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1244981</comments>
            <pubDate>Wed, 20 Feb 2008 19:07:11 +0100</pubDate>
            <guid isPermaLink="false">1244981</guid>        </item>
        <item>
            <title>Contact form</title>
            <link>http://feeds.feedburner.com/~r/BeginningPythonForBioinformatics/~3/237581203/</link>
            <description>I added a contact form if anyone is interested in dropping me a note. It can be accessed from the menu above. (Source: Beginning Python for Bioinformatics)</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1241806</comments>
            <pubDate>Tue, 19 Feb 2008 14:02:34 +0100</pubDate>
            <guid isPermaLink="false">1241806</guid>        </item>
        <item>
            <title>Parsing a GenBank file, another approach</title>
            <link>http://feeds.feedburner.com/~r/BeginningPythonForBioinformatics/~3/235634910/</link>
            <description>Last couple of entries we learned a little bit about sets and that newer Python versions make its use a lost faster than dictionaries to uniquify lists. In a couple of entrie we will see some simple examples on how to use sets in an efficient way.
And now for some completely different &amp;#8230; we go [...] (Source: Beginning Python for Bioinformatics)</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1234544</comments>
            <pubDate>Fri, 15 Feb 2008 16:37:35 +0100</pubDate>
            <guid isPermaLink="false">1234544</guid>        </item>
        <item>
            <title>Benchmarking Python: fastest way to generate unique lists</title>
            <link>http://feeds.feedburner.com/~r/BeginningPythonForBioinformatics/~3/226031865/</link>
            <description>Just for fun, let&amp;#8217;s see if there is any advantage (apart from generating a smaller code) in using either of the approaches to create an unique list. A list of 741 gene IDs and another one with 1322 (that contained all the 741 IDs from the first) were used. Instead of hard coding the lists [...] (Source: Beginning Python for Bioinformatics)</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1188495</comments>
            <pubDate>Wed, 30 Jan 2008 18:13:29 +0100</pubDate>
            <guid isPermaLink="false">1188495</guid>        </item>
        <item>
            <title>Uniquifying lists with sets and dictionaries</title>
            <link>http://feeds.feedburner.com/~r/BeginningPythonForBioinformatics/~3/225655829/</link>
            <description>We are going to use our previous example to compare the use of sets and dictionaries to create unique lists. We&amp;#8217;ve already seen that when sets are used it is very simple to transform a list with repeated items in a unique list. The only hassle is to create the set and then transform it [...] (Source: Beginning Python for Bioinformatics)</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1187081</comments>
            <pubDate>Wed, 30 Jan 2008 03:34:21 +0100</pubDate>
            <guid isPermaLink="false">1187081</guid>        </item>
        <item>
            <title>Python sets</title>
            <link>http://feeds.feedburner.com/~r/BeginningPythonForBioinformatics/~3/224817691/</link>
            <description>After a long hiatus we are back with more Python tips, tricks, codes and snippets. This time we will check how set works in Python. set is another object type available in Python (version 2.3 and up) that brings a lot of features to the language.
From the Python Library Reference: &amp;#8220;A set object is an [...] (Source: Beginning Python for Bioinformatics)</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1184616</comments>
            <pubDate>Mon, 28 Jan 2008 21:55:35 +0100</pubDate>
            <guid isPermaLink="false">1184616</guid>        </item>
        <item>
            <title>“Cutting” chromosomes with Python: a fast approach, take two, including precision</title>
            <link>http://feeds.feedburner.com/~r/BeginningPythonForBioinformatics/~3/190917460/</link>
            <description>On the last post we saw how to &amp;#8220;cut&amp;#8221;/extract a segment of a chromosome quickly in Python. But our last code had no precision in the cutting/extracting, because it didn&amp;#8217;t take in account the line size so it would extract more information than actually requested.
We need to address precision them. First take into account the [...] (Source: Beginning Python for Bioinformatics)</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1051130</comments>
            <pubDate>Mon, 26 Nov 2007 21:41:57 +0100</pubDate>
            <guid isPermaLink="false">1051130</guid>        </item>
        <item>
            <title>“Cutting” chromosomes with Python: a fast approach, take one</title>
            <link>http://feeds.feedburner.com/~r/BeginningPythonForBioinformatics/~3/187925764/</link>
            <description>Last couple of posts we started with some functional programming aspects of Python. I was away last week and couldn&amp;#8217;t create anything related to FP in the meantime, so I decided to post about a quick way to &amp;#8220;cut&amp;#8221;, or extract segments, from chromosomes stored as FASTA files.
This is a subject that I have been [...] (Source: Beginning Python for Bioinformatics)</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=1039994</comments>
            <pubDate>Tue, 20 Nov 2007 21:38:52 +0100</pubDate>
            <guid isPermaLink="false">1039994</guid>        </item>
        <item>
            <title>Functional programming in Python: using filter, take one</title>
            <link>http://feeds.feedburner.com/~r/BeginningPythonForBioinformatics/~3/178418183/</link>
            <description>This time we check another functional programming function from Python: filter. As the name implies, filter returns items from a sequence (list, string, etc) that are true to a certain condition defined by the function. The syntax is very similar to map 

Code (python)


filter &amp;#40;function, sequence&amp;#41;



and as map it returns a list (except when the [...] (Source: Beginning Python for Bioinformatics)</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=998527</comments>
            <pubDate>Thu, 01 Nov 2007 21:08:24 +0100</pubDate>
            <guid isPermaLink="false">998527</guid>        </item>
        <item>
            <title>Functional programming in Python: using map</title>
            <link>http://feeds.feedburner.com/~r/BeginningPythonForBioinformatics/~3/177836567/</link>
            <description>Functional programming in Python. First we need to define what is functional programming. Quoting Wikipedia: &amp;#8220;is a programming paradigm that treats computation as the evaluation of mathematical functions and avoids state and mutable data. It emphasizes the application of functions, in contrast with the imperative programming style that emphasizes changes in state.&amp;#8221;
Rather complex, eh? Another [...] (Source: Beginning Python for Bioinformatics)</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=994874</comments>
            <pubDate>Wed, 31 Oct 2007 23:17:23 +0100</pubDate>
            <guid isPermaLink="false">994874</guid>        </item>
        <item>
            <title>BioPython 1.44 released</title>
            <link>http://feeds.feedburner.com/~r/BeginningPythonForBioinformatics/~3/177257370/</link>
            <description>As advertised in many boards and sites BioPython has a new release, 1.44. You can get it here. I am panning some posts covering BioPython. (Source: Beginning Python for Bioinformatics)</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=991696</comments>
            <pubDate>Tue, 30 Oct 2007 16:41:02 +0100</pubDate>
            <guid isPermaLink="false">991696</guid>        </item>
        <item>
            <title>Merging single (or multiple) sequence FASTA files</title>
            <link>http://feeds.feedburner.com/~r/BeginningPythonForBioinformatics/~3/176832746/</link>
            <description>Last couple of entries we saw how to split a multiple FASTA file. And how can we achieve the opposite, merge single (or sometimes multiples) sequence files in on larger multiple sequence file? In Python is very simple, but we need to introduce a new module that would allow us to get the desired result.
In [...] (Source: Beginning Python for Bioinformatics)</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=988377</comments>
            <pubDate>Mon, 29 Oct 2007 20:17:31 +0100</pubDate>
            <guid isPermaLink="false">988377</guid>        </item>
        <item>
            <title>Splitting a FASTA file using awk (no sed required), or do we care about csplit?</title>
            <link>http://feeds.feedburner.com/~r/BeginningPythonForBioinformatics/~3/176741144/</link>
            <description>We saw that &amp;#8220;top notch bioinformaticians&amp;#8221; use csplit to split FASTA files, so I decided to post as many as possible alternatives to split these files. As csplit, awk is something found with more frequency in Linux machines than Windows, but it can be installed on Windows (even Vista) and it runs fine. Awk is [...] (Source: Beginning Python for Bioinformatics)</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=988378</comments>
            <pubDate>Mon, 29 Oct 2007 16:37:07 +0100</pubDate>
            <guid isPermaLink="false">988378</guid>        </item>
        <item>
            <title>Generating multiple sets of random DNA sequences with one script (and a bash one)</title>
            <link>http://feeds.feedburner.com/~r/BeginningPythonForBioinformatics/~3/175454456/</link>
            <description>A commenter, Dilmurat, gave me an idea about the script that generates random DNA sequence sets. Apparently it wasn&amp;#8217;t clear that the script was intended to generate only one sequence set, and not multiple sets. Dilmurat also offered his solution:

Code (python)


#!/usr/bin/env python


&amp;#160;


import random


import sys


&amp;#160;


def simulate_sequence&amp;#40;length&amp;#41;:


&amp;#160; &amp;#160; dna = &amp;#91;&amp;#8216;A&amp;#8217;, &amp;#8216;C&amp;#8217;, &amp;#8216;G&amp;#8217;, &amp;#8216;T&amp;#8217;&amp;#93;


&amp;#160; &amp;#160; sequence = [...] (Source: Beginning Python for Bioinformatics)</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=982476</comments>
            <pubDate>Fri, 26 Oct 2007 17:15:47 +0100</pubDate>
            <guid isPermaLink="false">982476</guid>        </item>
        <item>
            <title>Off topic</title>
            <link>http://feeds.feedburner.com/~r/BeginningPythonForBioinformatics/~3/171591449/</link>
            <description>Mr Daniel Swan accused me of not publishing his comment or reply to his comments. I have never seen his reply, either in the comments or as an email to me, so to fulfill his desires it is published here. 
I admit that some of the things you have addressed in the past, whilst ‘reinventing’ [...] (Source: Beginning Python for Bioinformatics)</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=961566</comments>
            <pubDate>Thu, 18 Oct 2007 14:47:32 +0100</pubDate>
            <guid isPermaLink="false">961566</guid>        </item>
        <item>
            <title>Alternative methods to split a FASTA file</title>
            <link>http://feeds.feedburner.com/~r/BeginningPythonForBioinformatics/~3/168054484/</link>
            <description>As Daniel didn&amp;#8217;t enlightened us on how to use csplit, I am posting several ways on how to split a multiple sequence FASTA file. This post gets out of our focus (if you haven&amp;#8217;t noticed, our focus here is Python, and maybe suffers from the NIH effect. Not invented here. We will be back with [...] (Source: Beginning Python for Bioinformatics)</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=941741</comments>
            <pubDate>Wed, 10 Oct 2007 19:10:36 +0100</pubDate>
            <guid isPermaLink="false">941741</guid>        </item>
        <item>
            <title>Splitting multiple FASTA file, or why shouldn’t we reinvent the wheel?</title>
            <link>http://feeds.feedburner.com/~r/BeginningPythonForBioinformatics/~3/168016911/</link>
            <description>Daniel Swan posted a comment on the previous entry regarding that splitting multiple sequence FASTA files is &amp;#8220;one of those ‘bioinformatics’ tasks where people are seriously guilty of reinventing the wheel&amp;#8221;. 
Biologically, let&amp;#8217;s dissect his comment. As there are not many comments in this blog, we take advantage of the few ones posted. 
I don&amp;#8217;t [...] (Source: Beginning Python for Bioinformatics)</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=941742</comments>
            <pubDate>Wed, 10 Oct 2007 17:04:59 +0100</pubDate>
            <guid isPermaLink="false">941742</guid>        </item>
        <item>
            <title>Splitting a multiple sequence FASTA file, making it better</title>
            <link>http://feeds.feedburner.com/~r/BeginningPythonForBioinformatics/~3/167613090/</link>
            <description>On the last entry we saw how to split a multiple FASTA file with our previous FASTA module. The previous script saved a sequence per file, where the filename was identical to the original file with the exception of a number added to the end. Let&amp;#8217;s say we have the file mysequences.fa, our script would [...] (Source: Beginning Python for Bioinformatics)</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=938647</comments>
            <pubDate>Tue, 09 Oct 2007 21:14:19 +0100</pubDate>
            <guid isPermaLink="false">938647</guid>        </item>
        <item>
            <title>Splitting a multiple sequence FASTA file</title>
            <link>http://feeds.feedburner.com/~r/BeginningPythonForBioinformatics/~3/167593254/</link>
            <description>starting the new phase of the website, we are going to see some random thought in Python and Bioinformatics. Any ideas would be highly appreciated, and I will use some terms used in searches that ended up finding this blog or this or this. We will also see some of the ideas that are being [...] (Source: Beginning Python for Bioinformatics)</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=938648</comments>
            <pubDate>Tue, 09 Oct 2007 20:22:39 +0100</pubDate>
            <guid isPermaLink="false">938648</guid>        </item>
        <item>
            <title>Python Magazine</title>
            <link>http://feeds.feedburner.com/~r/BeginningPythonForBioinformatics/~3/165875993/</link>
            <description>This is a blog devoted to basic Python, but I could not refrain myself of suggesting the new Python Magazine, that has its first issue free online.
Enjoy. (Source: Beginning Python for Bioinformatics)</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=931021</comments>
            <pubDate>Fri, 05 Oct 2007 20:55:30 +0100</pubDate>
            <guid isPermaLink="false">931021</guid>        </item>
        <item>
            <title>GenBank: parsing some features (and a change for the future)</title>
            <link>http://feeds.feedburner.com/~r/BeginningPythonForBioinformatics/~3/164896139/</link>
            <description>This is the last entry based on the book. In my opinion, further topics in the book are a little bit redundant and can be accomplished quite easily if you have followed the tutorial here. If a good number of people have interest in checking the remainder of the book, just let me know and [...] (Source: Beginning Python for Bioinformatics)</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=925179</comments>
            <pubDate>Wed, 03 Oct 2007 21:25:16 +0100</pubDate>
            <guid isPermaLink="false">925179</guid>        </item>
        <item>
            <title>Ads</title>
            <link>http://feeds.feedburner.com/~r/BeginningPythonForBioinformatics/~3/162628328/</link>
            <description>My site and blog(s) are hosted in an ISP that I pay from my own money. Because of that, I decided to add a small AdSense add on the left column of the blog. (Source: Beginning Python for Bioinformatics)</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=911759</comments>
            <pubDate>Fri, 28 Sep 2007 21:21:28 +0100</pubDate>
            <guid isPermaLink="false">911759</guid>        </item>
        <item>
            <title>GenBank files: take two</title>
            <link>http://feeds.feedburner.com/~r/BeginningPythonForBioinformatics/~3/162483038/</link>
            <description>Last time we saw how to extract the sequence from our GenBank file, but the final result might not be as nice as we wanted it to be. So, we need to make it prettier. Running last entry&amp;#8217;s script we end up with this sequence
    1 gtttggtcct aaccttgtaa [...] (Source: Beginning Python for Bioinformatics)</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=911760</comments>
            <pubDate>Fri, 28 Sep 2007 14:50:27 +0100</pubDate>
            <guid isPermaLink="false">911760</guid>        </item>
        <item>
            <title>GenBank files: take one</title>
            <link>http://feeds.feedburner.com/~r/BeginningPythonForBioinformatics/~3/162160900/</link>
            <description>We are back, moving to a new chapter of the book and starting a new section on the site. This chapter deals with the manipulation of GenBank files. These files are used by NCBI to store information about RNA, DNA and protein sequences. It is usually composed of an annotation section, that gives information about [...] (Source: Beginning Python for Bioinformatics)</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=908435</comments>
            <pubDate>Thu, 27 Sep 2007 21:24:33 +0100</pubDate>
            <guid isPermaLink="false">908435</guid>        </item>
        <item>
            <title>Restriction enzymes: the grand finale</title>
            <link>http://feeds.feedburner.com/~r/BeginningPythonForBioinformatics/~3/156134915/</link>
            <description>We get to the last piece of the puzzle. On the last four posts we have seen each part of the restriction enzyme site searcher script, and now we put everyhting together. If someone is also following the book, will see that the Python code we are producing here is slightly different, but in essence [...] (Source: Beginning Python for Bioinformatics)</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=869421</comments>
            <pubDate>Thu, 13 Sep 2007 21:18:25 +0100</pubDate>
            <guid isPermaLink="false">869421</guid>        </item>
        <item>
            <title>Restriction enzymes, third take</title>
            <link>http://feeds.feedburner.com/~r/BeginningPythonForBioinformatics/~3/156034540/</link>
            <description>We come to the penultimate part of our restriction enzyme site finder. Just a couple of pieces lacking in the puzzle and we are there. First, the most important: the function that searches for the sites, using regex patterns. We called it find_sites

Code (python)


def find_sites&amp;#40;input, set, sequence&amp;#41;:


&amp;#160; &amp;#160; iupacdict = &amp;#123;&amp;#8216;A&amp;#8217;:&amp;#8216;[A]&amp;#8217;,


&amp;#160; &amp;#160; &amp;#8216;C&amp;#8217;:&amp;#8216;[C]&amp;#8217;,


&amp;#160; &amp;#160; &amp;#8216;G&amp;#8217;:&amp;#8216;[G]&amp;#8217;,


&amp;#160; [...] (Source: Beginning Python for Bioinformatics)</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=869422</comments>
            <pubDate>Thu, 13 Sep 2007 17:08:42 +0100</pubDate>
            <guid isPermaLink="false">869422</guid>        </item>
        <item>
            <title>Ratings</title>
            <link>http://feeds.feedburner.com/~r/BeginningPythonForBioinformatics/~3/155564455/</link>
            <description>A new feature has been add to the blog: ratings. As I don&amp;#8217;t receive many comments on the posts but there is a steady stream of viewers of the website, I decided to add the possibility of rating a post. So, if you feel the post was useful for you, well or badly written, just [...] (Source: Beginning Python for Bioinformatics)</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=867194</comments>
            <pubDate>Wed, 12 Sep 2007 16:45:47 +0100</pubDate>
            <guid isPermaLink="false">867194</guid>        </item>
        <item>
            <title>Restrinction enzymes: second take</title>
            <link>http://feeds.feedburner.com/~r/BeginningPythonForBioinformatics/~3/153617575/</link>
            <description>We already have a function that reads the enzymes from a dataset in a flat file (with one change: return)

Code (python)


def read_enzymes&amp;#40;file&amp;#41;:


&amp;#160; &amp;#160; resenz = &amp;#123;&amp;#125;


&amp;#160; &amp;#160; start = False


&amp;#160; &amp;#160; for line in file:


&amp;#160; &amp;#160; &amp;#160; &amp;#160; if line.find&amp;#40;&amp;#8216;Rich Roberts&amp;#8217;&amp;#41; &amp;#62;= 0:


&amp;#160; &amp;#160; &amp;#160; &amp;#160; &amp;#160; &amp;#160; start = True


&amp;#160; &amp;#160; &amp;#160; &amp;#160; &amp;#160; [...] (Source: Beginning Python for Bioinformatics)</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=849784</comments>
            <pubDate>Fri, 07 Sep 2007 21:49:25 +0100</pubDate>
            <guid isPermaLink="false">849784</guid>        </item>
        <item>
            <title>Planning</title>
            <link>http://feeds.feedburner.com/~r/BeginningPythonForBioinformatics/~3/153598402/</link>
            <description>Another aspect covered in the book that we haven&amp;#8217;t seen yet is how to plan, design out script or software. Usual ways to design a program include writing use cases and drawing UML diagrams (stands for Unified Modelling Language). Here we will scratch the surface of use cases, where we will try to determine how [...] (Source: Beginning Python for Bioinformatics)</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=849785</comments>
            <pubDate>Fri, 07 Sep 2007 20:51:43 +0100</pubDate>
            <guid isPermaLink="false">849785</guid>        </item>
        <item>
            <title>Restriction enzymes: first take</title>
            <link>http://feeds.feedburner.com/~r/BeginningPythonForBioinformatics/~3/150693350/</link>
            <description>We now jump with both feet on the main topic of the book&amp;#8217;s chapter, which is generating restriction maps of DNA sequences. First step is to obtain restriction enzyme information, read it and format in a way that our main script will understand. We will use the same dataset as the book, the Rebase database [...] (Source: Beginning Python for Bioinformatics)</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=835373</comments>
            <pubDate>Fri, 31 Aug 2007 21:08:12 +0100</pubDate>
            <guid isPermaLink="false">835373</guid>        </item>
        <item>
            <title>Finding motifs: IUPAC and RegEx, an approach</title>
            <link>http://feeds.feedburner.com/~r/BeginningPythonForBioinformatics/~3/149392881/</link>
            <description>After a long delay, we are back. Before entering in the next topic, Restriction Enzymes, let&amp;#8217;s take a look how to create a regex pattern from user input and the dictionary of IUPAC code for nucleotides. We will use the same dictionary from the previous entry

Code (python)


iupacdict = &amp;#123;&amp;#8216;M&amp;#8217;:&amp;#8216;[AC]&amp;#8217;, 


&amp;#160; &amp;#160; &amp;#160; &amp;#160; &amp;#8216;R&amp;#8217;:&amp;#8216;[AG]&amp;#8217;,


&amp;#160; &amp;#160; [...] (Source: Beginning Python for Bioinformatics)</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=827869</comments>
            <pubDate>Tue, 28 Aug 2007 22:24:19 +0100</pubDate>
            <guid isPermaLink="false">827869</guid>        </item>
        <item>
            <title>Krugle</title>
            <link>http://feeds.feedburner.com/~r/BeginningPythonForBioinformatics/~3/134269249/</link>
            <description>A suggestion: whenever you are coding or leaning a new language it is always opportune to have at hand some code to compare, check and learn. One of the good tools to do code search is Krugle, where you can extensively search code based on language, project name, among other variables. 
One example: I searched [...] (Source: Beginning Python for Bioinformatics)</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=737456</comments>
            <pubDate>Mon, 16 Jul 2007 17:08:26 +0100</pubDate>
            <guid isPermaLink="false">737456</guid>        </item>
        <item>
            <title>Finding motifs: IUPAC and RegEx, an overview</title>
            <link>http://feeds.feedburner.com/~r/BeginningPythonForBioinformatics/~3/134267164/</link>
            <description>End of Section 5, moving to Section 6. For anyone also following the book there will be a jump at the end of chapter 8, so we are not touching the final script that deals with different reading frames here. We are going straight, or almost, to another take of Regular Expressions. We are going [...] (Source: Beginning Python for Bioinformatics)</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=737457</comments>
            <pubDate>Mon, 16 Jul 2007 16:59:56 +0100</pubDate>
            <guid isPermaLink="false">737457</guid>        </item>
        <item>
            <title>Translating DNA into proteins: second approach, now using FASTA files</title>
            <link>http://feeds.feedburner.com/~r/BeginningPythonForBioinformatics/~3/132753243/</link>
            <description>We have seen before how to translate DNA sequences into amino acids sequences. We have even created a module that contains the dictionary for the genetic code. Now we are going to combine both (very simple) modules we created in one nice script for day-to-day use.
So, we have the dnatranslate.py and the fasta.py that we [...] (Source: Beginning Python for Bioinformatics)</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=728374</comments>
            <pubDate>Wed, 11 Jul 2007 20:18:57 +0100</pubDate>
            <guid isPermaLink="false">728374</guid>        </item>
        <item>
            <title>Formatting output of FASTA files</title>
            <link>http://python.genedrift.org/2007/07/10/formatting-output-of-fasta-files/</link>
            <description>The Beginning Perl for Bioinformatics book shows a script to print formatted sequence data, specifying that no more than 80 characters (either nucleotides or amino acids) should be printed across a page. Here, we will see a similar script in Python and will include it in our fasta.py module in order to use it a [...] (Source: Beginning Python for Bioinformatics)</description>
            <author>Beginning Python for Bioinformatics</author>
            <type>news</type>
        <comments>http://www.medworm.com/rss/comments.php?id=725022</comments>
            <pubDate>Tue, 10 Jul 2007 16:30:15 +0100</pubDate>
            <guid isPermaLink="false">725022</guid>        </item>
    </channel>
</rss>

