ListSummary:May 10-31,2006

From BioPerl
Jump to: navigation, search


May 10-31, 2006

Oh me oh my. I go to conference and come back to a pretty busy mail-list. Of course I added a few comments myself to everything, but that's just because I can't keep anything to myself. Looks like lots of changes and proposed changes to modules, a few new ideas, numerous bug reports, and some good ol' Perl hackery. There's so much now I'll have to break this list summary into a couple of updates, which should be complete by June 6 (fingers crossed!).

First things first...


The Deobfuscator

At the top of the list is the availability of the Deobfuscator interface on the BioPerl server, thanks to Mauricio and Dave. What's the Deobfuscator? It's an easy way 'to determine the methods that are available from a given BioPerl module' (directly from the page below):

Second call for abstracts - BOSC 2006

Darin London has made the second call for abstracts for BOSC 2006. Details here...

CDAT - Integrative objects for character trees and data

On the main mail list, Arlin Stoltzfus has made a proposal an integrative object (called CDAT) for character trees and data. The second link below has the link to proposal outline in PDF format...

Modware - BioPerl-based OO API for Chado

Eric Just has posted Modware, a BioPerl-based, object-oriented API, written in Perl, for Chado. More information can be found here...

Eric's mailing list and news posts. Here is a direct link to the Modware site.

Bioperl-based Applications for "Free Software" Session

Andreas Bender announces a Computational Life Sciences Conference in which users can submit BioPerl or cool uses for BioPerl...

Project OpenLab

Jay Hannah proposes a new project (Project OpenLab) to 'provide a "point and click" toolset allowing researchers to quickly build arbitrary databases of sequences, primer groups, and primers.' Fernan Aguero and Rutger Vos gives their opinions...

Bio::Phylo Mail List

Rutger Vos has announced the availability of the Bio::Phylo mail list, hosted by the Open Bioinformatics Foundation... hey, that's us!

And now to the mail lists!


Bio::SeqIO oddness

YT points out issues with the way strains and subspecies designations are handled within Bio::SeqIO (via Bio::Species). Torsten chimes in as well; the resolution (though indirect): pull out the information from the 'source' tag from the feature table and look into Bio::Species yet again....

More fun with primer3

Chen Li and friends (?) continue on with discussion on the idiosyncrasies of using Bio::Tools::Primer3, Bio::Tools::Run::Primer3, and just what is the difference between the two (hint: it's in the name)...

Problems grabbing the PubMed ID

Yu Zhou responds to an older post about grabbing the PubMed ID from a file; Brian O. reveals that Yu may be using old code...

Fun(?), annoyances, and confusion with Bio::Species/Bio::Taxonomy::Node/Bio::DB::Taxonomy

Sendu Bala writes in to state his confusion with Bio::Taxonomy::Node and how it handles names. Jason and YT chime in, discussing issues which also relate to Bio::DB::Taxonomy and Bio::Species parsing...

BioPerl and converting gene predictions to GFF

Torsten responds to an off-list question asking how one can convert several results containing gene predictions to GFF format, giving a bit of code to demonstrate...

More fun(?) with Bio::DB::Taxonomy

Back for more, Sendu Bala shows that Bio::DB::Taxonomy has problems with certain names at the species level (and possibly below) and posts some potential code as an example. YT agrees and, in a caffeine-induced haze, goes off on all things taxonomic, born of frustrations with species/strain mangling from Bio::Species and Bio::DB::Taxonomy. Nadeem Faruque gives his thoughts as well, including other possible avenues to fix the issue...

How not to use BioPerl - Lesson #1

Saurabh Maheshwari asks for help with a complicated script that is meant to look for protein interactions using Bio::Graph modules to look for protein-protein interactions. Of course, this wasn't immediately apparent from the original post, so Marc Logghe and Sean Davis prodded Saurabh for more information. Sean Davis makes further recommendations, while YT gets honest when scrutinizing the submitted code. Lesson learned here? Do not cut-and-paste or copy from various scripts and expect it to work in the way you want...

Parsing output files from other tools

Hubert Prielinger wants to know if Bio::SearchIO can parse mpsearch format. YT points out that it can't, but one could either build a module or use ssearch and Bio::SearchIO::fasta instead...

How to get the Reverse Complement from a sequence

Chen Li has problems getting the reverse complement from a sequence file; YT and Marc lend a hand...

UniProt and MySQL

Hilmar responds, requests thought on issues with getting the UniProt database into MySQL (originally posted on BioSQL, see below)...

Possible memory leak in Bio::SearchIO?

Wayne Clark is worried about the possibility of a memory leak in Bio::SearchIO::blast. YT and Torsten give their thoughts, including the possibility that MySQL is to blame...

Bio::DB::Query::GenBank and ID's

Bernd Web finds some potential problems with the way Bio::DB::Query::GenBank and Bio::DB::GenBank handle sequence ID's, including several instances when thrown errors would be helpful and problems with example in the POD. YT explains that this is mainly due to differences between methods used between the two modules and decides to look into it in the future...

Where's Bio::SeqIO::entrezgene?

Kenny Daily wants to know where to find Bio::SeqIO::entrezgene; YT points out where. Kenny replies back with some problems and Stefan Kirov helps out...

Six Frame Translation

Chen Li wants to know how to translate in all six frames; Scott Markel and Brian O. help out...

Where's Bio::ASN1::EntrezGene?

Ryan Golhar looks for Bio::ASN1::EntrezGene but can't find it in BioPerl; YT points out that it isn't in BioPerl, but Ryan figures this out anyway...

Formatting sequence output

Chen Li chimes in again with a question about a module for sequence formatting. Malcolm Cook gives him some advice...

Bio::Map enhancements

Sendu Bala has posted proposed enhancements to Bio::Map modules in Bugzilla...

Performance problems and proposed enhancements for BioPerl

David Waner uncovered some pretty significant performance issues with BioPerl and Perl 5.8 on Windows and suggested several fixes for the changes which reduced the parsing time dramatically. YT responded enthusiastically (he's seen issues himself) and Brian O. asked about how the relevant tests responded...

Getting at features and annotation

Nick Staffa wanted to know how to get features and annotation in a GenBank genome file; Brian O., Torsten, and Chandan Singh all give advise (specifically, check the FAQ and the HOWTO's)...

BioPerl-ext, alignments, and Inline C/XS

Adam Kraut posted a question about how to best 'wrap' a C library for aligning sequences. Aaron Mackey points him towards bioperl-ext and gives some pointers on getting started. Stephen Lenk also chips in with some more advice and a little code...

How to parse BLAST XML output

Hubert Prielinger asked how to parse BLAST XML output. Turns out, according to Warren Gish, that WU-BLAST XML and BLAST XML should both be parsed by Bio::SearchIO::blastxml. YT pointed out changes recently to Bio::SearchIO::blastxml that require XML::SAX and XML::SAX::ExpatXS. The next issue was trying to get at taxonomic information, which Jason and YT address...

Guess the sequence format

Wijaya Edward wants to know if there is a way to guess the sequence format. Jason points out Bio::Tools::GuessSeqFormat...

Bio::Graphics issues

Our old friend Chen Li comes forward with problems concerning Bio::Graphics. Brian O., Torsten, and Lincoln lend a hand...

Bio::Graph questions

Neil Saunders has problems with using Bio::Graph modules, specifically Bio::Graph::IO and Bio::Graph::ProteinGraph. Brian O. suggests a different (though experimental) set of modules he's donating, called bioperl-network. Brian, has there been an announcement yet?

Bio::SeqIO::swiss versioning problems

Michael Rogoff points out that Bio::SeqIO::swiss doesn't parse out the sequence version correctly (its on the same line as the date) and proposes a patch. Jason directs him to the proper place to submit patches, but then points out that the date lines needs to have the extra versioning information stripped out. YT also says that the patch doesn't address recent changes as UniProt to the SwissProt format.

Bio::SeqFeature::Tools::Unflattener problems

Barry Moore uncovers a nasty bug with Unflattener, and YT confirms. Another one for Bugzilla...

Bio::Graphics problems (it's like I'm hearing an echo...)

Chen Li tries to figure out why he can't get a .png file to work correctly. Lincoln, Torsten, Brian O., and Marc Logghe offer up suggestions, with Lincoln providing the solution...


Derek Fairley asks about using Bio::Restriction::IO::bairoch. Though he got no direct answer from the mail list, the same question was posited again (see below).

Bio::Assembly::IO::ace output

Rowan Mitchell wants to know if write output is to be implemented for Bio::Assembly::IO::ace. Robson Francisco de Souza replies back that there are no current plans to add a write method for this module since it wasn't needed when originally submitted...

Getting files in EMBL format

Chen Li wants to know how to download files directly from EMBL. No one responds. Poor Chen...

Bio::Graphics::Panel negative position numbering

Kevin Lam Koiyau wants to know how to get negative position numbering with Bio::Graphics::Panel. Lincoln indicates that this is possible under bioperl 1.5.1. Jonathan Epstein adds a second question on how to add directional arrows to for BLAST hits, but no one answers.

...and Bio::Restriction::IO again

Jelena Obradovic points out problems with converting a REBASE file into a Bio::Restriction::EnzymeCollection object. YT points out the format is not captitalized, but find that the normal format ('bairoch') dies with an error. Brian points out that the format name should be more forgiving of case but YT indicates that case is the least of the issues here...


Genevieve DeClerck is trying to work out how to combine several analyses while parsing through a BLAST report. Brian O. and Torsten lends a hand, and Sendu Bala gives some input on a possible solution.

Problems with bioperl-ext

Orphan : Simon Rayner is having issues getting bioperl-ext running on SUSe Linux...

Using and integrated server to grab output from multiple servers

Orphan : Shameer Khadar wants to know how to get a system set up for grabbing output from multiple servers for display on an integrated server. Shameer, never start your question with 'My query may not be directly related to BioPERL', as we might lose interest...

Bio::Graphics::Panel backgroud color

Orphan : Jelena Obradovic wants to know how to turn off the backgraound color of the panel.

CVS and code auditing

Torsten Seemann stirs up a hornet's nest with an audit regarding the use of 'return undef;' in the core code based on a suggestion made by Damian Conway in Perl Best Practices. Pretty much everybody (where's Jason?!?) gives an opinion on the issue; YT kicks the nest around a bit more by pointing out how many 'throw_not_implemented statements are found in regular (non-interface) code. How fun!

Bio::TreeIO "Collapse" function

Lucia Peixoto ponders on how to collapse any node in a tree. Aaron Mackey and Jason suggest solutions. Lucia has a few more questions. I have a question Lucia: what exactly is the "ass_Descendent" method? Couldn't find that one in POD...

Moving the tutorial to the wiki

Jay Hannah tries moving the Bioperl tutorial instructions to the wiki and finds out how tricky it can be. Brian O., YT, and Mauricio all join in on trying to decide whether to keep the entire script or to split the documentation from the script completely...

How to add methods to a module

Hongyu Zhang wants to donate code to add to Bio::SimpleAlign but doesn't know how. YT modifies the FAQ (sneaky!) and points him to it...


Problems loading uniprot release 49.6 into mysql

S. Rayner (if that is your real name) relays problems gettng Uniprot loaded into MySQL. An issue with the 'RL' line is located in SwissProt files in which the annotation isn't unique. Hilmar suggests a few things to work around that and suggests several possible fixes... errors and GenBank-mangled UniProt files

Gerben Menschaert submits a problem with loading a GenBank sequence. Hilmar points out that the file in question is a UniProt file and that the GenBank parser has problems with these files, then suggests loading the UniProt files using the SwissProt format.

BioSQL and gbrowse

Genevieve DeClerck says she's having problems getting BioSQL to work with Gbrowse. Hilmar has a few suggestions...

Problem with adding features under BioSQL

Michael Cipriano finds an issue with BioSQL and adding features under GBrowse. Hilmar is confused (it IS BioSQL-l, not the GBrowse list). Lincoln posts a fix for the issue.

Bioperl-guts-l (for the diehards)

Note: Significant module changes and additions to CVS are normally announced on the main bioperl-l list if they are in decent enough condition for production work. If modules listed below have not been announced, then there may be a very good reason for it. If you plan on trying to use these, consider contacting the author(s). Many of the modules discussed in this section are highly experimental and are in various stages of development. They may or may not work at all. Therefore, we are not responsible for any problems faced with using this code.

I'm trying out a new layout this week which will hopefully enable me to keep up with all the changes a bit easier (I'm using a script to consolodate this stuff). Let me know what you think.

CVS Commits

Brian Osborne

Brian has contributed bioperl-network, which are used to analyze protein-protein interactions.

Module changes:

Test file changes and additions:

  • Modified tests :Edge.t, IO_dip_tab.t, IO_psi_xml.t, Interaction.t, Node.t, ProteinNet.t
  • Added/Modified data :00001.xml, bovin_small_intact.xml, psi_xml.dat, sv40_small.xml,,,,
  • Modified tests :IO_psi_xml.t
  • Modified tests :ProteinNet.t
  • Added/Modified data :arath_small-02.xml

Modified or added scripts/examples:

Christopher Fields

Module changes:

Test file changes and additions:

  • Modified tests :RestrictionIO.t

Jason Stajich

Module changes:

Lincoln Stein

Lincoln continues on with GFF3 integration into Bioperl

Module changes:

Modified or added scripts/examples:

Scott Cain

Module changes:

Sohel Merchant

Sohel has contributed the OBO format parser for ontology parsing:

Module changes:

olivier at

Some movement on bioperl-pipeline!!

Module changes:

Bug Reports

Jason went on a bug hunt and stomped a few along the way, along with the rest of the BioPerl gang...


  • Issue #843 :_readline() in bugzilla-daemon at
  • Issue #1829 :-ve - gff data must be in order
  • Issue #1915 :Tranditional bootstrap in newick format tree not accesible
  • Issue #1926 :Missing sections in Pdoc HTML
  • Issue #1945 :tblastn reports the wrong frame and hit position
  • Issue #1953 :Inconsistency between Bio::Factory::FTLocationFactory->from_string and Bio::Location::Split->to_FTstring
  • Issue #1954 :SeqIO::game does not write or read <computational_analysis> elements
  • Issue #1955 :Add Bio::AnnotatableI inheretance to Bio::Map::SimpleMap
  • Issue #1957 :Add some tests for to t/MapIO.t
  • Issue #1998 :Bio::Map::Marker incomplete implementation
  • Issue #2002 :Infinite recursive loop in
  • Issue #2003 doesn't parse sequence version out of swissprot files
  • Issue #2011 :New: Bio::Restriction::IO enhancements/code issues


  • WORKSFORME Issue #1988 :SearchIO::blast parses bit score incorrectly when score is reported in scientific notation
  • WORKSFORME Issue #2001 :RemoteBlast does not return HTML reports properly
  • LATER Issue #1539 cannot be told to read sequence ID as accession number
  • LATER Issue #1924 :scansite
  • INVALID Issue #2004 parsing error with certain locations
  • INVALID Issue #2009 :Bio::Restriction::IO object fails to load enzymes from local file; default set always loaded instead.
  • FIXED Issue #1830 :strange dashes on the negative axis when using xyplot
  • FIXED Issue #1985 :PSI-BLAST parsing fails on Windows
  • FIXED Issue #1989 :GI identifier missing when using Bio::Index::GenBank
  • FIXED Issue #2000 :AlignIO clustal/fasta parsing
  • FIXED Issue #2006 :Bio::SeqIO::genbank does not parse CONSRTM field
  • FIXED Issue #2007 :Remote webblast with input file
  • FIXED Issue #2008 :Bio::Location::Split produces erroneous start/end coordinates with multiple Fuzzy sublocations
  • FIXED Issue #2010 :Bio::Restriction::IO::bairoch has problems with enzymes with multiple sites
Personal tools
Main Links