Core 1.2.3 1.4.0 delta

From BioPerl
Jump to: navigation, search

These are detailed notes on changes made between bioperl-release-1-2-3 and bioperl-release-1-4-0.

Bio::Align::DNAStatistics
Made it possible to silence the warnings
Code alignment formatting
Runnable synopsis
Bio::AlignIO
Adding a parser and tests for UCSC maf (multiple alignment format) format. added a method SimpleAlign::splice_by_seq_pos to allow splicing of all sequences based on the gap locations of one sequence within the alignment. this could in principle be called repeatedly to remove all gaps from the MSA.
Guessing sequence and align formats by looking into file
Make format guessing to follow same logic as in Bio::SeqIO. Tests pass.
Fail better with no arguments to new()
Changed GuessSeqFormat to return undef
Bio::AlignIO::bl2seq
Johnathan Segal's fixes for Issue #1541 - problem with reverse complement alignments in bl2seq
Use SearchIO instead BPbl2seq
Bio::AlignIO::maf
Adding a parser and tests for UCSC maf (multiple alignment format) format. added a method SimpleAlign::splice_by_seq_pos to allow splicing of all sequences based on the gap locations of one sequence within the alignment. this could in principle be called repeatedly to remove all gaps from the MSA.
Cleaned up unit test spurious warnings. bugfix in maf parser for detecting last record in file. added functionality to trim gaps from a MSA for a given sequence to SimpleAlign. trimming allowed implementation of exporting Seq and SeqFeatures from SimpleAlign. the api here is still rough, comments appreciated.
Offset location of new seq with features by location of original seq requested to build from. added rudimentary key/value parsing for maf 'a' lines
Silencing a warning when running tests
Bio::Annotation::Comment
Overload double quot as as_text. but comment them, since it does not work with other modules
Bio::Annotation::DBLink
Overload double quot as as_text. but comment them, since it does not work with other modules
Bio::Annotation::OntologyTerm
Replaced call to phased out each_synonym by get_synonyms.
Replaced call to phased out each_synonym by get_synonyms.
Bio::Annotation::Reference
Overload double quot as as_text. but comment them, since it does not work with other modules
Bio::Coordinate::Collection
Add a method mapper_count - to get the number of stored mappers
Bio::Coordinate::GeneMapper
Cleanup code a little. use debug instead of print STDERR
Bio::Coordinate::Pair
Need to treat 0 as +1 for protein based strands
Bio::Coordinate::Result
Cleanup code a little
Bio::Coordinate::Utils
Each_mapper returns an array, not arrayref
Bio::DB::Fasta
Fixed issue of _split_group() being called as class method
Modified Bio::DB::Fasta so as to correctly reindex files created using the NDBM .pag and .dir files
Bio::DB::Fasta now handles CRLF files generated on windows platforms
Bio::DB::Flat
Some -index value has to be passed, it's required and there's no default value. Fixes Daniel Lang's bug.
The following regression tests now pass: GFF, SeqFeature, Registry
Bio::DB::Flat::BinarySearch
More detail on secondary namespaces
Bio::DB::GFF
Added minor workarounds for Bio::DB::GFF roundtripping to artemis
Caveat emptor and all other disclaimers. I just need to get Sheldon's changes into the CVS so that I can merge it with my own changes. Again, if this causes anyone heartache I have made backups of the original and I can roll back. The changes here are primarily to deal with the GFF2.5 format GFF that handles alignment features
Generalized Mark and Sheldon's GFF 2.5 unflattening scheme
Improved API for preferred_groups() method
Fixed issue of _split_group() being called as class method
Made a small change in group assignments so that the order in which preferred groups are listed determines precedence
Retracting edits on the _split_gff2_method until after the 1.4 release. Manipulating GFF group assignments requires further testing
Bio::DB::GFF::Adaptor::biofetch
Changes making genbank2gff.pl use SOFA terms for type names in generated GFF3
Minor, probably temporary, fixes to biofetch adaptor for genbank2gff
This biofetch adaptor and genbank2gff are better than before, but not yet perfect. Fixed: - Gets right the parentage of CDSs (belong to mRNAs) and exons (belong to chromosomes or other regions). - variation features are now clasified as SNPs (if the length is 1), else chromosome_variation. - no longer creates CDS lines that span introns. Remaining problems: - with multiple transcript genes, it can get confused as to which CDSs go with which mRNA (solution: reimplement with Unflattener) - SNP features can all get assigned the same ID, which is illegal in GFF3, though I don't think it will cause problems with gbrowse (though it probably will with chado).
Bio::DB::GFF::Adaptor::dbi
Fixed Issue #1384; histograms not working
Bio::DB::GFF::Adaptor::dbi::iterator
Added postgres adaptor
Bio::DB::GFF::Adaptor::dbi::oracle
Added postgres adaptor
I need to commit some of Sheldon's changes in order to merge them with my own changes at home. Hopefully this doesn't break anyone elses code - it should not affect anyone who is not using alignmnet features in Gbrowse GFF format, or GFF3 format. If this hurts anyone let me know and I will rollback the changes. I'm in purgatory anyway, so...
Bio::DB::GFF::Adaptor::dbi::pg
Added postgres adaptor
Bio::DB::GFF::Adaptor::memory
Added postgres adaptor
Removed global $VERSION
Generalized Mark and Sheldon's GFF 2.5 unflattening scheme
Fixed a bug that caused wildcard search to return unexpected results.
Bio::DB::GFF::Adaptor::memory_iterator
Added postgres adaptor
Bio::DB::GFF::Aggregator
Fixed errors in the high-mag sequence alignments shown by the segments glyph
Bio::DB::GFF::Feature
Added recursive gff3 dumping to Bio::DB::GFF
Added minor workarounds for Bio::DB::GFF roundtripping to artemis
Fixed overly-promiscuous regexp for detecting FASTA files
I need to commit some of Sheldon's changes in order to merge them with my own changes at home. Hopefully this doesn't break anyone elses code - it should not affect anyone who is not using alignmnet features in Gbrowse GFF format, or GFF3 format. If this hurts anyone let me know and I will rollback the changes. I'm in purgatory anyway, so...
Removing debugging messages
Silence the uninitialized value error
Reworked the following methods to more closely resemble the corresponding Bio::SeqFeatureI methods: all_tags (alias get_all_tags) gff_string get_tag_values aliased sub_SeqFeature to get_SeqFeatures
Generalized Mark and Sheldon's GFF 2.5 unflattening scheme
Quoting freetext in group name value of GFF2 output
Bio::DB::GFF::RelSegment
Added minor workarounds for Bio::DB::GFF roundtripping to artemis
Bio::DB::Query::WebQuery
Quell undef warnings - not sure why version is not properly set
Honour proxy environment variables like suggested in #1482
Bio::DB::Registry
ActiveState has no getpwuid() so AS users can use /home/bosborne
The HOWTO says that one should be able to use 1 or more seqdatabase.ini files. This is right, since the administrator could put one in /etc/bioinformatics and I might want my own in /home/bosborne/.bioinformatics. The old code was reading 1 *ini file and skipping the rest in OBDA_SEARCH_PATH, now it reads all the files specified in OBDA_SEARCH_PATH, as well as the standard locations.
Bio::DB::WebDBSeqI
Reset MODVERSION to use Packagewide Version system - change agent string to be 'bioperl-Bio_DB_GenBank/1.3' for example
Honour proxy environment variables like suggested in #1482
Bio::Expression::FeatureGroup
Allowing an underscore in ontology IDs. this is necessary to be able to parse cjm's OBO_REL relationship ontology, which otherwise observes DAG-Edit format
Bio::Factory::BlastHitFactory
Removing PSIBLAST modules
Bio::Factory::BlastResultFactory
Removing PSIBLAST modules
Bio::Factory::SequenceFactoryI
Some teeny bit of background/justification in the docs
Bio::Graphics::Feature
Substantial changes to Bio::Graphics::Feature to remain in synch with Bio::Tools::GFF
Fixed GFF2 dumping functions which broke when I added GFF3 dumping
Methods added so that a Feature really complies with the SeqI interface... Needed primary_seq especially, but added get_SeqFeatures and the rest to bring it up to spec for 1.3.x
Bio::Graphics::FeatureFile
Fixed errors in score handling in FeatureFile
Fixed various problems in the graphics layer when dealing with scored features
Substantial changes to Bio::Graphics::Feature to remain in synch with Bio::Tools::GFF
All functionality moved to Bio::DB::GFF->split_group, however still see major perf problems, not sure if this is localized to FeatureFile or not
Bio::Graphics::FeatureFile: remove uninit variable warning when calling features() without arguments; fixed frend web-based feature renderer to accomodate recent changes in FeatureFile API
Adding a symbol to access a feature's primary ID (eg, database PK)
Added a finished() method to Bio::Graphics::Panel in order to break cycles that lead to memory leaks when multiple panels are created and used
Bio::Graphics::Glyph
Added -codontable options to the glyphs that produce protein translations; courtesy Linda Sperling
Fixed implementation of user-defined glyph sort routines
Fixed implementation of user-defined glyph sort routines
Partial fix for incorrectly-rendered half-open intervals
Preliminary support for SVG output using GD::SVG
Fixed errors in the high-mag sequence alignments shown by the segments glyph
Polygon-based approach in filled_arrow to support SVG
Made the filled_oval and oval methods compatible with SVG (were just drawing closed arcs)
Fixed crashes that occured when calling boxes() before gd()
Made image_class, image_package, and polygon_package methods public (and fixed glyphs accordingly)
Added the track to the information retrieved by the boxes() method
Fixed issue of _split_group() being called as class method
Added a finished() method to Bio::Graphics::Panel in order to break cycles that lead to memory leaks when multiple panels are created and used
Added docs describing how subclassers should handle generic image packages
Bio::Graphics::Glyph::Factory
Fixed implementation of user-defined glyph sort routines
Fixed implementation of user-defined glyph sort routines
Preliminary support for SVG output using GD::SVG
Bio::Graphics::Glyph::arrow
Partial fix for incorrectly-rendered half-open intervals
Bio::Graphics::Glyph::box
Fixed behavior of the "box" glyph; there are still anomalies in drawing due to SVG addition
Bio::Graphics::Glyph::cds
Added -codontable options to the glyphs that produce protein translations; courtesy Linda Sperling
Bio::Graphics::Glyph::diamond
Converted line-based outline to polygon calls
Simplified the call to the polygon package
Removed the hardcoded image class conditional for filled triangles
Made image_class, image_package, and polygon_package methods public (and fixed glyphs accordingly)
Bio::Graphics::Glyph::dot
SVG compliancy marches on: changed the closed-arc method to ellipse and filledEllipse
Bio::Graphics::Glyph::ellipse
Inherits from generic instead of Glyph so that it can now draw labels!
Bio::Graphics::Glyph::generic
Added -codontable options to the glyphs that produce protein translations; courtesy Linda Sperling
Generalized some code to support SVG output
Line-based arrows now render properly when drawn in SVG
Made image_class, image_package, and polygon_package methods public (and fixed glyphs accordingly)
Minor code formatting changes.
Fixed behavior of the "box" glyph; there are still anomalies in drawing due to SVG addition
Fixed the arrow glyph to be consistent with pre-SVG appearance
Bio::Graphics::Glyph::graded_segments
General fixups of the xyplot and graded_segment glyphs
Fixed various problems in the graphics layer when dealing with scored features
Fixed Bio::SeqFeature::Generic so that it will accept a score of 0; modified Bio::Graphics::Glyph::graded_segment so that it draws a fg box around each segment by default (can restore default behavior with -vary_fg=>1)
Bio::Graphics::Glyph::minmax
General fixups of the xyplot and graded_segment glyphs
Fixed various problems in the graphics layer when dealing with scored features
Bio::Graphics::Glyph::pinsertion
Generalized the image_class call fo SVG support; enabled bordered fills
Made image_class, image_package, and polygon_package methods public (and fixed glyphs accordingly)
Bio::Graphics::Glyph::rndrect
Generalized image_class; now supports SVG
Made image_class, image_package, and polygon_package methods public (and fixed glyphs accordingly)
Bio::Graphics::Glyph::segments
Added a new "canonical_strand" option to the segments glyph
Fixed errors in the high-mag sequence alignments shown by the segments glyph
Added additional documentation for displaying multiple alignments with the segments glyph
Bio::Graphics::Glyph::translation
Added -codontable options to the glyphs that produce protein translations; courtesy Linda Sperling
Bio::Graphics::Glyph::triangle
Try to fix GD buffer overrun in triangle glyph
More range checking on triangle glyph before fillToBorder call
Preliminary support for polygon-based approach - needs full testing
Converted to plygon method for SVG-compliancy
Made image_class, image_package, and polygon_package methods public (and fixed glyphs accordingly)
Bio::Graphics::Glyph::xyplot
General fixups of the xyplot and graded_segment glyphs
Fixed errors in score handling in FeatureFile
Fixed various problems in the graphics layer when dealing with scored features
Removed function-oriented GD calls for compatability with SVG output
Removed a superfluous comment. hohum.
Bio::Graphics::Panel
Fixed implementation of user-defined glyph sort routines
Fixed implementation of user-defined glyph sort routines
Preliminary support for SVG output using GD::SVG
Fixed crashes that occured when calling boxes() before gd()
Shmooshed the hard-coded key-label font color - still hard-coded - needs to be worked into config
Removed redundant variable declaration in same scope
Made image_class, image_package, and polygon_package methods public (and fixed glyphs accordingly)
Minor code formatting changes.
Fixed the call to panel->height() to avoid a font exception when not running in SVG mode
Added the track to the information retrieved by the boxes() method
Added missing documentation for the Bio::Graphics::Panel->insert_track() method
Added a finished() method to Bio::Graphics::Panel in order to break cycles that lead to memory leaks when multiple panels are created and used
Brackets wrong way round. Thanks, Aaron!
Still trying to get this right
Added and svg() for lazy coders (well, more to mimic the png())
Bio::Graphics::Pictogram
This was spewing out huge numbers of warnings: 'Use of uninitialized value in division'
Support lowercase
Now supporting Bio::Matrix::PSM::SiteMatrix objects
Bio::Index::Abstract
Don't warn if we set verbose to -1
Bio::LocatableSeq
Johnathan Segal's fixes for Issue #1541 - problem with reverse complement alignments in bl2seq
Adding a parser and tests for UCSC maf (multiple alignment format) format. added a method SimpleAlign::splice_by_seq_pos to allow splicing of all sequences based on the gap locations of one sequence within the alignment. this could in principle be called repeatedly to remove all gaps from the MSA.
Fixed trunc() when strand is -1. Also made end() calculate its value based on the length of the sequence and start. no need to set end() expicitely any more
Silence a spurious warning arising from unset strand
Start() and end() now return undef if there is no sequence string
The following regression tests now pass: GFF, SeqFeature, Registry
Restoring previous functionality, clearer code from Lincoln
Bio::Location::Atomic
Corrected destructive getter in strand function
Comments per Issue #1572 - experimental method trunc() is undocumented
Bio::Location::Fuzzy
Don't emit warnings if/when strand is undef
Bio::Location::Simple
Fixes Issue #1535: "complement" was lost in single residue features in to_FTstring()
Comments per Issue #1572 - experimental method trunc() is undocumented
Bio::Matrix::PSM::IO
Added DNA PSM (Position Scoring Matrix) modules This includes meme, mast and transfac parsers
Start up cleanup - tests run but not all pass just yet
Bio::Root::Root missing from @ISA, added
Bio::Matrix::PSM::IO::mast
Added DNA PSM (Position Scoring Matrix) modules This includes meme, mast and transfac parsers
Start up cleanup - tests run but not all pass just yet
Capitalization fixed when rearranging in new
Two bug fixes: sequence is unknown, but width is, so we supply it as 'NNN..' Accession number should be supplied as -accession_number
Doc formatting fixes
Fixed a bug- warns instead of throwing an exception when the MAST report is empty
Ooops.. misssed some()
Cannot read HTML, would hang, fixed
Bio::Matrix::PSM::IO::meme
Added DNA PSM (Position Scoring Matrix) modules This includes meme, mast and transfac parsers
Change in meme parser: definition of primary id vs. accession number
Start up cleanup - tests run but not all pass just yet
Capitalization fixed when rearranging in new
Rare bug, actually not exactly bug: If MEME fails during the analysis it will create the proper file that terminates prematurely. The parser now will warn user of that instead of the dying. User will still be able to get the data for the successfully predicted ones. If no motifs were predicted the parser still dies with message that wrong format was passed (which is reasonable I think).
Rare bug, actually not exactly bug: If MEME fails during the analysis it will create the proper file that terminates prematurely. The parser now will warn user of that instead of the dying. User will still be able to get the data for the successfully predicted ones. If no motifs were predicted the parser still dies with message that wrong format was passed (which is reasonable I think). oops, works now
Bio::Matrix::PSM::IO::transfac
Added DNA PSM (Position Scoring Matrix) modules This includes meme, mast and transfac parsers
Start up cleanup - tests run but not all pass just yet
Capitalization fixed when rearranging in new
Throw exception if a position is not defined
Bio::Matrix::PSM::InstanceSite
Added DNA PSM (Position Scoring Matrix) modules This includes meme, mast and transfac parsers
Bug fix: accession vs accession_number, removed redundant identifiers
Start up cleanup - tests run but not all pass just yet
Bug fix: start method was overriding LocatableSeq method, and it shouldn't, fixed.
Synopsis and doc fixes
Bio::Matrix::PSM::InstanceSiteI
Added DNA PSM (Position Scoring Matrix) modules This includes meme, mast and transfac parsers
Start up cleanup - tests run but not all pass just yet
Bio::Matrix::PSM::Psm
Added DNA PSM (Position Scoring Matrix) modules This includes meme, mast and transfac parsers
Start up cleanup - tests run but not all pass just yet
Capitalization fixed when rearranging in new
Bio::Matrix::PSM::PsmHeader
Added DNA PSM (Position Scoring Matrix) modules This includes meme, mast and transfac parsers
Start up cleanup - tests run but not all pass just yet
Synopsis and doc fixes
Bio::Matrix::PSM::PsmHeaderI
Added DNA PSM (Position Scoring Matrix) modules This includes meme, mast and transfac parsers
Start up cleanup - tests run but not all pass just yet
Bio::Matrix::PSM::PsmI
Added DNA PSM (Position Scoring Matrix) modules This includes meme, mast and transfac parsers
Start up cleanup - tests run but not all pass just yet
Bio::Matrix::PSM::SiteMatrix
Added DNA PSM (Position Scoring Matrix) modules This includes meme, mast and transfac parsers
Start up cleanup - tests run but not all pass just yet
Remove spurious warnings
Commented out some spurious(?) lines from the new()
Fixed bug Heikki pointed with the constructor when no input data for the vectors (A,G,C,T) is supplied This is still a temp solution
Get/set method added to access accession_number
Bio::Matrix::PSM::SiteMatrixI
Added DNA PSM (Position Scoring Matrix) modules This includes meme, mast and transfac parsers
Get/set method added to access accession_number
Bio::Ontology::InterProTerm
Make to_string work
Fix a tricky bug that the identifier will be set as undef if you call interpro_id without arguments
Bio::Ontology::SimpleGOEngine
Allowing an underscore in ontology IDs. this is necessary to be able to parse cjm's OBO_REL relationship ontology, which otherwise observes DAG-Edit format
MPATH ontology doesn't have at least 3 digits in all identifiers
Fixes to ontology regex to parse a greater subset of DAG-Edit files. i have tracked down the files where DAG-Edit IDs are validated: GOFlatFileAdapter.java the regex still only matches a subset of the allowed characters in an identifier. identifiers can be any non-whitespace, non ;$,:!\? characters > length 1 on either side of a : separator. i've opted to match \w+:\w+, hopefully we don't need to go beyond this.
Bio::Ontology::Term
Tests fail better without network access
Add references as an attribute of term
Make the interpro parser handle publication etc.
Bio::OntologyIO
Adding escape of SGML and newlines/tabs. is there a generic SGML escape module we want to add as a dependency?
Bio::OntologyIO::Handlers::BaseSAXHandler
Add BaseSAXHandler and modify InterPro_BioSQL_Handler to inherit it
Base Handler has the new method as _current_hash;
Bio::OntologyIO::Handlers::InterProHandler
Have new root terms for Active_site and Binding_site. It can run throughout the complete Interpro xml file.
A SAX handler to parse interpro.xml and store in biosql. The example to use this module is load_interpro.pl in bioperl-db/scripts/biosql/
Bio::OntologyIO::Handlers::InterPro_BioSQL_Handler
A SAX handler to parse interpro.xml and store in biosql. The example to use this module is load_interpro.pl in bioperl-db/scripts/biosql/
Add BaseSAXHandler and modify InterPro_BioSQL_Handler to inherit it
Should be able to load dbxref
Make the interpro parser handle publication etc.
Runnable synopsis
Bio::OntologyIO::dagflat
Allowing an underscore in ontology IDs. this is necessary to be able to parse cjm's OBO_REL relationship ontology, which otherwise observes DAG-Edit format
MPATH ontology doesn't have at least 3 digits in all identifiers
Reverted previous change as it breaks the parser.
Fixes to ontology regex to parse a greater subset of DAG-Edit files. i have tracked down the files where DAG-Edit IDs are validated: GOFlatFileAdapter.java the regex still only matches a subset of the allowed characters in an identifier. identifiers can be any non-whitespace, non ;$,:!\? characters > length 1 on either side of a : separator. i've opted to match \w+:\w+, hopefully we don't need to go beyond this.
Adding escape of SGML and newlines/tabs. is there a generic SGML escape module we want to add as a dependency?
Bio::Phenotype::OMIM::OMIMentry
Finer parse the symptoms
Make the methods accept undef as argument. The methods are edited, created, contributors, and additional_references.
Bio::Phenotype::OMIM::OMIMparser
Finer parse the symptoms
In the original code, the clinical symptoms record returns \'clinical symptoms\' if there is not such record in the parsed file. I change it to return \'\' if nothing is in the file there.
The fields in entry object will return undef, if the info is not available in the parsed file.
Bio::Phenotype::Phenotype
Make the methods, comment and description, able to accept undef as argument.
Bio::PopGen::Simulation::Coalescent
Remove unnecessary event-handler stuff
Lost a use Bio::Tree::Tree somehow
Bio::PopGen::Statistics
Update LD so that it will a) return an pair of values, LD and chiSQ. Also fix it so that composite_LD will calculate correctly with missing data
Properly reset the internal variable counts
Better calculations because need to reset some variables- per Matt's corrections that HW was not being calculated correctly on subsequent iterations
Bio::PrimarySeqI
Translate() can take in a custom codon table
Bio::RangeI
Unflattener now works for multicopy genes RangeI changes - see email to bioperl list > > both intersection() and union() are documented as returning a (start, end, > strand) triple. in actual fact, intersection returns a RangeI compliant > object, and union() returns either a RangeI object or a triple depending > on wantarray() > > I have fixed things so that both intersection() and union() return either > RangeI or triple depending on wantarray() - following the principle of > least surprise - and documented this. The test suite passes. > > This will break code like this: > > $h = { 'range' => $sf->intersection($sf2) } > > since wantarray will be true here; however, this code violates the > previously documented interface anyway. > > I have also added a new method disconnected_ranges() to RangeI > > I could easily migrate this method somewhere else, but it seems to belong > with other geometrical methods such as intersection and union
Fixing spurious warnings caused by strand() now capable of returning undef
Make it so 'disconnected_ranges' sub don't cause warnings
Bio::Restriction::Analysis
Apply fix for big #1548
Added correct support for circular sequences in Bio::Restriction::Analysis (Issue #1540) Added multiple digests with any enzyme combinations
Updated Analysis.pm to incorporate changes from Peter Blaiklock and to move away from using the ^ notation in cut sites. Also tried to implement speed/memory saving devices, mainly now the only data that is stored is the cut positions of each enzyme (rather than fragments). Everything else is calculated when demanded.
Bio::Restriction::Enzyme
Fixes Issue #1518: Enzymes with a cut position of 0 have wrong cut() and site()
Added correct support for circular sequences in Bio::Restriction::Analysis (Issue #1540) Added multiple digests with any enzyme combinations
Updated Analysis.pm to incorporate changes from Peter Blaiklock and to move away from using the ^ notation in cut sites. Also tried to implement speed/memory saving devices, mainly now the only data that is stored is the cut positions of each enzyme (rather than fragments). Everything else is calculated when demanded.
Bio::Restriction::Enzyme::MultiSite
Added correct support for circular sequences in Bio::Restriction::Analysis (Issue #1540) Added multiple digests with any enzyme combinations
Bio::Restriction::EnzymeCollection
Added correct support for circular sequences in Bio::Restriction::Analysis (Issue #1540) Added multiple digests with any enzyme combinations
Updated Analysis.pm to incorporate changes from Peter Blaiklock and to move away from using the ^ notation in cut sites. Also tried to implement speed/memory saving devices, mainly now the only data that is stored is the cut positions of each enzyme (rather than fragments). Everything else is calculated when demanded.
Bio::Restriction::IO
Added correct support for circular sequences in Bio::Restriction::Analysis (Issue #1540) Added multiple digests with any enzyme combinations
Updated Analysis.pm to incorporate changes from Peter Blaiklock and to move away from using the ^ notation in cut sites. Also tried to implement speed/memory saving devices, mainly now the only data that is stored is the cut positions of each enzyme (rather than fragments). Everything else is calculated when demanded.
Bio::Restriction::IO::bairoch
Updated Analysis.pm to incorporate changes from Peter Blaiklock and to move away from using the ^ notation in cut sites. Also tried to implement speed/memory saving devices, mainly now the only data that is stored is the cut positions of each enzyme (rather than fragments). Everything else is calculated when demanded.
Bio::Restriction::IO::base
Fix suggested in Issue #1538
Added correct support for circular sequences in Bio::Restriction::Analysis (Issue #1540) Added multiple digests with any enzyme combinations
Updated Analysis.pm to incorporate changes from Peter Blaiklock and to move away from using the ^ notation in cut sites. Also tried to implement speed/memory saving devices, mainly now the only data that is stored is the cut positions of each enzyme (rather than fragments). Everything else is calculated when demanded.
Bio::Root::IO
In order for rmtree() to work in cygwin
Cleanup of debugging a little for uniformity
The following regression tests now pass: GFF, SeqFeature, Registry
Bio::Root::Version
Set version to 1.302
Version 1.4
Bio::Search::HSP::BlastHSP
Parse/store Links from WU-BLAST run with -links param
BlastHSP really doesn't work, migrate links to GenericHSP. Links is for WU-BLAST links parsing
Bio::Search::HSP::FastaHSP
Trim leading spaces for what it is worth
Bio::Search::HSP::GenericHSP
BlastHSP really doesn't work, migrate links to GenericHSP. Links is for WU-BLAST links parsing
Fix so that we don't have undef warnings
Bio::Search::Hit::GenericHit
Some more protection for when there are no HSPs for a Hit
Issue #1558
Bio::Search::Result::PsiBlastResult
Old directory name
Removing PSIBLAST modules
Bio::Search::SearchUtils
Some more protection for when there are no HSPs for a Hit
Bio::SearchIO
Avoid warnings for empty reports
Bio::SearchIO::IteratedSearchResultEventBuilder
Insure we explicitly set the HSP type in the Factory
Bio::SearchIO::Writer::GbrowseGFF
Gbrowse now allows tstart and tend tags for alignment features to make it more like normal GFF.
Bio::SearchIO::Writer::HTMLResultWriter
Only output HSP's Hit header when there are actual HSPs (when there are more hits than B, and B and V are not equal)
Allow toggling of WU-BLAST link display
Bio::SearchIO::Writer::TextResultWriter
Only output HSP's Hit header when there are actual HSPs (when there are more hits than B, and B and V are not equal)
Allow toggling of WU-BLAST link display
Bio::SearchIO::blast
Can parse B=0 BLAST output and at least produce minimal Hit objects
Parse more BLAST statistics, WU-BLAST frame specific lambda,entropy,kappa stored. Gapped lambda stored separately as well
Parse/store Links from WU-BLAST run with -links param
Grab statistics, posted date from WU-BLAST reports
Why bother with a complicated regexp - just grab the whole thing
Bio::SearchIO::blastxml
Blastxml expected <!DOCTYPE> and <BlastOutput> on the same line. my version of blastall puts them on different lines, which caused the parse to fail (from internal refactoring of <?xml> and <!DOCTYPE> tags). this change fixes the bug. tests added to SearchIO.t and a test blastxml file added.
Bio::SearchIO::chado
These are now redundant
Bio::SearchIO::chadosxpr
These are now redundant
Bio::SearchIO::exonerate
Started vulgar line parsing in exonerate.pm; need some hints about how to handle gaps (inside HSPs or outside?)
Bio::SearchIO::fasta
Better FASTA family program matching - was not parsing SSEARCH properly, cleaner regexp
I think we need to jump to the query line if we are presented with non-numbered opening line
More work to parse dreaded fasta context in alignments - this should work for the 2 examples Robin Emig has provided
Avoid inifinite loops when coordinate line ends with a number and no extra ws
Bio::SearchIO::hmmer
Store db name for hmmsearch results in r->database_name
Store both database_name and seqfile_name even though redundant - keep consistency with rest of SearchIO
Bio::SearchIO::psiblast
Removing PSIBLAST modules
Bio::SearchIO::psl
Fix to detect file header for psLayout3 (psl version 3)
Remove some warnings when running tests with warnings on
Bio::SearchIO::sim4
Make it so empty fields don't cause warnings
Bio::SearchIO::wise
Was swapping query/target improperly
Bio::Seq
Hope this change will keep the Examples section at the top of the module's HTML page
Bio::Seq::EncodedSeq
Fixed strandedness issues
Bio::Seq::Meta
Doc fixes from #1553
Bio::Seq::MetaI
Doc fixes from #1553
Bio::Seq::PrimedSeq
Removed global $version
Bio::SeqFeature::Generic
Fixed Bio::SeqFeature::Generic so that it will accept a score of 0; modified Bio::Graphics::Glyph::graded_segment so that it draws a fg box around each segment by default (can restore default behavior with -vary_fg=>1)
Bio::SeqFeature::Primer
Removed global $version
Bio::SeqFeature::Tools::TypeMapper
Updated type sequence_variant to reflect SO
Bio::SeqFeature::Tools::Unflattener
Unflattener now works for multicopy genes RangeI changes - see email to bioperl list > > both intersection() and union() are documented as returning a (start, end, > strand) triple. in actual fact, intersection returns a RangeI compliant > object, and union() returns either a RangeI object or a triple depending > on wantarray() > > I have fixed things so that both intersection() and union() return either > RangeI or triple depending on wantarray() - following the principle of > least surprise - and documented this. The test suite passes. > > This will break code like this: > > $h = { 'range' => $sf->intersection($sf2) } > > since wantarray will be true here; however, this code violates the > previously documented interface anyway. > > I have also added a new method disconnected_ranges() to RangeI > > I could easily migrate this method somewhere else, but it seems to belong > with other geometrical methods such as intersection and union
Reuses exons (eg containment graph not a tree) improved algorithm for matching mRNAs with CDSs
Added convenience method get_tagset_values() to SeqFeatureI fixed Unflattener to deal with strange NT_078847 record
Made the Unflattener more informative in certain problemmatic situations; added a new test
Bio::SeqFeatureI
Added convenience method get_tagset_values() to SeqFeatureI fixed Unflattener to deal with strange NT_078847 record
Bio::SeqIO
Alternate ABI extension for newer versions of software (requested by Jan Aerts)
Guessing sequence and align formats by looking into file
Don't need to guess if is supplied
Fail better with no arguments to new()
Changed GuessSeqFormat to return undef
Bio::SeqIO::asciitree
Runnable synopsis
Bio::SeqIO::chado
These are now redundant
Bio::SeqIO::chadoitext
These are now redundant
Bio::SeqIO::chadosxpr
These are now redundant
Bio::SeqIO::chadoxml
Version 1.0 of the GenBank to chadoxml converter .
Add more comments, and implement "match" operation for pubs.
Add exception handling in response to Issue #1532. please note that this chadoxml module is written to work with DNA sequence and annotation data from whole genome projects that are deposited in GenBank.
Fixed Issue #1532 with regard to filehandle; modified code to eliminate warrnings when run with 'perl -w'.
Comment out debugging output and use of environment variable $CodeBase .
Fix bug of null srcfeature_id for featureloc when srcfeature is not given on write_seq() invocation; fix bug of not calling unflattener when -seq_so_type is explicitly supplied.
Fix so that it will use the Root::IO filehandle rather than rely on filenames to be specified. Clean up documentation formatting some. Issue throws rather than dies. Code formatting changes
Bio::SeqIO::embl
Better parsis of virus names
Resolves Issue #1516. Feature qualifier value quotation is now exactly according to spesifications
Accession numbers in AC line are separated by '; ' not by ';'
SV line was mangled (like 'SV M20132; J03180.1') if there were secondary accession numbers
Fixed embl dumping to allow breaks at hyphens
Warn if whitespace in display_id()
Bio::SeqIO::fasta
Warn if whitespace in display_id()
Bio::SeqIO::game
Getting ready to add new SeqIO::game modules
Added new Bio::SeqIO::game modules, added tests to t/SeqIO.t
1) updated game XML parser/writer to deal more effectively with mRNA/CDS pairs and retain UTR splice info and play nicely with Bio::SeqFeature::Tools::Unflattener 2) moved game XML tests from SeqIO.t to game.t; added new tests 3) updated Bio::Tools::GuessSeqFormat for new game format
Bio::SeqIO::game::featHandler
Added new Bio::SeqIO::game modules, added tests to t/SeqIO.t
Avoid warning for when set->{Attributes}->{produces_seq} is undef
Added some changes to the ways mRNAs and UTRs are handled. Silenced some squawks (sorry!)
Bio::SeqIO::game::featureHandler
Getting ready to add new SeqIO::game modules
Bio::SeqIO::game::gameHandler
Added new Bio::SeqIO::game modules, added tests to t/SeqIO.t
Bio::SeqIO::game::gameSubs
Added new Bio::SeqIO::game modules, added tests to t/SeqIO.t
1) updated game XML parser/writer to deal more effectively with mRNA/CDS pairs and retain UTR splice info and play nicely with Bio::SeqFeature::Tools::Unflattener 2) moved game XML tests from SeqIO.t to game.t; added new tests 3) updated Bio::Tools::GuessSeqFormat for new game format
Bio::SeqIO::game::gameWriter
Added new Bio::SeqIO::game modules, added tests to t/SeqIO.t
1) updated game XML parser/writer to deal more effectively with mRNA/CDS pairs and retain UTR splice info and play nicely with Bio::SeqFeature::Tools::Unflattener 2) moved game XML tests from SeqIO.t to game.t; added new tests 3) updated Bio::Tools::GuessSeqFormat for new game format
Bio::SeqIO::game::idHandler
Getting ready to add new SeqIO::game modules
Bio::SeqIO::game::seqHandler
Getting ready to add new SeqIO::game modules
Added new Bio::SeqIO::game modules, added tests to t/SeqIO.t
1) updated game XML parser/writer to deal more effectively with mRNA/CDS pairs and retain UTR splice info and play nicely with Bio::SeqFeature::Tools::Unflattener 2) moved game XML tests from SeqIO.t to game.t; added new tests 3) updated Bio::Tools::GuessSeqFormat for new game format
Bio::SeqIO::gcg
Simple fix to Issue #1549 - would still benefit from changes suggested by Derek
Warn if whitespace in display_id()
Bio::SeqIO::genbank
Better parsis of virus names
Resolves Issue #1516. Feature qualifier value quotation is now exactly according to spesifications
Never print empty residue range on REFERENCE line. Fixes this 'REFERENCE 1 (bases 0 to 0)'
Fix Issue #1543. Do some parsing cleanup
Warn if whitespace in display_id()
Bio::SeqIO::kegg
Adding a parser for KEGG sequence records. currently does not support parsing of the CODON_USAGE and POSITION tags. -allen
Need to set the species tag to an object, not a string
Fixes to ontology regex to parse a greater subset of DAG-Edit files. i have tracked down the files where DAG-Edit IDs are validated: GOFlatFileAdapter.java the regex still only matches a subset of the allowed characters in an identifier. identifiers can be any non-whitespace, non ;$,:!\? characters > length 1 on either side of a : separator. i've opted to match \w+:\w+, hopefully we don't need to go beyond this.
Bio::SeqIO::pir
Warn if whitespace in display_id()
Bio::SeqIO::swiss
Better parsis of virus names
Resoving Issue #1519 1. fixed sprintf bug sometimes leading to extra space after ID tag 2. OS line output for viri now contains all the information after species name. The complex strain/abbreviation/common name list is stored in sub_species() which was previously not in use for viri. This is a hack but the (first) OS line now makes a perfect round trip.
Warn if whitespace in display_id()
Bio::SeqIO::tigr
Adding TIGR XML parser
Bio::SeqUtils
Translate_6frames() failed on sequences where bioperl would guess that the sequence string is protein. Streamlined coding of the method to avoid guessing.
Bio::SimpleAlign
Adding a parser and tests for UCSC maf (multiple alignment format) format. added a method SimpleAlign::splice_by_seq_pos to allow splicing of all sequences based on the gap locations of one sequence within the alignment. this could in principle be called repeatedly to remove all gaps from the MSA.
Cleaned up unit test spurious warnings. bugfix in maf parser for detecting last record in file. added functionality to trim gaps from a MSA for a given sequence to SimpleAlign. trimming allowed implementation of exporting Seq and SeqFeatures from SimpleAlign. the api here is still rough, comments appreciated.
Run clean with -w on
Offset location of new seq with features by location of original seq requested to build from. added rudimentary key/value parsing for maf 'a' lines
Bio::Species
Commented out internal calls to methods not doing anything
Bio::Structure::IO
Fixed email
Bio::Structure::IO::pdb
Append to the hash rather than allocate to objects for this
Bio::Taxonomy
Clean up the rank sets
Bio::Tools::Analysis::DNA::ESEfinder
Bulletproof the code some more, report warnings from POST events - now require HTML::HeadParser to be installed
Bio::Tools::Analysis::Protein::Domcut
Improvedfeature parsing
Tests fail better without network access
Added source tags to features
Bio::Tools::Analysis::Protein::ELM
INitial commit of ELM.pm, a wrapper around the EMBL functional peptide motif prediction server
Made printing "." to indicate waiting depend on verbosity. Unconditional printing messed up the test suite. An altenative would be to print to STDERR, instead.
Added source tags to features
Bulletproof code some more
Bio::Tools::Analysis::Protein::GOR4
Tests fail better without network access
Added source tags to features
Bio::Tools::Analysis::Protein::HNN
Tests fail better without network access
Added source tags to features
Bio::Tools::Analysis::Protein::NetPhos
Necessary change to result parser for non-standard sequence names
Added source tags to features
Bio::Tools::Analysis::Protein::Scansite
Wrapper arounf Scansite phosphorylation site server
White space fixes to make code easier to read
Bio::Tools::Analysis::Protein::Sopma
Tests fail better without network access
Added source tags to features
Bio::Tools::Analysis::SimpleAnalysisBase
New method - clear() allows multiple anlyses using the same object
Bio::Tools::BPlite
Issue #1542 - improper detection of end of Query regexp
Bio::Tools::BPlite::Iteration
Have be set to instead of undef - perhaps this is not entirely the best thing - are we screwing up in the parsing instead? use Bio::SearchIO instead I guess
Bio::Tools::CodonTable
New method add_table() if you know what you are doing you can add custom codon table
Bio::Tools::EMBOSS::Palindrome
Parsing and tests for 'palindrome'
Bio::Tools::GFF
GFF3 parsing and writing support
Include a header line describing that this is GFF3 per spec.
Hack-y support for writing out FeaturePairs, but a start. We need to support the notion of Feature themselves generating attribute section
Fix slightly brain dead code to use arrays instead of chop;chop;
Adding support for parsing GFF ##sequence-region header lines. these are transformed into featureless Bio::LocatableSeq objects, available via the next_segment method.
Needed to move header parsing outside of next_feature, as it may be useful to handle sequences before sequence features (think database inserts).
Changed GFF3 parse to split on spaces and added a note about not supporting sequences
Fixed to deal with split locations for GFF3 output. Should probably do the same for GFF2 and GFF1 but that code makes my head hurt.
Bio::Tools::GuessSeqFormat
Guessing sequence and align formats by looking into file
Remove line(), replace with text() which takes in multi line texts
Changed GuessSeqFormat to return undef
Fix game format guess
1) updated game XML parser/writer to deal more effectively with mRNA/CDS pairs and retain UTR splice info and play nicely with Bio::SeqFeature::Tools::Unflattener 2) moved game XML tests from SeqIO.t to game.t; added new tests 3) updated Bio::Tools::GuessSeqFormat for new game format
Bio::Tools::Phylo::PAML
Silenced a warning reported in Issue #1560
George Hartzell fixes
Bio::Tools::Phylo::PAML::Result
Doc fix
Bio::Tools::Primer3
Removed global $version
Bio::Tools::Prints
Prints now returns FeaturePair, rather than previously SeqFeature.
Bio::Tools::RandomDistFunctions
Utility module for generating random variates according to different distributions
Docu update
Bio::Tools::Run::RemoteBlast
Set an agent string per request from NCBI G. Coulouris
Bio::Tools::Run::StandAloneBlast
Allow SearchIO to be used for all output format types now with _READMETHOD set
Bio::Tools::SeqWords
New method: count_overlap_words() feature enhancement from Issue #1554
Bio::Tools::Signalp
Add the SignalP-HMM result. $feat->score; # Signal peptide probability $feat->get_tag_values('peptideProb')->[0]; # signalp peptide probability $feat->get_tag_values('anchorProb')->[0]; # signalp anchor probability
Bio::Tree::Node
Sort as advertised
Simpler way to add children with descendents all at once, add documentation for the feature, left/right was never an options, just specify an array of descendents
Throw if we pass in a hash reference too
Bulletproofing for length 0 branch lengths
Bio::Tree::NodeI
Default doesn't need to print out leaf status
Merge of NHX parsing changes to the branch - just in case and to provide diffs for people working with 1.2.3
Bio::Tree::RandomFactory
Re-implementation which is more in line with generating phylogenetically proper trees - using Mike Sanderson's suggestions based on his r8s code. Still needs some more work and some tests
Still probably some errors in assigning the tree weights, Mike assigns time slighly differently than I do so I need to figure out if distributions are really correct and/or if the normalization technique is really working.
Simpler way to add children with descendents all at once, add documentation for the feature, left/right was never an options, just specify an array of descendents
Bio::Tree::TreeFunctionsI
Bulletproofing for length 0 branch lengths
Bio::TreeIO
Adding support to output SVG::Graph trees via TreeIO subsystem. svggraph.pm module courtesy of Brian O'Connor <boconnor@ucla.edu>
Some docs
Extension support for lintree
Bio::TreeIO::TreeEventBuilder
Some debugging and better detection of when we need to pop off the stack
Merge of NHX parsing changes to the branch - just in case and to provide diffs for people working with 1.2.3
Some docs
Bio::TreeIO::lintree
Lintree parser and tests
Bio::TreeIO::newick
Bulletproofing for length 0 branch lengths
Issue #1570, support multi-lined newick files again. Strip \n and \r as they mean nothing and should not be converted to _ with Allen's WS handler regexp
Bio::TreeIO::nexus
Some docs
Bio::TreeIO::nhx
Fix problems with parsing, better detectiono of 'leafstatus' now, rebless Nodes to NodeNHX for NHX writing - this could cause problems later, but works as is right now
A cheating way, but fixing Issue #1471
Merge of NHX parsing changes to the branch - just in case and to provide diffs for people working with 1.2.3
Bio::TreeIO::svggraph
Adding support to output SVG::Graph trees via TreeIO subsystem. svggraph.pm module courtesy of Brian O'Connor <boconnor@ucla.edu>
Bio::TreeIO::tabtree
First attempt at making tabtree more useful
Makefile.PL
Removed dependency on Data::Stag - no longer required by new chadoxml code
An experiment - let's enable Makefile.PL pre-reqs (dependancies)
Added SVG dependency
Update prereqs
Add DG::SVG to (optional) dependencies
It's 'GD', not 'DG'
bioscripts.pod
Add mention of filter_search.PLS
Add mention of rnai_finder.cgi
Add back bioflat
New directory
Move from examples/ to scripts/
Added mention of all_glyphs.pl
bptutorial.pl
Issue #1527, fix on branch for consistency, even if we don't do another release off this branch
Add some URLs
Old URLs
Hard return breaks link
Hard return breaks link
PAML HOWTO
Tests fail better without network access
Edits
Better discussion of sequence objects. Edits.
examples/biblio/biblio_examples.pl
Add another useful bit
A couple more examples
examples/biographics/all_glyphs.pl
New script that displays all glyphs; enabled SVG output from other example scripts
Removed some extraneous internal notes on glyph compatability
Added a shaded arrow
examples/biographics/dynamic_glyphs.pl
Added support for SVG output
New script that displays all glyphs; enabled SVG output from other example scripts
examples/biographics/lots_of_glyphs.pl
Added support for SVG output
New script that displays all glyphs; enabled SVG output from other example scripts
examples/db/bioflat_index.pl
Add back bioflat
Binarysearch and flat are the same
Move from examples/ to scripts/
examples/rnai_finder.cgi
Initial check-in of cgi script for RNAi reagent design
Moved rnai_finder.cgi from examples to examples/sirna
examples/sirna/TAG
Moved rnai_finder.cgi from examples to examples/sirna
examples/sirna/rnai_finder.cgi
Moved rnai_finder.cgi from examples to examples/sirna
maintenance/authors.pl
Helper script to keep AUTHORS file up to date
models/popgen.dia
Bio::PopGen classes by Albert Vilella
scripts/Bio-DB-GFF/bp_genbank2gff.PLS
Changes making genbank2gff.pl use SOFA terms for type names in generated GFF3
Removed dependency on having a running mysql database when writing converted GFF3 files to stdout
This biofetch adaptor and genbank2gff are better than before, but not yet perfect. Fixed: - Gets right the parentage of CDSs (belong to mRNAs) and exons (belong to chromosomes or other regions). - variation features are now clasified as SNPs (if the length is 1), else chromosome_variation. - no longer creates CDS lines that span introns. Remaining problems: - with multiple transcript genes, it can get confused as to which CDSs go with which mRNA (solution: reimplement with Unflattener) - SNP features can all get assigned the same ID, which is illegal in GFF3, though I don't think it will cause problems with gbrowse (though it probably will with chado).
Slight adjustment to genbank2gff script to handle fly gene records right
scripts/Bio-DB-GFF/bulk_load_gff.PLS
Added minor workarounds for Bio::DB::GFF roundtripping to artemis
Fixed overly-promiscuous regexp for detecting FASTA files
Added option to set MAX_BIN, and updated the postgres loader to deal with gff3 (note that the gff3 stuff is completely untested though)
Added support for bulk loading from a local gff source to a remote db server
Added support for dsn strings in the form of "dbi:mysql:database=xxx;host=xxx"
Fixed a minor gff3 bug
scripts/Bio-DB-GFF/fast_load_gff.PLS
Added minor workarounds for Bio::DB::GFF roundtripping to artemis
Fixed overly-promiscuous regexp for detecting FASTA files
Added an option for setting MAX_BIN
Fixed a minor gff3 bug
scripts/Bio-DB-GFF/load_gff.PLS
Fixed overly-promiscuous regexp for detecting FASTA files
Added an option to set MAX_BIN
scripts/Bio-DB-GFF/pg_bulk_load_gff.PLS
Added option to set MAX_BIN, and updated the postgres loader to deal with gff3 (note that the gff3 stuff is completely untested though)
Fixed a minor gff3 bug
scripts/DB/bioflat_index.PLS
Move from examples/ to scripts/
scripts/graphics/frend.PLS
Bio::Graphics::FeatureFile: remove uninit variable warning when calling features() without arguments; fixed frend web-based feature renderer to accomodate recent changes in FeatureFile API
scripts/graphics/search_overview.PLS
Prevent barfing when Hit doesn't have any HSPs
scripts/popgen/composite_LD.PLS
Fix to deal with newer API
Print with new API
scripts/searchio/filter_search.PLS
Added in simple filtering script using SearchIO - useful for me, and could be useful to others over time
scripts/utilities/search2gff.PLS
Allow overriding of the source tag as parsed by SearchIO
Output 'match' and 'component' lines for GFF dumping
Personal tools
Namespaces
Variants
Actions
Main Links
documentation
community
development
Toolbox