Convert a fastA file to a hash


Sometimes you might want to convert a file of fastA sequences to a hash. Here is a one line method that might come handy for that.

require 'bio'
file_path = "example.fasta"

def fasta_to_hash
  Bio::FlatFile.auto(file_path){ |f| f.map {|entry| Hash.[](entry.definition.to_sym,[entry.seq.to_s])}}
end

 #=>[{:"seq1"=>["gatataggagatatcgttagag"]}]

The result is an array of hashes. Each hash key corresponds to the sequence name

Advertisements

One comment

  1. Rob Syme

    In a similar vein:

    def fasta_to_hash(filename)
    Hash[ Bio::FlatFile.auto(filename).map{|entry| [entry.definition, entry.seq] } ]
    end

    To take a filename and return a hash keyed by sequence definitions that contains Bio::Sequence objects. This is be Ruby 1.9 ony though…

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s