Processing netMHCII-pan prediction output


Like most informatics throughput methods, epitope prediction generates a lot of output and in a not so friendly format suitable for subsequent analysis. I considered writing a parser for the output using Ruby, but would that not take long? A simple vim function that I added to my .vimrc file to format the output and use a single keystroke worked the magic and saved time.

" formating output from netMHCII-pan program
function! FormatNetmhcOutput()
   g/^\#/norm dd 
   g/^--/norm dd
   g/^Protein/norm dd
   %le
   g/^pos/norm dd
   %s/<=\sWB//g
   %s/<=\sSB//g
   %s/\s\+$//
   %s/\s\+/,/g
   g/^$/d
endfunction
nmap   ;h  :call FormatNetmhcOutput()

This function can be called by pressing the ; and h key when in normal mode. It removes comments and provides a csv output that can be read with a simple R directive.

data <– read.csv("file.csv") 

sample output

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s