How to add an integer to a difference calculation and print it to the end of a line?

6

Goal: To print the difference between two fields separated by semicolons ($3 and $2) and add an integer (+1) to that calculated value at the end of each line beginning with ">".

Representative sample of my file:

>lcl|ORF1_      17609   17804   (+):21:131 unnamed protein product
MEKVKNKFDENDIKVPFVPSSLLFNNTGNLNTMDKR
>lcl|ORF2_      17609   17804   (+):70:111 unnamed protein product
MFLLHYYLIIQVI
>lcl|ORF3_      17609   17804   (+):112:147 unnamed protein product
MQWIKDKVLIK
>lcl|ORF4_      17609   17804   (+):129:91 unnamed protein product
MFYPLYLDYLYY
>lcl|ORF5_      17609   17804   (+):90:1 unnamed protein product, partial
MIMKKEQMELLYHSHQIYFLPFPLHQNIHP

Desired Output:

>lcl|ORF1_      17609   17804   (+):21:131 unnamed protein product:111
MEKVKNKFDENDIKVPFVPSSLLFNNTGNLNTMDKR
>lcl|ORF2_      17609   17804   (+):70:111 unnamed protein product:42
MFLLHYYLIIQVI
>lcl|ORF3_      17609   17804   (+):112:147 unnamed protein product:36
MQWIKDKVLIK
>lcl|ORF4_      17609   17804   (+):129:91 unnamed protein product:39
MFYPLYLDYLYY
>lcl|ORF5_      17609   17804   (+):90:1 unnamed protein product, partial:90
MIMKKEQMELLYHSHQIYFLPFPLHQNIHP

My current awk script gets me very close by printing the difference between $3 and $2 at the end of each line, but does not include the +1 addition step (required) and is not specific to lines beginning with ">", despite my attempt with /^ *>/ (not required, but nice):

$ awk -F":" 'BEGIN {OFS=FS} /^ *>/ {$4=$3-$2} $4<0 {$4=-$4} 1' file

>lcl|ORF1_      17609   17804   (+):21:131 unnamed protein product:110
MEKVKNKFDENDIKVPFVPSSLLFNNTGNLNTMDKR:::0
>lcl|ORF2_      17609   17804   (+):70:111 unnamed protein product:41
MFLLHYYLIIQVI:::0
>lcl|ORF3_      17609   17804   (+):112:147 unnamed protein product:35
MQWIKDKVLIK:::0
>lcl|ORF4_      17609   17804   (+):129:91 unnamed protein product:38
MFYPLYLDYLYY:::0
>lcl|ORF5_      17609   17804   (+):90:1 unnamed protein product, partial:89
MIMKKEQMELLYHSHQIYFLPFPLHQNIHP:::0

Attempts to add the integer (+1) to the difference calculation:

$ awk -F":" 'BEGIN {OFS=FS} /^ *>/ {$4+1=$3-$2} $4<0 {$4=-$4} 1' file
awk: line 1: syntax error at or near =

$ awk -F":" 'BEGIN {OFS=FS} /^ *>/ {$4+=1=$3-$2} $4<0 {$4=-$4} 1' file
awk: line 1: syntax error at or near =

$ awk -F":" -v n=1 'BEGIN {OFS=FS} /^ *>/ {$4+n=$3-$2} $4<0 {$4=-$4} 1' file
awk: line 1: syntax error at or near =

And although I'm not sure how to implement functions using awk, I think there could be some utility in using something similar to this:

$ function add_one (number) {
      return number + 1
  }
$ awk -F":" 'BEGIN {OFS=FS} /^ *>/ {add_one($4)=$3-$2} $4<0 {$4=-$4} 1' file

While I have been attempting to use awk to solve this problem, I am interested in any solution (e.g., since I am attempting to perform this calculation line-by-line, perhaps there is a more efficient solution with sed?).

Share
Improve this question
3
  • 3
    Upvote for well asked question which also provided your attempts to solve the problem yourself (which has been a rare occurrence lately on SO...) – David C. Rankin Apr 10 at 4:24
  • 1
    Thanks David! That means a lot :) I'm trying my best to figure out how to solve my problems before posting to SO which I think really improves my ability to articulate what I am having trouble solving. – Gawain Apr 10 at 4:28
  • 1
    It makes a world of difference in how your questions are received. The old adage that "You never get a second chance to make a good first impression" rings true here as well :) – David C. Rankin Apr 10 at 4:29

Comments

Popular posts from this blog

Meaning of `{}` for return expression

Get current scroll position of ScrollView in React Native

flutter websocket connection issue