Register
It is currently Mon Sep 22, 2014 6:18 pm

Script to find and alter numbers


All times are UTC - 6 hours


Post new topic Reply to topic  [ 15 posts ] 
Author Message
 PostPosted: Mon Apr 14, 2008 5:25 am   

Joined: Mon Apr 14, 2008 5:07 am
Posts: 9
Hi,

I want to build a script that alters the origin in a plasmid. The plasmid is stored in a XML file, and is circular. The way I want to do it is to build a script that identifies all numbers in the file and subtracts a certain number from them that is a user input.
Furthermore, if the number after subtraction is less than zero, it should be added to another number (the totalt size of the plasmid) that is also a user input. For instance, if the original number is 4 and the subtaction is 6, the result is -2. The totalt size of the plasmid is 10, therefore the output should read -2+10=8.

Is this even possible with bash or should I try another scripting method? I've attached a sample file of the type I want to play around with.

Thank you for your help!

Code:
<root>

  <fields>

    <isCircular>true</isCircular>

  </fields>

  <created type="date">2008-04-09 09:06:53</created>

  <name>pHR-cPPT.TRE.eGFP.WPRE.SIN</name>

  <description>A new nucleotide sequence entered manually</description>

  <sequenceAnnotations>

    <annotation>

      <description>BamHI</description>

      <type>restriction site</type>

      <intervals>

        <interval>

          <minimumIndex>2657</minimumIndex>

          <maximumIndex>2662</maximumIndex>

          <direction>none</direction>

        </interval>

      </intervals>

      <qualifiers>

        <qualifier>

          <name>Recognition pattern</name>

          <value>G^GATCC</value>

        </qualifier>

      </qualifiers>

    </annotation>

    <annotation>

      <description>ClaI</description>

      <type>restriction site</type>

      <intervals>

        <interval>

          <minimumIndex>2213</minimumIndex>

          <maximumIndex>2218</maximumIndex>

          <direction>none</direction>

        </interval>

      </intervals>

      <qualifiers>

        <qualifier>

          <name>Recognition pattern</name>

          <value>AT^CGAT</value>

        </qualifier>

      </qualifiers>

    </annotation>

    <annotation>

      <description>cPPT</description>

      <type>misc_feature</type>

      <intervals>

        <interval>

          <minimumIndex>2095</minimumIndex>

          <maximumIndex>2212</maximumIndex>

          <direction>none</direction>

        </interval>

      </intervals>

      <qualifiers>

        <qualifier>

          <name>created by</name>

          <value>User</value>

        </qualifier>

      </qualifiers>

    </annotation>

    <annotation>

      <description>psi</description>

      <type>misc_feature</type>

      <intervals>

        <interval>

          <minimumIndex>686</minimumIndex>

          <maximumIndex>823</maximumIndex>

          <direction>none</direction>

        </interval>

      </intervals>

      <qualifiers>

        <qualifier>

          <name>created by</name>

          <value>User</value>

        </qualifier>

      </qualifiers>

    </annotation>

    <annotation>

      <description>RRE</description>

      <type>misc_feature</type>

      <intervals>

        <interval>

          <minimumIndex>1310</minimumIndex>

          <maximumIndex>1514</maximumIndex>

          <direction>none</direction>

        </interval>

      </intervals>

      <qualifiers>

        <qualifier>

          <name>created by</name>

          <value>User</value>

        </qualifier>

        <qualifier>

          <name>modified by</name>

          <value>User</value>

        </qualifier>

      </qualifiers>

    </annotation>

    <annotation>

      <description>3'LTR</description>

      <type>3'UTR</type>

      <intervals>

        <interval>

          <minimumIndex>4135</minimumIndex>

          <maximumIndex>4370</maximumIndex>

          <direction>none</direction>

        </interval>

      </intervals>

      <qualifiers>

        <qualifier>

          <name>created by</name>

          <value>User</value>

        </qualifier>

      </qualifiers>

    </annotation>

    <annotation>

      <description>SV40</description>

      <type>misc_feature</type>

      <intervals>

        <interval>

          <minimumIndex>8673</minimumIndex>

          <maximumIndex>8928</maximumIndex>

          <direction>none</direction>

        </interval>

      </intervals>

      <qualifiers>

        <qualifier>

          <name>created by</name>

          <value>User</value>

        </qualifier>

      </qualifiers>

    </annotation>

    <annotation>

      <description>TRE</description>

      <type>helix</type>

      <intervals>

        <interval>

          <minimumIndex>2219</minimumIndex>

          <maximumIndex>2655</maximumIndex>

          <direction>leftToRight</direction>

        </interval>

      </intervals>

      <qualifiers>

        <qualifier>

          <name>created by</name>

          <value>User</value>

        </qualifier>

      </qualifiers>

    </annotation>

    <annotation>

      <description>XGPT</description>

      <type>ORF</type>

      <intervals>

        <interval>

          <minimumIndex>9153</minimumIndex>

          <maximumIndex>9610</maximumIndex>

          <direction>leftToRight</direction>

        </interval>

      </intervals>

      <qualifiers>

        <qualifier>

          <name>created by</name>

          <value>User</value>

        </qualifier>

      </qualifiers>

    </annotation>

    <annotation>

      <description>WPRE</description>

      <type>misc_feature</type>

      <intervals>

        <interval>

          <minimumIndex>3460</minimumIndex>

          <maximumIndex>4048</maximumIndex>

          <direction>leftToRight</direction>

        </interval>

      </intervals>

      <qualifiers>

        <qualifier>

          <name>created by</name>

          <value>User</value>

        </qualifier>

      </qualifiers>

    </annotation>

    <annotation>

      <description>5'LTR</description>

      <type>5'UTR</type>

      <intervals>

        <interval>

          <minimumIndex>1</minimumIndex>

          <maximumIndex>634</maximumIndex>

          <direction>none</direction>

        </interval>

      </intervals>

      <qualifiers>

        <qualifier>

          <name>created by</name>

          <value>User</value>

        </qualifier>

      </qualifiers>

    </annotation>

    <annotation>

      <description>GFP</description>

      <type>gene</type>

      <intervals>

        <interval>

          <minimumIndex>2703</minimumIndex>

          <maximumIndex>3419</maximumIndex>

          <direction>leftToRight</direction>

        </interval>

      </intervals>

      <qualifiers>

        <qualifier>

          <name>created by</name>

          <value>User</value>

        </qualifier>

      </qualifiers>

    </annotation>

    <annotation>

      <description>SV40 polyA</description>

      <type>misc_feature</type>

      <intervals>

        <interval>

          <minimumIndex>10012</minimumIndex>

          <maximumIndex>10861</maximumIndex>

          <direction>leftToRight</direction>

        </interval>

      </intervals>

      <qualifiers>

        <qualifier>

          <name>created by</name>

          <value>User</value>

        </qualifier>

      </qualifiers>

    </annotation>

    <annotation>

      <description>b-lacatamase</description>

      <type>gene</type>

      <intervals>

        <interval>

          <minimumIndex>6759</minimumIndex>

          <maximumIndex>7619</maximumIndex>

          <direction>leftToRight</direction>

        </interval>

      </intervals>

      <qualifiers>

        <qualifier>

          <name>created by</name>

          <value>User</value>

        </qualifier>

      </qualifiers>

    </annotation>

    <annotation>

      <description>KpnI</description>

      <type>restriction site</type>

      <intervals>

        <interval>

          <minimumIndex>2521</minimumIndex>

          <maximumIndex>2526</maximumIndex>

          <direction>none</direction>

        </interval>

        <interval>

          <minimumIndex>4066</minimumIndex>

          <maximumIndex>4071</maximumIndex>

          <direction>none</direction>

        </interval>

      </intervals>

      <qualifiers>

        <qualifier>

          <name>Recognition pattern</name>

          <value>GGTAC^C</value>

        </qualifier>

      </qualifiers>

    </annotation>

    <annotation>

      <description>EcoRI</description>

      <type>restriction site</type>

      <intervals>

        <interval>

          <minimumIndex>3449</minimumIndex>

          <maximumIndex>3454</maximumIndex>

          <direction>none</direction>

        </interval>

        <interval>

          <minimumIndex>4054</minimumIndex>

          <maximumIndex>4059</maximumIndex>

          <direction>none</direction>

        </interval>

      </intervals>

      <qualifiers>

        <qualifier>

          <name>Recognition pattern</name>

          <value>G^AATTC</value>

        </qualifier>

      </qualifiers>

    </annotation>

    <annotation>

      <description>XhoI</description>

      <type>restriction site</type>

      <intervals>

        <interval>

          <minimumIndex>3433</minimumIndex>

          <maximumIndex>3438</maximumIndex>

          <direction>none</direction>

        </interval>

      </intervals>

      <qualifiers>

        <qualifier>

          <name>Recognition pattern</name>

          <value>C^TCGAG</value>

        </qualifier>

      </qualifiers>

    </annotation>

  </sequenceAnnotations>

  <charSequence>

    <sequence>actta....</sequence>

    <gapPrefixLength>0</gapPrefixLength>

    <gapSuffixLength>0</gapSuffixLength>

  </charSequence>

</root>



Top
 Profile  
 PostPosted: Mon Apr 14, 2008 10:05 am   
Site Admin

Joined: Tue May 17, 2005 7:31 pm
Posts: 251
Location: Georgia
you would have to know more about the number you are attempting to extract... does it always have certain things around it? do they always follow a certain line? once you have that... then you can start extracting the lines that contain the numbers, then you can extract the numbers themselves and perform the math on them. it's really hard to say "you would write this script <some code>" when we really don't know as much about what you want as you do.

but it sounds like bash is what you would want for this... atleast i'd probably do this in bash first... then possibly in python... depending on the files and whatnot.


Top
 Profile  
 PostPosted: Mon Apr 14, 2008 12:51 pm   

Joined: Mon Apr 14, 2008 5:07 am
Posts: 9
Thanks for your reply.

Yes, it seems there is always <minimumIndex>NUMBER</minimumIndex> around the numbers I want. Question is, is there some kind of loop I could make that would extract the numbers, roll em into an array or something, manipulate them and then stuff them back in?


Top
 Profile  
 PostPosted: Mon Apr 14, 2008 1:14 pm   
Site Admin

Joined: Tue May 17, 2005 7:31 pm
Posts: 251
Location: Georgia
what about the numbers directly under those "<maximumIndex>" are you going to do anything with those?


Top
 Profile  
 PostPosted: Mon Apr 14, 2008 1:46 pm   

Joined: Mon Apr 14, 2008 5:07 am
Posts: 9
Yes, sorry. I need to change both <minimumIndex>X</minimumIndex> and <maximumIndex>X</maximumIndex>


Top
 Profile  
 PostPosted: Mon Apr 14, 2008 3:06 pm   
Site Admin

Joined: Tue May 17, 2005 7:31 pm
Posts: 251
Location: Georgia
this was just a quick run down... not sure if this is what you are looking for... but i hope it's close...
you would pass this script the xml file you want to replace the numbers in...
so if you named this script "plasmid.sh"
you would call it like "plasmid.sh filetoalter.xml"

Code:
#!/bin/bash
# needs bc incase of decimals (currently set to 2 places)
# if you don't use decimals, it won't invoke bc

xml="$1"
grep "Index>" "$xml" > tmp
cp "$xml" "${xml}.bak" # make a backup of the original file... just in case
minMax="0"
read -p "Please enter the total size of the plasmid: " tplasmid
read -p "Please enter the number to subtract: " subNum



function math
{
   newNumber=`bc -l 2> /dev/null<< EOF
   scale = 2 # defines how far to round... this is hundredths
   ${1}
   EOF
   `
   if [[ "$newNumber" == "" ]]
   then
      newNumber="bc is not installed"
   fi
   echo "$newNumber"
}

function mathNoDec
{
   let newNumber="$1"
   echo $newNumber
}

function getNewNum
{
   n="$1"
   dec=`echo "$n" | grep '.'`
   if [[ ${#dec} > 0 ]]
   then
      tmpNum=`math "$n - $subNum"`
      if [[ ${tmpNum:0:1} == "-" ]]
      then
         echo `math "$tplasmid + $tmpNum"`
      else
         echo $tmpNum
      fi
   else
      tmpNum=`mathNoDec "$n - $subNum"`
      if [[ ${tmpNum:0:1} == "-" ]]
      then
         echo `math "$tplasmid + $tmpNum"`
      else
         echo $tmpNum
      fi
   fi   
}

until ! read cur_line
do
   if [[ "$minMax" == "0" ]]
   then
      # This is a minimum index
      num=`echo "$cur_line" | cut -d\> -f2 | cut -d\< -f1`
      
      # Do something to the number here and assign the changed number to newNum
      newNum=`getNewNum "$num"`

      # Put the new number in place (we will replace every instance where this number
      # Matches for minimumIndex just to save on work.
      sed -i "s#<minimumIndex>$num</minimumIndex>#<minimumIndex>$newNum</minimumIndex>#g" "$xml"
      minMax=1
   else
      # This is a maximum index
      num=`echo "$cur_line" | cut -d\> -f2 | cut -d\< -f1`
      
      # Do something to the number here and assign the changed number to newNum
      newNum=`getNewNum "$num"`

      # Put the new number in place (we will replace every instance where this number
      # Matches for maximumIndex just to save on work.
      sed -i "s#<maximumIndex>$num</maximumIndex>#<maximumIndex>$newNum</maximumIndex>#g" "$xml"
      minMax=0
   fi
done < tmp

rm -f tmp

enjoy :)


Top
 Profile  
 PostPosted: Tue Apr 15, 2008 1:44 am   

Joined: Mon Apr 14, 2008 5:07 am
Posts: 9
wow - impressive!!! Thanks alot, I'll give it a test run and tell you how it worked out! :)


Top
 Profile  
 PostPosted: Tue Apr 15, 2008 2:57 am   

Joined: Mon Apr 14, 2008 5:07 am
Posts: 9
Did the script, here was the output, it seems it made some noise! :) What do you make of it? Thanks again!

Code:
dhcp-85:~ frede$ sh plasmid.sh a3.xml
Please enter the total size of the plasmid: 11029
Please enter the number to subtract: 2657
sed: 1: "a3.xml": command a expects \ followed by text
sed: 1: "a3.xml": command a expects \ followed by text
sed: 1: "a3.xml": command a expects \ followed by text
sed: 1: "a3.xml": command a expects \ followed by text
sed: 1: "a3.xml": command a expects \ followed by text
sed: 1: "a3.xml": command a expects \ followed by text
sed: 1: "a3.xml": command a expects \ followed by text
sed: 1: "a3.xml": command a expects \ followed by text
sed: 1: "a3.xml": command a expects \ followed by text
sed: 1: "a3.xml": command a expects \ followed by text
sed: 1: "a3.xml": command a expects \ followed by text
sed: 1: "a3.xml": command a expects \ followed by text
sed: 1: "a3.xml": command a expects \ followed by text
sed: 1: "a3.xml": command a expects \ followed by text
sed: 1: "a3.xml": command a expects \ followed by text
sed: 1: "a3.xml": command a expects \ followed by text
sed: 1: "a3.xml": command a expects \ followed by text
sed: 1: "a3.xml": command a expects \ followed by text
sed: 1: "a3.xml": command a expects \ followed by text
sed: 1: "a3.xml": command a expects \ followed by text
sed: 1: "a3.xml": command a expects \ followed by text
sed: 1: "a3.xml": command a expects \ followed by text
sed: 1: "a3.xml": command a expects \ followed by text
sed: 1: "a3.xml": command a expects \ followed by text
sed: 1: "a3.xml": command a expects \ followed by text
sed: 1: "a3.xml": command a expects \ followed by text
sed: 1: "a3.xml": command a expects \ followed by text
sed: 1: "a3.xml": command a expects \ followed by text
sed: 1: "a3.xml": command a expects \ followed by text
sed: 1: "a3.xml": command a expects \ followed by text
sed: 1: "a3.xml": command a expects \ followed by text
sed: 1: "a3.xml": command a expects \ followed by text
sed: 1: "a3.xml": command a expects \ followed by text
sed: 1: "a3.xml": command a expects \ followed by text
sed: 1: "a3.xml": command a expects \ followed by text
sed: 1: "a3.xml": command a expects \ followed by text
[/quote]


Top
 Profile  
 PostPosted: Tue Apr 15, 2008 8:37 am   
Site Admin

Joined: Tue May 17, 2005 7:31 pm
Posts: 251
Location: Georgia
are you sure you copy and pasted the script correctly? i didn't get any such error when i ran it... perhaps you can send me the file you ran it against... send it to jbsnake at gmail dot com (i spell it out so no bots can log it for spam :wink: )


Top
 Profile  
 PostPosted: Sun Apr 20, 2008 12:55 pm   

Joined: Mon Apr 14, 2008 5:07 am
Posts: 9
hey did you get my email?


Top
 Profile  
 PostPosted: Mon Apr 21, 2008 8:51 am   
Site Admin

Joined: Tue May 17, 2005 7:31 pm
Posts: 251
Location: Georgia
i did.... did you get my return email?


Top
 Profile  
 PostPosted: Tue Apr 22, 2008 4:29 pm   

Joined: Mon Apr 14, 2008 5:07 am
Posts: 9
nope sorry. Can you post your reply here or try to resend please?


Top
 Profile  
 PostPosted: Tue Apr 22, 2008 7:53 pm   
Site Admin

Joined: Tue May 17, 2005 7:31 pm
Posts: 251
Location: Georgia
hey Frederik,

I have a couple suggestions. First, don't use sh to run the script, you may be using the Bourne Shell (true sh) instead of the Bourne Again Shell (bash). Instead make the script executable (chmod +x plasmid.sh) and run it like (./plasmid.sh test.xml). Second, don't use the -v flag when you run it... I'm not sure if you just did that for the text output for me or not, but I'm not sure if bash likes it too much... usually you'd want to use the set +v in the script itself.

Thanks, and I hope that helps!

--
JB
jbsnake at gmail dot com


Top
 Profile  
 PostPosted: Wed Apr 23, 2008 2:04 am   

Joined: Mon Apr 14, 2008 5:07 am
Posts: 9
ran it the way you said, same problem!

Code:
sed: 1: "test.xml": undefined label 'est.xml'


???


Top
 Profile  
 PostPosted: Wed Apr 23, 2008 8:38 am   
Site Admin

Joined: Tue May 17, 2005 7:31 pm
Posts: 251
Location: Georgia
pretty strange... taking your xml file and my script... i ran it just fine... i'm at a loss


Top
 Profile  
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 15 posts ] 

All times are UTC - 6 hours


Who is online

Users browsing this forum: Bing [Bot] and 2 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Jump to:  


BashScripts | Promote Your Page Too
Powered by phpBB © 2011 phpBB Group
© 2003 - 2011 USA LINUX USERS GROUP