I am wanting to count how many times synonymous and non- synonymous mutations appear in a sequence of DNA, given the number of synonymous and non- synonymous mutations in each 3 letter codon. ie given that AAA has 7 synonymous and 1 non- synonymous equations, and CCC has 6 and 3 respectively, then the sequence AAACCC would have 13 synonymous and 4 non- synonymous mutations. However, these sequences could have 10k + letters with a total of 64 different 3 letter combinations... How could I set up an M file, using for / else if statements to count the mutations? Thanks
Asked
Active
Viewed 602 times
1 Answers
1
Assuming you have filtered out the data errors and each time you nicely have three letter, here is one approach:
1) Make your data look like this:
AAA
CCC
ACA
CAC
...
2) Count how many times each of the 64 options occurs.
3) Multiply that found number of times with the corresponding syn and non-sym mutations.
That should be it!
Note that step 2 and 3 can easily be achieved with Excel as well. If you are not fluent in matlab it will probably even be quicker.
Dennis Jaheruddin
- 925
- 1
- 6
- 20