diff options
-rw-r--r-- | Genetic-Code-Amino-Acid-Codon-Chart-sidebyside-03.png | bin | 0 -> 54031 bytes | |||
-rw-r--r-- | NOTES.org | 40 | ||||
-rw-r--r-- | phi6 RefWT_from Lele.txt | 1 | ||||
-rw-r--r-- | phi6 wt protein start stops.csv | 14 |
4 files changed, 55 insertions, 0 deletions
diff --git a/Genetic-Code-Amino-Acid-Codon-Chart-sidebyside-03.png b/Genetic-Code-Amino-Acid-Codon-Chart-sidebyside-03.png Binary files differnew file mode 100644 index 0000000..de7a11c --- /dev/null +++ b/Genetic-Code-Amino-Acid-Codon-Chart-sidebyside-03.png diff --git a/NOTES.org b/NOTES.org new file mode 100644 index 0000000..ba7f44b --- /dev/null +++ b/NOTES.org @@ -0,0 +1,40 @@ +* This is the phi6 genome: +[[file:phi6 RefWT_from Lele.txt]] + +* CSV file +[[file:phi6 wt protein start stops.csv]] + +This is a CSV file with three columns: protein name, start nucleotide, ending nucleotide +These numbers are inclusive. Everything else in the genome that’s not in at least one of those ranges (there’s one nucleotide overlaps +between some reading frames) isn’t protein-coding. + +* Standard genetic code +[[file:Genetic-Code-Amino-Acid-Codon-Chart-sidebyside-03.png]] + +The standard genetic code that you’ve used for some of my class projects applies, we will be using the single capital letter abbreviations for +amino acids. Because of this please use lowercase “a, c, g, t” for nucleotides. This is a chart that uses the DNA bases (no need to switch “u” +to “t” in your head) and has the single letter amino acids. The three stop codons (taa, tag, tga) should all code for the same thing — could be +“STOP” could be an asterisk… you can have some creative control here :-) + +* Test +As a test that our coordinates are correct, can you spit out the protein sequence from each of those proteins? Each will start with a M (one with +a V, it’s an “alternate start codon) and should stop with a stop. Please send me that as a text file. + +If that works I’ll get you sample input and output for what we need the program to actually do + +have a nucleotide number and nucleotide inputted +print out reference sequence nt at that number, the nt number, the inputted nucleotide (Tab) the name of the protein involved OR +“noncoding” (Tab) Amino acid called by wild type sequence, the number in the protein that amino acid is, the amino acid called by the +inputted nucleotide being in the sequence. + +Something like: +input 7500g + +output: +a7500g P7 S34T + +(sometimes the variant nucleotide will be in a protein-coding region but won’t change the called amino acid, this is normal and fine so we’ll +see, for example, “S34S” + +Thanks! +SD diff --git a/phi6 RefWT_from Lele.txt b/phi6 RefWT_from Lele.txt new file mode 100644 index 0000000..e751e6d --- /dev/null +++ b/phi6 RefWT_from Lele.txt @@ -0,0 +1 @@ +GGAAAAAAACTTTATATAACTCTTATATAAGTGCCCTTAGCGGGGCTCCCCGGCTACGGTCGGATCCCTACGGGGAGGATAGGGTGAAAACCCCTAGTGCAAGCTGACACTCATACCTCCCAAGGTCCATGAGTCGACGCAAAGGTCCTCGAAAGCATGTTGTCCTTTCGTACAACCGAGTAGGTTCGTTGCCTTAATTGGTGACGCTTGCAGGATGAGGATGGTCCCGACGCCTAACGGACCTTGCTGCCTTCTTTCCCTGGATTGGCGGTGTTGTTCCCACTAATAATAAAGGAATACGCACATGTTGCTGCCTGTAGTAGCCCGTGCGGCCGTCCCTGCTATTGAGAGTGCCATTGCGGCTACTCCTAGCCTGGTTTCCCGAATCGCAGCCGCGATCGGTTCCAAGGCCAGCCCTTCCGCCATTTTGGCGGCGGTCAAGAGCAACCCGGTCGTCGCAGGTCTGACACTCGCTCAGATCGGAAGCACCGGTTATGACGCCTATCAGCAGCTTCTGGAGAATCATCCAGAGGTCGCCGAGATGCTGAAAGACCTGTCTTTCAAAGCCGACGAAATCCAGCCGGATTTCATCGGTAACCTCGGTCAGTACCGCGAAGAGCTGGAACTGGTCGAAGATGCTGCCCGCTTCGTGGGCGGCATGTCGAACCTGATTCGCCTGCGCCAGGCCCTGGAACTTGATATCAAGTACTACGGCCTGAAAATGCAGCTGAATGACATGGGATACCGCTCGTAATGGTTATCGGTCTCCTGAAGTATCTCACGCCTGCCGTTAAGGTGCAGATGGCTGCTCGCGCGTTGGGCCTGTCCCCCGCCGAAGTCGCTGCAATTGACGGCACGTTGGGTCGTGTCTCTGCGATGCCAGCGGTCGCGGTCGTGCTGGGAGGGAAACCTCTCTCTCTGGCCACGATCGCGTCAGTTGTGTCTGATGCAAACCCCAGTGCCACTGTTGGCGCGCTTATGCCTGCTGTACAGGGCATGGTGAGTTCCGACGAAGGCGCGAGTGCGTTGGCTAAGACCGTGGTAGGCTTCATGGAGTCCGACCCCAACAGCGATGTCCTGGTTCAACTGCTCCACAAGGTGTCAAACTTGCCGATTGTCGGCTTTGGTGACACGCAGTATGCAGACCCAGCTGACTTCTTGGCCAAGGGAGTTTTCCCTCTGATCAGGAAGCCAGAAGTAGAGGTTCAAGCTGCGCCTTTCACCTGTCGTCAGTGTGATCATGTTGATCACATCACTGATGTACCTCAAACTTCGACCTTTGTTCACAAATGCACTTCGTGCGGCTTTGTGCAGATGGTCCACCGTAAGGATGTTCCGTAATGCCATTTCCTCTGGTAAAGCAAGACCCAACCTCGAAGGCTTTCACTGAAGCCAGTGAACGCTCCACCGGCACCCAGATCCTGGACGTCGTCAAGGCCCCTATCGGCCTGTTCGGCGACGATGCCAAACACGAGTTCGTGACCCGTCAGGAACAAGCCGTCTCCGTCGTCAGCTGGGCAGTTGCTGCCGGTCTGATCGGCGAGCTGATCGGCTACCGTGGTGCGCGTTCGGGTCGCAAAGCGATCCTGGCCAACATCCCTTTCCTGGCCTAACTCCTCGTGTCCAAGGATAGCGCCTTCGCGGTGCAATACTCGCTGCGCGCCCTGGGACAAAAGGTGCGGGCAGACGGGGTAGTGGGCTCTGAGACCCGTGCGGCGCTGGATGCGCTGCCCGAGAATCAGAAGAAAGCGATTGTAGAGTTGCAAGCACTCCTACCGAAAGCACAGTCGGTCGGCAACAGCCGTGTGAGGTTCACAACAGCTGAAGTCGACTCGGCGGTGGCGCGGATCTCGCAAAAGATAGGTGTTCCGGCTTCTTACTACCAGTTCCTGATTCCGATCGAGAACTTCGTGGTGGCCGGTGGTTTCGAAACCACCGTTTCTGGTTCCTTCCGTGGGTTGGGCCAGTTCAACCGGCAGACGTGGGATGGACTCCGTCGTTTAGGCCGTAACCTTCCTGCATTTGAGGAGGGTTCGGCACAACTGAACGCTTCTCTTTATGCAATCGGGTTCTTGTATCTTGAGAACAAGAGAGCGTACGAGGCGTCGTTCAAAGGCCGCGTTTTCACTCACGAAATCGCGTATTTGTATCACAACCAAGGCGCTCCAGCTGCCGAACAGTACCTGACTTCGGGTCGGCTCGTTTACCCGAAGCAAAGCGAGGCCGCTGTCGCGGCGGTTGCGGCTGCGAGAAACCAGCATGTCAAAGAGAGTTGGGCTTAACCCTGAACTGCATCGTGAACTGAAAATGTTCCCAGATGTCACGAAGGGTGGCACGTTCGACATAACCATCCGGTCGACTACCGAGAACGGTGCTTTTTGGGCGAACTACGAAGGTAGAACGTCCTTGGTCACCGTCCCGGACGTGAAGACAGCTATCGAGTTTTTGATTAAACTCTGCCGTCGACACAAGTTGTCCAATCAGGTGAACACGCGAACGCTTCTCCGCGATTTGCAACGAACGTTGCAGGAATGTGAATGCCAGTCTCATCATGTGCCGTTGTCCAGCCCCTTCATGCATCTCAGATTTGCGTAAAGCTGATCGGAAGCTATGAAAGTAAGCTGAGCGACACGGAAGTTATTGAAGCAGCTATTCAAGCTCTCATAGGCTTGGAAAGCCCGGCATCGATGTCGTTCCATGTCGCGCCAGACGCGGCCACCGATATGTATCTTGATCTGATCGAAATCTACTCCCCGTCGTCAGTCGGGATACATCTCGTCCTGCCATAAGCGCTGTCTGTAGCGTGCATAAACAGATAGATCGCCTTTTTAGGTAACCGCGGATTGATCACCGTTCCGAGCTTGCTTGGATAAACAAGTCCTTGTATAACAAGGCGAGACTCACTATGTGAGCGTCCAATAGGACGGCCCCTTCGGGGGCTCTCTCTCTGGAAAAAAACTTTATATATTTTCTACGTTGAGCTCCGTATAAAGCTCCGTGCCCGCACACGCCCGCTACGGCGGTATTGTCTAACCGGCGACAATAAACAGCTGCTGCTTACAAGCTTACAGTTGACCGGAGTCTCGGCGTGCAGCGCCTAAACACGGGAAACCGTGGTGGTGACACCCTCTGCTGAGGGCTTATAGTGGTGATATTCCTCCCCAGGAGTTCCCTCCCATTTCGGCCACTCGCGCTCTAACCATGAGCGCGGTCTCTTTGAGAGTGTCGCTCTTCTGCCTACGCGCTCATTCGTTCCCTCGAGTTGACGCTTCAAGCAGGTGGACACCTCCTCAACCCATAATAAGAGATCCATTCAATGGACAACATCCTCGATCCCCTTAAGGCTCCGTTTTCTTCGGAAGCCGCCGCGAAAACCACCGCTGCCAAAATCGCTGTGGTATACGCGTTGGTCGGTCTGGTTGGCGGTCTGCTGCTCACCAAGTAAGGTGTAGTATGCATGACACGCGACCGCTCCGAACCGAGAGACCCATGGCCAGCAAGAATACGAATGACCGGGTTTTCGACCGGTTTCATTCTTCTCTCGTTCGGCTGTGGCATTCTGCAAACCAGCGCATGCGCGGTTCTTTCTCCGTCGTGGTTCGGGTCGAGCACTCTTTAGTGCTCCTCATCGGCTACACGGTGGTAGGCGCGACTGTCGCACACTTCGTGAGGTGACTATGTTAGCTTTCGTAGCGCGAGCGGTCGTACTTTACTCTGCTGGTGTAGTCGTGGGCATCGCCTACGATCACGTCACAGGAAGGAAACGTCGCCATGACTAAGTGGAAGATGTACATCGCCGGCGTCGTTCTGGTCATCGTAGGGGCAGTTACTCATGCTCCACAGCTGATGGTCCAGGGCATGACTACGCTCGCGACTCAAGCGGCCGCAGACGCGGCCGATGGTGGAGGTGCTCAGTGAGTATCTTCTCCTCGTTGTTCAAGGTCATCAAGAAGGTAATCTCGAAGGTGGTCGCCACCCTTAAGAAAATCTTCAAGAAGATCTGGCCGTTGCTACTTATTGTGGCAATTATCTACTTCGCTCCCTACCTCGCCGGGTTCTTCACTTCCGCCGGGTTCACTGGGATCGGAGGGATCTTCTCCTCTATCGCAACCACCATCACGCCTACGCTGACGTCGTTCCTGTCGACTGCGTGGTCTGGTGTGGGCTCTCTTGCCTCCACGGCTTGGTCTGGGTTCCAATCTCTCGGGATGGGTACTCAGCTCGCTGTCGTGAGTGGCGCGGCTGCTCTGATTGCACCTGAGGAAACGGCTCAACTGGTTACCGAAATCGGTACCACCGTAGGTGATATCGCCGGTACGATTATCGGCGGTGTCGCCAAGGCACTCCCGGGTTGGATCTGGATCGCCGCAGGCGGTCTTGCCGTCTGGGCCCTCTGGCCGTCATCTGACAGTAAGGAGTAGCAAATGCGCTACCAAGGCATCAACGAGTGGCTGGGTGGAGCCAAGAAACTCACCACCGCAAACGGTGAGATTGGCGCTATCTACCTCTCCGCTGCTCCTCCCACCGACGCCGCACGTGCGGACGCTAAGGCGGTGGATTTTACTGCTGGTTGGCCAAGCGCGATCGTTGATCGCGCTGATGCCACTCGTGCCAAGCAGAACTACCTGTGGGTTGGCGATAACGTTGTGCACATCGGGGCTAAACACGTTCCACTCCTCGATCTGTGGGGCGGGACAGGTGATGCCTGGCAGCAGTTCGTTGGCTATGCCTGCCCAATGCTCGACCTTTGTCGTGCGTGGGGCCTGGGTTATGCCAGCGCTTCTGTAACCACCGGCTCGTTGCAGGGCTATCAGCCATCGGCGTTCTTGGACGCTGAGCAACAGCAGTTCGCGAAGGACAATCTCAACCTGTATGGCGATAACTGCCTTGACCTGGCCACCAGTTCGTCCGCTCAGCGGGCATTTCTGGAGCAGTGCATGGGCTGCGCCTTGCCGGAGGATTGCGTCTTCGGTTGGTATGTGAAAATGGATTGGGAAGGCTCGGCAGTTGCCGACGCCTACGCTGCGATCCGTGTCCAAGGGTTCGCCACTGTAATGGCACCTTGGCAGTCGGTTGGCGGTGCTGGCTACGTTTACGCTCGTGTGCCTCAAAAAGGCGCGTGGATGGGTGTGAACCTGCTTGCCTATGTCCACGGCACCAGTGGCCAGCCTGCTTATGGCATTCCGATGACCCTCTCGGGGTTCACCGGTAACATGGGTCAGGTGGCTTCGAAGTGGCTCATGCTTCCTCTCCTGATGATCGTCGACCCTCATGTCGTCCAGATTTTGGCCGCACTGGGGGTTAAACGTGGGACCAAATCGGACCCACGGACGACCGACGTGTACGCTGATCCGAAGGTTCCGGCTAGCCGTATTTCCGGGCCGATGATCAATGCAACGGTTGCTCCTCCTGCGACGATCCCCGCTACCATTCCGGTGCCTCTGGCGCCGCTCGGTGGCGCGGGTGGCCCTGGCGCTCAGGGTTTCCAGGTATACCCCGTTTTCACCTGGGGTCTGCCTGAGTTCATGACCGACGTGACCATCGAAGGTACCGTCACTGCGGACTCCAACGGTCTGCATGTCGTGGACGACGTGCGTAACTACGTCTGGAACGGTACTGCTCTTGCTGCAATTGAGCAGGTCAATGCCGCTGACGGTCGAGTTACGCTCACTGACTCTGAGCGTGCTCAACTCGCCTCGTTGACTGTTCGAACCGCATCGTTGCGTCAGCAGCTGTCGGTTGGGGCAGACCCCTTGTCCAAGACGTCGATCTGGCGTCAGGCTCAAAAGGCCGATTATGATCTGCTGTCTCAACAGATCATCGAAGCGGACACGGTGAAAAACCTACCTGCTGTGACGTTCGCTCAGGCGAACAAAGCGGCAGGCGGTCAATCCGAGACGTTGTGGCACCAGATGTATCGGGTCAACGATATCGCTGGCGATCAAGTCACCGCAATCCAAATCACTGGTACGATGGCGACTGGCATTCGCTGGTCGGCAACTGCTGGCGGTCTGGTCGTCGATGCTGACGAGCAAGATGCGGTGATCGCGATTTCGTCCGGTAAGCCGGTCAAGAACAGCTCCGACCTTCCTACGGCCGACGCTGTGAACTACTTGTTCGGTATCACTGCGGACGATATGCCTGGTATCGTTTCCTCGCAAAAGGAAATGAACAGCGAGTTTGAAGAAGGTTTCCTTCAGAAAGCTCGTCTCTGGAACCCACGTAAGCTCGTCGAAAACGTCCAGAATGCCTATTTCCTGATGGTGTACGCTCGCGATCGGAAGCAATTCCACTCGTTGGTGGCATCCTCTCTGGCGATGGCCAAGCTGGCGCGTAAGTACGCGGGCCTGTAAGGAGTCGTATGGCTGCTGAACAATCCTCCGGTATGAGCGCGTTCACCAAAGGCACGATCGTGATCTGCCTGGTGGTGGTCGCCCTCAATCTCATCGGGAAGTGACCATGGTACCGCTAAAAATTAGCACGCTGGAGTCCCAGCTGCAACCGCTTGTTAAGTTGGTTGCAACCGAGACCCCCGGTGCCCTCGTAGCGTATGCTCGAGGGTTATCGAGTGCCGACCGCTCGCGGTTGTACAGACTGCTTCGTTCTTTGGAGCAGGCCATCCCGAAGCTGTCGTCGGCTGTCGTTTCGGCCACGACGTTGGCAGCGCGAGGTCTCTAATGGAGACCAACCCGCTGCTTCAGCTTGAGTCGCTGTCGTTACGCTTGCGAGACATGCCTCGTTCGCGCCTTTCTGCGCTGATGAAGAACATGTCGTATGAGCAGCTGCAGTCGTTGTATAGCACCAGCGTAAAAGTTGGCGCTGTGCTCGATAGCGTTTCAATGCAGTTGCTTGAGGCGTCACAAACCGCTCAATCGGGAACTCGACTGATGACACCGCAGGAGTACGTCGCTGCTGGTGGAGGTCGTGTGTACGTTAAATAAGTCCTTAGATTTCTAAGGCGAGACTCGCTTTGCGAGCATCCAATAGGATGGCCCCTTCGGGGGCTCTCTCTCTGTAAAAAAACTTTATATAGTCTTTTACCTGGATTCTCTGTGCAGAACTGAGAACTGAACGCTACCCTTGCGGGGGATGCGGCCCCGGGCTACGGCCTAGGGATCCAGCGTGGCTCACGGGCCGCCGGAACTGACGTCCGTAACAAACGTCCTTGGGATAGGAGTACAGTAACCACTCTTAGATACCCGATTCCCCTGTTTCTGCGTGGAAGCCTTTCGACAGCTACCCAGCTTAGATCGTCTGGTGCCCTAAATCCCTGGAGATAACCAATGGCTACATTACAAGATGTGCATCTACGGGTGAATGACCGGGTAACACCGGTGTACTTCACTGCTCGCTCGTTTCTGCTCGTTTCTCCGAAACGTGCGGGGCAAGCAACGTTCCTCGCTCGCGAGGAGGGTACTGACAATCCTGTCGTTACCTGTCATGTATCCGACTTTTATAAGGACGGTGTGTAATGACTTTGTACCTGGTCCCTCCGCTGGATTCGGCGGACAAAGAGTTGCCTGCTCTGGCTTCCAAAGCTGGGGTAACGCTTCTCGAGATCGAGTTTCTTCACGAGCTCTGGCCTCACCTCAGTGGTGGTCAGATCGTGATCGCCGCTCTCAACGCCAACAATCTGGCCATCCTCAACCGTCACATGTCCACTCTGTTGGTCGAGTTGCCGGTTGCTGTGATGGCCGTTCCCGGTGCTAGCTATCGTTCCGATTGGAACATGATCGCTCACGCACTCCCGTCTGAGGATTGGATCACTTTGTCCAACAAGATGCTGAAAAGCGGCTTGCTGGCGAACGATACCGTCCAGGGCGAGAAGCGCTCCGGCGCTGAGCCGCTGTCGCCGAACGTGTACACCGATGCGCTCTCGCGTCTCGGTATCGCGACGGCCCATGCTATCCCCGTTGAACCCGAACAACCGTTCGATGTCGATGAGGTAAGCGCCTGATGCCGAGGAGAGCTCCCGCGTTCCCTCTGAGCGATATCAAGGCTCAGATGCTGTTCGCAAATAACATCAAGGCCCAACAAGCCTCGAAGCGTAGCTTCAAAGAGGGGGCGATTGAAACGTACGAAGGGCTGCTTTCAGTAGACCCTCGGTTTTTGAGTTTCAAGAACGAGCTCTCTCGGTATCTGACCGACCACTTCCCGGCGAACGTCGACGAGTATGGTCGTGTTTATGGAAACGGTGTTCGTACCAACTTCTTTGGTATGCGCCACATGAACGGGTTTCCAATGATCCCCGCGACGTGGCCACTCGCTTCCAACCTTAAGAAACGTGCCGACGCTGACCTAGCCGATGGCCCTGTTTCTGAGCGCGACAATCTACTCTTTCGCGCCGCAGTCCGGCTTATGTTTTCAGATCTAGAGCCTGTTCCGCTGAAGATCCGTAAAGGATCGTCAACCTGCATCCCGTATTTTTCTAACGATATGGGAACGAAGATCGAGATCGCCGAGCGCGCTCTTGAGAAAGCGGAAGAAGCTGGCAATCTGATGCTGCAAGGTAAGTTTGATGACGCCTACCAGCTCCACCAAATGGGTGGTGCCTATTACGTCGTGTATCGTGCACAATCGACCGATGCTATCACACTCGACCCTAAGACCGGAAAATTCGTGTCAAAGGATCGTATGGTCGCTGACTTCGAATACGCAGTCACGGGCGGTGAGCAAGGCTCGCTGTTCGCTGCTTCGAAGGATGCCTCTCGTTTGAAGGAACAGTACGGGATAGATGTCCCGGACGGGTTTTTCTGCGAGCGGCGTCGTACCGCTATGGGTGGTCCGTTCGCGTTGAACGCTCCTATCATGGCCGTTGCGCAACCTGTGCGAAACAAAATTTACTCCAAGTACGCTTACACCTTTCACCATACTACTCGTCTTAATAAGGAGGAAAAGGTGAAAGAGTGGTCGTTGTGCGTCGCTACTGACGTATCCGACCACGACACGTTCTGGCCTGGATGGCTGCGGGATCTCATCTGTGATGAACTGCTCAACATGGGGTACGCTCCGTGGTGGGTTAAGTTGTTCGAGACCTCGCTCAAACTGCCCGTTTACGTGGGCGCTCCTGCTCCTGAGCAGGGCCACACGTTGTTGGGTGATCCGTCCAACCCTGATCTCGAAGTTGGTCTCTCGTCCGGACAAGGGGCGACCGACCTCATGGGCACGTTGCTCATGAGTATCACCTACCTGGTGATGCAACTTGATCACACCGCTCCTCACCTCAACAGTCGAATCAAGGACATGCCATCAGCATGCCGCTTTCTTGACTCGTATTGGCAAGGACACGAGGAGATCCGTCAGATCTCAAAATCTGATGATGCTATGCTTGGCTGGACCAAAGGTCGTGCTTTGGTTGGTGGTCATCGTTTGTTCGAGATGCTGAAAGAGGGTAAGGTTAACCCCTCACCTTACATGAAGATCTCCTACGAGCACGGTGGCGCCTTCCTTGGTGACATCCTGCTTTACGACTCGCGTCGTGAGCCTGGCTCTGCCATCTTCGTTGGTAACATCAACTCAATGCTGAACAACCAGTTCAGCCCTGAGTACGGTGTCCAATCGGGCGTTCGCGACCGATCTAAGCGCAAACGGCCGTTCCCCGGTCTTGCTTGGGCGTCGATGAAAGATACCTACGGTGCCTGTCCGATCTACTCTGATGTGCTGGAGGCGATCGAGCGTTGCTGGTGGAACGCGTTCGGTGAGTCGTACCGTGCGTATCGTGAAGATATGCTTAAACGCGACACTCTCGAACTATCACGCTACGTTGCGTCGATGGCTCGTCAAGCCGGGCTGGCTGAACTCACTCCCATTGATTTGGAGGTGCTTGCTGACCCGAACAAACTCCAGTATAAGTGGACCGAGGCCGATGTCTCGGCGAATATCCACGAGGTACTGATGCATGGCGTATCGGTCGAAAAGACTGAGCGCTTTCTCCGTTCTGTAATGCCGAGGTAATCATGCCGATTGTCGTAACTCAAGCGCATATTGATCGTGTCGGCATCGCCGCCGATCTGCTCGATGCGTCTCCTGTGTCGCTTCAAGTTCTTGGTCGCCCTACCGCGATCAACACTGTCGTCATCAAGACGTACATCGCTGCTGTTATGGAGCTCGCCTCCAAGCAAGGTGGTTCGTTGGCCGGTGTGGATATTCGTCCTTCGGTTCTGCTGAAAGACACCGCTATCTTCACCAAGCCGAAGGCGAAGTCCGCTGACGTCGAATCTGATGTCGACGTTCTGGACACGGGGATTTACTCCGTTCCTGGACTGGCTCGCAAGCCTGTCACCCACCGTTGGCCATCAGAGGGTATCTACTCTGGTGTCACAGCTCTGATGGGCGCTACCGGTTCCGGTAAGTCGATCACGCTGAACGAAAAGCTCCGTCCAGACGTCCTGATTCGTTGGGGCGAGGTGGCTGAAGCTTACGATGAGCTGGATACCGCCGTCCACATCTCGACTCTGGATGAGATGTTGATTGTGTGTATTGGCCTGGGTGCACTCGGGTTCAACGTCGCTGTTGACTCGGTTCGTCCTCTGCTGTTCCGTCTCAAAGGCGCCGCCTCTGCGGGGGGTATTGTGGCTGTGTTCTACAGCCTGTTGACCGATATCTCGAACTTGTTCACACAATACGATTGCTCTGTCGTCATGGTCGTTAACCCGATGGTTGACGCTGAGAAGATCGAGTACGTGTTCGGTCAGGTCATGGCTTCGACTGTCGGTGCGATCTTGTGTGCTGATGGCAACGTGTCCAGAACGATGTTCCGGACCAACAAAGGTCGTATTTTCAACGGTGCGGCCCCTCTTGCTGCTGACACTCACATGCCTAGCATGGATCGTCCTACCAGCATGAAGGCCCTCGATCATACCTCGATCGCCTCTGTCGCACCGCTGGAGCGTGGCTCCGTGGATACCGACGATCGCAATTCCGCTCCGCGCCGTGGCGCTAACTTCTCTCTGTAAGGGTATAAGATGTTCAACCTCAAAGTTAAAGATCTGAACGGTTCCGCTCGCGGTCTGACTCAAGCTTTCGCCATCGGCGAATTGAAGAACCAGCTGTCCGTCGGCGCGTTGCAGTTGCCGTTGCAGTTCACGCGCACGTTCTCCGCTTCCATGACCAGCGAGTTGCTTTGGGAAGTGGGCAAGGGCAACATCGACCCAGTGATGTACGCTCGTCTGTTTTTCCAGTACGCGCAAGCTGGCGGCGCTCTGTCCGTTGATGAGCTCGTGAACCAGTTCACTGAGTATCACCAATCCACGGCCTGTAACCCTGAAATCTGGCGCAAGCTGACTGCTTACATCACCGGTTCCTCGAACCGCGCGATCAAAGCTGACGCTGTAGGCAAGGTGCCTCCAACCGCGATCCTGGAGCAGTTGCGCACTCTCGCTCCCTCGGAGCACGAGTTGTTTCACCACATCACGACCGACTTCGTCTGCCATGTGCTGTCTCCCCTCGGTTTCATCCTGCCTGACGCTGCCTACGTGTACCGCGTTGGTCGCACCGCTACGTACCCCAATTTCTACGCTCTTGTAGATTGCGTACGTGCGAGCGACCTGCGTCGTATGCTGACAGCGCTGTCGTCTGTCGATTCGAAGATGCTTCAAGCCACGTTCAAAGCCAAAGGCGCTCTTGCCCCTGCTTTGATCTCCCAGCATCTGGCTAACGCCGCCACTACTGCTTTCGAGCGGTCGCGCGGTAACTTCGATGCCAATGCTGTGGTGTCGTCCGTTCTGACCATTCTTGGTCGTCTCTGGTCGCCTTCCACCCCGAAGGAGCTCGACCCGAGTGCGCGTTTGCGCAACACCAACGGTATCGATCAGCTGCGCAGTAACCTGGCGCTGTTCATCGCGTACCAGGATATGGTCAAGCAACGCGGTCGCGCCGAAGTCATCTTCTCTGACGAGGAGCTGTCGTCGACGATCATCCCTTGGTTCATCGAGGCGATGAGCGAAGTGTCCCCGTTCAAACTGCGTCCGATCAACGAGACTACCAGCTATATCGGTCAGACCTCCGCGGTCGACCACATGGGCCAGCCGAGCCATGTTGTGGTCTACGAAGACTGGCAGTTTGCCAAGGAGATCACCGCTTTCACTCCTGTCAAGCTGGCCAACAACTCGAATCAGCGTTTCCTGGACGTTGAGCCTGGTATCTCTGATCGTATGTCGGCTACGCTGGCACCAATCGGCAACACGTTCGCGGTTTCGGCGTTCGTCAAGAACCGCACCGCCGTTTACGAGGCTGTTTCGCAGCGTGGTACAGTCAACAGCAACGGCGCGGAGATGACCCTCGGGTTCCCTTCCGTTGTTGAACGCGACTACGCTCTCGACCGTGATCCTATGGTCGCGATCGCTGCTCTGCGCACTGGTATCGTCGATGAAAGTCTCGAGGCTCGCGCTTCGAACGATCTGAAACGGTCGATGTTCAACTACTACGCGGCTGTGATGCATTACGCTGTTGCTCACAATCCTGAAGTTGTTGTTTCGGAGCACCAAGGTGTTGCCGCCGAACAAGGTTCGCTCTACCTGGTGTGGAACGTCCGCACTGAGCTGCGAATCCCTGTTGGTTACAACGCCATCGAGGGCGGTTCGATCCGTACCCCTGAGCCGTTGGAGGCGATCGCCTACAACAAGCCGATCCAACCGTCCGAGGTGCTGCAAGCCAAGGTACTGGATTTGGCTAACCACACAACCTCGATTCACATCTGGCCGTGGCATGAGGCTTCGACCGAGTTCGCGTACGAAGACGCCTACTCTGTCACCATCCGCAACAAACGCTACACCGCCGAAGTCAAGGAGTTCGAACTCCTCGGTCTCGGTCAACGTCGCGAACGTGTACGGATCCTCAAGCCTACGGTAGCCCACGCTATCATCCAGATGTGGTATTCCTGGTTCGTCGAGGACGACCGCACTTTGGCAGCTGCCCGTCGCACGTCTCGCGATGACGCCGAGAAGCTTGCTATCGACGGTCGTCGTATGCAAAACGCTGTGACCTTGCTTCGCAAGATCGAGATGATTGGGACAACCGGTATCGGTGCGTCTGCCGTCCACCTCGCGCAGTCGCGCATCGTGGATCAGATGGCCGGTCGAGGTCTCATCGACGACAGCTCCGATCTCCATGTCGGTATCAACCGTCACCGTATCCGCATCTGGGCCGGCCTCGCCGTTCTCCAGATGATGGGTCTCTTGAGCCGCTCCGAAGCGGAAGCTCTCACCAAGGTCCTTGGTGATAGCAACGCTCTGGGCATGGTTGTCGCCACAACCGACATTGATCCATCCCTGTAACTCTCGTAAGCTCTCATAGACCTTTCGTTATAATTCCATAAGTCCTTAGATTTCTAAGGCGAGACTCGCTTTGCGAGCGTCCAATAGGACGGCCCCCTCGGGGGCTCTCTCTCT
\ No newline at end of file diff --git a/phi6 wt protein start stops.csv b/phi6 wt protein start stops.csv new file mode 100644 index 0000000..e6d3a98 --- /dev/null +++ b/phi6 wt protein start stops.csv @@ -0,0 +1,14 @@ +protein-name,start,stop
+P8,305,754
+P12,754,1341
+P9,1341,1613
+P5a,1620,2282
+P10,3317,3445
+P6,3915,4421
+P3,4425,6372
+P13,6460,6678
+P14,7284,7472
+P7,7472,7957
+P2,7957,9954
+P4,9957,10955
+P1,10965,13274
\ No newline at end of file |