From e2c2d6ab0d48041595c15ec01cfdb4f26ee32a57 Mon Sep 17 00:00:00 2001 From: Brian Cully Date: Fri, 1 Jul 2022 13:57:54 -0400 Subject: Add worknotes and files from Siobain's email. --- ...c-Code-Amino-Acid-Codon-Chart-sidebyside-03.png | Bin 0 -> 54031 bytes NOTES.org | 40 +++++++++++++++++++++ phi6 RefWT_from Lele.txt | 1 + phi6 wt protein start stops.csv | 14 ++++++++ 4 files changed, 55 insertions(+) create mode 100644 Genetic-Code-Amino-Acid-Codon-Chart-sidebyside-03.png create mode 100644 NOTES.org create mode 100644 phi6 RefWT_from Lele.txt create mode 100644 phi6 wt protein start stops.csv diff --git a/Genetic-Code-Amino-Acid-Codon-Chart-sidebyside-03.png b/Genetic-Code-Amino-Acid-Codon-Chart-sidebyside-03.png new file mode 100644 index 0000000..de7a11c Binary files /dev/null and b/Genetic-Code-Amino-Acid-Codon-Chart-sidebyside-03.png differ diff --git a/NOTES.org b/NOTES.org new file mode 100644 index 0000000..ba7f44b --- /dev/null +++ b/NOTES.org @@ -0,0 +1,40 @@ +* This is the phi6 genome: +[[file:phi6 RefWT_from Lele.txt]] + +* CSV file +[[file:phi6 wt protein start stops.csv]] + +This is a CSV file with three columns: protein name, start nucleotide, ending nucleotide +These numbers are inclusive. Everything else in the genome that’s not in at least one of those ranges (there’s one nucleotide overlaps +between some reading frames) isn’t protein-coding. + +* Standard genetic code +[[file:Genetic-Code-Amino-Acid-Codon-Chart-sidebyside-03.png]] + +The standard genetic code that you’ve used for some of my class projects applies, we will be using the single capital letter abbreviations for +amino acids. Because of this please use lowercase “a, c, g, t” for nucleotides. This is a chart that uses the DNA bases (no need to switch “u” +to “t” in your head) and has the single letter amino acids. The three stop codons (taa, tag, tga) should all code for the same thing — could be +“STOP” could be an asterisk… you can have some creative control here :-) + +* Test +As a test that our coordinates are correct, can you spit out the protein sequence from each of those proteins? Each will start with a M (one with +a V, it’s an “alternate start codon) and should stop with a stop. Please send me that as a text file. + +If that works I’ll get you sample input and output for what we need the program to actually do + +have a nucleotide number and nucleotide inputted +print out reference sequence nt at that number, the nt number, the inputted nucleotide (Tab) the name of the protein involved OR +“noncoding” (Tab) Amino acid called by wild type sequence, the number in the protein that amino acid is, the amino acid called by the +inputted nucleotide being in the sequence. + +Something like: +input 7500g + +output: +a7500g P7 S34T + +(sometimes the variant nucleotide will be in a protein-coding region but won’t change the called amino acid, this is normal and fine so we’ll +see, for example, “S34S” + +Thanks! +SD diff --git a/phi6 RefWT_from Lele.txt b/phi6 RefWT_from Lele.txt new file mode 100644 index 0000000..e751e6d --- /dev/null +++ b/phi6 RefWT_from Lele.txt @@ -0,0 +1 @@ +GGAAAAAAACTTTATATAACTCTTATATAAGTGCCCTTAGCGGGGCTCCCCGGCTACGGTCGGATCCCTACGGGGAGGATAGGGTGAAAACCCCTAGTGCAAGCTGACACTCATACCTCCCAAGGTCCATGAGTCGACGCAAAGGTCCTCGAAAGCATGTTGTCCTTTCGTACAACCGAGTAGGTTCGTTGCCTTAATTGGTGACGCTTGCAGGATGAGGATGGTCCCGACGCCTAACGGACCTTGCTGCCTTCTTTCCCTGGATTGGCGGTGTTGTTCCCACTAATAATAAAGGAATACGCACATGTTGCTGCCTGTAGTAGCCCGTGCGGCCGTCCCTGCTATTGAGAGTGCCATTGCGGCTACTCCTAGCCTGGTTTCCCGAATCGCAGCCGCGATCGGTTCCAAGGCCAGCCCTTCCGCCATTTTGGCGGCGGTCAAGAGCAACCCGGTCGTCGCAGGTCTGACACTCGCTCAGATCGGAAGCACCGGTTATGACGCCTATCAGCAGCTTCTGGAGAATCATCCAGAGGTCGCCGAGATGCTGAAAGACCTGTCTTTCAAAGCCGACGAAATCCAGCCGGATTTCATCGGTAACCTCGGTCAGTACCGCGAAGAGCTGGAACTGGTCGAAGATGCTGCCCGCTTCGTGGGCGGCATGTCGAACCTGATTCGCCTGCGCCAGGCCCTGGAACTTGATATCAAGTACTACGGCCTGAAAATGCAGCTGAATGACATGGGATACCGCTCGTAATGGTTATCGGTCTCCTGAAGTATCTCACGCCTGCCGTTAAGGTGCAGATGGCTGCTCGCGCGTTGGGCCTGTCCCCCGCCGAAGTCGCTGCAATTGACGGCACGTTGGGTCGTGTCTCTGCGATGCCAGCGGTCGCGGTCGTGCTGGGAGGGAAACCTCTCTCTCTGGCCACGATCGCGTCAGTTGTGTCTGATGCAAACCCCAGTGCCACTGTTGGCGCGCTTATGCCTGCTGTACAGGGCATGGTGAGTTCCGACGAAGGCGCGAGTGCGTTGGCTAAGACCGTGGTAGGCTTCATGGAGTCCGACCCCAACAGCGATGTCCTGGTTCAACTGCTCCACAAGGTGTCAAACTTGCCGATTGTCGGCTTTGGTGACACGCAGTATGCAGACCCAGCTGACTTCTTGGCCAAGGGAGTTTTCCCTCTGATCAGGAAGCCAGAAGTAGAGGTTCAAGCTGCGCCTTTCACCTGTCGTCAGTGTGATCATGTTGATCACATCACTGATGTACCTCAAACTTCGACCTTTGTTCACAAATGCACTTCGTGCGGCTTTGTGCAGATGGTCCACCGTAAGGATGTTCCGTAATGCCATTTCCTCTGGTAAAGCAAGACCCAACCTCGAAGGCTTTCACTGAAGCCAGTGAACGCTCCACCGGCACCCAGATCCTGGACGTCGTCAAGGCCCCTATCGGCCTGTTCGGCGACGATGCCAAACACGAGTTCGTGACCCGTCAGGAACAAGCCGTCTCCGTCGTCAGCTGGGCAGTTGCTGCCGGTCTGATCGGCGAGCTGATCGGCTACCGTGGTGCGCGTTCGGGTCGCAAAGCGATCCTGGCCAACATCCCTTTCCTGGCCTAACTCCTCGTGTCCAAGGATAGCGCCTTCGCGGTGCAATACTCGCTGCGCGCCCTGGGACAAAAGGTGCGGGCAGACGGGGTAGTGGGCTCTGAGACCCGTGCGGCGCTGGATGCGCTGCCCGAGAATCAGAAGAAAGCGATTGTAGAGTTGCAAGCACTCCTACCGAAAGCACAGTCGGTCGGCAACAGCCGTGTGAGGTTCACAACAGCTGAAGTCGACTCGGCGGTGGCGCGGATCTCGCAAAAGATAGGTGTTCCGGCTTCTTACTACCAGTTCCTGATTCCGATCGAGAACTTCGTGGTGGCCGGTGGTTTCGAAACCACCGTTTCTGGTTCCTTCCGTGGGTTGGGCCAGTTCAACCGGCAGACGTGGGATGGACTCCGTCGTTTAGGCCGTAACCTTCCTGCATTTGAGGAGGGTTCGGCACAACTGAACGCTTCTCTTTATGCAATCGGGTTCTTGTATCTTGAGAACAAGAGAGCGTACGAGGCGTCGTTCAAAGGCCGCGTTTTCACTCACGAAATCGCGTATTTGTATCACAACCAAGGCGCTCCAGCTGCCGAACAGTACCTGACTTCGGGTCGGCTCGTTTACCCGAAGCAAAGCGAGGCCGCTGTCGCGGCGGTTGCGGCTGCGAGAAACCAGCATGTCAAAGAGAGTTGGGCTTAACCCTGAACTGCATCGTGAACTGAAAATGTTCCCAGATGTCACGAAGGGTGGCACGTTCGACATAACCATCCGGTCGACTACCGAGAACGGTGCTTTTTGGGCGAACTACGAAGGTAGAACGTCCTTGGTCACCGTCCCGGACGTGAAGACAGCTATCGAGTTTTTGATTAAACTCTGCCGTCGACACAAGTTGTCCAATCAGGTGAACACGCGAACGCTTCTCCGCGATTTGCAACGAACGTTGCAGGAATGTGAATGCCAGTCTCATCATGTGCCGTTGTCCAGCCCCTTCATGCATCTCAGATTTGCGTAAAGCTGATCGGAAGCTATGAAAGTAAGCTGAGCGACACGGAAGTTATTGAAGCAGCTATTCAAGCTCTCATAGGCTTGGAAAGCCCGGCATCGATGTCGTTCCATGTCGCGCCAGACGCGGCCACCGATATGTATCTTGATCTGATCGAAATCTACTCCCCGTCGTCAGTCGGGATACATCTCGTCCTGCCATAAGCGCTGTCTGTAGCGTGCATAAACAGATAGATCGCCTTTTTAGGTAACCGCGGATTGATCACCGTTCCGAGCTTGCTTGGATAAACAAGTCCTTGTATAACAAGGCGAGACTCACTATGTGAGCGTCCAATAGGACGGCCCCTTCGGGGGCTCTCTCTCTGGAAAAAAACTTTATATATTTTCTACGTTGAGCTCCGTATAAAGCTCCGTGCCCGCACACGCCCGCTACGGCGGTATTGTCTAACCGGCGACAATAAACAGCTGCTGCTTACAAGCTTACAGTTGACCGGAGTCTCGGCGTGCAGCGCCTAAACACGGGAAACCGTGGTGGTGACACCCTCTGCTGAGGGCTTATAGTGGTGATATTCCTCCCCAGGAGTTCCCTCCCATTTCGGCCACTCGCGCTCTAACCATGAGCGCGGTCTCTTTGAGAGTGTCGCTCTTCTGCCTACGCGCTCATTCGTTCCCTCGAGTTGACGCTTCAAGCAGGTGGACACCTCCTCAACCCATAATAAGAGATCCATTCAATGGACAACATCCTCGATCCCCTTAAGGCTCCGTTTTCTTCGGAAGCCGCCGCGAAAACCACCGCTGCCAAAATCGCTGTGGTATACGCGTTGGTCGGTCTGGTTGGCGGTCTGCTGCTCACCAAGTAAGGTGTAGTATGCATGACACGCGACCGCTCCGAACCGAGAGACCCATGGCCAGCAAGAATACGAATGACCGGGTTTTCGACCGGTTTCATTCTTCTCTCGTTCGGCTGTGGCATTCTGCAAACCAGCGCATGCGCGGTTCTTTCTCCGTCGTGGTTCGGGTCGAGCACTCTTTAGTGCTCCTCATCGGCTACACGGTGGTAGGCGCGACTGTCGCACACTTCGTGAGGTGACTATGTTAGCTTTCGTAGCGCGAGCGGTCGTACTTTACTCTGCTGGTGTAGTCGTGGGCATCGCCTACGATCACGTCACAGGAAGGAAACGTCGCCATGACTAAGTGGAAGATGTACATCGCCGGCGTCGTTCTGGTCATCGTAGGGGCAGTTACTCATGCTCCACAGCTGATGGTCCAGGGCATGACTACGCTCGCGACTCAAGCGGCCGCAGACGCGGCCGATGGTGGAGGTGCTCAGTGAGTATCTTCTCCTCGTTGTTCAAGGTCATCAAGAAGGTAATCTCGAAGGTGGTCGCCACCCTTAAGAAAATCTTCAAGAAGATCTGGCCGTTGCTACTTATTGTGGCAATTATCTACTTCGCTCCCTACCTCGCCGGGTTCTTCACTTCCGCCGGGTTCACTGGGATCGGAGGGATCTTCTCCTCTATCGCAACCACCATCACGCCTACGCTGACGTCGTTCCTGTCGACTGCGTGGTCTGGTGTGGGCTCTCTTGCCTCCACGGCTTGGTCTGGGTTCCAATCTCTCGGGATGGGTACTCAGCTCGCTGTCGTGAGTGGCGCGGCTGCTCTGATTGCACCTGAGGAAACGGCTCAACTGGTTACCGAAATCGGTACCACCGTAGGTGATATCGCCGGTACGATTATCGGCGGTGTCGCCAAGGCACTCCCGGGTTGGATCTGGATCGCCGCAGGCGGTCTTGCCGTCTGGGCCCTCTGGCCGTCATCTGACAGTAAGGAGTAGCAAATGCGCTACCAAGGCATCAACGAGTGGCTGGGTGGAGCCAAGAAACTCACCACCGCAAACGGTGAGATTGGCGCTATCTACCTCTCCGCTGCTCCTCCCACCGACGCCGCACGTGCGGACGCTAAGGCGGTGGATTTTACTGCTGGTTGGCCAAGCGCGATCGTTGATCGCGCTGATGCCACTCGTGCCAAGCAGAACTACCTGTGGGTTGGCGATAACGTTGTGCACATCGGGGCTAAACACGTTCCACTCCTCGATCTGTGGGGCGGGACAGGTGATGCCTGGCAGCAGTTCGTTGGCTATGCCTGCCCAATGCTCGACCTTTGTCGTGCGTGGGGCCTGGGTTATGCCAGCGCTTCTGTAACCACCGGCTCGTTGCAGGGCTATCAGCCATCGGCGTTCTTGGACGCTGAGCAACAGCAGTTCGCGAAGGACAATCTCAACCTGTATGGCGATAACTGCCTTGACCTGGCCACCAGTTCGTCCGCTCAGCGGGCATTTCTGGAGCAGTGCATGGGCTGCGCCTTGCCGGAGGATTGCGTCTTCGGTTGGTATGTGAAAATGGATTGGGAAGGCTCGGCAGTTGCCGACGCCTACGCTGCGATCCGTGTCCAAGGGTTCGCCACTGTAATGGCACCTTGGCAGTCGGTTGGCGGTGCTGGCTACGTTTACGCTCGTGTGCCTCAAAAAGGCGCGTGGATGGGTGTGAACCTGCTTGCCTATGTCCACGGCACCAGTGGCCAGCCTGCTTATGGCATTCCGATGACCCTCTCGGGGTTCACCGGTAACATGGGTCAGGTGGCTTCGAAGTGGCTCATGCTTCCTCTCCTGATGATCGTCGACCCTCATGTCGTCCAGATTTTGGCCGCACTGGGGGTTAAACGTGGGACCAAATCGGACCCACGGACGACCGACGTGTACGCTGATCCGAAGGTTCCGGCTAGCCGTATTTCCGGGCCGATGATCAATGCAACGGTTGCTCCTCCTGCGACGATCCCCGCTACCATTCCGGTGCCTCTGGCGCCGCTCGGTGGCGCGGGTGGCCCTGGCGCTCAGGGTTTCCAGGTATACCCCGTTTTCACCTGGGGTCTGCCTGAGTTCATGACCGACGTGACCATCGAAGGTACCGTCACTGCGGACTCCAACGGTCTGCATGTCGTGGACGACGTGCGTAACTACGTCTGGAACGGTACTGCTCTTGCTGCAATTGAGCAGGTCAATGCCGCTGACGGTCGAGTTACGCTCACTGACTCTGAGCGTGCTCAACTCGCCTCGTTGACTGTTCGAACCGCATCGTTGCGTCAGCAGCTGTCGGTTGGGGCAGACCCCTTGTCCAAGACGTCGATCTGGCGTCAGGCTCAAAAGGCCGATTATGATCTGCTGTCTCAACAGATCATCGAAGCGGACACGGTGAAAAACCTACCTGCTGTGACGTTCGCTCAGGCGAACAAAGCGGCAGGCGGTCAATCCGAGACGTTGTGGCACCAGATGTATCGGGTCAACGATATCGCTGGCGATCAAGTCACCGCAATCCAAATCACTGGTACGATGGCGACTGGCATTCGCTGGTCGGCAACTGCTGGCGGTCTGGTCGTCGATGCTGACGAGCAAGATGCGGTGATCGCGATTTCGTCCGGTAAGCCGGTCAAGAACAGCTCCGACCTTCCTACGGCCGACGCTGTGAACTACTTGTTCGGTATCACTGCGGACGATATGCCTGGTATCGTTTCCTCGCAAAAGGAAATGAACAGCGAGTTTGAAGAAGGTTTCCTTCAGAAAGCTCGTCTCTGGAACCCACGTAAGCTCGTCGAAAACGTCCAGAATGCCTATTTCCTGATGGTGTACGCTCGCGATCGGAAGCAATTCCACTCGTTGGTGGCATCCTCTCTGGCGATGGCCAAGCTGGCGCGTAAGTACGCGGGCCTGTAAGGAGTCGTATGGCTGCTGAACAATCCTCCGGTATGAGCGCGTTCACCAAAGGCACGATCGTGATCTGCCTGGTGGTGGTCGCCCTCAATCTCATCGGGAAGTGACCATGGTACCGCTAAAAATTAGCACGCTGGAGTCCCAGCTGCAACCGCTTGTTAAGTTGGTTGCAACCGAGACCCCCGGTGCCCTCGTAGCGTATGCTCGAGGGTTATCGAGTGCCGACCGCTCGCGGTTGTACAGACTGCTTCGTTCTTTGGAGCAGGCCATCCCGAAGCTGTCGTCGGCTGTCGTTTCGGCCACGACGTTGGCAGCGCGAGGTCTCTAATGGAGACCAACCCGCTGCTTCAGCTTGAGTCGCTGTCGTTACGCTTGCGAGACATGCCTCGTTCGCGCCTTTCTGCGCTGATGAAGAACATGTCGTATGAGCAGCTGCAGTCGTTGTATAGCACCAGCGTAAAAGTTGGCGCTGTGCTCGATAGCGTTTCAATGCAGTTGCTTGAGGCGTCACAAACCGCTCAATCGGGAACTCGACTGATGACACCGCAGGAGTACGTCGCTGCTGGTGGAGGTCGTGTGTACGTTAAATAAGTCCTTAGATTTCTAAGGCGAGACTCGCTTTGCGAGCATCCAATAGGATGGCCCCTTCGGGGGCTCTCTCTCTGTAAAAAAACTTTATATAGTCTTTTACCTGGATTCTCTGTGCAGAACTGAGAACTGAACGCTACCCTTGCGGGGGATGCGGCCCCGGGCTACGGCCTAGGGATCCAGCGTGGCTCACGGGCCGCCGGAACTGACGTCCGTAACAAACGTCCTTGGGATAGGAGTACAGTAACCACTCTTAGATACCCGATTCCCCTGTTTCTGCGTGGAAGCCTTTCGACAGCTACCCAGCTTAGATCGTCTGGTGCCCTAAATCCCTGGAGATAACCAATGGCTACATTACAAGATGTGCATCTACGGGTGAATGACCGGGTAACACCGGTGTACTTCACTGCTCGCTCGTTTCTGCTCGTTTCTCCGAAACGTGCGGGGCAAGCAACGTTCCTCGCTCGCGAGGAGGGTACTGACAATCCTGTCGTTACCTGTCATGTATCCGACTTTTATAAGGACGGTGTGTAATGACTTTGTACCTGGTCCCTCCGCTGGATTCGGCGGACAAAGAGTTGCCTGCTCTGGCTTCCAAAGCTGGGGTAACGCTTCTCGAGATCGAGTTTCTTCACGAGCTCTGGCCTCACCTCAGTGGTGGTCAGATCGTGATCGCCGCTCTCAACGCCAACAATCTGGCCATCCTCAACCGTCACATGTCCACTCTGTTGGTCGAGTTGCCGGTTGCTGTGATGGCCGTTCCCGGTGCTAGCTATCGTTCCGATTGGAACATGATCGCTCACGCACTCCCGTCTGAGGATTGGATCACTTTGTCCAACAAGATGCTGAAAAGCGGCTTGCTGGCGAACGATACCGTCCAGGGCGAGAAGCGCTCCGGCGCTGAGCCGCTGTCGCCGAACGTGTACACCGATGCGCTCTCGCGTCTCGGTATCGCGACGGCCCATGCTATCCCCGTTGAACCCGAACAACCGTTCGATGTCGATGAGGTAAGCGCCTGATGCCGAGGAGAGCTCCCGCGTTCCCTCTGAGCGATATCAAGGCTCAGATGCTGTTCGCAAATAACATCAAGGCCCAACAAGCCTCGAAGCGTAGCTTCAAAGAGGGGGCGATTGAAACGTACGAAGGGCTGCTTTCAGTAGACCCTCGGTTTTTGAGTTTCAAGAACGAGCTCTCTCGGTATCTGACCGACCACTTCCCGGCGAACGTCGACGAGTATGGTCGTGTTTATGGAAACGGTGTTCGTACCAACTTCTTTGGTATGCGCCACATGAACGGGTTTCCAATGATCCCCGCGACGTGGCCACTCGCTTCCAACCTTAAGAAACGTGCCGACGCTGACCTAGCCGATGGCCCTGTTTCTGAGCGCGACAATCTACTCTTTCGCGCCGCAGTCCGGCTTATGTTTTCAGATCTAGAGCCTGTTCCGCTGAAGATCCGTAAAGGATCGTCAACCTGCATCCCGTATTTTTCTAACGATATGGGAACGAAGATCGAGATCGCCGAGCGCGCTCTTGAGAAAGCGGAAGAAGCTGGCAATCTGATGCTGCAAGGTAAGTTTGATGACGCCTACCAGCTCCACCAAATGGGTGGTGCCTATTACGTCGTGTATCGTGCACAATCGACCGATGCTATCACACTCGACCCTAAGACCGGAAAATTCGTGTCAAAGGATCGTATGGTCGCTGACTTCGAATACGCAGTCACGGGCGGTGAGCAAGGCTCGCTGTTCGCTGCTTCGAAGGATGCCTCTCGTTTGAAGGAACAGTACGGGATAGATGTCCCGGACGGGTTTTTCTGCGAGCGGCGTCGTACCGCTATGGGTGGTCCGTTCGCGTTGAACGCTCCTATCATGGCCGTTGCGCAACCTGTGCGAAACAAAATTTACTCCAAGTACGCTTACACCTTTCACCATACTACTCGTCTTAATAAGGAGGAAAAGGTGAAAGAGTGGTCGTTGTGCGTCGCTACTGACGTATCCGACCACGACACGTTCTGGCCTGGATGGCTGCGGGATCTCATCTGTGATGAACTGCTCAACATGGGGTACGCTCCGTGGTGGGTTAAGTTGTTCGAGACCTCGCTCAAACTGCCCGTTTACGTGGGCGCTCCTGCTCCTGAGCAGGGCCACACGTTGTTGGGTGATCCGTCCAACCCTGATCTCGAAGTTGGTCTCTCGTCCGGACAAGGGGCGACCGACCTCATGGGCACGTTGCTCATGAGTATCACCTACCTGGTGATGCAACTTGATCACACCGCTCCTCACCTCAACAGTCGAATCAAGGACATGCCATCAGCATGCCGCTTTCTTGACTCGTATTGGCAAGGACACGAGGAGATCCGTCAGATCTCAAAATCTGATGATGCTATGCTTGGCTGGACCAAAGGTCGTGCTTTGGTTGGTGGTCATCGTTTGTTCGAGATGCTGAAAGAGGGTAAGGTTAACCCCTCACCTTACATGAAGATCTCCTACGAGCACGGTGGCGCCTTCCTTGGTGACATCCTGCTTTACGACTCGCGTCGTGAGCCTGGCTCTGCCATCTTCGTTGGTAACATCAACTCAATGCTGAACAACCAGTTCAGCCCTGAGTACGGTGTCCAATCGGGCGTTCGCGACCGATCTAAGCGCAAACGGCCGTTCCCCGGTCTTGCTTGGGCGTCGATGAAAGATACCTACGGTGCCTGTCCGATCTACTCTGATGTGCTGGAGGCGATCGAGCGTTGCTGGTGGAACGCGTTCGGTGAGTCGTACCGTGCGTATCGTGAAGATATGCTTAAACGCGACACTCTCGAACTATCACGCTACGTTGCGTCGATGGCTCGTCAAGCCGGGCTGGCTGAACTCACTCCCATTGATTTGGAGGTGCTTGCTGACCCGAACAAACTCCAGTATAAGTGGACCGAGGCCGATGTCTCGGCGAATATCCACGAGGTACTGATGCATGGCGTATCGGTCGAAAAGACTGAGCGCTTTCTCCGTTCTGTAATGCCGAGGTAATCATGCCGATTGTCGTAACTCAAGCGCATATTGATCGTGTCGGCATCGCCGCCGATCTGCTCGATGCGTCTCCTGTGTCGCTTCAAGTTCTTGGTCGCCCTACCGCGATCAACACTGTCGTCATCAAGACGTACATCGCTGCTGTTATGGAGCTCGCCTCCAAGCAAGGTGGTTCGTTGGCCGGTGTGGATATTCGTCCTTCGGTTCTGCTGAAAGACACCGCTATCTTCACCAAGCCGAAGGCGAAGTCCGCTGACGTCGAATCTGATGTCGACGTTCTGGACACGGGGATTTACTCCGTTCCTGGACTGGCTCGCAAGCCTGTCACCCACCGTTGGCCATCAGAGGGTATCTACTCTGGTGTCACAGCTCTGATGGGCGCTACCGGTTCCGGTAAGTCGATCACGCTGAACGAAAAGCTCCGTCCAGACGTCCTGATTCGTTGGGGCGAGGTGGCTGAAGCTTACGATGAGCTGGATACCGCCGTCCACATCTCGACTCTGGATGAGATGTTGATTGTGTGTATTGGCCTGGGTGCACTCGGGTTCAACGTCGCTGTTGACTCGGTTCGTCCTCTGCTGTTCCGTCTCAAAGGCGCCGCCTCTGCGGGGGGTATTGTGGCTGTGTTCTACAGCCTGTTGACCGATATCTCGAACTTGTTCACACAATACGATTGCTCTGTCGTCATGGTCGTTAACCCGATGGTTGACGCTGAGAAGATCGAGTACGTGTTCGGTCAGGTCATGGCTTCGACTGTCGGTGCGATCTTGTGTGCTGATGGCAACGTGTCCAGAACGATGTTCCGGACCAACAAAGGTCGTATTTTCAACGGTGCGGCCCCTCTTGCTGCTGACACTCACATGCCTAGCATGGATCGTCCTACCAGCATGAAGGCCCTCGATCATACCTCGATCGCCTCTGTCGCACCGCTGGAGCGTGGCTCCGTGGATACCGACGATCGCAATTCCGCTCCGCGCCGTGGCGCTAACTTCTCTCTGTAAGGGTATAAGATGTTCAACCTCAAAGTTAAAGATCTGAACGGTTCCGCTCGCGGTCTGACTCAAGCTTTCGCCATCGGCGAATTGAAGAACCAGCTGTCCGTCGGCGCGTTGCAGTTGCCGTTGCAGTTCACGCGCACGTTCTCCGCTTCCATGACCAGCGAGTTGCTTTGGGAAGTGGGCAAGGGCAACATCGACCCAGTGATGTACGCTCGTCTGTTTTTCCAGTACGCGCAAGCTGGCGGCGCTCTGTCCGTTGATGAGCTCGTGAACCAGTTCACTGAGTATCACCAATCCACGGCCTGTAACCCTGAAATCTGGCGCAAGCTGACTGCTTACATCACCGGTTCCTCGAACCGCGCGATCAAAGCTGACGCTGTAGGCAAGGTGCCTCCAACCGCGATCCTGGAGCAGTTGCGCACTCTCGCTCCCTCGGAGCACGAGTTGTTTCACCACATCACGACCGACTTCGTCTGCCATGTGCTGTCTCCCCTCGGTTTCATCCTGCCTGACGCTGCCTACGTGTACCGCGTTGGTCGCACCGCTACGTACCCCAATTTCTACGCTCTTGTAGATTGCGTACGTGCGAGCGACCTGCGTCGTATGCTGACAGCGCTGTCGTCTGTCGATTCGAAGATGCTTCAAGCCACGTTCAAAGCCAAAGGCGCTCTTGCCCCTGCTTTGATCTCCCAGCATCTGGCTAACGCCGCCACTACTGCTTTCGAGCGGTCGCGCGGTAACTTCGATGCCAATGCTGTGGTGTCGTCCGTTCTGACCATTCTTGGTCGTCTCTGGTCGCCTTCCACCCCGAAGGAGCTCGACCCGAGTGCGCGTTTGCGCAACACCAACGGTATCGATCAGCTGCGCAGTAACCTGGCGCTGTTCATCGCGTACCAGGATATGGTCAAGCAACGCGGTCGCGCCGAAGTCATCTTCTCTGACGAGGAGCTGTCGTCGACGATCATCCCTTGGTTCATCGAGGCGATGAGCGAAGTGTCCCCGTTCAAACTGCGTCCGATCAACGAGACTACCAGCTATATCGGTCAGACCTCCGCGGTCGACCACATGGGCCAGCCGAGCCATGTTGTGGTCTACGAAGACTGGCAGTTTGCCAAGGAGATCACCGCTTTCACTCCTGTCAAGCTGGCCAACAACTCGAATCAGCGTTTCCTGGACGTTGAGCCTGGTATCTCTGATCGTATGTCGGCTACGCTGGCACCAATCGGCAACACGTTCGCGGTTTCGGCGTTCGTCAAGAACCGCACCGCCGTTTACGAGGCTGTTTCGCAGCGTGGTACAGTCAACAGCAACGGCGCGGAGATGACCCTCGGGTTCCCTTCCGTTGTTGAACGCGACTACGCTCTCGACCGTGATCCTATGGTCGCGATCGCTGCTCTGCGCACTGGTATCGTCGATGAAAGTCTCGAGGCTCGCGCTTCGAACGATCTGAAACGGTCGATGTTCAACTACTACGCGGCTGTGATGCATTACGCTGTTGCTCACAATCCTGAAGTTGTTGTTTCGGAGCACCAAGGTGTTGCCGCCGAACAAGGTTCGCTCTACCTGGTGTGGAACGTCCGCACTGAGCTGCGAATCCCTGTTGGTTACAACGCCATCGAGGGCGGTTCGATCCGTACCCCTGAGCCGTTGGAGGCGATCGCCTACAACAAGCCGATCCAACCGTCCGAGGTGCTGCAAGCCAAGGTACTGGATTTGGCTAACCACACAACCTCGATTCACATCTGGCCGTGGCATGAGGCTTCGACCGAGTTCGCGTACGAAGACGCCTACTCTGTCACCATCCGCAACAAACGCTACACCGCCGAAGTCAAGGAGTTCGAACTCCTCGGTCTCGGTCAACGTCGCGAACGTGTACGGATCCTCAAGCCTACGGTAGCCCACGCTATCATCCAGATGTGGTATTCCTGGTTCGTCGAGGACGACCGCACTTTGGCAGCTGCCCGTCGCACGTCTCGCGATGACGCCGAGAAGCTTGCTATCGACGGTCGTCGTATGCAAAACGCTGTGACCTTGCTTCGCAAGATCGAGATGATTGGGACAACCGGTATCGGTGCGTCTGCCGTCCACCTCGCGCAGTCGCGCATCGTGGATCAGATGGCCGGTCGAGGTCTCATCGACGACAGCTCCGATCTCCATGTCGGTATCAACCGTCACCGTATCCGCATCTGGGCCGGCCTCGCCGTTCTCCAGATGATGGGTCTCTTGAGCCGCTCCGAAGCGGAAGCTCTCACCAAGGTCCTTGGTGATAGCAACGCTCTGGGCATGGTTGTCGCCACAACCGACATTGATCCATCCCTGTAACTCTCGTAAGCTCTCATAGACCTTTCGTTATAATTCCATAAGTCCTTAGATTTCTAAGGCGAGACTCGCTTTGCGAGCGTCCAATAGGACGGCCCCCTCGGGGGCTCTCTCTCT \ No newline at end of file diff --git a/phi6 wt protein start stops.csv b/phi6 wt protein start stops.csv new file mode 100644 index 0000000..e6d3a98 --- /dev/null +++ b/phi6 wt protein start stops.csv @@ -0,0 +1,14 @@ +protein-name,start,stop +P8,305,754 +P12,754,1341 +P9,1341,1613 +P5a,1620,2282 +P10,3317,3445 +P6,3915,4421 +P3,4425,6372 +P13,6460,6678 +P14,7284,7472 +P7,7472,7957 +P2,7957,9954 +P4,9957,10955 +P1,10965,13274 \ No newline at end of file -- cgit v1.3