summaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
-rw-r--r--Genetic-Code-Amino-Acid-Codon-Chart-sidebyside-03.pngbin0 -> 54031 bytes
-rw-r--r--NOTES.org40
-rw-r--r--phi6 RefWT_from Lele.txt1
-rw-r--r--phi6 wt protein start stops.csv14
4 files changed, 55 insertions, 0 deletions
diff --git a/Genetic-Code-Amino-Acid-Codon-Chart-sidebyside-03.png b/Genetic-Code-Amino-Acid-Codon-Chart-sidebyside-03.png
new file mode 100644
index 0000000..de7a11c
--- /dev/null
+++ b/Genetic-Code-Amino-Acid-Codon-Chart-sidebyside-03.png
Binary files differ
diff --git a/NOTES.org b/NOTES.org
new file mode 100644
index 0000000..ba7f44b
--- /dev/null
+++ b/NOTES.org
@@ -0,0 +1,40 @@
+* This is the phi6 genome:
+[[file:phi6 RefWT_from Lele.txt]]
+
+* CSV file
+[[file:phi6 wt protein start stops.csv]]
+
+This is a CSV file with three columns: protein name, start nucleotide, ending nucleotide
+These numbers are inclusive. Everything else in the genome that’s not in at least one of those ranges (there’s one nucleotide overlaps
+between some reading frames) isn’t protein-coding.
+
+* Standard genetic code
+[[file:Genetic-Code-Amino-Acid-Codon-Chart-sidebyside-03.png]]
+
+The standard genetic code that you’ve used for some of my class projects applies, we will be using the single capital letter abbreviations for
+amino acids. Because of this please use lowercase “a, c, g, t” for nucleotides. This is a chart that uses the DNA bases (no need to switch “u”
+to “t” in your head) and has the single letter amino acids. The three stop codons (taa, tag, tga) should all code for the same thing — could be
+“STOP” could be an asterisk… you can have some creative control here :-)
+
+* Test
+As a test that our coordinates are correct, can you spit out the protein sequence from each of those proteins? Each will start with a M (one with
+a V, it’s an “alternate start codon) and should stop with a stop. Please send me that as a text file.
+
+If that works I’ll get you sample input and output for what we need the program to actually do
+
+have a nucleotide number and nucleotide inputted
+print out reference sequence nt at that number, the nt number, the inputted nucleotide (Tab) the name of the protein involved OR
+“noncoding” (Tab) Amino acid called by wild type sequence, the number in the protein that amino acid is, the amino acid called by the
+inputted nucleotide being in the sequence.
+
+Something like:
+input 7500g
+
+output:
+a7500g P7 S34T
+
+(sometimes the variant nucleotide will be in a protein-coding region but won’t change the called amino acid, this is normal and fine so we’ll
+see, for example, “S34S”
+
+Thanks!
+SD
diff --git a/phi6 RefWT_from Lele.txt b/phi6 RefWT_from Lele.txt
new file mode 100644
index 0000000..e751e6d
--- /dev/null
+++ b/phi6 RefWT_from Lele.txt
@@ -0,0 +1 @@
+GGAAAAAAACTTTATATAACTCTTATATAAGTGCCCTTAGCGGGGCTCCCCGGCTACGGTCGGATCCCTACGGGGAGGATAGGGTGAAAACCCCTAGTGCAAGCTGACACTCATACCTCCCAAGGTCCATGAGTCGACGCAAAGGTCCTCGAAAGCATGTTGTCCTTTCGTACAACCGAGTAGGTTCGTTGCCTTAATTGGTGACGCTTGCAGGATGAGGATGGTCCCGACGCCTAACGGACCTTGCTGCCTTCTTTCCCTGGATTGGCGGTGTTGTTCCCACTAATAATAAAGGAATACGCACATGTTGCTGCCTGTAGTAGCCCGTGCGGCCGTCCCTGCTATTGAGAGTGCCATTGCGGCTACTCCTAGCCTGGTTTCCCGAATCGCAGCCGCGATCGGTTCCAAGGCCAGCCCTTCCGCCATTTTGGCGGCGGTCAAGAGCAACCCGGTCGTCGCAGGTCTGACACTCGCTCAGATCGGAAGCACCGGTTATGACGCCTATCAGCAGCTTCTGGAGAATCATCCAGAGGTCGCCGAGATGCTGAAAGACCTGTCTTTCAAAGCCGACGAAATCCAGCCGGATTTCATCGGTAACCTCGGTCAGTACCGCGAAGAGCTGGAACTGGTCGAAGATGCTGCCCGCTTCGTGGGCGGCATGTCGAACCTGATTCGCCTGCGCCAGGCCCTGGAACTTGATATCAAGTACTACGGCCTGAAAATGCAGCTGAATGACATGGGATACCGCTCGTAATGGTTATCGGTCTCCTGAAGTATCTCACGCCTGCCGTTAAGGTGCAGATGGCTGCTCGCGCGTTGGGCCTGTCCCCCGCCGAAGTCGCTGCAATTGACGGCACGTTGGGTCGTGTCTCTGCGATGCCAGCGGTCGCGGTCGTGCTGGGAGGGAAACCTCTCTCTCTGGCCACGATCGCGTCAGTTGTGTCTGATGCAAACCCCAGTGCCACTGTTGGCGCGCTTATGCCTGCTGTACAGGGCATGGTGAGTTCCGACGAAGGCGCGAGTGCGTTGGCTAAGACCGTGGTAGGCTTCATGGAGTCCGACCCCAACAGCGATGTCCTGGTTCAACTGCTCCACAAGGTGTCAAACTTGCCGATTGTCGGCTTTGGTGACACGCAGTATGCAGACCCAGCTGACTTCTTGGCCAAGGGAGTTTTCCCTCTGATCAGGAAGCCAGAAGTAGAGGTTCAAGCTGCGCCTTTCACCTGTCGTCAGTGTGATCATGTTGATCACATCACTGATGTACCTCAAACTTCGACCTTTGTTCACAAATGCACTTCGTGCGGCTTTGTGCAGATGGTCCACCGTAAGGATGTTCCGTAATGCCATTTCCTCTGGTAAAGCAAGACCCAACCTCGAAGGCTTTCACTGAAGCCAGTGAACGCTCCACCGGCACCCAGATCCTGGACGTCGTCAAGGCCCCTATCGGCCTGTTCGGCGACGATGCCAAACACGAGTTCGTGACCCGTCAGGAACAAGCCGTCTCCGTCGTCAGCTGGGCAGTTGCTGCCGGTCTGATCGGCGAGCTGATCGGCTACCGTGGTGCGCGTTCGGGTCGCAAAGCGATCCTGGCCAACATCCCTTTCCTGGCCTAACTCCTCGTGTCCAAGGATAGCGCCTTCGCGGTGCAATACTCGCTGCGCGCCCTGGGACAAAAGGTGCGGGCAGACGGGGTAGTGGGCTCTGAGACCCGTGCGGCGCTGGATGCGCTGCCCGAGAATCAGAAGAAAGCGATTGTAGAGTTGCAAGCACTCCTACCGAAAGCACAGTCGGTCGGCAACAGCCGTGTGAGGTTCACAACAGCTGAAGTCGACTCGGCGGTGGCGCGGATCTCGCAAAAGATAGGTGTTCCGGCTTCTTACTACCAGTTCCTGATTCCGATCGAGAACTTCGTGGTGGCCGGTGGTTTCGAAACCACCGTTTCTGGTTCCTTCCGTGGGTTGGGCCAGTTCAACCGGCAGACGTGGGATGGACTCCGTCGTTTAGGCCGTAACCTTCCTGCATTTGAGGAGGGTTCGGCACAACTGAACGCTTCTCTTTATGCAATCGGGTTCTTGTATCTTGAGAACAAGAGAGCGTACGAGGCGTCGTTCAAAGGCCGCGTTTTCACTCACGAAATCGCGTATTTGTATCACAACCAAGGCGCTCCAGCTGCCGAACAGTACCTGACTTCGGGTCGGCTCGTTTACCCGAAGCAAAGCGAGGCCGCTGTCGCGGCGGTTGCGGCTGCGAGAAACCAGCATGTCAAAGAGAGTTGGGCTTAACCCTGAACTGCATCGTGAACTGAAAATGTTCCCAGATGTCACGAAGGGTGGCACGTTCGACATAACCATCCGGTCGACTACCGAGAACGGTGCTTTTTGGGCGAACTACGAAGGTAGAACGTCCTTGGTCACCGTCCCGGACGTGAAGACAGCTATCGAGTTTTTGATTAAACTCTGCCGTCGACACAAGTTGTCCAATCAGGTGAACACGCGAACGCTTCTCCGCGATTTGCAACGAACGTTGCAGGAATGTGAATGCCAGTCTCATCATGTGCCGTTGTCCAGCCCCTTCATGCATCTCAGATTTGCGTAAAGCTGATCGGAAGCTATGAAAGTAAGCTGAGCGACACGGAAGTTATTGAAGCAGCTATTCAAGCTCTCATAGGCTTGGAAAGCCCGGCATCGATGTCGTTCCATGTCGCGCCAGACGCGGCCACCGATATGTATCTTGATCTGATCGAAATCTACTCCCCGTCGTCAGTCGGGATACATCTCGTCCTGCCATAAGCGCTGTCTGTAGCGTGCATAAACAGATAGATCGCCTTTTTAGGTAACCGCGGATTGATCACCGTTCCGAGCTTGCTTGGATAAACAAGTCCTTGTATAACAAGGCGAGACTCACTATGTGAGCGTCCAATAGGACGGCCCCTTCGGGGGCTCTCTCTCTGGAAAAAAACTTTATATATTTTCTACGTTGAGCTCCGTATAAAGCTCCGTGCCCGCACACGCCCGCTACGGCGGTATTGTCTAACCGGCGACAATAAACAGCTGCTGCTTACAAGCTTACAGTTGACCGGAGTCTCGGCGTGCAGCGCCTAAACACGGGAAACCGTGGTGGTGACACCCTCTGCTGAGGGCTTATAGTGGTGATATTCCTCCCCAGGAGTTCCCTCCCATTTCGGCCACTCGCGCTCTAACCATGAGCGCGGTCTCTTTGAGAGTGTCGCTCTTCTGCCTACGCGCTCATTCGTTCCCTCGAGTTGACGCTTCAAGCAGGTGGACACCTCCTCAACCCATAATAAGAGATCCATTCAATGGACAACATCCTCGATCCCCTTAAGGCTCCGTTTTCTTCGGAAGCCGCCGCGAAAACCACCGCTGCCAAAATCGCTGTGGTATACGCGTTGGTCGGTCTGGTTGGCGGTCTGCTGCTCACCAAGTAAGGTGTAGTATGCATGACACGCGACCGCTCCGAACCGAGAGACCCATGGCCAGCAAGAATACGAATGACCGGGTTTTCGACCGGTTTCATTCTTCTCTCGTTCGGCTGTGGCATTCTGCAAACCAGCGCATGCGCGGTTCTTTCTCCGTCGTGGTTCGGGTCGAGCACTCTTTAGTGCTCCTCATCGGCTACACGGTGGTAGGCGCGACTGTCGCACACTTCGTGAGGTGACTATGTTAGCTTTCGTAGCGCGAGCGGTCGTACTTTACTCTGCTGGTGTAGTCGTGGGCATCGCCTACGATCACGTCACAGGAAGGAAACGTCGCCATGACTAAGTGGAAGATGTACATCGCCGGCGTCGTTCTGGTCATCGTAGGGGCAGTTACTCATGCTCCACAGCTGATGGTCCAGGGCATGACTACGCTCGCGACTCAAGCGGCCGCAGACGCGGCCGATGGTGGAGGTGCTCAGTGAGTATCTTCTCCTCGTTGTTCAAGGTCATCAAGAAGGTAATCTCGAAGGTGGTCGCCACCCTTAAGAAAATCTTCAAGAAGATCTGGCCGTTGCTACTTATTGTGGCAATTATCTACTTCGCTCCCTACCTCGCCGGGTTCTTCACTTCCGCCGGGTTCACTGGGATCGGAGGGATCTTCTCCTCTATCGCAACCACCATCACGCCTACGCTGACGTCGTTCCTGTCGACTGCGTGGTCTGGTGTGGGCTCTCTTGCCTCCACGGCTTGGTCTGGGTTCCAATCTCTCGGGATGGGTACTCAGCTCGCTGTCGTGAGTGGCGCGGCTGCTCTGATTGCACCTGAGGAAACGGCTCAACTGGTTACCGAAATCGGTACCACCGTAGGTGATATCGCCGGTACGATTATCGGCGGTGTCGCCAAGGCACTCCCGGGTTGGATCTGGATCGCCGCAGGCGGTCTTGCCGTCTGGGCCCTCTGGCCGTCATCTGACAGTAAGGAGTAGCAAATGCGCTACCAAGGCATCAACGAGTGGCTGGGTGGAGCCAAGAAACTCACCACCGCAAACGGTGAGATTGGCGCTATCTACCTCTCCGCTGCTCCTCCCACCGACGCCGCACGTGCGGACGCTAAGGCGGTGGATTTTACTGCTGGTTGGCCAAGCGCGATCGTTGATCGCGCTGATGCCACTCGTGCCAAGCAGAACTACCTGTGGGTTGGCGATAACGTTGTGCACATCGGGGCTAAACACGTTCCACTCCTCGATCTGTGGGGCGGGACAGGTGATGCCTGGCAGCAGTTCGTTGGCTATGCCTGCCCAATGCTCGACCTTTGTCGTGCGTGGGGCCTGGGTTATGCCAGCGCTTCTGTAACCACCGGCTCGTTGCAGGGCTATCAGCCATCGGCGTTCTTGGACGCTGAGCAACAGCAGTTCGCGAAGGACAATCTCAACCTGTATGGCGATAACTGCCTTGACCTGGCCACCAGTTCGTCCGCTCAGCGGGCATTTCTGGAGCAGTGCATGGGCTGCGCCTTGCCGGAGGATTGCGTCTTCGGTTGGTATGTGAAAATGGATTGGGAAGGCTCGGCAGTTGCCGACGCCTACGCTGCGATCCGTGTCCAAGGGTTCGCCACTGTAATGGCACCTTGGCAGTCGGTTGGCGGTGCTGGCTACGTTTACGCTCGTGTGCCTCAAAAAGGCGCGTGGATGGGTGTGAACCTGCTTGCCTATGTCCACGGCACCAGTGGCCAGCCTGCTTATGGCATTCCGATGACCCTCTCGGGGTTCACCGGTAACATGGGTCAGGTGGCTTCGAAGTGGCTCATGCTTCCTCTCCTGATGATCGTCGACCCTCATGTCGTCCAGATTTTGGCCGCACTGGGGGTTAAACGTGGGACCAAATCGGACCCACGGACGACCGACGTGTACGCTGATCCGAAGGTTCCGGCTAGCCGTATTTCCGGGCCGATGATCAATGCAACGGTTGCTCCTCCTGCGACGATCCCCGCTACCATTCCGGTGCCTCTGGCGCCGCTCGGTGGCGCGGGTGGCCCTGGCGCTCAGGGTTTCCAGGTATACCCCGTTTTCACCTGGGGTCTGCCTGAGTTCATGACCGACGTGACCATCGAAGGTACCGTCACTGCGGACTCCAACGGTCTGCATGTCGTGGACGACGTGCGTAACTACGTCTGGAACGGTACTGCTCTTGCTGCAATTGAGCAGGTCAATGCCGCTGACGGTCGAGTTACGCTCACTGACTCTGAGCGTGCTCAACTCGCCTCGTTGACTGTTCGAACCGCATCGTTGCGTCAGCAGCTGTCGGTTGGGGCAGACCCCTTGTCCAAGACGTCGATCTGGCGTCAGGCTCAAAAGGCCGATTATGATCTGCTGTCTCAACAGATCATCGAAGCGGACACGGTGAAAAACCTACCTGCTGTGACGTTCGCTCAGGCGAACAAAGCGGCAGGCGGTCAATCCGAGACGTTGTGGCACCAGATGTATCGGGTCAACGATATCGCTGGCGATCAAGTCACCGCAATCCAAATCACTGGTACGATGGCGACTGGCATTCGCTGGTCGGCAACTGCTGGCGGTCTGGTCGTCGATGCTGACGAGCAAGATGCGGTGATCGCGATTTCGTCCGGTAAGCCGGTCAAGAACAGCTCCGACCTTCCTACGGCCGACGCTGTGAACTACTTGTTCGGTATCACTGCGGACGATATGCCTGGTATCGTTTCCTCGCAAAAGGAAATGAACAGCGAGTTTGAAGAAGGTTTCCTTCAGAAAGCTCGTCTCTGGAACCCACGTAAGCTCGTCGAAAACGTCCAGAATGCCTATTTCCTGATGGTGTACGCTCGCGATCGGAAGCAATTCCACTCGTTGGTGGCATCCTCTCTGGCGATGGCCAAGCTGGCGCGTAAGTACGCGGGCCTGTAAGGAGTCGTATGGCTGCTGAACAATCCTCCGGTATGAGCGCGTTCACCAAAGGCACGATCGTGATCTGCCTGGTGGTGGTCGCCCTCAATCTCATCGGGAAGTGACCATGGTACCGCTAAAAATTAGCACGCTGGAGTCCCAGCTGCAACCGCTTGTTAAGTTGGTTGCAACCGAGACCCCCGGTGCCCTCGTAGCGTATGCTCGAGGGTTATCGAGTGCCGACCGCTCGCGGTTGTACAGACTGCTTCGTTCTTTGGAGCAGGCCATCCCGAAGCTGTCGTCGGCTGTCGTTTCGGCCACGACGTTGGCAGCGCGAGGTCTCTAATGGAGACCAACCCGCTGCTTCAGCTTGAGTCGCTGTCGTTACGCTTGCGAGACATGCCTCGTTCGCGCCTTTCTGCGCTGATGAAGAACATGTCGTATGAGCAGCTGCAGTCGTTGTATAGCACCAGCGTAAAAGTTGGCGCTGTGCTCGATAGCGTTTCAATGCAGTTGCTTGAGGCGTCACAAACCGCTCAATCGGGAACTCGACTGATGACACCGCAGGAGTACGTCGCTGCTGGTGGAGGTCGTGTGTACGTTAAATAAGTCCTTAGATTTCTAAGGCGAGACTCGCTTTGCGAGCATCCAATAGGATGGCCCCTTCGGGGGCTCTCTCTCTGTAAAAAAACTTTATATAGTCTTTTACCTGGATTCTCTGTGCAGAACTGAGAACTGAACGCTACCCTTGCGGGGGATGCGGCCCCGGGCTACGGCCTAGGGATCCAGCGTGGCTCACGGGCCGCCGGAACTGACGTCCGTAACAAACGTCCTTGGGATAGGAGTACAGTAACCACTCTTAGATACCCGATTCCCCTGTTTCTGCGTGGAAGCCTTTCGACAGCTACCCAGCTTAGATCGTCTGGTGCCCTAAATCCCTGGAGATAACCAATGGCTACATTACAAGATGTGCATCTACGGGTGAATGACCGGGTAACACCGGTGTACTTCACTGCTCGCTCGTTTCTGCTCGTTTCTCCGAAACGTGCGGGGCAAGCAACGTTCCTCGCTCGCGAGGAGGGTACTGACAATCCTGTCGTTACCTGTCATGTATCCGACTTTTATAAGGACGGTGTGTAATGACTTTGTACCTGGTCCCTCCGCTGGATTCGGCGGACAAAGAGTTGCCTGCTCTGGCTTCCAAAGCTGGGGTAACGCTTCTCGAGATCGAGTTTCTTCACGAGCTCTGGCCTCACCTCAGTGGTGGTCAGATCGTGATCGCCGCTCTCAACGCCAACAATCTGGCCATCCTCAACCGTCACATGTCCACTCTGTTGGTCGAGTTGCCGGTTGCTGTGATGGCCGTTCCCGGTGCTAGCTATCGTTCCGATTGGAACATGATCGCTCACGCACTCCCGTCTGAGGATTGGATCACTTTGTCCAACAAGATGCTGAAAAGCGGCTTGCTGGCGAACGATACCGTCCAGGGCGAGAAGCGCTCCGGCGCTGAGCCGCTGTCGCCGAACGTGTACACCGATGCGCTCTCGCGTCTCGGTATCGCGACGGCCCATGCTATCCCCGTTGAACCCGAACAACCGTTCGATGTCGATGAGGTAAGCGCCTGATGCCGAGGAGAGCTCCCGCGTTCCCTCTGAGCGATATCAAGGCTCAGATGCTGTTCGCAAATAACATCAAGGCCCAACAAGCCTCGAAGCGTAGCTTCAAAGAGGGGGCGATTGAAACGTACGAAGGGCTGCTTTCAGTAGACCCTCGGTTTTTGAGTTTCAAGAACGAGCTCTCTCGGTATCTGACCGACCACTTCCCGGCGAACGTCGACGAGTATGGTCGTGTTTATGGAAACGGTGTTCGTACCAACTTCTTTGGTATGCGCCACATGAACGGGTTTCCAATGATCCCCGCGACGTGGCCACTCGCTTCCAACCTTAAGAAACGTGCCGACGCTGACCTAGCCGATGGCCCTGTTTCTGAGCGCGACAATCTACTCTTTCGCGCCGCAGTCCGGCTTATGTTTTCAGATCTAGAGCCTGTTCCGCTGAAGATCCGTAAAGGATCGTCAACCTGCATCCCGTATTTTTCTAACGATATGGGAACGAAGATCGAGATCGCCGAGCGCGCTCTTGAGAAAGCGGAAGAAGCTGGCAATCTGATGCTGCAAGGTAAGTTTGATGACGCCTACCAGCTCCACCAAATGGGTGGTGCCTATTACGTCGTGTATCGTGCACAATCGACCGATGCTATCACACTCGACCCTAAGACCGGAAAATTCGTGTCAAAGGATCGTATGGTCGCTGACTTCGAATACGCAGTCACGGGCGGTGAGCAAGGCTCGCTGTTCGCTGCTTCGAAGGATGCCTCTCGTTTGAAGGAACAGTACGGGATAGATGTCCCGGACGGGTTTTTCTGCGAGCGGCGTCGTACCGCTATGGGTGGTCCGTTCGCGTTGAACGCTCCTATCATGGCCGTTGCGCAACCTGTGCGAAACAAAATTTACTCCAAGTACGCTTACACCTTTCACCATACTACTCGTCTTAATAAGGAGGAAAAGGTGAAAGAGTGGTCGTTGTGCGTCGCTACTGACGTATCCGACCACGACACGTTCTGGCCTGGATGGCTGCGGGATCTCATCTGTGATGAACTGCTCAACATGGGGTACGCTCCGTGGTGGGTTAAGTTGTTCGAGACCTCGCTCAAACTGCCCGTTTACGTGGGCGCTCCTGCTCCTGAGCAGGGCCACACGTTGTTGGGTGATCCGTCCAACCCTGATCTCGAAGTTGGTCTCTCGTCCGGACAAGGGGCGACCGACCTCATGGGCACGTTGCTCATGAGTATCACCTACCTGGTGATGCAACTTGATCACACCGCTCCTCACCTCAACAGTCGAATCAAGGACATGCCATCAGCATGCCGCTTTCTTGACTCGTATTGGCAAGGACACGAGGAGATCCGTCAGATCTCAAAATCTGATGATGCTATGCTTGGCTGGACCAAAGGTCGTGCTTTGGTTGGTGGTCATCGTTTGTTCGAGATGCTGAAAGAGGGTAAGGTTAACCCCTCACCTTACATGAAGATCTCCTACGAGCACGGTGGCGCCTTCCTTGGTGACATCCTGCTTTACGACTCGCGTCGTGAGCCTGGCTCTGCCATCTTCGTTGGTAACATCAACTCAATGCTGAACAACCAGTTCAGCCCTGAGTACGGTGTCCAATCGGGCGTTCGCGACCGATCTAAGCGCAAACGGCCGTTCCCCGGTCTTGCTTGGGCGTCGATGAAAGATACCTACGGTGCCTGTCCGATCTACTCTGATGTGCTGGAGGCGATCGAGCGTTGCTGGTGGAACGCGTTCGGTGAGTCGTACCGTGCGTATCGTGAAGATATGCTTAAACGCGACACTCTCGAACTATCACGCTACGTTGCGTCGATGGCTCGTCAAGCCGGGCTGGCTGAACTCACTCCCATTGATTTGGAGGTGCTTGCTGACCCGAACAAACTCCAGTATAAGTGGACCGAGGCCGATGTCTCGGCGAATATCCACGAGGTACTGATGCATGGCGTATCGGTCGAAAAGACTGAGCGCTTTCTCCGTTCTGTAATGCCGAGGTAATCATGCCGATTGTCGTAACTCAAGCGCATATTGATCGTGTCGGCATCGCCGCCGATCTGCTCGATGCGTCTCCTGTGTCGCTTCAAGTTCTTGGTCGCCCTACCGCGATCAACACTGTCGTCATCAAGACGTACATCGCTGCTGTTATGGAGCTCGCCTCCAAGCAAGGTGGTTCGTTGGCCGGTGTGGATATTCGTCCTTCGGTTCTGCTGAAAGACACCGCTATCTTCACCAAGCCGAAGGCGAAGTCCGCTGACGTCGAATCTGATGTCGACGTTCTGGACACGGGGATTTACTCCGTTCCTGGACTGGCTCGCAAGCCTGTCACCCACCGTTGGCCATCAGAGGGTATCTACTCTGGTGTCACAGCTCTGATGGGCGCTACCGGTTCCGGTAAGTCGATCACGCTGAACGAAAAGCTCCGTCCAGACGTCCTGATTCGTTGGGGCGAGGTGGCTGAAGCTTACGATGAGCTGGATACCGCCGTCCACATCTCGACTCTGGATGAGATGTTGATTGTGTGTATTGGCCTGGGTGCACTCGGGTTCAACGTCGCTGTTGACTCGGTTCGTCCTCTGCTGTTCCGTCTCAAAGGCGCCGCCTCTGCGGGGGGTATTGTGGCTGTGTTCTACAGCCTGTTGACCGATATCTCGAACTTGTTCACACAATACGATTGCTCTGTCGTCATGGTCGTTAACCCGATGGTTGACGCTGAGAAGATCGAGTACGTGTTCGGTCAGGTCATGGCTTCGACTGTCGGTGCGATCTTGTGTGCTGATGGCAACGTGTCCAGAACGATGTTCCGGACCAACAAAGGTCGTATTTTCAACGGTGCGGCCCCTCTTGCTGCTGACACTCACATGCCTAGCATGGATCGTCCTACCAGCATGAAGGCCCTCGATCATACCTCGATCGCCTCTGTCGCACCGCTGGAGCGTGGCTCCGTGGATACCGACGATCGCAATTCCGCTCCGCGCCGTGGCGCTAACTTCTCTCTGTAAGGGTATAAGATGTTCAACCTCAAAGTTAAAGATCTGAACGGTTCCGCTCGCGGTCTGACTCAAGCTTTCGCCATCGGCGAATTGAAGAACCAGCTGTCCGTCGGCGCGTTGCAGTTGCCGTTGCAGTTCACGCGCACGTTCTCCGCTTCCATGACCAGCGAGTTGCTTTGGGAAGTGGGCAAGGGCAACATCGACCCAGTGATGTACGCTCGTCTGTTTTTCCAGTACGCGCAAGCTGGCGGCGCTCTGTCCGTTGATGAGCTCGTGAACCAGTTCACTGAGTATCACCAATCCACGGCCTGTAACCCTGAAATCTGGCGCAAGCTGACTGCTTACATCACCGGTTCCTCGAACCGCGCGATCAAAGCTGACGCTGTAGGCAAGGTGCCTCCAACCGCGATCCTGGAGCAGTTGCGCACTCTCGCTCCCTCGGAGCACGAGTTGTTTCACCACATCACGACCGACTTCGTCTGCCATGTGCTGTCTCCCCTCGGTTTCATCCTGCCTGACGCTGCCTACGTGTACCGCGTTGGTCGCACCGCTACGTACCCCAATTTCTACGCTCTTGTAGATTGCGTACGTGCGAGCGACCTGCGTCGTATGCTGACAGCGCTGTCGTCTGTCGATTCGAAGATGCTTCAAGCCACGTTCAAAGCCAAAGGCGCTCTTGCCCCTGCTTTGATCTCCCAGCATCTGGCTAACGCCGCCACTACTGCTTTCGAGCGGTCGCGCGGTAACTTCGATGCCAATGCTGTGGTGTCGTCCGTTCTGACCATTCTTGGTCGTCTCTGGTCGCCTTCCACCCCGAAGGAGCTCGACCCGAGTGCGCGTTTGCGCAACACCAACGGTATCGATCAGCTGCGCAGTAACCTGGCGCTGTTCATCGCGTACCAGGATATGGTCAAGCAACGCGGTCGCGCCGAAGTCATCTTCTCTGACGAGGAGCTGTCGTCGACGATCATCCCTTGGTTCATCGAGGCGATGAGCGAAGTGTCCCCGTTCAAACTGCGTCCGATCAACGAGACTACCAGCTATATCGGTCAGACCTCCGCGGTCGACCACATGGGCCAGCCGAGCCATGTTGTGGTCTACGAAGACTGGCAGTTTGCCAAGGAGATCACCGCTTTCACTCCTGTCAAGCTGGCCAACAACTCGAATCAGCGTTTCCTGGACGTTGAGCCTGGTATCTCTGATCGTATGTCGGCTACGCTGGCACCAATCGGCAACACGTTCGCGGTTTCGGCGTTCGTCAAGAACCGCACCGCCGTTTACGAGGCTGTTTCGCAGCGTGGTACAGTCAACAGCAACGGCGCGGAGATGACCCTCGGGTTCCCTTCCGTTGTTGAACGCGACTACGCTCTCGACCGTGATCCTATGGTCGCGATCGCTGCTCTGCGCACTGGTATCGTCGATGAAAGTCTCGAGGCTCGCGCTTCGAACGATCTGAAACGGTCGATGTTCAACTACTACGCGGCTGTGATGCATTACGCTGTTGCTCACAATCCTGAAGTTGTTGTTTCGGAGCACCAAGGTGTTGCCGCCGAACAAGGTTCGCTCTACCTGGTGTGGAACGTCCGCACTGAGCTGCGAATCCCTGTTGGTTACAACGCCATCGAGGGCGGTTCGATCCGTACCCCTGAGCCGTTGGAGGCGATCGCCTACAACAAGCCGATCCAACCGTCCGAGGTGCTGCAAGCCAAGGTACTGGATTTGGCTAACCACACAACCTCGATTCACATCTGGCCGTGGCATGAGGCTTCGACCGAGTTCGCGTACGAAGACGCCTACTCTGTCACCATCCGCAACAAACGCTACACCGCCGAAGTCAAGGAGTTCGAACTCCTCGGTCTCGGTCAACGTCGCGAACGTGTACGGATCCTCAAGCCTACGGTAGCCCACGCTATCATCCAGATGTGGTATTCCTGGTTCGTCGAGGACGACCGCACTTTGGCAGCTGCCCGTCGCACGTCTCGCGATGACGCCGAGAAGCTTGCTATCGACGGTCGTCGTATGCAAAACGCTGTGACCTTGCTTCGCAAGATCGAGATGATTGGGACAACCGGTATCGGTGCGTCTGCCGTCCACCTCGCGCAGTCGCGCATCGTGGATCAGATGGCCGGTCGAGGTCTCATCGACGACAGCTCCGATCTCCATGTCGGTATCAACCGTCACCGTATCCGCATCTGGGCCGGCCTCGCCGTTCTCCAGATGATGGGTCTCTTGAGCCGCTCCGAAGCGGAAGCTCTCACCAAGGTCCTTGGTGATAGCAACGCTCTGGGCATGGTTGTCGCCACAACCGACATTGATCCATCCCTGTAACTCTCGTAAGCTCTCATAGACCTTTCGTTATAATTCCATAAGTCCTTAGATTTCTAAGGCGAGACTCGCTTTGCGAGCGTCCAATAGGACGGCCCCCTCGGGGGCTCTCTCTCT \ No newline at end of file
diff --git a/phi6 wt protein start stops.csv b/phi6 wt protein start stops.csv
new file mode 100644
index 0000000..e6d3a98
--- /dev/null
+++ b/phi6 wt protein start stops.csv
@@ -0,0 +1,14 @@
+protein-name,start,stop
+P8,305,754
+P12,754,1341
+P9,1341,1613
+P5a,1620,2282
+P10,3317,3445
+P6,3915,4421
+P3,4425,6372
+P13,6460,6678
+P14,7284,7472
+P7,7472,7957
+P2,7957,9954
+P4,9957,10955
+P1,10965,13274 \ No newline at end of file