#+title: percent nucleotide identity threshold (pnit?) * input csv file, first row and first column are names, every other item is the % identity of the names corresponding to the current cell. #+name: input-table-example | | seq1 | seq2 | seq3 | | seq1 | | | | | seq2 | 0.9 | | | | seq3 | 0.32 | 0.11 | | this shows ~seq2~ is 90% identical to ~seq1~, and ~seq3~ is 32% and 11% identical to ~seq1~ and ~seq2~, respectively. the csv file would look like this: #+name: input-csv-example #+begin_src text ,seq1,seq2,seq3 seq1,,, seq2,0.9,, seq3,0.32,0.11, #+end_src * output csv file, two column, representing a pair where the value is at least as large a given threshold. given [[input-table-example][the example input table]], at a threshold of 32%, we should get: #+name: output-table-example-32 | seq2 | seq1 | | seq3 | seq1 | or, in csv: #+name: output-csv-example #+begin_src text seq2,seq1 seq3,seq1 #+end_src * runners #+name: process #+begin_src shell :results file :file n-401-94.csv :var threshold=94.0 filename="inputs/n-401.csv" guix shell perl -- ./pairwise.pl $threshold $filename #+end_src #+RESULTS: process [[file:n-401-94.csv]] #+call: process[:file n-402-90.5.csv](threshold=90.5, filename="n-402.csv") #+RESULTS: [[file:n-402-90.5.csv]] #+call: process[:file n-402-93.5.csv](threshold=93.5, filename="n-402.csv") #+RESULTS: [[file:n-402-93.5.csv]]