文件的情景:prank.sh.completed是prank.sh的子集:

1
2
3
cat prank.sh prank.sh.completed | sort | uniq -d >temp.txt

cat prank.sh temp.txt | sort | uniq -u > different.txt

(以下内容搭个便车)

前两天,利用cafe4.2的版本分析,根据本地以及网上教程,

在网上下载了各种现成的脚本,真方便。

1
cafe 01cafe.sh

上述命令运行报错(忘了记录)。原因:缺少树的祖先位置的λ分类。添加即可。

在提取扩张收缩的基因家族ID时,cafetutorial_report_analysis.py脚本运行报错(忘了记录)。原因:缺少cafecore moduel。解决方法:github上搜索并下载cafecore.py,添加到python脚本目录下。

运行时又有了以下报错

1
2
3
File "./cafecore.py", line 8
<!DOCTYPE html>
^SyntaxError: invalid syntax

感谢stackoverflow上的用户user2357112,他/她指出原因是没有真正下载到cafecore.py这个脚本

1
“You're not downloading the script. You're downloading a GitHub web page with the script and a whole bunch of other stuff on it, like GitHub navigation and a search bar and clickable line numbers.”

于是下载了python而非html语言格式的cafecore.py脚本。到此,单独运行脚本是成功的,因为出现了脚本的使用参数等信息(如下):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
python cafetutorial_report_analysis.py

|**Error 1: -i must be defined |

usage: cafetutorial_report_analysis.py [-h] [-i INPUT_FILE]
[-e USER_ERR_START] [-d USER_TMP_DIR]
[-f FIRST_RUN] [-c CURVE_OPTION]
[-t ERROR_TRIES] [-l USER_LOG_FILE]
[-o OUTPUT_FILE] [-s IND_MIN]
[-v VERBOSE]

optional arguments:
-h, --help show this help message and exit
-i INPUT_FILE A CAFE shell script with the full CAFE path in the
shebang line, the load, tree, and lambda commands. These
lines will be read and incorporated into the caferror
shell script.
-e USER_ERR_START The starting point for the grid search. Should be between
0 and 1. Default: 0.4
-d USER_TMP_DIR A directory in which all caferror files will be stored.
If none is specified, it will default to caferror_X, with
X being some integer one higher than the last directory.
-f FIRST_RUN Boolean option to perform a pre-error model run (1) or
not (0). Default: 0
-c CURVE_OPTION Boolean option. caferror can either perform the grid
search (0) or search a pre-specified space (1). Default:
0
-t ERROR_TRIES A list of error values to search over. Note: -c MUST be
set to 1 to use these values. Enter as a comma delimited
string, ie -t 0.1,0.2,0.3
-l USER_LOG_FILE Specify the name for caferror's log file here. Default:
caferrorLog.txt
-o OUTPUT_FILE Output file which stores only the error model and score
for each run. Default: caferror_default_output.txt
-s IND_MIN Boolean option to specify whether to perform only the
global error search (0) or continue with individual
species minimizations (1). Default: 0
-v VERBOSE Boolean option to have detailed information for each CAFE
run printed to the screen (1) or not (0). Default: 1

当加上输入文件以及参数时,又有了新的报错:

1
python python_scripts/cafetutorial_report_analysis.py -i reports/report_run1.cafe -o reports/summary_run1 -r 0

报错内容:

1
2
3
4
5
 File "cafetutorial_report_analysis.py", line 8, in <module>
import sys, os, argparse, cafecore as cafecore
File "./cafecore.py", line 420, in <module>
treestring = Tree[Tree.index("("):];
NameError: name 'Tree' is not defined

查看输入文件的格式与脚本内容,苦思冥想之后将脚本中匹配tree改为匹配Tree,因为输入文件中是T大写,重新运行上述脚本,意料之中,又有了新的报错(如下):

1
2
3
4
File "cafetutorial_report_analysis.py", line 8, in <module>
import sys, os, argparse, cafecore as cafecore
File "./cafecore.py", line 491, in <module>
printWrite(caferrorLog, 1, "# CAFE path set as:", CafePath, pad);

咨询了大佬,可能是没有全路径的问题,加上全路径后,还是报错,瞬时全身乏力,通关无望。

于是找到之前做过案例的文件,查看相关文件是否有差异。将本地所有python脚本上传并运行,直接运行成功。

最终发现,有时候网上的脚本真坑人。

没午休,回想这些报错,头又大了!