提取两文件的不同行

文件的情景：prank.sh.completed是prank.sh的子集：

1
2
3

cat prank.sh prank.sh.completed | sort | uniq -d >temp.txt

cat prank.sh temp.txt | sort | uniq -u > different.txt

（以下内容搭个便车）

前两天，利用cafe4.2的版本分析，根据本地以及网上教程，

在网上下载了各种现成的脚本，真方便。

1	cafe 01cafe.sh

上述命令运行报错（忘了记录）。原因：缺少树的祖先位置的λ分类。添加即可。

在提取扩张收缩的基因家族ID时，cafetutorial_report_analysis.py脚本运行报错（忘了记录）。原因：缺少cafecore moduel。解决方法：github上搜索并下载cafecore.py，添加到python脚本目录下。

运行时又有了以下报错

1
2
3

File "./cafecore.py", line 8
<!DOCTYPE html>
^SyntaxError: invalid syntax

感谢stackoverflow上的用户user2357112，他/她指出原因是没有真正下载到cafecore.py这个脚本

“You're not downloading the script. You're downloading a GitHub web page with the script and a whole bunch of other stuff on it, like GitHub navigation and a search bar and clickable line numbers.”

于是下载了python而非html语言格式的cafecore.py脚本。到此，单独运行脚本是成功的，因为出现了脚本的使用参数等信息（如下）：

python cafetutorial_report_analysis.py

|**Error 1: -i must be defined |

usage: cafetutorial_report_analysis.py [-h] [-i INPUT_FILE]
                                       [-e USER_ERR_START] [-d USER_TMP_DIR]
                                       [-f FIRST_RUN] [-c CURVE_OPTION]
                                       [-t ERROR_TRIES] [-l USER_LOG_FILE]
                                       [-o OUTPUT_FILE] [-s IND_MIN]
                                       [-v VERBOSE]

optional arguments:
  -h, --help         show this help message and exit
  -i INPUT_FILE      A CAFE shell script with the full CAFE path in the
                     shebang line, the load, tree, and lambda commands. These
                     lines will be read and incorporated into the caferror
                     shell script.
  -e USER_ERR_START  The starting point for the grid search. Should be between
                     0 and 1. Default: 0.4
  -d USER_TMP_DIR    A directory in which all caferror files will be stored.
                     If none is specified, it will default to caferror_X, with
                     X being some integer one higher than the last directory.
  -f FIRST_RUN       Boolean option to perform a pre-error model run (1) or
                     not (0). Default: 0
  -c CURVE_OPTION    Boolean option. caferror can either perform the grid
                     search (0) or search a pre-specified space (1). Default:
                     0
  -t ERROR_TRIES     A list of error values to search over. Note: -c MUST be
                     set to 1 to use these values. Enter as a comma delimited
                     string, ie -t 0.1,0.2,0.3
  -l USER_LOG_FILE   Specify the name for caferror's log file here. Default:
                     caferrorLog.txt
  -o OUTPUT_FILE     Output file which stores only the error model and score
                     for each run. Default: caferror_default_output.txt
  -s IND_MIN         Boolean option to specify whether to perform only the
                     global error search (0) or continue with individual
                     species minimizations (1). Default: 0
  -v VERBOSE         Boolean option to have detailed information for each CAFE
                     run printed to the screen (1) or not (0). Default: 1

当加上输入文件以及参数时，又有了新的报错：

1	python python_scripts/cafetutorial_report_analysis.py -i reports/report_run1.cafe -o reports/summary_run1 -r 0

报错内容：

 File "cafetutorial_report_analysis.py", line 8, in <module>
    import sys, os, argparse, cafecore as cafecore
  File "./cafecore.py", line 420, in <module>
    treestring = Tree[Tree.index("("):];
NameError: name 'Tree' is not defined

查看输入文件的格式与脚本内容，苦思冥想之后将脚本中匹配tree改为匹配Tree，因为输入文件中是T大写，重新运行上述脚本，意料之中，又有了新的报错（如下）：

File "cafetutorial_report_analysis.py", line 8, in <module>
    import sys, os, argparse, cafecore as cafecore
  File "./cafecore.py", line 491, in <module>
    printWrite(caferrorLog, 1, "# CAFE path set as:", CafePath, pad);

咨询了大佬，可能是没有全路径的问题，加上全路径后，还是报错，瞬时全身乏力，通关无望。

于是找到之前做过案例的文件，查看相关文件是否有差异。将本地所有python脚本上传并运行，直接运行成功。

最终发现，有时候网上的脚本真坑人。

没午休，回想这些报错，头又大了！