GLIBC的故事

1. 安装glibc-2.29 (系统:Linux version 2.6.32)

GNU libc requires kernel header files from Linux 3.2.0 or later to be installed before configuring. checking installed Linux kernel header files… missing or too old!

2. libc.so.6 GLIBC_

3. 动了GLIBC,可能会出现:
1. conda无法工作
2. R无法编译
3. 系统工具无法工作

4.

Acrobat Pro DC 使用手册

1. acrobat 不支持修改快捷键,只能使用默认的快捷键。

1. 缩放:ctrl+2, 适合宽度;ctrl+1, 实际大小
2. 隐藏:F8 隐藏工具栏,F9隐藏菜单栏
3. 添加文本注释:
4. 首选项:ctrl+k
5. 手型工具:h
6. 选择文本工具:v
7. 高亮文本工具:u

 

2. adobe有一个fast web view,如果pdf启用了,每次保存速度非常慢。当我在学习一个文档时,会常常保存,每次保存我都无法操作pdf,且需要等待很久。所以,我需要关闭这个功能。
编辑->首选项->文档,在保存设置下,关闭,另存为优化快速web查看。最后点击确认。

notes of ANNOVAR

1. 坐标系: By default, 1-based coordinate system is used.

2. 核心程序: annotate_variation.pl

3. 注释类型: gene-based (-geneanno), region-based (-regionanno) and filter-based (-filter) annotations.

4. 输出结果:
a. The first file contains annotation for all variants, by adding two columns to the beginning of each input line.
b. The second output file contains the amino acid changes as a result of the exonic variant.

5. 重点定位:
What about GFF3 file for new species?(http://annovar.openbioinformatics.org/en/latest/user-guide/gene/)

gff3ToGenePred                                                                               (http://hgdownload.soe.ucsc.edu/admin/exe/linux.x86_64/)

6.特别注意:
annovar注释以后,会把InDel的坐标改了。比如:
原来是,    133,ACTG->A,
修改后是,134,CTG->-。

Windows10 debug & optimize

windows cmd grammar

  1. 命令别名:doskey
    doskey ls = dir
    doskey vi=vim $*: 如果没有$*,敲击缩写时的参数不会被传递给缩写。
    https://docs.microsoft.com/en-us/windows-server/administration/windows-commands/doskey
  2. @:在命令开头加@可以关闭运行本命令时打印本命令。
    @echo foo: 不会先打印echo foo,再打印foo;而是直接打印foo。
  3. foobar

linux-problem set

  1. ^M=ctrl v+ctrl m
  2. wc -l 根据换行符统计行数,如果最后一行没有换行符,就会统计少一行。
    如何为最后一行补充换行符?
    sed -i -s ‘$s/$/\n/;P;d;’ file
    #sed命令解读
    #-i: 原地编辑,具体实现为,输出到一个新的文件,输出结束后,使用mv修改名字为原文件
    #-s: 不要将参数中给定的多个文件视作一个文件,而是单独看待
    #‘$s/$/\n/;P;d’: 第一个$是定位符,只处理最后一行;P,输出模式空间内容,单截止于第一个回车符;d删除模式空间,直接开始下一次cycle。
    #补充说明:sed有一个默认操作,开始时移除当前行末的所有换行符,结束时输出模式空间,如果移除过换行符,补回换行符。
  3. foobar

framework of ML/DL

1. framework: keras, xgboost…

2. hyperparameters: tune

3. rationale: 多层感知机、卷积神经网络、自编码器…

将原理使用框架实现出来,并找到尽可能优秀的参数。

problems of saving webpage

保存网页的时候,会有各种不如意的地方,找不到令人满意的工具。现在记录下来各种问题,以后有机会自己开发一款趁手好用、小巧的网页转pdf的工具。

1. 保存的网页会被自动切成好几个小块,切的地方会让后续的阅读很困扰,比如从图片中间切,从一个重要的段落中间切。

2. 保存的网页字体很难看

3. 滚动截屏速度很慢

4. chrome, devtools, screenshot, 保存的图片分辨率较低

git usage

1. 提交

git add .

git commit -m “your comments about this submission”

git push origin master

2. 同时开发,解决冲突

背景:
实际工作中,一定会出现多个人共同开发一个软件,但是大家负责不同的功能。基于git,大家可以各自创建一个分支,完成自己的分支工作后,再merge回去。

问题:
如果同事A修改了某个文件F的格式,但是同事B依赖该文件原来的格式。A先提交成功了,B再提交,B就无法运行成功了,因为F的格式变了。任何一个人,merge之前应该先做测试,证明merge后能够正常运行,才能真正merge。又或者,大家应该有一个约定,有的文件是不能动的。

3. git clone

4. windows下git对路径名长度存在限制,即使win10系统本身解除了这个限制。
git config –global core.longpaths true
但是没用,仍然报错:fatal: ‘$GIT_DIR’ too big

5. foobar

plink

1. plink常用格式:
https://www.cog-genomics.org/plink/1.9/formats#ped
ped, map
bed, bim, fam

2. plink gwas

3. plink homepage:
1.07: http://zzz.bwh.harvard.edu/plink/index.shtml
1.9 & 2.0: https://www.cog-genomics.org/plink/

#snp2bedbimfam
plink –23file JPT-NA19001.snp JPT ID002 –out JPT-NA19001

#去除有问题的snp
plink –bfile JPT-NA19001 –exclude merge.missnp –make-bed –out new

#merge单个文件
plink –bfile source1 –bmerge source2_trial –make-bed –out merged_trial

#merge多个文件
plink –merge-list merge_list –make-bed –out merge

conda usage & debug

  1. 使用conda install安装时,如果下载到一半,断网中断了安装;下次再安装就会core dump;
    解决办法:删除 $(which conda)/../../pkgs 下面的中断的时候正在下载的包即可。
  2. foobar

advanced shell programming

  1. 使用$(command),将command的返回值作为另一个命令的组成部分
    grep $(head -n1 a.txt) a.txt
  2. daily & sophisticated shell built-in commands
  3. sed [stream editor]
  4. $’string’

     
  5. set -x
  6. I/O重定向
  7. 在双引号内使用单引号,不需要加反斜线;使用双引号需要加反斜线;
  8. eval
  9. xargs
  10. awk [Aho Weinberger Kernighan]
  11. truncate last line
  12. 12

centos 8 管理

1. 清除无用内核及其对应的modules等文件

2. 修改grub参数

3. 添加仓库

1. yum config-manager –set-enabled PowerTools
2. yum install epel-release

4. 安装X11

尝试过的方案:
1. 借由安装xclock,系统会自动安装其所依赖的X11相关组件
yum install xclock
2. yum dep 安装R依赖
3. 手动编译R包:
3.1 给定路径:

./configure –prefix=/home/nott/00.software/00.common/R-3.6.1/ –x-libraries=/usr/lib64 –x-includes=/usr/include/
3.2 去掉conda
4. 暂时方法:使用CairoX11()

4. foobar

Excel 编程

1. 统计所有sheet的总列数

 

2. foobar

daily R_LANG Commands

R is a free software environment for statistical computing and graphics. It compiles and runs on a wide variety of UNIX platforms, Windows and MacOS.

1. related to environment variables
Sys.getenv()
Sys.setenv(BINPREF = “”)

## try http:// if https:// URLs are not supported

2. Change directory: setwd(‘E:/’) or setwd(“E:/”)
caveat: use ‘/’ not ‘\’ in windows.
*: list items in the current directory: dir()

3. bioconductor:
options(BioC_mirror=”https://mirrors.ustc.edu.cn/bioc/”) #换成国内的源,用于加速
if (!requireNamespace(“BiocManager”, quietly = TRUE))
install.packages(“BiocManager”)
BiocManager::install(“ChIPseeker”)

4. update, remove packages
update.packages()
remove.packages()

5. upgrade R
library(installr)
updateR()

6. rstudio换源加速

7. BiocManager换源
源列表:https://www.bioconductor.org/about/mirrors/
options(BioC_mirror=”http://mirrors.ustc.edu.cn/bioc/”)

 

The support of gamma function in Perl

I want a gamma function and try to install Math::GammaFunction package in Perl but fail finally.
The last update of this package is 2007.1.
I can guess the grammar of this package is far from the current standard.
Finally I decide to use Python.

spearman’s rank correlation coefficient 斯皮尔曼等级相关系数

该系数(斯皮尔曼等级相关系数)是一种技术,能够被用于总结两个变量之间关系的强度和方向(负相关或正相关)。计算出来的数值在-1到1之间。

1. 计算该系数的方法:

  • Create a table from your data.
Convenience Store Distance from CAM (m) Rank distance Price of 50cl bottle (€) Rank price Difference between ranks (d) d2
1 50 10 1.8 2 8 64
2 175 9 1.2 3.5 5.5 30.25
3 270 8 2 1 7 49
4 375 7 1 6 1 1
5 425 6 1 6 0 0
6 580 5 1.2 3.5 1.5 2.25
7 710 4 0.8 9 -5 25
8 790 3 0.6 10 -7 49
9 890 2 1 6 -4 16
10 980 1 0.85 8 -7 49
∑d2 = 285.5
  • Rank the two data sets. Ranking is achieved by giving the ranking ‘1’ to the biggest number in a column, ‘2’ to the second biggest value and so on. The smallest value in the column will get the lowest ranking. This should be done for both sets of measurements.
  • Tied scores are given the mean (average) rank. For example, the three tied scores of 1 euro in the example below are ranked fifth in order of price, but occupy three positions (fifth, sixth and seventh) in a ranking hierarchy of ten. The mean rank in this case is calculated as (5+6+7)/3 = 6.
  • Find the difference in the ranks (d): This is the difference between the ranks of the two values on each row of the table. The rank of the second value (price) is subtracted from the rank of the first (distance from the museum).
  • Square the differences (d2) To remove negative values and then sum them (d2).
  • Calculate the coefficient (Rs) using the formula below. The answer will always be between 1.0 (a perfect positive correlation) and -1.0 (a perfect negative correlation).

Now to put all these values into the formula.

  • Find the value of all the d2 values by adding up all the values in the Difference column. In our example this is 285.5. Multiplying this by 6 gives 1713.
  • Now for the bottom line of the equation. The value n is the number of sites at which you took measurements. This, in our example is 10. Substituting these values into n3 – n we get 1000 – 10
  • We now have the formula: Rs = 1 – (1713/990) which gives a value for Rs:

1 – 1.73 = -0.73

2. 检验该关联的显著性

这里网页上直接提供了,计算好的各个显著性的r值,让我们用自己算出来的r值与之对比。没有提供计算方法。

维基百科(https://en.wikipedia.org/wiki/Spearman%27s_rank_correlation_coefficient#Determining_significance)提供了三种算法。

更多信息参考网页:https://geographyfieldwork.com/SpearmansRank.htm (谷歌搜索推荐位为1)

 

 

 

win10 hyper-v 关闭

修改项一:

1. 打开,控制面板
win+r, control
2. 打开,卸载程序
3. 启用或关闭Windows功能
4. 点掉Hyper-V, container (反之,启用docker就点回来)

修改项二:

1. win+r, bcdedit
2. bcdedit /set hypervisorlaunchtype off

further reading:

https://superuser.com/questions/1391838/virtual-box-is-not-working-on-windows-10
https://blog.csdn.net/imilano/article/details/83038682