TED speaker

Benoit Mandelbrot: (20 November 1924 – 14 October 2010)

TED title: Fractals and the roughness
Introduction:  Benoit Mandelbrot was a Polish-born, French and American mathematician. He is recognized for his contribution to the field of fractal geometry, which included coining the word “fractal” as well as for developing a “theory of roughness” and “self-similarity” in nature.

Carter Emmart

TED title: A 3D atlas of the universe
Introduction: It is a standalone 4-dimensional space visualization application built on the programmable Partiview data visualization engine designed by Stuart Levy of the National Center for Supercomputing Applications (NCSA) as an adjunct of the NCSA’s Virtual Director virtual choreography project. The Virtual Universe Atlas project was launched by the American Museum of Natural History’s Hayden Planetarium with significant programming support from the National Aeronautics and Space Administration as well as Stuart Levy. The database draws on the National Virtual Observatory.

Daniel Kahneman

TED title: The riddle of experience vs. memory.
Introduction: (Hebrew: דניאל כהנמן‎, born March 5, 1934) is an Israeli-American psychologist notable for his work on the psychology of judgment and decision-making, as well as behavioral economics, for which he was awarded the 2002 Nobel Memorial Prize in Economic Sciences (shared with Vernon L. Smith). His empirical findings challenge the assumption of human rationality prevailing in modern economic theory.

python 使用技巧

  1. 安装 pip:
    1. 下载:https://bootstrap.pypa.io/get-pip.py
    2. 安装:python get-pip.py
  2. 在windows下使用pip:
    python -m pip
  3. error:Microsoft Visual C++ 9.0 required(Unable to find vcvarsall.bat).
    解决方法:下载 VCForPython27.msi 。
    地址: http://www.microsoft.com/en-us/download/confirmation.aspx?id=44266








Basic information on S.suis

  1. one of the most prevalent pathogens in swine causing a range of disease syndromes including arthritis (关节炎), meningitis (脑膜炎), pneumonia (肺炎), septicemia (败血症) and endocarditis (心内膜炎), etc. [1]
  2. an zoonotic agent able to induce meningitis, endocarditis, and streptococcal toxic shock-like syndrome in humans. [1]
  3. Thirty three S. Suis serotypes identified on the basis of antigenic differences in their CPS (Capsule Polysacharides). [1]
  4. S. suis 2 mainly infects people who have direct contact with carrier pigs, sick pigs, or raw pork via wounds on the skin, or the mucosa of the mouth, or nasal cavity. [1]
  5. 1642 cases of S. suis human infection had been reported worldwide until Dec. 31, 2013. [1]

[1]. Zhang, Y., Ding, D., Liu, M., Yang, X., Zong, B., Wang, X., Chen, H., Bei, W., and Tan, C. (2016). Effect of the glycosyltransferases on the capsular polysaccharide synthesis of Streptococcus suis serotype 2. Microbiological research.

The elements for building a website – 建站必备

  1. Domain name – 域名
  2. Web hosting – 主机托管
  3. File manager
  4.  Raster graphics editor
  5. Browser
    推荐:Google Chrome、Firefox
  6. CMS (Content Management System)

Key points of the analysis of microarray – 基因芯片分析要点

  1. biological replicates – 生物学重复
    Five or more is usually robust for micro-array studies
  2. qPCR validation
    Micro-array may give many false positives so it is usually necessary to validate the differential expression observed in some of the key genes.

Effect of the glycosyltransferases [糖基转移酶] on the CPS [荚膜多糖] synthesis of S.suis 2

  1. The incomplete CPS resulting from deletion of the cps genes in S.suis 2 SC19;
  2. Interplay between S.suis 2 SC19 and different cell lines in vitro changed by these
    genes deletion
    cps2E, cps2G, cps2J and cps2L
  3. More deposition on the mutant strains of complement C3 in porcine serum
    than on WT
  4. Essential role of the cps genes in viability of SC19 in a murine model

Zhang, Y., Ding, D., Liu, M., Yang, X., Zong, B., Wang, X., Chen, H., Bei, W., and Tan, C. (2016). Effect of the glycosyltransferases on the capsular polysaccharide synthesis of Streptococcus suis serotype 2. Microbiological research.

LaTeX 随笔

  1. LaTeX在windows下认识的文件路径是“/”,而使用Perl的File::Spec包得到的路径使用的是“\”;
  2. 生成dvi: latex filename.tex;
  3. 生成pdf: dvipdfm filename.dvi;

KEGG 使用注意事项

  1. bta里的pathway个数在不断增加,过去抓取的和现在的混着用就会出错;
  2. 批量下载KEGG Mapper生成的图像时,由于网络状况可能导致下载不完全,请一定仔细核实数目是否对应,图像是否完整;


在KEGG中,分子水平上的功能保存在KO(KEGG Orthology)数据库中。这些功能与直系同源组联系在一起,以此来使得一个特殊物种的实验数据可以被扩展到其他物种。KEGG中的基因组注释是直系同源注释,其方式为,为GENES数据库中的每个基因制定KO identifiers (K numbers) 。对于原始数据,像由RefSeq或者GenBank给出的基因名和描述,即使他们和KO的分配不一致,KEGG也不会做任何修改。

将KO的条目与功能表征的序列数据的实验证据联系在一起的工作,已经开始了,并且现在已经展示在REFERENCE下的SEQUENCE子域中。而且,基因组层面的“KEGG GENES”(http://www.genome.jp/kegg/genes.html)集合已经被扩展,使其可以将蛋白数据也包含在附录中。最终KO数据库将覆盖所有的功能表征蛋白序列信息(另见”KEGG Enzyme”(http://www.genome.jp/kegg/annotation/enzyme.html))。

In KEGG, molecular-level functions are stored in the KO (KEGG Orthology) database and associated with ortholog groups in order to enable extension of experimental evidence in a specific organism to other organisms. Genome annotation in KEGG is ortholog annotaion, assigning KO identifiers (K numbers) to individual genes in the GENES database. No updates are made to original data, such as gene names and descriptions given by RefSeq or GenBank, even if they are inconsistent with the KO assignment.

Major efforts have been initated to associate each KO entry with experimental evidence of functionally characterized sequence data, now shown in the SEQUENCE subfield of the REFERENCE field. Furthermore, the genome-based collection of KEGG GENES has been expanded to allow individual protein data to be included in the addendum category. Eventually the KO database will cover all knowledge on functionally characterized protein sequences (see also KEGG Enzyme).

一般来说,KO对功能直系同源的划分是定义在KEGG分子网络的语境中(KEGG pathway maps, BRITE hierarchies and KEGG modules)。KEGG分子网络实际上是由K numbers标识的网络节点表示的。KOs和相应的分子网络的关系呗存储在下面这个系统中。

KEGG Orthology (KO)

将功能信息和直系同源组关联在一起这个功能是KEGG资源的一个独特的功能。基于有限总量的实验数据生成的对序列相似性的预测被预先定义好在KEGG中。如同在BlastKOALA和其他工具中实现的那样,对KEGG GENES的序列相似性搜索是针对K numbers的。一旦一个K numbers被指定给基因组中的基因,KEGG pathways maps, Brite hierarchies,和KEGG modules都会自动重建。如此一来,就能对较高水平的功能有一个生物学上的科学的诠释。

In general KO grouping of functional orthologs is defined in the context of KEGG molecular networks (KEGG pathway maps, BRITE hierarchies and KEGG modules), which are in fact represented as networks of nodes identified by K numbers. The relationships between KOs and corresponding molecular networks are represented in the following KO system.

KEGG Orthology (KO)The fact that functional information is associated with ortholog groups is a unique aspect of the KEGG resource. The sequence similarity based inference as a generalization of limited amount of experimental evidence is predefined in KEGG. As implemented in BlastKOALA and other tools, the sequence similarity search against KEGG GENES is a search for most appropriate K numbers. Once K numbers are assigned to genes in the genome, the KEGG pathways maps, Brite hierarchies, and KEGG modules are automatically reconstructed, enabling biological interpretation of high-level functions.



DAVID-WS (web service) has been developed to automate user tasks by providing stateful web services to access DAVID programmatically without the need for human interactions. [1]


DAVID-WS is made stateful by keeping the state-related input of a user operation in a session context that can be accessed by subsequent user operations within the same session. Users can add lists, change background populations, select species and categories and reset functional parameters for data analysis, as well as query all tools within the same session and format output as desired. [1]

[1] Jiao, X., Sherman, B.T., Huang da, W., Stephens, R., Baseler, M.W., Lane, H.C., and Lempicki, R.A. (2012). DAVID-WS: a stateful web service to facilitate gene/protein list analysis. Bioinformatics 28, 1805-1806.


  1. 脚本位置信息
    1. (子)脚本所在的位置:/home/wangyu/
    my $path_curf = File::Spec->rel2abs(__FILE__);
    my ($vol, $dirs, $file) = File::Spec->splitpath($path_curf);
    2. 从哪里调用的(主)脚本:/home/wangyu/code
    3. 程序目前切换(chdir)到哪里了:/lustre/Work
    注意:`pwd` 的返回值需要chomp去除末尾回车; 解释: 1. 我用a.pl调用b.pl,主脚本为a.pl,子脚本为b.pl; 2. a.pl在/home/wangyu/code/perl, b.pl在/home/wangyu; 3. 使用chdir切换了到/lustre/Work以后,调用b.pl,在b.pl里面,使用三种方式判断路径。
     1. 在windows系统下git环境中,使用pwd获得的路径格式为‘/a/b/c’,这个路径如果搭配‘chdir’使用,就会提示找不到;此时需要使用Cwd模块中的getcwd()函数
  2. perl -d: 打开调试功能
  3. windows下,html中指定路径:”file:\/\/\/path_to_the_file”;
  4. 对读入的数据进行split前,注意,要用chomp处理;
    其实际影响案例有:1. 如果一个变量$var包含了换行符,我把这个变量放在system “gzip -d -c $var > filename”,这条命令$var后面的就无法生效,因为在$var已经敲了回车了。
  5. Installation:
    perl -MCPAN -e shell
    install SOAP::Lite
  6. Your Perl is configured to link against libgdbm,but libgdbm.so was not found.:aptitude install libgdbm-dev
  7. Please tell me where I can find your apache src:
  8. Function Round: int($number+0.5)
  9. ‘Unquoted string “..” may clash with future reserved word
    I meet this warning because my filehandle is lowercase with the “warning” on. It’s better to use uppercase as developers wish.
  10. $$: 该脚本的进程号;
  11. 微型Perl: 修改文件内容
     perl -p -i -e 's/from/to/' *.file

    -p:输出本行内容(-n: 不输出本行内容)
    -i:指定备份文件后缀名,如果给出-i选项并且没有指定后缀名,则覆盖原文件 (-i.bak)
    *.file: 需要修改的文件

  12. 已安装模块备份及重装
    perl -MCPAN -eautobundle 
    perl -MCPAN -e 'install Bundle::Snapshot_2017_03_10_00'
  13. 选择性正则匹配:/(.snp.gz|.snp.tar.gz|.snp)/,匹配上的模式保存在$1
  14. strict refs
    Can't use string as a symbol ref while "strict refs" in use
    Can't use string as a HASH ref while "strict refs" in use
  15. asdf











对于已克隆基因的需求在过去32年间不断增多,但是一些需要他们的人却很难自己做出来。尽管传统的基于连接的克隆在做的时候会遇到一些陷阱,而且这些陷阱如果没有意识到而进去了,会带来较为严重的后果。但其实只要足够注意,大多数的陷阱都是可以避免的。在这里,我们回顾了将基因克隆到质粒的实验过程中所需的酶和试剂的化学性质,并且去关注那些与此最相关的细节。值得指出的是,我们探索了,琼脂糖凝胶电泳检测的优点,DNA和二氧化硅之间相互作用的原理,以及热稳定DNA聚合酶、限制性内切酶、和T4 DNA连接酶的使用过程中存在的问题。同时,我们也叙述了大肠杆菌转化和DNA修饰酶使用过程中的常见陷阱。全面认识已经建立的方法,对于解决在调整技术时产生的问题,根据需求执行替代方案,以及创造新方案而言,是必需的。






“Awesome” agarose – 了不起的琼脂糖

PCR产物传统方法是,使用引物与独特的酶切位点合作实现。内切酶被用来在PCR产物和受体质粒上产生黏性末端。酶切产物通过琼脂糖凝胶电泳分离;目的切割片段接着从胶上回收。两个DNA片段用T4DNA连接酶催化连接。然后连接产物被转化到大肠杆菌中,最后,转化子通过质粒上的抗生素抗性标记筛选得到。这种传统方法属于劳动密集型,几乎每一步的陷阱都被很好的认识到了,除了DNA的纯化以及PCR产物末端的结构异质性,通过琼脂糖凝胶电泳可以很好的监控这一切(Figure 1)。由此我们认为,质粒以及PCR产物应当在传化之前和之后、酶切后、胶纯化后以及连接后进行验证,这样在进行下一步前,对每一步都进行确认。微量紫外光谱仪近些年变得十分流行,但是这些设备并不是设计来区分基质和DNA修饰产物的。琼脂糖凝胶电泳使得我们可以在问题出现时立刻发现他们;有经验的基因工程人员能够通过解决这些问题节约大量时间而不是不断地修补以及重新开始。

琼脂糖凝胶电泳的替代物已经被设计出来了,但是还没十分普及。在我的实验室里,“mini-gels”使我们用来分辨小片段(<5 kb)的方法,因为这非常方便,快速,并且(相比于大胶来说)更加敏感,试剂费用更低(琼脂糖,染料,以及电泳的缓冲液,详见附件)。倒在75×52的玻璃板上的Mini-gels能够大量生产然后保存(Figure 2)。更短和更薄的胶能够更快地被电泳,相比于那些更厚的来说,也不用进行过度加热。(小的胶盒中只需5v / cm,就像Owl C2-S)。分辨率(条带的物理分离)效果不如大胶好(Figure 3),但是对于日常克隆实验中的小片段来说,足够了。我们使用地摩尔浓度的电泳介质(10mM 硼酸锂,pH 8.2)进行小条带(<2 kb)的快速分辨;我们使用更高导电性的TAE缓冲液(40mM Tris, 20mM 的乙酸,1mM的EDTA)对更大的条带进行电泳,因为使用低摩尔浓度(5 mM 硼酸锂,pH 7.2)的mini-gels无法胜任。任何使得克隆人员能够跑更多胶的创新都是对质量控制标准的持续促进。



[我的补充:Owl™ C2-S:http://www.thermoscientific.com/content/tfs/en/product/owl-c2-s-micro-electrophoresis-system.html

Owl™ C2-S Micro Electrophoresis System

Screen up to seven samples on one gel in less than five minutes with this product!




“Stick-less” silica – 不怎么黏的二氧化硅



并非所有研究人员都使用同一种胶纯化技术,由此说明没有一种是绝对可靠的。在我们实验室,我们使用硅胶模离心吸附柱来从琼脂糖凝胶中纯化酶切的DNA但跟商品化方法(下面会提到)不同,一次来最大化产量和纯度。其他普遍使用的胶纯化方法包括电洗脱技术(从胶块上或者将条带跑进第二个孔),使用阴离子交换纸来吸附和洗脱(Whatman DE81 or Schleicher and Schuell NA-45),苯酚/氯仿法从低熔点琼脂糖中提取,冷冻挤压和琼脂水解酶消化法。大多数研究人员会使用硅胶模离心吸附柱来做胶提取并不是因为这种方法能够有最好的回收率或者纯度而是因为这种方法更容易。琼脂残留物或者各种在胶回收过程中引入的溶剂和盐可能会抑制T4 DNA连接酶活性,然后进一步影响克隆效率。这些物质很难检测,所以他们一般被忽略了,知道后来琼脂糖凝胶电泳使得连接效率提升了以后才使人们开始关注这个问题。



Matsumura I 2015. Why Johnny can’t clone: Common pitfalls and not so common solutions. Biotechniques 59, IV-XIII.