BMC genomics – 2017 – Oreochromis niloticus (Nile Tilapia) – sex determination regions

Sex determination regions

The new O_niloticus_UMD1 assembly was used to study sequence differentiation across two sex-determining regions in tilapias. The first region is an XX/XY sex-determination region on LG1 found in many strains of til-apia [9, 34, 44–47]. We previously characterized this region by whole genome Illumina re-sequencing of pooled DNA from males and females [48]. We realigned these sequences to the new O_niloticus_UMD1 assembly and searched for variants that were fixed in the XX female pool and poly-morphic in the XY male pool. Figure 4 shows the FST and the sex-patterned variant alle le frequencies for the XX/XY O. niloticus comparison across the complete Orenil1.1 and O_niloticus_UMD1 assemblies, while Fig. 5 focuses on the highly differentiated ~9Mbp region on LG1 with a substantial number of sex-patterned variants, indicative of a reduction in recombination in a sex determination region that hasexistedforsometime[48].

The second sex comparison is for an ZZ/WZ sex-determination region on LG3 in a strain of O. aureus [11,49]. This region has not previously been characterized using whole genome sequencing. For this comparison we identified variant alleles fixed in the ZZ male pool and polymorphic in the WZ female pool. Figure 6 shows the FST and the sex-patterned variant allele frequencies for this comparison across the whole O_niloticus_UMD1 assembly, while Fig. 7 focuses on the differentiated region on LG3. O. aureus LG3 contains a large ~50Mbp region of differentiated sex-patterned variants, also indicative of a reduction in recombination in the sex determination region. Figure 6 also shows this differentiation pattern on several other LGs (LG7, LG9, LG14, LG16, LG18, LG22 and LG23). It is possible that these smaller regions of sex-patterned differentiation are actually translocations in O.aureus relative to the O. niloticus genome assembly.

summary of phylogenetic tree


2014-RAxML version 8  ->  2006-RAxML-VI-HPC  ->  2005-RAxML-III


=>  1981-Evolutionary Trees from DNA Sequences: A Maximum Likelihood Approach


=>  Maximum Likelihood Approach ->  statistics




concatenation: 将关心的基因连在一起做

Beavis effect

In a simulation study, William D. Beavis showed that the average estimates of phenotypic variances associated with correctly identified QTL were greatly overestimated if only 100 progeny were evaluated, slightly overestimated if 500 progeny were evaluated, and fairly close to the actual magnitude when 1000 progeny were evaluated.


QTL Analysis

a) Quantitative trait locus (QTL) mapping requires parental strains (red and blue plots) that differ genetically for the trait, such as lines created by divergent artificial selection.

b) The parental lines are crossed to create F1 individuals (not shown), which are then crossed among themselves to create an F2, or crossed to one of the parent lines to create backcross progeny. Both of these crosses produce individuals or strains that contain different fractions of the genome of each parental line. The phenotype for each of these recombinant individuals or lines is assessed, as is the genotype of markers that vary between the parental strains.

c) Statistical techniques such as composite interval mapping evaluate the probability that a marker or an interval between two markers is associated with a QTL affecting the trait, while simultaneously controlling for the effects of other markers on the trait. The results of such an analysis are presented as a plot of the test statistic against the chromosomal map position, in recombination units (cM). Positions of the markers are shown as triangles. The horizontal line marks the significance threshold. Likelihood ratios above this line are formally significant, with the best estimate of QTL positions given by the chromosomal position corresponding to the highest significant likelihood ratio. Thus, the figure shows five possible QTL, with the best-supported QTL around 10 and 60 cM.



MIKAWA Satoshi (美川智博士)

2. phylogeny



plink –23file JPT-NA19001.snp JPT ID002 –out JPT-NA19001

plink –bfile JPT-NA19001 –exclude merge.missnp –make-bed –out new

plink –bfile source1 –bmerge source2_trial –make-bed –out merged_trial

plink –merge-list merge_list –make-bed –out merge


1. 面向对象编程的奥义在于每种数据都自带其操作,这样使用者就不必了解如何操作复杂的数据结构了,而只需要学习这种数据的接口即可;


C++ 模板与泛型编程

“泛型编程旨在编写独立于数据类型的代码” 《c++ primer plus》(6th ed)





software: 1). samtools; 2). annovar;

file format: 1). vcf; 2). sam/bam; 3). gff (include end); 4).the Description of Sequence Variants (nomenclature)


software: 1). bed; (并且最后一个碱基不包含在内,比如3 5,包含的是第2个碱基到第4个碱基之间的序列,一共3个碱基);


HOX gene

ref: 2013-the regulation of hox gene expression during animal development


homeosis the replacement of part of one segment of an insect or other segemented animal by a structure characteristic of a different segment, especially through mutation.
homeobox any of a class of closely similar sequences which occur in various genes and are involved in regulating embryonic development in a wide range of species



GATK caveat

1. 选择/过滤

VariantFiltration: Filter variant calls based on INFO and/or FORMAT annotations
output: A filtered VCF in which passing variants are annotated as PASS and failing variants are annotated with the name(s) of the filter(s) they failed.
SelectVariants:    Select a subset of variants from a VCF file.