R/qtl-related package demo
========================================================

This is a quick demo of R/qtl and related packages R/qtlhot, R/qtlnet and R/qtlyeast for workshops on causal networks. For information and tutorials on R and R/qtl, visit http://www.rqtl.org . Each package his its own vignettes.

R/qtl
---------------------------------

We first show some features of R/qtl. For more info, see http://rqtl.org/tutorials/rqtltour2.pdf .

```{r}
library(qtl)
library(qtlyeast)
summary(yeast.orf)
plot.map(yeast.orf)
```

This shows a summary of the Brem and Kruglyak (2001) yeast dataset, with `r sum(nmar(yeast.orf))` markers and `r nphe(yeast.orf)` phenotypes. 

```{r}
tf.orf <- "YBR158W"
tf.gene <- yeast.annot[yeast.annot$orf == tf.orf, "gene"]

## Invoke QTL hotspot and causal pair library.
library(qtlhot)
cand <- cis.cand.reg$cis.reg[cis.cand.reg$cis.reg$gene == tf.orf,]
cand.peak <- cand$peak.pos
cand.pos <- cis.cand.reg$cis.reg$phys.pos[cis.cand.reg$cis.reg$gene == tf.orf]
```

We examine one particular open reading frame, `r tf.orf`, corresponding to transcription factor, `r tf.gene`. Here is an image of the genotypes for chromosome 2 reordered by this ORF.

```{r}
cand.mar <- which.min(abs(cand.pos - pull.map(yeast.orf)[[2]]))
geno.image(yeast.orf, 2, reorder = find.pheno(yeast.orf, tf.orf))
abline(v = cand.mar, lwd = 3, col = "green")
```

This ORF is located at `r cand.pos`cM; the nearest marker (number `r cand.mar`) is identified with a green line on the genotype image. Now we do a simple interval map scan, which finds the strongest evidence (LOD) for a QTL right at the gene location. The red line is the 1% permutation threshold.

```{r}
yeast.orf <- calc.genoprob(yeast.orf, step = 2)
scan.orf <- scanone(yeast.orf, pheno.col = find.pheno(yeast.orf, tf.orf), method="hk")
plot(scan.orf)
plot(scan.orf, chr = 2)
abline(v = cand.pos, lwd = 3, col = "green")
lod.thr <- c(summary(perm.orf, 0.01))
abline(h = lod.thr, lwd = 3, col = "red", lty = 2)
```

R/qtlhot: hotspots
---------------------------------------------

Now let's look at hotspots. The `vignette("qtlhot", "qtlhot")` walks through a simulated set of hotspots. For the yeast data, see `vignette("yeast_cmst", "qtlyeast")`. Basically, we use the saved high LOD scores (`highlod.orf`) and LOD permutation threshold (`lod.thr`) to find hotspot size across the genome (`hotsize.orf`). The summary shows several significant hotspots, with one of the largest on chr 2, right at `r tf.orf`. The red 5% threshold of 96 is for the number of traits with significant LOD at a position (Chaibub Neto et al. 2012).

```{r}
hotsize.orf <- hotsize(highlod.orf, lod.thr = lod.thr)
hot.thr <- 96
(hot.sum <- summary(hotsize.orf, threshold = hot.thr))
plot(hotsize.orf, chr = hot.sum$chr, bandcol = "gray70")
abline(h = hot.thr, lwd = 2, col = "red", lty = 2)
plot(hotsize.orf, chr = 2)
abline(v = cand.pos, lwd = 3, col = "green")
abline(h = hot.thr, lwd = 2, col = "red", lty = 2)
```

R/qtlhot: causal pairs
-----------------------------------

Find co-mapped traits and get joint-CMST p-values with Benjaminie-Hochberg adjustment (Chaibub Neto et al. 2013).

```{r}
cand.tf <- cand.reg[cand.reg$gene == tf.orf,, drop = FALSE]
comap.tf <- GetCoMappingTraits(highlod.orf, cand.tf)[[1]]
```

The [TransFac](http://www.gene-regulation.com/pub/databases.html) database provides transcription factor targets, which are stored in `ko.list` in the `qtlyeast` package. Let's find which of the co-mapping traits are targets of the transcription factor `r tf.gene`, and do causal mapping tests for them.

```{r}
match.orf <- !is.na(match(comap.tf, ko.list[[tf.orf]]))
out.peak <- CMSTtests(yeast.orf, tf.orf, comap.tf[match.orf], Q.chr = 2, Q.pos = cand.peak,
                 method="joint", penalty="bic")
causal.call <- function(x) {ordered(c("causal","react","indep","corr")[apply(x, 1, which.min)],
                                    c("causal","react","indep","corr"))}
target.peak <- data.frame(pval = apply(out.peak$pvals.j.BIC, 1, min),
                          call = causal.call(out.peak$pvals.j.BIC))
row.names(target.peak) <- factor(comap.tf[match.orf])
table(target.peak$pval <= 0.1, target.peak$call)
target.peak$gene <- yeast.annot[match(row.names(target.peak), yeast.annot$orf), "gene"]
target.peak[order(target.peak$pval),]
```

Only `r sum(target.peak$pval <= 0.1)` targets had a significant causal call, and all of those were for the independent model (pleiotropy). Let's look only at the alleged targets of `r tf.gene`. We use color to identify the calls for these `r sum(match.orf)` ORFs.

```{r}
plot(scan.orf, chr = 2)
for(target in row.names(target.peak)) {
  tmp <- scanone(yeast.orf, pheno.col = find.pheno(yeast.orf, target), method="hk")
  if(max(tmp)$lod > lod.thr) {
    col <- "gray"
    if(target.peak[target, "pval"] < 0.1)
      col <- c("red", "blue", "green","gray")[unclass(target.peak[target, "call"])]
    plot(tmp, chr = 2, add = TRUE, col = col)    
  }
}
plot(scan.orf, chr = 2, add = TRUE)
abline(v = cand.pos, lwd = 3, col = "green")
abline(h = lod.thr, lwd = 3, col = "red", lty = 2)
```

So, it appears that the `r tf.gene` is pleiotropic to targets with strong QTL signals.

R/qtlnet: causal network
---------------------------------

Let's take the significant targets of `r tf.gene` and see if we can develop a causal network.

```{r}
net.set <- c(tf.orf, row.names(target.peak)[target.peak$pval <= 0.1])
library(qtlnet)
```
The following will create MCMC samples for the QTLnet, but it takes awhile. Instead we read already created data.
```
amn1.qtlnet <- mcmc.qtlnet(yeast.orf, pheno.col = find.pheno(yeast.orf, net.set),
                           threshold = lod.thr, nSamples = 1000, thinning = 20,
                           random.seed = 92387475)
save(amn1.qtlnet, file = "amn1.qtlnet.RData", compress = TRUE)
```
```{r}
load("amn1.qtlnet.RData")
amn1.graph <- igraph.qtlnet(amn1.qtlnet)
plot(amn1.graph)
```
The plot randomly arranges the traits and QTL. Alternatively, use the interactive
```
tkplot(amn1.graph)
```
This apparently did not work well, as the `r tf.orf` is shown as directly causal to one target and reactive to two others. More work would be needed to study this more carefully.