Child pages
  • Preferential Spatial Gene Expression in Neuroanatomy

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Code Block
library(plotrix)
library(RMINC)
anatfile="template.mnc" 
statsfile="stats.mnc"

## if you are too lazy to do the registration for this dataset, I included them in the package :)
# anatfile = system.file('extdata/Dorr_resampled_200um.mnc',package="ABIgeneRMINC")
# statsfile = system.file('extdata/enrichment_stats.mnc',package="ABIgeneRMINC")

mincPlotSliceSeries(
     anatomy=mincArray(mincGetVolume(anatfile)),
     statistics=mincArray(mincGetVolume(statsfile)),
     anatLow=1500,anatHigh=7592,symmetric=TRUE,
     legend="t-statistics",low=2,high=6.5)

...

Code Block
library(devtools)
install_github(repo="DJFernandes/ABIgeneRMINC")

 


Step 3: Download Gene Expression Data

...

Code Block
library(ABIgeneRMINC)
## List all the genes in the Allen Gene Expression dataset
## Save the output in "genes.txt"
## Perform API queries in(needs parallelxml2 ##package) Parallelin requiresparallel foreach(parallel &package)
doMC
packages

list.all.allen.genes(outfile="genes.txt",parallel=4) #should take about 10min

Then, you can use the "find.gene.experiment" function, to find all the experiments regarding the genes and the URL to download them. For example, if I want all gene expression data associated with Brain-Derived Neurotrophic factor (BDNF), I do the following.

...

The following commands reads them into R.


 

Code Block
### Reading Gene Expression from downloaded and unzipped data files
library(ABIgeneRMINC)

genefile1="/projects/egerek/matthijs/2015-07-Allen-Brain/Allen_Gene_Expression/raw_data/coronal/Bdnf_sid79587720/energy.raw"

genefile2="/projects/egerek/matthijs/2015-07-Allen-Brain/Allen_Gene_Expression/raw_data/sagittal/Bdnf_sid75695642/energy.raw"

## Read gene expression data as a 1D-vector
genedata1=read.raw.gene(genefile1)

genedata2=read.raw.gene(genefile2)

 


If you do not want to manually download and unzip the data, "read.raw.genes" function will read data from URLs as well. However, it is recommended you download the data to make it easier to debug.

Code Block
### Reading Gene Expression from URL
library(ABIgeneRMINC)

geneURL1="http://api.brain-map.org/grid_data/download/79587720?include=energy"

geneURL2="http://api.brain-map.org/grid_data/download/75695642?include=energy"

## Read gene expression data as a 1D-vector
genedata1=read.raw.gene(geneURL1,url=TRUE)

genedata2=read.raw.gene(geneURL2,url=TRUE)

 


Finally, to analyse gene expression we need to convert from the orientation used in the Allen Brain Institute, to the orientation used in MINC files. This can be done quickly using the "allenVectorTOmincVector" function. For those who want to know exactly how the orientations differ, please read the info following the code.

Code Block
## Convert Allen orientation to MINC orientation
genedata1=allenVectorTOmincVector(genedata1)
genedata2=allenVectorTOmincVector(genedata2)


Info

Allen Gene Expression Data is stored as a 1-D array going from X=Anterior-to-Posterior, Y=Superior-to-Inferior, and Z=Left-to-Right (dimensions written from fastest changing index to slowest).

MINC Data is stored as a 1-D array going from X=Left-to-Right, Y=Posterior-to-Anterior, Z=Inferior-to-Superior (dimensions written from fastest changing index to slowest)

Converting between the two orientations simply requires reordering of indices which is done by the following function and its inverse:

allenVectorTOmincVector : To go from Allen orientation to MINC orientation

mincVectorTOallenVector : To go from MINC orientation to Allen orientation

Step 5: Basic Gene Expression Data manipulations (not necessary)

Let's use RMINC to visualise the gene expression data, on our MRI template (from step 1). 

Code Block
mincPlotSliceSeries(
     anatomy=mincArray(mincGetVolume(anatfile)),
     statistics=mincArray(genedata1),
     col=colorRampPalette(c("darkgreen","yellowgreen"))(255),
     anatLow=1500,anatHigh=7592,symmetric=FALSE,
     legend="Bdnf Expression",low=2,high=6.5)

 


Figure 2: Bdnf Expression (Allen Experiment 79587720)

 


Code Block
mincPlotSliceSeries(
     anatomy=mincArray(mincGetVolume(anatfile)),
     statistics=mincArray(genedata2),
     col=colorRampPalette(c("darkgreen","yellowgreen"))(255),
     anatLow=1500,anatHigh=7592,symmetric=FALSE,
     legend="Bdnf Expression",low=2,high=6.5)

Figure 3: Bdnf Expression (Allen Experiment 75695642)

 

 



Notice, how the data from Experiment 75695642, encompasses only half the brain. This is true of most of the gene expression data from ABI, which were generated from sagittal slices spanning half the brain. A subset of genes have expression data generated from coronal slices spanning the whole brain (like Experiment 79587720). While gene expression in the left and right hemispheres of the mouse brain is likely not completely identical, it might be useful to reflect the data to fill in the missing hemisphere in the sagittal experiments for exploratory analysis. This can be done using the 'midplane.reflect' function. Below is code that reads Experiment 75695642, and reflects it across the sagittal midplane.

Code Block
## Read Gene Data
genedata=read.raw.gene("http://api.brain-map.org/grid_data/download/75695642?include=energy",url=TRUE)
 
## Reflect it across sagittal midplane
genedata.reflected=midplane.reflect(genedata,reflect.dim=1)
 
## Make image
mincPlotSliceSeries(
     anatomy=mincArray(mincGetVolume(anatfile)),
     statistics=mincArray(allenVectorTOmincVector(genedata.reflected)),
     col=colorRampPalette(c("darkgreen","yellowgreen"))(255),
     anatLow=1500,anatHigh=7592,symmetric=FALSE,
     legend="Bdnf Expression",low=2,high=6.5)

 


Figure 3: Bdnf Expression (Allen Experiment 75695642) after sagittal reflection

...

It is worth noting there is also a "write.raw.gene" function which does the opposite as "read.raw.gene". To write the reflected file:


 

Code Block
## Write Reflected Data
write.raw.gene(genedata.reflected,filename="Bdnf_sid75695642_reflected.raw")

 


If you prefer to do analysis by structure instead of voxels, you can try the "unionize" function. This function computes the sum, mean, and standard deviations of gene expression for annotated structures.

Code Block
## Find which structures have highest gene expression
 
## Read annotations (can be downloaded from ABI)
labelfile="/projects/egerek/matthijs/2015-07-Allen-Brain/Allen_Gene_Expression/labels/allen_grid_labels.raw"  

labels=read.raw.gene(labelfile,labels=TRUE)
 
 
## unionize overall the labels. Sort by highest mean expression
labels.to.sum=sort(unique(labels))
labels.to.sum=labels.to.sum[labels.to.sum!=0]
df=unionize(grid.data=genedata.reflected,     #vector to unionize
           labels.to.sum=labels.to.sum,      #sum all labels
           labels.grid=labels               #the vector of labels
           )
df=df[order(df$mean,decreasing=TRUE),]
 
## Read csv which contains definition of labels (can be downloaded from ABI)
labeldefs=read.csv("/projects/egerek/matthijs/2015-07-Allen-Brain/Allen_Gene_Expression/labels/allen_gridlabels_structures.csv")
 
## Sort sort structures by highest gene expression
df$structures=labeldefs[match(labeldefs$id,df$labels),'name']
 
## View
head(df)
##Return
#    labels       sum      mean    stdev                                     structures
#555   1059 190.89058 10.605032 5.197725                     Field CA2, pyramidal layer
#154    267  45.97817  9.195634 4.589667          Orbital area, lateral part, layer 2/3
#258    479 312.93908  7.632661 4.986437                 Piriform area, polymorph layer
#625  10704 684.76912  7.284778 4.978902 Posterior parietal association areas, layer 6b
#237    440 113.66420  7.104013 2.756221             posteromedial visual area, layer 1
#553   1054 340.99053  7.103969 4.993027                         Visceral area, layer 5

 


Step 6: Preferential Expression

...

Code Block
## Find Brain Mask
labelfile="/projects/egerek/matthijs/2015-07-Allen-Brain/Allen_Gene_Expression/labels/allen_grid_labels.raw"

brainmask=read.raw.gene(labelfile,labels=TRUE)>0
 
## Read BDNF Gene Data
genedata=read.raw.gene("http://api.brain-map.org/grid_data/download/79587720?include=energy",url=TRUE)
 
## Read Neuroanatomical statistics data (from step 1)
statsdata=mincGetVolume("stats.mnc")
 
## Convert it from MINC orientation to Allen Orientation
statsdata=mincVectorTOallenVector(statsdata)
 
## Find preferential expression 
##   Target regions |t-statistic|>2
##   Contrast regions: whole brain
geneFoldChange(abs(statsdata),genedata,maskvector=brainmask,tgt.thresh=2)
 

 


The fold-change value returned is 1.13, indicating that there is 1.13 times greater expression in regions with neuroanatomical differences that the mean expression in the whole brain.  

...

Code Block
## Find Brain Mask and convert it into MINC orientation
labelfile="/projects/egerek/matthijs/2015-07-Allen-Brain/Allen_Gene_Expression/labels/allen_grid_labels.raw"

brainmask=read.raw.gene(labelfile,labels=TRUE)>0
brainmask=allenVectorTOmincVector(brainmask)

## Read CSV containing filenames
df=read.csv("/projects/egerek/matthijs/2015-07-Allen-Brain/Allen_Gene_Expression/gene_ids_stats2.csv")
 
## Subset to extract only the coronal and reflected sagittal experiments
df=subset(df,cutting_plane %in% c('sagittal_reflected','coronal'))
 
## Extract filenames, and convert to characters
genefiles=as.character(df$raw_data)
 
## Read Neuroanatomical statistics data (from step 1)
statsdata=mincGetVolume("stats.mnc")
 
## Find preferential expression 
##   Target regions |t-statistic|>2
##   Contrast regions: whole brain
## NOTE: when "geneFoldChange" receives filenames instead of vectors,
## It reads the filenames and does computations in MINC space
 
## Apply "geneFoldChange" on all genes. 
## I like using the "pbapply" library because it gives a progress bar
## but you don't need to. Just use the "lapply" instead of "pbapply"
## if you don't feel like loading another library
## This will take about 40 minutes.

library(pbapply)
fcs=pblapply(genefiles,function(file)
            geneFoldChange(abs(statsdata),file,maskvector=brainmask,tgt.thresh=2))
 
## Create Dataframe with the gene acronym, experiment ID, and
##  and foldchange value for easy viewing
## Order by fold change
retdf=data.frame(gene=df$acronym,experiment=df$Section_Data_ID,foldchange=unlist(fcs))
retdf=retdf[order(retdf$foldchange,decreasing=TRUE),]

 


Voilà! A candidate list of genes with preferential expression in neuroanatomy.

...