UCLA Multimodal Connectivity Database (UMCD)

http://jessebrown.webfactional.com

The UMCD is an online tool, developed by my colleague Jesse Brown at UCLA, which can be used to analyze the network characteristics of connectivity matrices. These connectivity matrices can be generated from a variety of different data sources, with their edges and nodes representing a variety of different properties of real-world systems. For instance, DTI MRI scans can be used to generate whole-brain streamline tractography data, and the number of fibers (edge weights) can be determined between each pair in a series of anatomical regions of interest (ROIs) across the brain (nodes).

R interface to UMCD

Download: http://github.com/johncolby/R-UMCD

From within R, this tool allows you to automatically submit a batch of analysis requests to the UMCD and collect the results.

Install

  1. Download the R-UMCD package from GitHub (see above).
  2. Make sure a recent version of R is available on your computer. http://www.r-project.org
  3. Launch R.
  4. Install the XML and RCurl packages if they aren't already available. For example: > install.packages('XML')
  5. Source the UMCD.R document: > source('/path/to/UMCD.R')

Login

To access data that are stored as “private” in the database, you can optionally first login with umcdLogin(). The curl handle that is returned can then be passed on to the other functions. For example:

> curl = umcdLogin(email='myemail', password='mypassword')
> results = umcdAnalyze(requests, curl=curl)

ADHD example

To demonstrate the interface, let's do an example analysis on a subset of the ADHD200 fcMRI data that is already made public on the UMCD.

Query the ADHD200_CC200 study to obtain all the individual network names, and then limit these to only the subjects from the “NeuroIMAGE” site.

library(ggplot2)

network_names = umcdListNetworks('ADHD200_CC200')$networks
network_names = network_names[grep('NeuroIMAGE', network_names)]
> network_names
 [1] "NeuroIMAGE_1017176" "NeuroIMAGE_1125505" "NeuroIMAGE_1208586" "NeuroIMAGE_1312097" "NeuroIMAGE_1411495"
 [6] "NeuroIMAGE_1538046" "NeuroIMAGE_1588809" "NeuroIMAGE_2074737" "NeuroIMAGE_2352986" "NeuroIMAGE_2419464"
[11] "NeuroIMAGE_2574674" "NeuroIMAGE_2671604" "NeuroIMAGE_2756846" "NeuroIMAGE_2876903" "NeuroIMAGE_2961243"
[16] "NeuroIMAGE_3007585" "NeuroIMAGE_3108222" "NeuroIMAGE_3190461" "NeuroIMAGE_3304956" "NeuroIMAGE_3322144"
[21] "NeuroIMAGE_3449233" "NeuroIMAGE_3566449" "NeuroIMAGE_3808273" "NeuroIMAGE_3858891" "NeuroIMAGE_3888614"
[26] "NeuroIMAGE_3941358" "NeuroIMAGE_3959823" "NeuroIMAGE_3980079" "NeuroIMAGE_4020830" "NeuroIMAGE_4134561"
[31] "NeuroIMAGE_4239636" "NeuroIMAGE_6115230" "NeuroIMAGE_7339173" "NeuroIMAGE_7446626" "NeuroIMAGE_7504392"
[36] "NeuroIMAGE_8387093" "NeuroIMAGE_8409791" "NeuroIMAGE_8991934" "NeuroIMAGE_9956994"

Setup a data frame with all of our requests.

requests = data.frame(study_name   = 'ADHD200_CC200', 
                      network_name = network_names,
                      density      = '20',
                      orientation  = 'Axial',
                      weight       = 'Binary',
                      stringsAsFactors=F)
> requests
      study_name       network_name density orientation weight
1  ADHD200_CC200 NeuroIMAGE_1017176      20       Axial Binary
2  ADHD200_CC200 NeuroIMAGE_1125505      20       Axial Binary
3  ADHD200_CC200 NeuroIMAGE_1208586      20       Axial Binary
4  ADHD200_CC200 NeuroIMAGE_1312097      20       Axial Binary
5  ADHD200_CC200 NeuroIMAGE_1411495      20       Axial Binary
6  ADHD200_CC200 NeuroIMAGE_1538046      20       Axial Binary
...

Now submit all of these requests to the UMCD and collect the results. You'll see a progress bar telling you how far along you are.

> results = umcdAnalyze(requests)
  |===============================================================                                          |  60%

Once it finishes, the results list has 3 main items.

  • info: A data frame containing all of the metadata for each request.
  • global.measures: A data frame containing all of the global measures for each request.
  • nodal.measures: A data frame containing all of the nodal measures for each request.

Now we'll reformat the results a little bit to make things easier for our analysis.

  • Throw out the preprocessing metadata since this has a lot of text.
  • Convert the “Age Range” metadata field into a new age numeric variable.
  • Convert the 3-group “Subject Pool” metadata field into a new 2-group group factor variable.
  • Omit Inf values (some subjects aren't fully connected at this sparsity, so return unusable Inf values for path length).
  • Merge the group and age variables in with global.measures.
results$info = results$info[,-7]
results$info$age = as.numeric(gsub('(.+)-.+', '\\1', results$info$`Age Range`))
results$info$group = factor(ifelse(grepl('ADHD', results$info$`Subject Pool`), 'ADHD', 'TD'))
results$global.measures = results$global.measures[!is.infinite(results$global.measures$value), ]
results$global.measures = with(results, join(global.measures, info[,c(1,7,8)]))

The really nice thing about automating the analysis requests in this way is that, since the data are already back in R, it's trivial to now start fitting models, plotting, etc..

First let's look at simple between group t-tests for each global measure of connectivity.

> ddply(results$global.measures, 'measure', function(x) summary(lm(value ~ group, data=x))$coef[2,])
                     measure      Estimate   Std. Error     t value  Pr(>|t|)
1 Characteristic Path Length -8.285965e-04 1.830179e-02 -0.04527407 0.9642221
2     Clustering Coefficient -7.851302e-03 1.534930e-02 -0.51150891 0.6120341
3                      Gamma  6.860108e-02 1.494925e-01  0.45889326 0.6489953
4          Global Efficiency  5.579489e-03 8.492363e-03  0.65700075 0.5152466
5                     Lambda  1.199507e-01 2.123109e-01  0.56497661 0.5754995
6             Modularity (Q)  7.744470e-03 8.524405e-03  0.90850566 0.3694910
7       Number of Components -4.358289e-01 8.025541e-01 -0.54305234 0.5903505
8            Raw Density (%)  1.210183e-14 1.380917e-14  0.87636167 0.3864879
9                      Sigma  1.162052e-01 2.057780e-01  0.56471182 0.5756778

Unfortunately nothing is significant. :-(

Finally, we can plot each one of these measures vs. age, color the dots by group, and add some trendlines.

qplot(age, value, color=group, data=results$global.measures) +
  facet_wrap(facets=~measure, scales='free_y') +
  geom_smooth(method='lm')

This code is also available altogether in the example.R file in the download directory.

neuroimaging/umcd/main.txt · Last modified: 2011/12/01 10:38 pm PST by John Colby
 
Except where otherwise noted, content on this wiki is licensed under the following license: CC Attribution-Noncommercial-Share Alike 4.0 International
Recent changes RSS feed Donate Powered by PHP Valid XHTML 1.0 Valid CSS Driven by DokuWiki