Reference no: EM132812474
Instruction
Your task in this coursework is to write a C++ program to analyse an fMRI time course to determine whether it is significantly activated using permutation testing. This program should perform the following main steps:
Question 1. Load the measured time course data from file;
Question 2. Determine the size of the blocks used in the task (i.e. how many samples correspond to a ‘rest' or ‘task' period);
Question 3. Compute the expected time course signal;
Question 4. Compute the correlation coefficient between the expected time course (generated in step 3) and measured time course (as loaded in step 1);
Question 5. Generate 5000 new time courses signals by random permutations of the measured signal (as loaded in step 1). For each of these signals, compute the correlation coefficient between it and the expected time course (as generated in step 3);
Question 6. Display a histogram of the 5,000 correlation coefficients computed in step 5;
Question 7. Estimate the critical value for statistical significance at the p = 0.05 level;
Question 8. Determine whether the observed correlation coefficient (as computed in step 4) exceeds the critical value (computed in step 7), and report whether the correlation is statistically significant.
Loading the data and determining the block size
You are provided with 4 data files, consisting of a list of values stored as plain text, one per line. These data files were produced from 2 different experiments with different parameters, leading to differences in the number of sample points (64 vs. 128) and differences in the block size (8 vs. 16). The block size information is encoded in the filename, after the final underscore (‘ '); e.g. for the file exp1_roi1_8.txt a block size of 8 samples was used. Your program should load the full time course in each case and determine the block size from the filename, and should be able to cope with any number of sample points or block size.
For each of the two experiments, two time courses are provided, estimated from different regions of interest (ROI). You should observe a significant correlation for ROI 1, but not for ROI 2.
Computing the expected time course signal
The expected time course signal should look like the red curves in figure2. For a block size of n sample points, the expected signal consists of a ‘block' of n zeros, followed by a ‘block' of n ones, and another ‘block' of n zeros, and so on until the end of the time course. The expected signal is expected to always start with a ‘rest' (zero) block. The total number of sample points in the time course should match the number in the measured time course.
Computing the correlation coefficient
You should compute the absolute value of thePearson correlation coefficient |r|, which can be computed efficiently for two equal-sized arrays of values x = {xi} and y = {yi} using the following
formula:
r = n Σxiyi -ΣxiΣyi/√nΣxi2-Σxi2(Σxi)2 √nΣyi2 - (Σyi)2
Remember to take the absolute value of the correlation coefficient: no effect implies a correlation coefficient of zero; we are interested in any non-zero correlation, whether positive or negative.
Permutations
Random permutations of the array of measured data values can be performed using the standard C++ function std::random shuffle(). This function can be used with an STL std::vector as follows:
# include <algorithm > 1
... 2
std :: vector <double > x; 3
... 4
std :: random_shuffle (x. begin () , x.end ()); 5
Note that this function performs the reordering in-place: the values in the original vector will be reordered, and the original ordering will be lost. If you need to keep the original vector intact, make a copy of it first and apply std::random_shuffle to that copy.
Displaying the histogram
The 5,000 correlation coefficients estimated from permuted data should be displayed as a horizontal histogram, as shown in the example listing below. Your code should divide up the desired range of values (defined by the min & max value) into the desired number of bins, and then count how many values fall within each bin. Your program should then display the histogram by printing each bin of the histogram on its own line, displayed as a sequence of some character (you can use the character = or # for this) of length given by the count.
Ideally, the count should be scaled to ensure the histogram can fit on the terminal within a display width of 80 characters (based on the maximum count). It should also display the value of the centre of each bin on the left of the histogram, on the same line as the corresponding bin. Your code should be capable of handling any number of bins, defined over any interval. For the purposes of this assignment, you should use 25 bins arranged over the interval [0, 1] (as shown in the example below).
Estimating the critical value and establishing significance
The critical value corresponds to the 95th percentile of the values in the null distribution. You can obtain an estimate of this value by first sorting your array of correlation coefficients, and then looking up the value of the element whose index is closest to 95% of the size of the array (e.g. for a sorted array x of size 1,000, the 95th percentile is x[950]).
You may find the std::sort() function useful here, which can be used with an STL std::vector as follows:
# include <algorithm >
...
std :: vector <double > x;
...
std :: sort (x. begin () , x.end ());
Reporting Requirements
You should submit a C++ project that meets as many of the requirements as possible. You do not need to submit any written report but do try to use variable/function naming, comments and indentation to make your program as easy to understand as possible. Also try to make your program as resilient to runtime errors as possible.
Attachment:- Permutation testing.rar