MRI
BOLD5000

OpenNeuro Accession Number: ds001499Files: 5994Size: 280.3GB

BIDS Validation

3 Errors Invalid
FilesDownloadMetadata
BOLD5000
BOLD5000
  •   .bidsignore
  •   CHANGES
  •   dataset_description.json
  •   participants.tsv
  •   README
  •   task-5000scenes_bold.json
  •   task-5000scenes_events.json
  •   task-localizer_bold.json
  •   task-localizer_events.json
  • derivatives
  • sub-CSI1
  • sub-CSI2
  • sub-CSI3
  • sub-CSI4

README

BOLD5000: Brains, Objects, Landscapes Dataset

For details please refer to BOLD5000.org and our paper on arXiv (http://arxiv.org/abs/1809.01281)

Participant Directories Content 1) Four participants: CSI1, CSI2, CSI3, & CSI4 2) Functional task data acquisition sessions: sessions #1-15 Each functional session includes: -3 sets of fieldmaps (EPI opposite phase encoding; spin-echo opposite phase encoding pairs with partial & non-partial Fourier) -9 or 10 functional scans of slow event-related 5000 scene data (5000scenes) -1 or 0 functional localizer scans used to define scene selective regions (localizer) -each event.json file lists each stimulus, the onset time, and the participant’s response (participants performed a simple valence task) 3) Anatomical data acquisition session: #16 Anatomical Data: T1 weighted MPRAGE scan, a T2 weighted SPACE, diffusion spectrum imaging

Notes: -All MRI and fMRI data provided is with Siemens pre-scan normalization filter.
-CSI4 only participated in 10 MRI sessions: 1-9 were functional acquisition sessions, and 10 was the anatomical data acquisition session.

Derivatives Directory Content 1) fMRIprep: -Preprocessed data for all functional data of CSI1 through CSI4 (listed in folders for each participant: derivatives/fmriprep/sub-CSIX). Data was preprocessed both in T1w image space and on surface space. Functional data was motion corrected, susceptibility distortion corrected, and aligned to the anatomical data using bbregister. Please refer to the paper for the details on preprocessing. -Reports resulting from fMRI prep, which include the success of anatomical alignment and distortion correction, among other measures of preprocessing success are all listed in the sub-CSIX.html files.
2) Freesurfer: Freesurfer reconstructions as a result of fMRIprep preprocessing stream. 3) MRIQC: Image quality metrics (IQMs) of the dataset using MRIQC. -CSIX-func.csv files are text files with a list of all IQMs for each session, for each run. -CSIX-anat.csv files are text files with a list of all IQMs for the scans acquired in the anatomical session (e.g., MPRAGE). -CSIX_IQM.xls an excel workbook, each sheet of workbook lists the IQMs for a single run. This is the same data as CSIX-func.csv, except formatted differently. -sub-CSIX/derivatives: contain .json with the MRIQC/IQM results for each run. -sub-CSIX/reports: contains .html file with MRIQC/IQM results for each run along with mean signal and standard deviation maps. 4)spm: A directory that contains the masks used to define each region of interest (ROI) in each participant. There were 10 ROIs: early visual (EarlyVis), lateral occipital cortex (LOC), occipital place area (OPA), parahippocampal place area (PPA), retrosplenial complex (RSC) for the left hemisphere (LH) and right hemisphere (RH).

Comments

Please sign in to contribute to the discussion.
By anqiwu.angela@gmail.com - almost 4 years ago
Hi, I have a question regarding downloading the data. Is there anyway that I only download one subject or one session without downloading the entire dataset nor clicking into each folders? Thanks.
By krzysztof.gorgolewski@gmail.com - almost 4 years ago
You can download any custom subset of the files included in this and any other OpenNeuro dataset via the AWS S3 protocol. The data is available at s3://openneuro.org. To access it you will have use no sign request mode (via '--no-sign-request' flag for AWS CLI https://docs.aws.amazon.com/cli/latest/reference/ - different S3 clients might expos this option differently).
By bold5000.team@gmail.com - almost 4 years ago
Unfortunately, I don't think it's possible to download without clicking each folder. If you want to download the raw data (DICOM format), you can download it separately on Kilthub. The link is available on our website bold5000.org
By montecarlorun@gmail.com - over 3 years ago
Hi, I'm a new researcher with initial experience on Neurovault hosted datasets.
Neurovault site of Mr Gorgolewski provides statistical unthresholded maps with all the voxel spatial structure. The available preprocessing hosted here provides region based values instead of "all-voxel" values.
Is there a "neurovault style statistical map" preprocessing script available to apply on bold5000 raw data or even better a preprocessed version as such?
I have almost no prior experience in preprocessing pipeline. What would you suggest me so I can replicate the z-score map pipeline of Neurovault?
By davidhofmann0@gmail.com - about 3 years ago
Thank you for this dataset!

I have several questions:

1. Which files I can use to calculate the GLM? It seems those are the files (e.g. "sub-CSI1_ses-01_task-5000scenes_run-01_bold_space-T1w_preproc.nii.gz"), but they are not are not normalized to MNI space.

2. Was no slice-timing performed? If yes, what was the reference slice.

3. Which columns in the counfounds.tsv file are the motion parameters? X,Y,Z, RotX,RotY,RotZ?

It would be nice to also supply the results of the GLM analysis.



By bold5000.team@gmail.com - about 3 years ago
1. You are correct. We did not normalize to MNI space, we kept the data in native space.
2. Slice-timing was not performed.
3. If you look in the confounds file, there are titles of the columns in the first row. Additional information about the confounds output can be found: https://fmriprep.readthedocs.io/en/stable/outputs.html#confounds
4. We have provided you with all the data to be able to run the GLM in your fMRI analysis pipeline of your choice.
By ruthwick.meduri@gmail.com - almost 3 years ago
hi thre,
what is that extension .json, and how to view the fMRI file, im not understanding. please help
By lvqihong1992@gmail.com - almost 4 years ago
In the event .tsv file, some trials are labeled as "rep", such as "rep_coco" and "rep_imagenet". What is "rep"? For some "rep_coco" images, I couldn't find the category label in the COCO dataset.
And is there any pre-written code that provide the category labels of the images? Such as "animate, inanimate, objects, and food" used in the paper?
Thanks!
By bold5000.team@gmail.com - almost 4 years ago
Apologies for the confusion. 'rep_' is a prefix denoting that the image is one of the 113 images that are repeated. There are 2 or 3 repeated images per run. You can find the corresponding image from the name coming after 'rep_' prefix.

As for category labels, there's no pre-written code that is released yet. We explored the WordNet hierarchy to determine our final classes. I'll be happy to provide the mappings that you've requested on our website by the end of week.
By lvqihong1992@gmail.com - almost 4 years ago
I see! Thank you very much again! This is very helpful!
By anqiwu.angela@gmail.com - over 3 years ago
Hi, I also need this label information. Can I also get access to it? Thanks a lot!
By bold5000.team@gmail.com - over 3 years ago
Hi, the images + labels are available on our website bold5000.org under downloads.
By anqiwu.angela@gmail.com - over 3 years ago
Sorry I was not clear in the previous message. I mean the object, food, animate, inanimate labels as in Figure 11 in the paper. I see those low level labels in the folder. But it is more interesting to me to have those high level labels as well. Best.
By l.r.caglar@gmail.com - about 3 years ago
Hi! First of all, thank you so much for making this amazing dataset public! I was wondering whether it would be possible to obtain the representational dissimilarity matrices (RDMs) for both the fMRI and the modeling data as well? Thank you!
By ggaziv@gmail.com - over 3 years ago
Hi, are the ROIs defined only for volume representation or can find appropriate per vertex labels in the case I have chosen to work with the surface representation (bold_space-fsnative). In the former case, what are the ROI labels for bold_space-T1w_brainmask? I see there two versions of aseg_roi and aparcaseg_roi.

Note: all the above addresses data under derivatives/fmriprep/.../func
Thanks!
By bold5000.team@gmail.com - about 3 years ago
The ROIs are only for the volume representation. The ROI labels such as aseg_roi etc are the output from the fMRIprep pipeline.
By cosyne1992@gmail.com - almost 4 years ago
Hi, I'm trying to use curl to download this dataset to a server, but I couldn't find the URL for the download button. It doesn't show up when I right click the download button. What's the URL for it? Thanks!
By krzysztof.gorgolewski@gmail.com - almost 4 years ago
You can download any custom subset of the files included in this and any other OpenNeuro dataset via the AWS S3 protocol. The data is available at s3://openneuro.org. To access it you will have use no sign request mode (via '--no-sign-request' flag for AWS CLI https://docs.aws.amazon.com/cli/latest/reference/ - different S3 clients might expos this option differently).
By l.r.caglar@gmail.com - about 3 years ago
Hi! First of all, thank you so much for making this amazing dataset public! I was wondering whether it would be possible to obtain the representational dissimilarity matrices (RDMs) for both the fMRI and the modeling data as well? Thank you!
By bold5000.team@gmail.com - about 3 years ago
You can create the RDMs from the extracted ROI data that can be found under Derivatives/spm.
By torabparhiz.sepehr@gmail.com - almost 4 years ago
Hi, are there other ways to download only some parts of the dataset in addition to using AWS? Amazon blocks access to AWS services from countries such as Iran.

Also, downloading from the OpenNeuro site is difficult with slow connections as OpenNeuro download links does not seem to work with download managers and curl.
Thanks.
By krzysztof.gorgolewski@gmail.com - almost 4 years ago
All OpenNeuro data and services (including this website) are served from AWS.

However, you do not need an Amazon account to download the data via the S3 protocol. All you need to do is to install AWS CLI https://docs.aws.amazon.com/cli/ and run the following command:

aws s3 sync --no-sign-request s3://openneuro.org/ds001499 <target_folder>

Where <target_folder> is local path.
By torabparhiz.sepehr@gmail.com - almost 4 years ago
Thanks a lot. Downloading speed is much better now.
Also, here's an example of how the target folder should be written for anyone else who might want to use this method:
aws s3 sync --no-sign-request s3://openneuro.org/ds001499 snapshots/1.1.1/files/derivatives/fmriprep/
By ggaziv@gmail.com - almost 4 years ago
Hi, What is the size of the whole bundle file here? Couldn't find it. Thanks.
By bold5000.team@gmail.com - almost 4 years ago
It's around 300GB
By ggaziv@gmail.com - almost 4 years ago
Are the results under derivatives/fmriprep/.../func the fully processed version of fMRI used in the analyses in the paper? Are they taken before or post the GLM -> HPF -> mean subtraction? Additionally, it appears that data preprocessed in T1w space is of type int16. Is this the actual signal (was expecting float)?
Thanks.
By bold5000.team@gmail.com - almost 4 years ago
The results under fmriprep/.../func are the data completed after preprocessing. It is this data which was then used as input for the GLM. The results of the GLM were used to extract the data from each ROI. This extracted data is used in paper analyses. Yes, the data is of type int16.
By ggaziv@gmail.com - over 3 years ago
Thanks. From the paper I understand that the data under fmriprep is already nuisance free: "Data [...] were entered into general linear model (GLM), where nuisance variables were regressed out of the data. [...] All of the nuisance variables were confounds extracted in the fMRIPREP analysis stream."

Can we be specific regarding the delta between the data under fmriprep/.../func and the data used for analyses in the paper? If such a delta exists, would you be so kind to additionally make available for download the version of data used for analyses or alternatively the code used to produce these data.
Thanks.
By kendrick.kay@gmail.com - about 3 years ago
I also would be interested if it were possible to access the actual prepared/extracted data that are used for further analysis in the paper. That is, the data after the GLM methods / mean-subtraction / etc., in order to ensure numerical consistency with what is done in the paper. Is this at all possible?
By bold5000.team@gmail.com - about 3 years ago
All the data used in the paper are in the extracted ROI files found in Derivatives/spm.
By arashj@stanford.edu - over 2 years ago
Files under derivatives/spm seem to be ROI masks themselves. Would it be possible to access extracted data after applying these masks? Or reference code on how to apply these masks? Thank you
By kwj2001@med.cornell.edu - over 2 years ago
(Edit: duplicate. please delete)
By kwj2001@med.cornell.edu - over 2 years ago
(Edit: duplicate. please delete)
By kwj2001@med.cornell.edu - over 2 years ago
(Edit: duplicate. please delete)
By kwj2001@med.cornell.edu - over 2 years ago
(Edit: duplicate. please delete)
By kwj2001@med.cornell.edu - over 2 years ago
Great dataset!

Can you provide some more detail on the GLM used prior to analysis? I'm trying to recreate the exact voxel values as distributed in your ROIs/CSI3/mat/CSI3_ROIs_TR34.mat file, using the *T1w_preproc.nii.gz and *bold_confounds.tsv and constructing a GLM. So far the closest I can get to matching your data are with the following choices:

1. Separate GLM for each 194-volume run
2. Input data to GLM is Vpsc=V./mean(V across 194 timepoints), where V is the 194x<voxels> raw voxel time series loaded from T1w_preproc.nii.gz
3. confounds matrix is 194x15, consisting of these columns from confounds.tsv: [CSF, WhiteMatter, GlobalSignal, X, Y, Z, RotX, RotY, RotZ, Cosine00, Cosine01, Cosine02, Cosine03, Cosine04, <column of all ones>]
4. Vresid=(eye(size(confounds,1))-confounds*pinv(confounds))*Vpsc
5. after computing GLM residual, extract mean of TR3 and TR4 timepoints by V34=(Vresid(6:5:188,:)+Vresid(7:5:188,:))/2

For each session, I end up with 370x<voxels> or 333x<voxels> depending on if it's a 10-run or 9-run session. If I correlate these timecourses with the corresponding rows in ROIS_TR34.mat, I end up with a correlation coefficient for each voxel that ranges between 0.97 and 1, depending on the session and ROI.

Is there something you guys did differently from what I describe? If you have the script that generated CSI3_ROIs_TR34.mat that would be pretty useful for reproducibility in general.

Thanks!
-Keith

(Edit: Oops sorry for the duplicate posts! The website didn't give any indication that the post had succeeded so I clicked a few more times! Feel free to delete the dupes!)
By bold5000.team@gmail.com - about 2 years ago
Hi Keith,

Apologies for the delay in response. We did not include any of the five cosine confounds in our GLM - perhaps this accounts for the slight differences between the timecourses?

Let us know if this resolves your issue!

-BOLD5000 team