Getting Files

Once you are approved for access to ABCD shared data, you can learn how to use NDA query tools to create and download packages containing tab-delimited text files in this training. The Frequently Asked Questions below provide information on the ABCD Study and its associated releases, as well as how to get imaging data files from the ABCD repository.

Releases | Clinical Data Sets | Getting Imaging Volumes | Image Processing



What is the ABCD Data Repository?

Data collected as part of the Adolescent Brain Cognitive Development Study will be harmonized to standard definitions and shared with qualified researchers through the NIMH Data Archive (NDA) for research purposes. All imaging and non-imaging assessment data, will be shared in the ABCD Data Repository. This website is the central location for the repository, which must be accessed via a free account registered with the NDA.

What is the NIMH Data Archive?

The NIMH Data Archive (NDA) is a research data sharing infrastructure and database supported by NIH. The archive includes the ABCD repository and several others, including the National Database for Autism Research (NDAR), the National Database for Clinical Trials related to Mental Illness (NDCT), the Research Domain Criteria Database (RDoCdb), and more. Each of these repositories houses different kinds of data, but share an infrastructure and are accessed via one central website, formerly the NDAR website.

How often is data released and what is the difference between the two schedules?

ABCD Study data is released on two schedules: Fast Track data is released on on ongoing basis, and Annual Release data is put out once per year, starting in February 2018. Fast Track releases include only raw imaging data, and may be subject to updates or corrections. Since this is released on an ongoing basis, researchers interested in it should check back weekly or monthly to check for new data. Each year, a Curated Annual Release will be shared including pre-processed imaging, and all non-imaging assessments. Each annual release will re-release previous curated data along with new curated data. All annual releases will be documented on this page.

Clinical Data Sets

What assessments are included in these data sets?

Please review the Curated Annual Release 4.0 Summary for information on everything included in the release. Release 4.0 replaced Release 3.0. Only Data Release 4.0 data should be used going forward. Details of updates are provided in Changes and Known Issues Data Release 4.0 Release Notes

Visit the ABCD Study Collection page for detailed information on all data from the ABCD Data Analysis Informatics & Resources Center. ABCD Study Data Release 4.0 Release Notes contain detailed information on the data release.

Getting Image Volumes

Where are the ABCD imaging volumes?

The NIMH Data Archive structures 'Image03' and 'fmriresults01' contain data on the subjects, timepoint, and metadata about the associated images, and can be downloaded directly from the website using the Download Manager tool. The structure image03 contains all of the raw imaging data shared as part of ABCD Fast Track releases. Those interested in or more comfortable working with the minimally processed data, which is in BIDS format, should use fmriresults01, which contains those data and is included in the Curated Annual release.

The images themselves are stored in Amazon Web Service's S3 cloud storage. To access them, you will need to go through a few steps; getting the image03/fmriresults01 data structure, identifying which imaging files you need, and using web service calls to AWS to get the volumes directly in their cloud storage location via the use of a tool script, or other method that supports this. One such tool supported by AWS is the AWS Command Line Interface, which allows you to interact with S3 through the command line.

How do I get these data files?

The Image03 file containing information on all the available raw images in the ABCD repository, as well as the fmriresults01 structure containing all minimally processed images, can be obtained by logging in, going to the ABCD Collection and clicking Download. Once you press Create Data Package/Add to Data Study in the Filter Cart, you'll be taken to the Data Packaging Page where you can select or deselect specific measures to include in your data package.

Once you've selected the measures of interest, including Image03 or fmriresults01, you can create the package (leaving "Include Associated Files" unchecked) and download it using the Download Manager tool, or deploy it your own RDS database in the cloud called a miNDAR.

Files included in a package are accessible from Amazon Web Services (AWS) S3 Object Storage. Each miNDAR package will have a table “S3_LINKS” table containing URIs for all objects in that package. Using direct web serivice calls to Amazon Web Service's S3 API, a third party tool, or client libraries, data from these objects can be streamed or downloaded.

How do I generate temporary AWS credentials?

A detailed description of this process is available on our cloud page.

Authorization to access S3 Objects requires authenication with AWS using temporary AWS credentials or presigned URL. Both forms of authorization are time-limited and require individual users to authenticate with the package service, requesting either presigned URL or temporary credentials for one or more files within a specific package.

Users may access the web service using the swagger user interface, NDA Tools command-line download manager tool written in Python, or by writing their own tool.

Examples are provided on the NDA GitHub Page.

How do I use the imaging metadata files?

If you've downloaded the metadata file for the type of data you're interested in, you can use whatever tool you wish to filter, query, or manipulate the data to determine which subjects/files are of interest to you. If you have deployed your package to a miNDAR hosted by NDA in the cloud, you can connect to it using Oracle and query or manipulate it as you would any other Oracle database. In addition to the image-related information you might need to identify files of interest, the Image03 and fmriresults01 structures will contain the image_file element, which will specify the location of the files in S3.

What if I want to filter the metadata before I download it?

If you prefer to download, or deploy to a miNDAR, a pre-filtered dataset, you can use the NIMH Data Archive's query tools to restrict the data included. Start by adding the entire ABCD Collection to your cart as above, for all of the Fast Track AND Annual Release data, or from the NDA Study for a particular Annaul Release. Then, you can apply additional filters in the Data Dictionary to restrict by element value (e.g. female or an age range). Please note that multiple restrictions in a single filter will match using OR logic, but applying multiple filters will use AND logic, returning fewer results the more filters you add.

Once I have the S3 links, how do I get the images?

This will entail using your NDA account to generate temporary AWS credentials, and then using that with a third party tool or method of your choice, and the S3 links to access the images.

The NDA Tools download command line client will manage this process for you.

Are supporting resources available for cloud computation?

Yes. NDA offers a pilot program for obtaining computational credits to support analyzing these data in place in S3. Please review the policy for more information.

Do I need an AWS account?

No. Researchers accessing the data directly in S3 can use credentials generated by NDA, associated with NDA's AWS account. Even users working on the data in place in S3, approved for computational credits, will not need their own account. Only researchers wishing to compute on data in their own AWS account without credits will need one.

Image Processing

What parcellation schemes were used for the different imaging modalities?

The Desikan parcels are used for all modalities, including sMRI, dMRI, task-fMRI, and rs-fMRI. In addition, the Gordon parcels are used for both task- and rs-fMRI. For rs-fMRI, the within and between-network measures are calculated from the Gordon parcels only (because the Desikan parcels span multiple functional areas and the same networks are not defined for the Desikan parcels), but the Desikan parcels are included in the variance measures. Additionally, sMRI values are provided as weighted fuzzy clusters, which are based on the genetically derived atlas of Chen et. al.. Be advised, there are limits to any parcellation scheme whereby voxels may fall out of bounds and be excluded from an ROI, and that users should be cautious when examining functional connectivity patterns, especially in task-based fMRI.