NIMH Data Archive Tools
In this central location you can access and launch all tools maintained by the NIMH Data Archive. In all cases, a recent version of Java is required in order to use NDA tools. OpenJDK is not supported.
Researchers throughout and beyond the field of mental health research use the NDA GUID Tool, originally developed for the use of the autism research community and the National Database for Autism Research (NDAR), to generate participant identifiers. The GUID Tool takes personally information provided by research participants and uses it to securely create a unique identifier. Using this tool, participant data can be linked across studies and laboratories while always maintaining the participant's privacy. In order to use this tool, you will need a user account with the appropriate permission assigned.
Contributors harmonizing and uploading their data to the NIMH Data Archive must use the Validation and Upload Tool to send their data and complete this process. This tool connects to the Data Dictionary, and then allows you to load data templates and validate them against their definitions. This helps ensure that data in NDA is harmonized to a standard and serves as a "pre-upload" QA check on your data. After data is successfully harmonized, the same tool is used to package and upload the data to your NDA Collection. In addition to working with CSV data templates, the tool also supports direct uploads from a hosted AWS-RDS database. There are currently three different versions of the tool. Please review them below to determine the most appropriate one for your situation. If in doubt, please use the first option: the HTML version.
HTML Validation and Upload Tool
This version of the tool allows you to use it a webpage, validating the quality of your data and upload it directly through your web browser. Chrome, Firefox, Safari, and recent versions of Internet Explorer are supported
To support the BIDS format for imaging data, NDA has added a new Data Element of type Manifest to the existing set of String, Integer, Float, Thumbnail, File, and GUID. The Manifest data element is like the File data element in that the data submission template specifies the location of an XML or JSON file containing a collection of files. This supports the capability to create NDA data structures that describe a collection of related file resources. This new element treats the files included in the Manifest as associated files for purposes of submission and these are ingested and stored as individual objects in AWS S3 Object storage, which also enables users to directly access specific files from the collection of files.
How to submit BIDS data using a manifest:
Manifest files can be submitted with two NDA data structures, image03 and fmriresults01. Instead of indicating a file in the image_file or derived_files columns, list the JSON file in the Manifest data element column. As of now, data submission for structures that use this new type will need to use the nda-tools vtcmd tool, which is distributed with the nda-tools python package
For more information about Manifest data elements, examples, and helper scripts, please refer to our GitHub repository.
Users approved for access to NDA shared data can use the Download Manager to view and download data packages they have added to their account or created using the query tools. Access to download data to non-AWS internet addresses is limited to 20 Terabytes over 30 days. For more detail, including examples, please read about our user download threshold.
Note: Data packages containing omics data must be accessed through the cloud and cannot be downloaded directly using this tool. In addition to the Download Manager tool described below, a python command line downloader is also available.
The Download Manager tool is provided as a Java Web Start application, which is launched using a Java Network Launch Protocol (JNLP) file downloaded from NDA. Opening the JNLP file requires the Java Runtime Environment (JRE) for Java 8 , which will download resources from NDA necessary to launch the application. Note only Java 8 is compatible with NDA Download Manager.
To launch (i.e., run) the tool, the following prerequisites must be satisfied:
- Installation of Java Runtime Environment (JRE) for Java 8
- Copy of JNLP file, or URL for the JNLP file
Verify you have the 1.8 version of the JRE installed by opening a command prompt (Windows) or terminal window (Mac and Linux), then enter the following command: java -version and press the ‘Enter’ key. You should expect to see output displaying either the installed version of Java Runtime Environment (JRE) or an error message indicating that the command is not recognized. The JRE version should begin with 1.8.0_ followed by the specific update version (i.e., java version "1.8.0_261").
To obtain a copy of the JNLP file, click Download NDA Download Manager JNLP. The browser should begin downloading a file named DownloadManager.jnlp.
Installing Java SE Runtime Environment (JRE) for Java 8
If you were not able to successfully verify installation of JRE version 1.8, please download the installation file for your operating system and follow the instructions provided for installation. Note that other JRE versions are not compatible with this tool.
Further instructions for installing Linux JRE for Java 8 and switching between Java versions are provided here.
Launching the Download Manager
After successfully installing JRE 1.8, you should be able to double-click on the DownloadManager.jnlp file that was downloaded (see prerequisites). The installation should register the Java Web Launcher executable (javaws) when attempting to open or run a jnlp file. You may need to add an exception for the https://nda.nih.gov URL in the JRE configuration to allow the application to run.
If the system does not recognize which application to use for the jnlp file:
- Open a command prompt (Windows) or terminal (Mac and Linux),
- Change to the directory where you downloaded the DownloadManager.jnlp file, typically the Downloads folder for your user, and
- Enter the command: javaws DownloadManager.jnlp.
When the Java Web Launcher opens the DownloadManager.jnlp file, several files will be downloaded and a prompt will be displayed the text: Do you want to run this application?
The prompt will show the following information:
Name: Download Manager
This application will run with unrestricted access which may put your computer and personal information at risk. Run this application only if you trust the locations and publisher above.
Click on the Run button to open the Download Manager, and you will be prompted to enter your NDA username and password. After entering a valid username and password, the tool will open.
Note: There may be some delay after entering your credentials, especially if your user has a large number of packages or packages include a large number of files.
Using the Download Manager
Select the location for where packages should be saved to by clicking the "Browse" button, and then selecting a directory with enough space for downloading your package. The size is provided in the table that displays each of your packages. The default location will be your user’s home directory.
Select the check box next to the package you wish to download and click the Start Downloads button. The package status must be 'Ready to Download' before you can begin downloading files. If the package status is Creating Package, press the Refresh Queue button to update the status. Depending on the size (number of subject records and files) of the package this can take up to 30 minutes after initiating the data package request in the NDA Web Application.
Download Manager will create a directory with the package name, in the location you selected. If any errors are encountered during the download, an errors.txt file will be created in this directory with detailed error messages, and the progress bar will not reach 100% completion.
Depending on your internet connection and performance of the storage device (internal hard drive, network attached storage, etc.) downloading an entire package may require several days to complete. Users can typically expect packages in excess of 1 Terabyte to take several days to download.
If you encounter any errors, or have additional questions, please open a Help Desk ticket by sending an email to NDAHelp@nih.gov. Please include the package id in your request, along with a brief description of the issue you are experiencing, and a copy of any errors.txt file if present.
The mission of the National Institute of Mental Health Data Archive (NDA) is to make research data available for reuse. Data collected across projects can be aggregated and made available using the GUID, including clinical data, and the results of imaging, genomic, and other experimental data collected from the same participants. In this way, separate experiments on genotypes and brain volumes can inform the research community on the over one hundred thousand subjects now contained in the NDA. The NDA’s cloud computation capability provides a framework in support of this infrastructure.
How does it work?
The NDA holds and protects rich datasets (fastq, brain imaging) in object-based storage (Amazon S3). To facilitate access, the NDA supports the deployment of data packages (created through the NDA Query tools) to an Amazon Web Service Oracle database. These databases contain a table for each data structure in a package. Associated raw or evaluated data files are available via read-only access to NDA’s S3 objects. Addresses for those objects in the associated package are provided in the miNDAR table titled S3_LINKS. By providing this interface, the NDA envisions real-time computation against rich datasets that can be initiated without the need to download full data packages. Furthermore, a new category of data structure has been created called "evaluated data." Tables for these structures will be created for each miNDAR, allowing researchers using NDA cloud capabilities and computational pipelines to write any analyzed data directly back to the miNDAR database. This will enable the NDA to make this data available to the general research community when appropriate.
miNDARs can also be populated with your own raw or evaluated data and uploaded directly back into the NDA for a streamlined data submission directly from a hosted database.
How do I get started?
The option to launch data packages to a cloud hosted database will be available during package creation. You can deploy previously generated data packages as well as new ones.
To move data to Oracle, first create a package in the NDA. Then, following registration, enter the package id and credentials requested on the miNDAR tab. This will start the miNDAR creation process, which takes approximately 10 minutes. Once created, the miNDAR connect details will be emailed to you, and can be used to establish a connection with your credentials.
Access to download data to non-AWS interent addresses is limited, please read about our user download threshold. Access from AWS internet addresses is unlimited.
Files included in a package are accessible from Amazon Web Services (AWS) S3 Object Storage. Each miNDAR package will have a table “S3_LINKS” table containing URIs for all objects in that package. Using direct web serivice calls to Amazon Web Service's S3 API, a third party tool, or client libraries, data from these objects can be streamed or downloaded.
For security purposes temporary AWS credentials are needed to access the S3 Objects. Temporary credentials are issued by authenticating with a web service using your NDA username and password. AWS credentials can be obtained directly from the web service (see examples on our GitHub page) or from the download manager, which is available in both a GUI and command line version.
For the GUI version, go to the 'Tools' menu and select 'Generate AWS Credentials'.
For the command line download manager, use the following syntax:
java -jar downloadmanager.jar --username user --password pass --g
For help with the command line download manager, use the following switches: -h, --help
The web service provides temporary credentials in three parts:
- an access key,
- a secret key,
- and a session token
All three parts are needed in order to authenticate properly with S3 and retrieve data.
Additionally the web service provides returns an expiration timestamp for the token in YYYY-MM-DDTHH:MM:SS-TZ format (TZ=HH:MM). New keys can be retrieved at any time. A service oriented approach allows for implementation of pipeline procedures which can request new keys at the appropriate stage of data processing.