Data Harmonization

NDA supports an unlimited amount of clinical, demographic, genomics, and phenotypic data associated with human subjects research (see Data Dictionary). To ensure the harmonization of data across projects, all data submitted to NDA must be consistent with a data structure as defined in the Data Dictionary. While working through the creation of the Data Expected List, researchers should locate structures that can be used to harmonize all of their assessments and measures and are encouraged to extend these by providing NDA with the information needed to define any new assessments, in cases where there is no definition already. All data captured should be made consistent either through adopting an NDA definition, or by extending the NDA data definition.

This page provides information on some different types of data as they are differentiated in the NDA submission process, and details on how they will ultimately look once harmonized and ready to upload. This way, researchers can make any necessary plans for infrastructure or organization on their end to facilitate an easier upload of some data types.

Clinical Assessments

The term Clinical Assessments is used, in this context, in a very general sense to describe all clinical, behavioral, demographic, or other tabular data structures that have one thing in common: they are harmonized and submitted to NDA simply as tables of data. In the basic submission process these will be CSV spreadsheets, each matching an NDA Data Dictionary definition, pre-existing or defined specifically for your project. Some researchers have succeeded in organizing their internal data infrastructure to match these definitions and simply export and upload the contents when the time comes. Others perform data entry into the spreadsheets.

Neurosignal Recordings

NDA accepts evoked response/event-based data from EEG, fMRI, eye tracking, MEG, and EGG experiments. Each of these types of data has one standard structure in the Data Dictionary that can be used to upload the associated files. These are:

These structures allow you to provide required information about the files captured and contain a data file element that allows you to specify the path (location on the local drive or file system) of associated data files for upload. An Experiment Definition, created in your NDA Collection, provides the metadata associated with an Experiment. Once defined, the Experiment is reviewed for a basic level of quality (e.g. it does not list N/A in all required fields), and "Approved" by NDA Staff. 

Additionally, the NDA provides imaging quality assurance for submitted imaging files using the Centre for the Functional MRI of the Brain (FMRIB) Software Library (FSL) Fast/First computational pipelines. These quality assurance results are now available to authorized users for query and download under 'Evaluated Data' in the Data Dictionary.

When submitting these data, the Validation and Upload Tool will:

  1. Verify the Experiment Definition referenced exists and has been "Approved" by NDA Staff.
  2. Locate the files using the path provided, and verify they exist

As it uploads, the tool will get those files, and include them in the upload package.

Anatomical MRI

Structural images are uploaded using the same Image structure as functional images, etc. However, there is no associated Experiment Definition for these images - an upload of MRI data will consist only of the associated files and the Image03 structure.

Omics

To submit omics data (genomics, proteomics, metabolomics, etc.) to the NDA, you will first need to use the Experiments Tab of your project's NDA Collection to create an Experiment that defines parameters like molecule, platform, software, etc. This is the same in principle as the Experiment Definition requirement for Neurosignal Recordings data, but with different types of options in the tool. Once the experiment is created, it will be assigned an ID number. You can then use the standard omics definition (genomics_sample structure) to provide required sample information, and enter the Experiment ID in the field of the same name. Associated files are referenced in a data file element and included in the upload package.