Loading...

National Institute of Mental Health Data Archive (NDA) Sign In
National Institute of Mental Health Data Archive (NDA) Sign In
NDA

Success! An email is on its way!

Please check your email to complete the linking process. The link you receive is only valid for 30 minutes.

Check your spam or junk folder if you do not receive the email in the next few minutes.

Warning Notice This is a U.S. Government computer system, which may be accessed and used only for authorized Government business by authorized personnel. Unauthorized access or use of this computer system may subject violators to criminal, civil, and/or administrative action. All information on this computer system may be intercepted, recorded, read, copied, and disclosed by and to authorized personnel for official purposes, including criminal investigations. Such information includes sensitive data encrypted to comply with confidentiality and privacy requirements. Access or use of this computer system by any person, whether authorized or unauthorized, constitutes consent to these terms. There is no right of privacy in this system.
Create or Link an Existing NDA Account
NIMH Data Archive (NDA) Sign In or Create An Account
Update Password

You have logged in with a temporary password. Please update your password. Passwords must contain 8 or more characters and must contain at least 3 of the following types of characters:

  • Uppercase
  • Lowercase
  • Numbers
  • Special Characters limited to: %,_,!,@,#,$,-,%,&,+,=,),(,*,^,:,;

Subscribe to our mailing list

Mailing List(s)
Email Format

You are now leaving the NIMH Data Archive (NDA) web site to go to:

Click on the address above if the page does not change within 10 seconds.

Disclaimer

NDA is not responsible for the content of this external site and does not monitor other web sites for accuracy.

Packaging, downloading data, and MINDAR databases are currently unavailable.

1 Numbers reported are subjects by age
New Trial
New Project

Format should be in the following format: Activity Code, Institute Abbreviation, and Serial Number. Grant Type, Support Year, and Suffix should be excluded. For example, grant 1R01MH123456-01A1 should be entered R01MH123456

Please select an experiment type below

Collection - Use Existing Experiment
To associate an experiment to the current collection, just select an axperiment from the table below then click the associate experiment button to persist your changes (saving the collection is not required). Note that once an experiment has been associated to two or more collections, the experiment will not longer be editable.

The table search feature is case insensitive and targets the experiment id, experiment name and experiment type columns. The experiment id is searched only when the search term entered is a number, and filtered using a startsWith comparison. When the search term is not numeric the experiment name is used to filter the results.
SelectExperiment IdExperiment NameExperiment Type
Created On
24HI-NGS_R1Omics02/16/2011
475MB1-10 (CHOP)Omics06/07/2016
490Illumina Infinium PsychArray BeadChip AssayOmics07/07/2016
501PharmacoBOLD Resting StatefMRI07/27/2016
506PVPREFOmics08/05/2016
509ABC-CT Resting v2EEG08/18/2016
13Comparison of FI expression in Autistic and Neurotypical Homo SapiensOmics12/28/2010
18AGRE/Broad Affymetrix 5.0 Genotype ExperimentOmics01/06/2011
22Stitching PCR SequencingOmics02/14/2011
26ASD_MethylationOmics03/01/2011
29Microarray family 03 (father, mother, sibling)Omics03/24/2011
37Standard paired-end sequencing of BCRsOmics04/19/2011
38Illumina Mate-Pair BCR sequencingOmics04/19/2011
39Custom Jumping LibrariesOmics04/19/2011
40Custom CapBPOmics04/19/2011
41ImmunofluorescenceOmics05/11/2011
43Autism brain sample genotyping, IlluminaOmics05/16/2011
47ARRA Autism Sequencing Collaboration at Baylor. SOLiD 4 SystemOmics08/01/2011
53AGRE Omni1-quadOmics10/11/2011
59AGP genotypingOmics04/03/2012
60Ultradeep 454 sequencing of synaptic genes from postmortem cerebella of individuals with ASD and neurotypical controlsOmics06/23/2012
63Microemulsion PCR and Targeted Resequencing for Variant Detection in ASDOmics07/20/2012
76Whole Genome Sequencing in Autism FamiliesOmics01/03/2013
519RestingfMRI11/08/2016
90Genotyped IAN SamplesOmics07/09/2013
91NJLAGS Axiom Genotyping ArrayOmics07/16/2013
93AGP genotyping (CNV)Omics09/06/2013
106Longitudinal Sleep Study. H20 200. Channel set 2EEG11/07/2013
107Longitudinal Sleep Study. H20 200. Channel set 3EEG11/07/2013
108Longitudinal Sleep Study. AURA 200EEG11/07/2013
105Longitudinal Sleep Study. H20 200. Channel set 1EEG11/07/2013
109Longitudinal Sleep Study. AURA 400EEG11/07/2013
116Gene Expression Analysis WG-6Omics01/07/2014
131Jeste Lab UCLA ACEii: Charlie Brown and Sesame Street - Project 1Eye Tracking02/27/2014
132Jeste Lab UCLA ACEii: Animacy - Project 1Eye Tracking02/27/2014
133Jeste Lab UCLA ACEii: Mom Stranger - Project 2Eye Tracking02/27/2014
134Jeste Lab UCLA ACEii: Face Emotion - Project 3Eye Tracking02/27/2014
145AGRE/FMR1_Illumina.JHUOmics04/14/2014
146AGRE/MECP2_Sanger.JHUOmics04/14/2014
147AGRE/MECP2_Junior.JHUOmics04/14/2014
151Candidate Gene Identification in familial AutismOmics06/09/2014
152NJLAGS Whole Genome SequencingOmics07/01/2014
154Math Autism Study - Vinod MenonfMRI07/15/2014
155RestingfMRI07/25/2014
156SpeechfMRI07/25/2014
159EmotionfMRI07/25/2014
160syllable contrastEEG07/29/2014
167School-age naturalistic stimuliEye Tracking09/19/2014
44AGRE/Broad Affymetrix 5.0 Genotype ExperimentOmics06/27/2011
45Exome Sequencing of 20 Sporadic Cases of Autism Spectrum DisorderOmics07/15/2011
Collection - Add Experiment
Add Supporting Documentation
Select File

To add an existing Data Structure, enter its title in the search bar. If you need to request changes, select the indicator "No, it requires changes to meet research needs" after selecting the Structure, and upload the file with the request changes specific to the selected Data Structure. Your file should follow the Request Changes Procedure. If the Data Structure does not exist, select "Request New Data Structure" and upload the appropriate zip file.

Request Submission Exemption
Characters Remaining:
Not Eligible

The Data Expected list for this Collection shows some raw data as missing. Contact the NDA Help Desk with any questions.

Please confirm that you will not be enrolling any more subjects and that all raw data has been collected and submitted.

Collection Updated

Your Collection is now in Data Analysis phase and exempt from biannual submissions. Analyzed data is still expected prior to publication or no later than the project end date.

[CMS] Attention
[CMS] Please confirm that you will not be enrolling any more subjects and that all raw data has been collected and submitted.
[CMS] Error

[CMS]

Unable to change collection phase where targeted enrollment is less than 90%

Delete Submission Exemption
Are you sure you want to delete this submission exemption?
You have requested to move the sharing dates for the following assessments:
Data Expected Item Original Sharing Date New Sharing Date

Please provide a reason for this change, which will be sent to the Program Officers listed within this collection:

Explanation must be between 20 and 200 characters in length.

Please press Save or Cancel
Add New Email Address - Dialog
New Email Address
Collection Summary Collection Charts
Collection Title Collection Investigators Collection Description
SSC total recall project
Eichler, Evan 
This collection consists of sequencing and variation data resulting from the reanalysis of Whole Exome Sequences from 9047 individual subjects belonging to the Simons Simplex Collection (SSC). Original data were contributed by a collaboration between NDAR Collections 1878 (Eichler Lab, University of Washington), 1936 (Wigler Lab, Cold Spring Harbor Laboratories), and 1985 (State Lab, UCSF). Reanalysis of this data was done by members of the Eichler Lab, sequences were realigned to a common reference genome (human_g1k_v37) and analyzed for possible genomic variants (SNVs, InDels, and CNVs). Details on the analysis/methods can be found in the following individual NDAR Studies: 1)realigned BAM files - NDAR Study 334 (http://ndar.nih.gov/study.html?id=334); 2)unfiltered SNV/InDel variant calls made using GATK with and without annotations - NDAR Study 348 (http://ndar.nih.gov/study.html?id=348); 3)unfiltered SNV/InDel variant calls made using FreeBayes with and without annotations - NDAR Study 349 (http://ndar.nih.gov/study.html?id=349); 4)CNV variant calls made using XHMM and CoNIFER - NDAR Study 361 (http://ndar.nih.gov/study.html?id=361).
NIMH Data Archive
08/22/2013
Funding Completed
Close Out
No
9,047
Loading Chart...
NIH - Contract None




helpcenter.collection.general-tab

NDA Help Center

Collection - General Tab

Fields available for edit on the top portion of the page include:

  • Collection Title
  • Investigators
  • Collection Description
  • Collection Phase
  • Funding Source
  • Clinical Trials

Collection Phase: The current status of a research project submitting data to an NDA Collection, based on the timing of the award and/or the data that have been submitted.

  • Pre-Enrollment: The default entry made when the NDA Collection is created.
  • Enrolling: Data have been submitted to the NDA Collection or the NDA Data Expected initial submission date has been reached for at least one data structure category in the NDA Collection.
  • Data Analysis: Subject level data collection for the research project is completed and has been submitted to the NDA Collection. The NDA Collection owner or the NDA Help Desk may set this phase when they’ve confirmed data submission is complete and submitted subject counts match at least 90% of the target enrollment numbers in the NDA Data Expected. Data submission reminders will be turned off for the NDA Collection.
  • Funding Completed: The NIH grant award (or awards) associated with the NDA Collection has reached its end date. NDA Collections in Funding Completed phase are assigned a subphase to indicate the status of data submission.
    • The Data Expected Subphase indicates that NDA expects more data will be submitted
    • The Closeout Subphase indicates the data submission is complete.
    • The Sharing Not Met Subphase indicates that data submission was not completed as expected.

Blinded Clinical Trial Status:

  • This status is set by a Collection Owner and indicates the research project is a double blinded clinical trial. When selected, the public view of Data Expected will show the Data Expected items and the Submission Dates, but the targeted enrollment and subjects submitted counts will not be displayed.
  • Targeted enrollment and subjects submitted counts are visible only to NDA Administrators and to the NDA Collection or as the NDA Collection Owner.
  • When an NDA Collection that is flagged Blinded Clinical Trial reaches the maximum data sharing date for that Data Repository (see https://nda.nih.gov/nda/sharing-regimen.html), the embargo on Data Expected information is released.

Funding Source

The organization(s) responsible for providing the funding is listed here.

Supporting Documentation

Users with Submission privileges, as well as Collection Owners, Program Officers, and those with Administrator privileges, may upload and attach supporting documentation. By default, supporting documentation is shared to the general public, however, the option is also available to limit this information to qualified researchers only.

Grant Information

Identifiable details are displayed about the Project of which the Collection was derived from. You may click in the Project Number to view a full report of the Project captured by the NIH.

Clinical Trials

Any data that is collected to support or further the research of clinical studies will be available here. Collection Owners and those with Administrator privileges may add new clinical trials.

Frequently Asked Questions

  • How does the NIMH Data Archive (NDA) determine which Permission Group data are submitted into?
    During Collection creation, NDA staff determine the appropriate Permission Group based on the type of data to be submitted, the type of access that will be available to data access users, and the information provided by the Program Officer during grant award.
  • How do I know when a NDA Collection has been created?
    When a Collection is created by NDA staff, an email notification will automatically be sent to the PI(s) of the grant(s) associated with the Collection to notify them.
  • Is a single grant number ever associated with more than one Collection?
    The NDA system does not allow for a single grant to be associated with more than one Collection; therefore, a single grant will not be listed in the Grant Information section of a Collection for more than one Collection.
  • Why is there sometimes more than one grant included in a Collection?
    In general, each Collection is associated with only one grant; however, multiple grants may be associated if the grant has multiple competing segments for the same grant number or if multiple different grants are all working on the same project and it makes sense to hold the data in one Collection (e.g., Cooperative Agreements).

Glossary

  • Administrator Privilege
    A privilege provided to a user associated with an NDA Collection or NDA Study whereby that user can perform a full range of actions including providing privileges to other users.
  • Collection Owner
    Generally, the Collection Owner is the contact PI listed on a grant. Only one NDA user is listed as the Collection owner. Most automated emails are primarily sent to the Collection Owner.
  • Collection Phase
    The Collection Phase provides information on data submission as opposed to grant/project completion so while the Collection phase and grant/project phase may be closely related they are often different. Collection users with Administrative Privileges are encouraged to edit the Collection Phase. The Program Officer as listed in eRA (for NIH funded grants) may also edit this field. Changes must be saved by clicking the Save button at the bottom of the page. This field is sortable alphabetically in ascending or descending order. Collection Phase options include:
    • Pre-Enrollment: A grant/project has started, but has not yet enrolled subjects.
    • Enrolling: A grant/project has begun enrolling subjects. Data submission is likely ongoing at this point.
    • Data Analysis: A grant/project has completed enrolling subjects and has completed all data submissions.
    • Funding Completed: A grant/project has reached the project end date.
  • Collection Title
    An editable field with the title of the Collection, which is often the title of the grant associated with the Collection.
  • Grant
    Provides the grant number(s) for the grant(s) associated with the Collection. The field is a hyperlink so clicking on the Grant number will direct the user to the grant information in the NIH Research Portfolio Online Reporting Tools (RePORT) page.
  • Supporting Documentation
    Various documents and materials to enable efficient use of the data by investigators unfamiliar with the project and may include the research protocol, questionnaires, and study manuals.
  • NIH Research Initiative
    NDA Collections may be organized by scientific similarity into NIH Research Initiatives, to facilitate query tool user experience. NIH Research Initiatives map to one or multiple Funding Opportunity Announcements.
  • Permission Group
    Access to shared record-level data in NDA is provisioned at the level of a Permission Group. NDA Permission Groups consist of one or multiple NDA Collections that contain data with the same subject consents.
  • Planned Enrollment
    Number of human subject participants to be enrolled in an NIH-funded clinical research study. The data is provided in competing applications and annual progress reports.
  • Actual Enrollment
    Number of human subjects enrolled in an NIH-funded clinical research study. The data is provided in annual progress reports.
  • NDA Collection
    A virtual container and organization structure for data and associated documentation from one grant or one large project/consortium. It contains tools for tracking data submission and allows investigators to define a wide array of other elements that provide context for the data, including all general information regarding the data and source project, experimental parameters used to collect any event-based data contained in the Collection, methods, and other supporting documentation. They also allow investigators to link underlying data to an NDA Study, defining populations and subpopulations specific to research aims.
  • Data Use Limitations
    Data Use Limitations (DULs) describe the appropriate secondary use of a dataset and are based on the original informed consent of a research participant. NDA only accepts consent-based data use limitations defined by the NIH Office of Science Policy.
  • Total Subjects Shared
    The total number of unique subjects for whom data have been shared and are available for users with permission to access data.
IDNameCreated DateStatusType
No records found.
helpcenter.collection.experiments-tab

NDA Help Center

Collection - Experiments

The number of Experiments included is displayed in parentheses next to the tab name. You may download all experiments associated with the Collection via the Download button. You may view individual experiments by clicking the Experiment Name and add them to the Filter Cart via the Add to Cart button.

Collection Owners, Program Officers, and users with Submission or Administrative Privileges for the Collection may create or edit an Experiment.

Please note: The creation of an NDA Experiment does not necessarily mean that data collected, according to the defined Experiment, has been submitted or shared.

Frequently Asked Questions

  • Can an Experiment be associated with more than one Collection?

    Yes -see the “Copy” button in the bottom left when viewing an experiment. There are two actions that can be performed via this button:

    1. Copy the experiment with intent for modifications.
    2. Associate the experiment to the collection. No modifications can be made to the experiment.

Glossary

  • Experiment Status
    An Experiment must be Approved before data using the associated Experiment_ID may be uploaded.
  • Experiment ID
    The ID number automatically generated by NDA which must be included in the appropriate file when uploading data to link the Experiment Definition to the subject record.
Genomics Sample Genomics 9047
Genomics Subject Genomics 9047
helpcenter.collection.shared-data-tab

NDA Help Center

Collection - Shared Data

This tab provides a quick overview of the Data Structure title, Data Type, and Number of Subjects that are currently Shared for the Collection. The information presented in this tab is automatically generated by NDA and cannot be edited. If no information is visible on this tab, this would indicate the Collection does not have shared data or the data is private.

The shared data is available to other researchers who have permission to access data in the Collection's designated Permission Group(s). Use the Download button to get all shared data from the Collection to the Filter Cart.

Frequently Asked Questions

  • How will I know if another researcher uses data that I shared through the NIMH Data Archive (NDA)?
    To see what data your project have submitted are being used by a study, simply go the Associated Studies tab of your collection. Alternatively, you may review an NDA Study Attribution Report available on the General tab.
  • Can I get a supplement to share data from a completed research project?
    Often it becomes more difficult to organize and format data electronically after the project has been completed and the information needed to create a GUID may not be available; however, you may still contact a program staff member at the appropriate funding institution for more information.
  • Can I get a supplement to share data from a research project that is still ongoing?
    Unlike completed projects where researchers may not have the information needed to create a GUID and/or where the effort needed to organize and format data becomes prohibitive, ongoing projects have more of an opportunity to overcome these challenges. Please contact a program staff member at the appropriate funding institution for more information.

Glossary

  • Data Structure
    A defined organization and group of Data Elements to represent an electronic definition of a measure, assessment, questionnaire, or collection of data points. Data structures that have been defined in the NDA Data Dictionary are available at https://nda.nih.gov/general-query.html?q=query=data-structure
  • Data Type
    A grouping of data by similar characteristics such as Clinical Assessments, Omics, or Neurosignal data.
  • Shared
    The term 'Shared' generally means available to others; however, there are some slightly different meanings based on what is Shared. A Shared NDA Study is viewable and searchable publicly regardless of the user's role or whether the user has an NDA account. A Shared NDA Study does not necessarily mean that data used in the NDA Study have been shared as this is independently determined. Data are shared according the schedule defined in a Collection's Data Expected Tab and/or in accordance with data sharing expectations in the NDA Data Sharing Terms and Conditions. Additionally, Supporting Documentation uploaded to a Collection may be shared independent of whether data are shared.

Collection Owners and those with Collection Administrator permission, may edit a collection. The following is currently available for Edit on this page:

Publications

Publications relevant to NDA data are listed below. Most displayed publications have been associated with the grant within Pubmed. Use the "+ New Publication" button to add new publications. Publications relevant/not relevant to data expected are categorized. Relevant publications are then linked to the underlying data by selecting the Create Study link. Study provides the ability to define cohorts, assign subjects, define outcome measures and lists the study type, data analysis and results. Analyzed data and results are expected in this way.

PubMed IDStudyTitleJournalAuthorsDateStatus
No records found.
helpcenter.collection.publications-tab

NDA Help Center

Collection - Publications

The number of Publications is displayed in parentheses next to the tab name. Clicking on any of the Publication Titles will open the Publication in a new internet browsing tab.

Collection Owners, Program Officers, and users with Submission or Administrative Privileges for the Collection may mark a publication as either Relevant or Not Relevant in the Status column.

Frequently Asked Questions

  • How can I determine if a publication is relevant?
    Publications are considered relevant to a collection when the data shared is directly related to the project or collection.
  • Where does the NDA get the publications?
    PubMed, an online library containing journals, articles, and medical research. Sponsored by NiH and National Library of Medicine (NLM).

Glossary

  • Create Study
    A link to the Create an NDA Study page that can be clicked to start creating an NDA Study with information such as the title, journal and authors automatically populated.
  • Not Determined Publication
    Indicates that the publication has not yet been reviewed and/or marked as Relevant or Not Relevant so it has not been determined whether an NDA Study is expected.
  • Not Relevant Publication
    A publication that is not based on data related to the aims of the grant/project associated with the Collection or not based on any data such as a review article and, therefore, an NDA Study is not expected to be created.
  • PubMed
    PubMed provides citation information for biomedical and life sciences publications and is managed by the U.S. National Institutes of Health's National Library of Medicine.
  • PubMed ID
    The PUBMed ID is the unique ID number for the publication as recorded in the PubMed database.
  • Relevant Publication
    A publication that is based on data related to the aims of the grant/project associated with the Collection and, therefore, an NDA Study is expected to be created.
Data Expected List: Mandatory Data Structures

These data structures are mandatory for your NDA Collection. Please update the Targeted Enrollment number to accurately represent the number of subjects you expect to submit for the entire study.

For NIMH HIV-related research that involves human research participants: Select the dictionary or dictionaries most appropriate for your research. If your research does not require all three data dictionaries, just ignore the ones you do not need. There is no need to delete extra data dictionaries from your NDA Collection. You can adjust the Targeted Enrollment column in the Data Expected tab to “0” for those unnecessary data dictionaries. At least one of the three data dictionaries must have a non-zero value.

Data ExpectedTargeted EnrollmentInitial SubmissionSubjects SharedStatus
No Mandatory Data Expected
To create your project's Data Expected list, use the "+New Data Expected" to add or request existing structures and to request new Data Structures that are not in the NDA Data Dictionary.

If the Structure you need already exists, locate it and specify your dates and enrollment when adding it to your Data Expected list. If you require changes to the Structure you need, select the indicator stating "No, it requires changes to meet research needs," and upload a file containing your requested changes.

If the structure you need is not yet defined in the Data Dictionary, you can select "Upload Definition" and attach the necessary materials to request its creation.

When selecting the expected dates for your data, make sure to follow the standard Data Sharing Regimen and choose dates within the date ranges that correspond to your project start and end dates.

Please visit the Completing Your Data Expected Tutorial for more information.
Data Expected List: Data Structures per Research Aims

These data structures are specific to your research aims and should list all data structures in which data will be collected and submitted for this NDA Collection. Please update the Targeted Enrollment number to accurately represent the number of subjects you expect to submit for the entire study.

Data ExpectedTargeted EnrollmentInitial SubmissionSubjects SharedStatus
Genomics/omics info icon
10,06005/13/2015
9,047
Approved
Structure not yet defined
No Status history for this Data Expected has been recorded yet
helpcenter.collection.data-expected-tab

NDA Help Center

Collection - Data Expected

The Data Expected tab displays the list of all data that NDA expects to receive in association with the Collection as defined by the contributing researcher, as well as the dates for the expected initial upload of the data, and when it is first expected to be shared, or with the research community. Above the primary table of Data Expected, any publications determined to be relevant to the data within the Collection are also displayed - members of the contributing research group can use these to define NDA Studies, connecting those papers to underlying data in NDA.

The tab is used both as a reference for those accessing shared data, providing information on what is expected and when it will be shared, and as the primary tracking mechanism for contributing projects. It is used by both contributing primary researchers, secondary researchers, and NIH Program and Grants Management staff.

Researchers who are starting their project need to update their Data Expected list to include all the Data Structures they are collecting under their grant and set their initial submission and sharing schedule according to the NDA Data Sharing Regimen.

To add existing Data Structures from the Data Dictionary, to request new Data Structure that are not in the Dictionary, or to request changes to existing Data Structures, click "+New Data Expected".

For step-by-step instructions on how to add existing Data Structures, request changes to an existing Structure, or request a new Data Structure, please visit the Completing Your Data Expected Tutorial.

If you are a contributing researcher creating this list for the first time, or making changes to the list as your project progress, please note the following:

  • Although items you add to the list and changes you make are displayed, they are not committed to the system until you Save the entire page using the "Save" button at the bottom of your screen. Please Save after every change to ensure none of your work is lost.
  • If you attempt to add a new structure, the title you provide must be unique - if another structure exists with the same name your change will fail.
  • Adding a new structure to this list is the only way to request the creation of a new Data Dictionary definition.

Frequently Asked Questions

  • What is an NDA Data Structure?
    An NDA Data Structure is comprised of multiple Data Elements to make up an electronic definition of an assessment, measure, questionnaire, etc will have a corresponding Data Structure.
  • What is the NDA Data Dictionary?
    The NDA Data Dictionary is comprised of electronic definitions known as Data Structures.

Glossary

  • Analyzed Data
    Data specific to the primary aims of the research being conducted (e.g. outcome measures, other dependent variables, observations, laboratory results, analyzed images, volumetric data, etc.) including processed images.
  • Data Item
    Items listed on the Data Expected list in the Collection which may be an individual and discrete Data Structure, Data Structure Category, or Data Structure Group.
  • Data Structure
    A defined organization and group of Data Elements to represent an electronic definition of a measure, assessment, questionnaire, or collection of data points. Data structures that have been defined in the NDA Data Dictionary are available at https://nda.nih.gov/general-query.html?q=query=data-structure
  • Data Structure Category
    An NDA term describing the affiliation of a Data Structure to a Category, which may be disease/disorder or diagnosis related (Depression, ADHD, Psychosis), specific to data type (MRI, eye tracking, omics), or type of data (physical exam, IQ).
  • Data Structure Group
    A Data Item listed on the Data Expected tab of a Collection that indicates a group of Data Structures (e.g., ADOS or SCID) for which data may be submitted instead of a specific Data Structure identified by version, module, edition, etc. For example, the ADOS Data Structure Category includes every ADOS Data Structure such as ADOS Module 1, ADOS Module 2, ADOS Module 1 - 2nd Edition, etc. The SCID Data Structure Group includes every SCID Data Structure such as SCID Mania, SCID V Mania, SCID PTSD, SCID-V Diagnosis, and more.
  • Evaluated Data
    A new Data Structure category, Evaluated Data is analyzed data resulting from the use of computational pipelines in the Cloud and can be uploaded directly back to a miNDAR database. Evaluated Data is expected to be listed as a Data Item in the Collection's Data Expected Tab.
  • Imaging Data
    Imaging+ is an NDA term which encompasses all imaging related data including, but not limited to, images (DTI, MRI, PET, Structural, Spectroscopy, etc.) as well as neurosignal data (EEG, fMRI, MEG, EGG, eye tracking, etc.) and Evaluated Data.
  • Initial Share Date
    Initial Submission and Initial Share dates should be populated according to the NDA Data Sharing Terms and Conditions. Any modifications to these will go through the approval processes outlined above. Data will be shared with authorized users upon publication (via an NDA Study) or 1-2 years after the grant end date specified on the first Notice of Award, as defined in the applicable Data Sharing Terms and Conditions.
  • Initial Submission Date
    Initial Submission and Initial Share dates should be populated according to these NDA Data Sharing Terms and Conditions. Any modifications to these will go through the approval processes outlined above. Data for all subjects is not expected on the Initial Submission Date and modifications may be made as necessary based on the project's conduct.
  • Research Subject and Pedigree
    An NDA created Data Structure used to convey basic information about the subject such as demographics, pedigree (links family GUIDs), diagnosis/phenotype, and sample location that are critical to allow for easier querying of shared data.
  • Submission Cycle
    The NDA has two Submission Cycles per year - January 15 and July 15.
  • Submission Exemption
    An interface to notify NDA that data may not be submitted during the upcoming/current submission cycle.

Collection Owners and those with Collection Administrator permission, may edit a collection. The following is currently available for Edit on this page:

Associated Studies

Studies that have been defined using data from a Collection are important criteria to determine the value of data shared. The number of subjects column displays the counts from this Collection that are included in a Study, out of the total number of subjects in that study. The Data Use column represents whether or not the study is a primary analysis of the data or a secondary analysis. State indicates whether the study is private or shared with the research community.

Study NameAbstractCollection/Study SubjectsData UsageState
Hotspots of missense mutation identify neurodevelopmental disorder genes and functional domainsAlthough de novo missense mutations have been predicted to account for more cases of autism than gene-truncating mutations, most research has focused on the latter. We identified the properties of de novo missense mutations in patients with neurodevelopmental disorders (NDDs) and highlight 35 genes with excess missense mutations. Additionally, 40 amino acid sites were recurrently mutated in 36 genes, and targeted sequencing of 20 sites in 17,600 NDD patients identified 21 new patients with identical missense mutations. One recurrent site (p.Ala636Thr) occurs in a glutamate receptor subunit, GRIA1. This same amino acid substitution in the homologous but distinct mouse glutamate receptor subunit Grid2 is associated with Lurcher ataxia. Phenotypic follow-up in five individuals with GRIA1 mutations shows evidence of specific learning disabilities and autism. Overall, we find significant clustering of de novo mutations in 200 genes, highlighting specific functional domains and synaptic candidate genes important in NDD pathology.13/18812Primary AnalysisShared
Insights into Autism Spectrum Disorder Genomic Architecture and Biology from 71 Risk LociAnalysis of de novo CNVs (dnCNVs) from the full Simons Simplex Collection (SSC) (N = 2,591 families) replicates prior findings of strong association with autism spectrum disorders (ASDs) and confirms six risk loci (1q21.1, 3q29, 7q11.23, 16p11.2, 15q11.2-13, and 22q11.2). The addition of published CNV data from the Autism Genome Project (AGP) and exome sequencing data from the SSC and the Autism Sequencing Consortium (ASC) shows that genes within small de novo deletions, but not within large dnCNVs, significantly overlap the high-effect risk genes identified by sequencing. Alternatively, large dnCNVs are found likely to contain multiple modest-effect risk genes. Overall, we find strong evidence that de novo mutations are associated with ASD apart from the risk for intellectual disability. Extending the transmission and de novo association test (TADA) to include small de novo deletions reveals 71 ASD risk loci, including 6 CNV regions (noted above) and 65 risk genes (FDR ≤ 0.1).8190/9975Secondary AnalysisShared
Complete Realignment of Whole Exome Sequencing data from 2415 families in SSC CollectionWhole Exome Sequencing has been completed for ~ 2500 families from the Simons Simplex Collection. Sequencing was performed at three individual sequencing centers with original data submitted to NDAR Collections 1878, 1895, and 1936; subsets of these data have been analyzed by various methods and published. This study represents an effort to realign sequencing data from all three collection sin a uniform manner using the latest toolchains and algorithms available, which can be used as a resource for the entire ASD Community. Original sequence data has been realigned to a single reference genome (1000 Genomes / GRCh37) using BWA, Picardtools, Samtools, and some custom python scripts. QC summary data were generated as part of the realignment process using the aforementioned tools in addition to QPLOT and some custom scripts. Complete methods, including source code for pipeline and custom scripts can be found at: https://github.com/nkrumm/asd-jre-public. The data package for this study represents the genomics_subject02, genomics_sample03, and omics_qa01 data structures which include realigned BAM files and QC files (i.e., QPLOT output and BAM header files). Variant calling an annotation for these data are provided in NDAR Studies 348 (https://ndar.nih.gov/study.html?id=348) and 349 (https://ndar.nih.gov/study.html?id=349).9047/9047Secondary AnalysisShared
The contribution of mosaic variants to autism spectrum disorderDe novo mutation is highly implicated in autism spectrum disorder (ASD). However, the contribution of post-zygotic mutation to ASD is poorly characterized. We performed both exome sequencing of paired samples and analysis of de novo variants from whole-exome sequencing of 2,388 families. While we find little evidence for tissue-specific mosaic mutation, multi-tissue post-zygotic mutation (i.e. mosaicism) is frequent, with detectable mosaic variation comprising 5.4% of all de novo mutations. We identify three mosaic missense and likely-gene disrupting mutations in genes previously implicated in ASD (KMT2C, NCKAP1, and MYH10) in probands but none in siblings. We find a strong ascertainment bias for mosaic mutations in probands relative to their unaffected siblings (p = 0.003). We build a model of de novo variation incorporating mosaic variants and errors in classification of mosaic status and from this model we estimate that 33% of mosaic mutations in probands contribute to 5.1% of simplex ASD diagnoses (95% credible interval 1.3% to 8.9%). Our results indicate a contributory role for multi-tissue mosaic mutation in some individuals with an ASD diagnosis.9047/9047Secondary AnalysisShared
Copy Number Variants from SSC Collection ~ 2500 families by two Methods (XHMM and Conifer)XHMM was run on a set of realigned BAM files from the SSC collection (see NDAR Study 334 for BAM files) using the attached scripts. These scripts calculate depth of coverage using GATK, pull the GATK output from an instance on NDAR's cloud, merge the output of GATK into a single matrix, process the read depth matrix (filter, center), normalize the matrix using principal component analysis (PCA), process the normalized read depth matrix (filter, z-score), run a hidden markov model (HMM) on this matrix to identify CNVs in the normalized data, and generate family level vcfs from the xhmm data. XHMM produces as output coverage summary tables produced by GATK (sample_interval_statistics, sample_interval_summary, sample_summary, sample_statistics), principal component data files, a genotyped CNV output VCF file, and some example plots and graphics. For this study, the GATK output is available. Additional information about XHMM is available here: http://atgu.mgh.harvard.edu/xhmm/tutorial.shtml9041/9041Secondary AnalysisShared
Variant Recalling (FreeBayes) from Whole Exome Sequencing data for 2415 families in SSC CollectionWhole Exome Sequencing has been completed for ~ 2500 families from the Simons Simplex Collection. Sequencing was performed at three individual sequencing centers with original data submitted to NDAR Collections 1878, 1895, and 1936; subsets of these data have been analyzed by various methods and published. This study represents an effort to call and annotate SNPs and Indels on data from all three collections in a uniform manner using the latest toolchains and algorithms available. Variant calls from this study were generated using FreeBayes, Famseq, and some custom scripts; annotation was provided by SnpEff, dbNSFP, and vcftools. Note that variants were called in batches with ~ 20 families per batch. Complete methods, including source code for pipeline and custom scripts can be found at: https://github.com/nkrumm/asd-jre-public The data package for this study includes the genomics_sample02, genomics_sample03 structures with annotated and un-annotated VCF files for each family. Another NDAR Study (348) is available with VCF files generated using GATK (https://ndar.nih.gov/study.html?id=348), and the complete set of BAM files used for variant calling are available in NDAR Study 334 (https://ndar.nih.gov/study.html?id=334)8976/8976Secondary AnalysisShared
Variant Recalling (GATK) from Whole Exome Sequencing data for 2415 families in SSC CollectionWhole Exome Sequencing has been completed for ~ 2500 families from the Simons Simplex Collection. Sequencing was performed at three individual sequencing centers with original data submitted to NDAR Collections 1878, 1895, and 1936; subsets of these data have been analyzed by various methods and published. This study represents an effort to call and annotate SNPs and Indels on data from all three collections in a uniform manner using the latest toolchains and algorithms available. Variant calls from this study were generated using GATK, Famseq, and some custom scripts; annotation was provided by SnpEff, dbNSFP, and vcftools. Note that variants were called in batches with ~ 20 families per batch. Complete methods, including source code for pipeline and custom scripts can be found at: https://github.com/nkrumm/asd-jre-public The data package for this study represents the genomics_subject02, genomics_sample03 structures which include annotated and un-annotated VCF files for each family. Another NDAR Study (349) is available with VCF files generated using FreeBayes (https://ndar.nih.gov/study.html?id=349), and the complete set of BAM files used for variant calling are available in NDAR Study 334 (https://ndar.nih.gov/study.html?id=334)8976/8976Secondary AnalysisShared
Excess of rare inherited truncating mutations in autismIn order to quantify the effect of private, inherited mutations on autism risk, we generated a callset of both inherited and de novo single nucleotide variants (SNVs) and copy number variants (CNVs) across 2,377 Simons Simplex Collection families. The publically deposited dataset includes 1,786 parents-child-unaffected sibling "quads" allowing us to compare burden of inherited and de novo mutations between affected and unaffected siblings in simplex autism families. We find that private, inherited truncating SNV mutations in conserved genes are significantly enriched in probands (odds ratio = 1.14, p = 0.0002) and more likely to be transmitted to children with autism when compared to their unaffected siblings (p < 0.0001). We find that this effect becomes more pronounced with increasing gene conservation (Residual Variation Intolerance Score, RVIS). Likewise, we observe a similar bias for inherited CNVs specifically for small (<100 kbp), maternally inherited events (p = 9.6x10^-3) that are enriched in CHD8 target genes (OR = 3.6, p = 2.0x10^-3). We quantified autism spectrum disorder (ASD) risk for de novo and inherited CNVs and SNVs by using a conditional logistic regression model. Independent from de novo mutations, private truncating SNVs and rare, inherited CNVs contribute an increase in risk with an odds ratio 1.11 (p = 0.0002) and 1.23 (p = 0.01), respectively. Our results indicate a statistically independent role for inherited mutations in ASD risk and identify additional high-impact risk candidate genes (e.g., RIMS1, CUL7, LZTR1 and CC2D2A) where transmitted mutations may create a sensitized background for autism but are unlikely to be necessary and sufficient for the disorder.8911/8911Secondary AnalysisShared
Evolutionary and Genetic Analysis of Synonymous Nucleotide Substitutions in Subjects with Autism Spectrum DisordersThe director of the project, Dr. Igor Rogozin, analyzed a modest collection of synonymous nucleotide substitutions from two small databases of mutations observed in autistic subjects [1]. Dr. Rogozin and his colleagues found that there was a statistically significant tendency for these synonymous nucleotide substitutions to replace a reference codon supportive of faster protein translation with a non-reference codon that is known to be associated with slower translation [1]. In the proposed study, we wish to test the codon replacement properties of synonymous substitutions reported in the much larger NDAR database, including whether the property of propensity to slower translation holds in a much larger data set of mutations. We also wish to compare the characteristics of the synonymous and nonsynonymous substitutions, using established techniques in genetics. [1] Poliakov E, Koonin EV, Rogozin IB. Impairment of translation in neurons as a putative causative factor for autism. Biology Direct. 2014; 9:16. 7200/7200Secondary AnalysisShared
The evolution and population diversity of human-specific segmental duplicationsSegmental duplications contribute to human evolution, adaptation and genomic instability but are often poorly characterized. We investigate the evolution, genetic variation and coding potential of human-specific segmental duplications (HSDs). We identify 218 HSDs based on analysis of 322 deeply sequenced archaic and contemporary hominid genomes. We sequence 550 human and nonhuman primate genomic clones to reconstruct the evolution of the largest, most complex regions with protein-coding potential (n=80 genes/33 gene families). We show that HSDs are non-randomly organized, associate preferentially with ancestral ape duplications termed “core duplicons”, and evolved primarily in an interspersed inverted orientation. In addition to Homo sapiens-specific gene expansions (e.g., TCAF1/2), we highlight ten gene families (e.g., ARHGAP11B and SRGAP2C) where copy number never returns to the ancestral state, there is evidence of mRNA splicing, and no common gene-disruptive mutations are observed in the general population. Such duplicates are candidates for the evolution of human-specific adaptive traits. 1536/6360Primary AnalysisShared
Mitochondrial DNA mutations in Autism Spectrum DisorderMitochondrial dysfunction is frequently observed in Autism Spectrum Disorders (ASD). Thus, variations in the mitochondrial DNA (mtDNA) sequences may contribute to increased ASD risks. In the current study, we evaluated mtDNA variations, including homoplasmy and heteroplasmy, in 903 ASD individuals along with their mothers and non-ASD siblings by using off-target reads from whole-exome sequencing data sets of Simons Foundation Autism Research Initiative (SFARI) Simons Collection available on NDAR. We found that heteroplasmic mutations in ASD individuals were enriched at non-polymorphic mtDNA sites (P = 0.0015) compared to their non-ASD siblings, which were more likely to confer deleterious effects than heteroplasmies at polymorphic mtDNA sites. Accordingly, we observed a ~1.5-fold enrichment of nonsynonymous mutations as well as a ~2.2-fold enrichment of predicted pathogenic mutations (P < 0.003) in ASD individuals compared to their non-ASD siblings. Our genetic findings substantiate pathogenic mtDNA mutations as a potential cause for ASD and synergize with recent work calling attention to their unique metabolic phenotypes for diagnosis and treatment of ASD.2479/2709Secondary AnalysisShared
Identification of differentially methylated regions (DMRs) and cytosine sites (DMCs) in DNA methylation data of autism cases and unaffected siblingsWe compared blood-based DNA methylation profiles between children with autism spectrum disorder (ASD) and carefully matched, unrelated neurotypical control children. Using sequencing-based method, we identified ASD-specific differentially methylated regions (DMRs) and cytosine sites (DMCs). We carried out comparative analyses with datasets from the NDA Collection 1650 (SFARI - DNA Methylation Analysis Cohort) that measured blood DNA methylation in ASD using microarray technology. We also identified DMRs and DMCs using metilene and minfi pipelines in the DNAm datasets from the NDA Collection 1650.601/728Secondary AnalysisShared
Phenotypic subtyping and re-analysis of existing methylation data from autistic probands in simplex families reveal ASD subtype-associated differentially methylated genes and biological functionsAutism spectrum disorder (ASD) describes a group of neurodevelopmental disorders with core deficits in social communication and manifestation of restricted, repetitive, and stereotyped behaviors. Despite the core symptomatology, ASD is extremely heterogeneous with respect to the severity of symptoms and behaviors. This heterogeneity presents an inherent challenge to all large-scale genome-wide 'omics analyses. In the present study, we address this heterogeneity by stratifying ASD probands from simplex families according to severity of behavioral scores on the Autism Diagnostic Interview-Revised diagnostic instrument, followed by re-analysis of existing DNA methylation data from individuals in three ASD subphenotypes in comparison to that of their respective unaffected siblings. We demonstrate that subphenotyping of cases enables the identification of over 1.6 times the number of statistically significant differentially methylated genes (DMGs) between cases and controls, compared to that identified when all cases are combined. Our analyses also reveal ASD-related neurological functions and comorbidities that are enriched among DMGs in each phenotypic subgroup but not in the combined case group. These findings may aid in the development of subtype-directed diagnostics and therapeutics. 129/584Secondary AnalysisShared
Embryonic lethal genetic variants and chromosomally normal pregnancy lossObjective: To examine whether rare potentially damaging genetic variants are associated with chromosomally normal pregnancy loss and estimate the magnitude of the association. Design: Case-control. Setting: Cases comprise 19 chromosomally normal loss conceptus-parent trios. They derive from a consecutive series of karyotyped losses at one hospital. Controls comprise 547 unaffected siblings of autism cases-parent trios from the National Database for Autism Research. Main outcome measures: The rate of predicted damaging variants in the exome (loss of function and missense–damaging) and the proportions of probands with at least one such variant among cases versus controls. Results: The proportions of probands with at least one rare predicted damaging variant were 36.8% among cases and 22.9% among controls (odds ratio (OR)=2.0, 99% CI 0.5-7.3). No case has a variant in a fetal anomaly gene. The proportion with variants in possibly embryonic lethal genes was increased in case probands (OR=14.5, 99% CI 1.5-89.7); variants occurred in BAZ1A, FBN2 and TIMP2. Conclusion: Rare genetic variants in the conceptus may be a cause of chromosomally normal loss. A larger sample is needed to estimate the magnitude of the association with precision and to identify relevant biological pathways. 547/547Secondary AnalysisShared
* Data not on individual level
helpcenter.collection.associated-studies-tab

NDA Help Center

Collection - Associated Studies

Clicking on the Study Title will open the study details in a new internet browser tab. The Abstract is available for viewing, providing the background explanation of the study, as provided by the Collection Owner.

Primary v. Secondary Analysis: The Data Usage column will have one of these two choices. An associated study that is listed as being used for Primary Analysis indicates at least some and potentially all of the data used was originally collected by the creator of the NDA Study. Secondary Analysis indicates the Study owner was not involved in the collection of data, and may be used as supporting data.

Private v. Shared State: Studies that remain private indicate the associated study is only available to users who are able to access the collection. A shared study is accessible to the general public.

Frequently Asked Questions

  • How do I associate a study to my collection?
    Studies are associated to the Collection automatically when the data is defined in the Study.

Glossary

  • Associated Studies Tab
    A tab in a Collection that lists the NDA Studies that have been created using data from that Collection including both Primary and Secondary Analysis NDA Studies.
Edit