GUID (Global Unique Identifier)
The GUID is a subject ID allowing researchers to share data specific to a study participant without exposing personally identifiable information (PII) and match participants across labs and research data repositories.
Originally implemented to support the autism research community, the GUID is now available for other research communities needing a common subject identifier across research laboratories and repositories. Contact us at firstname.lastname@example.org for any questions related to this capability, or to request to use the GUID in your research. New users of the NIMH Data Archive may request an account and select the GUID privilege at the bottom or login and request GUID access within their profile.
Researchers with existing accounts can use the links below to begin working with the GUID Tool:
Users resetting their passwords should complete the process and set a new permanent password before attempting to login to the Tool.
How does it work?
The GUID Tool is a piece of software that accepts the personal information of subjects, and uses it to create a series of hash codes. These codes are sent to our system and checked against the GUID database. If these codes have been seen before, that means the information matches an existing GUID, and this GUID is sent back. If no match is found, a new GUID is created and sent back. If someone else enters the same information later, the tool will detect this match and send back the same GUID. The GUID itself is a series of alpha-numeric characters. This system has the following advantages:
- No PII ever leaves your computer.
- There is nothing about a GUID that would allow someone to infer the identity of the individual to whom it belongs.
- The same individual's information will result in the same GUID across time, location, and research study. This allows researchers to match shared data from that participant regardless of source, without ever sharing or viewing PII.
What will I need to create a GUID?
All of the following information is required to create a GUID: sex, first name, last name, middle name, date of birth, and city/municipality of birth. In all cases, all of this information should be obtained and entered as it appears on the birth certificate. Using the birth certificate ensures that this information can not change throughout an individual's lifespan. Below is a table with some commonly encountered questions regarding the specifics of GUID creation.
|What if I only have the middle intitial or nickname?||Only full legal names as they appear on the birth certificate should be used. Initials, nicknames, or unknowns/blanks should not be used.|
|What if I don't know the middle name or they left "middle name" blank?||If you know that the participant has no legal middle name on their birth certificate, the tool allows you to specify that the person has no middle name. If you do not have the middle name, or if you do not know whether or not they have one, you cannot create a GUID.|
|What should I do with suffixes like Jr., III, etc.?||Suffixes such as these should be omitted from the names when creating a GUID.|
|Should I include state/country when entering city/municipality of birth?||No. Only the city/municipality name as it appears on the birth certificate should be entered.|
|What is the "Use Existing GUID" column in the batch template?||The GUID Batch Template has a value not present in the interface, titled "Use Existing GUID." When generating GUIDs for twins, triplets, etc., the batch template and this column must be used. This column should be entered as NO in these cases. This tells the tool that although the PII given may be close enough to match, that they are different individuals and an existing GUID they match should be ignored.|
How do I use it?
- Get access - Request an account by emailing email@example.com stating your purpose for needing an account. This may be to create GUIDs for a project submitting data to an NIMH Data Archive (NDA) repository, or to use GUIDs as subject identifiers in your own research community.
- Launch the GUID software using the link at the top of this page. Note that a recent version of Java is needed to run the tool.
- To create a single GUID, use the data entry interface of the tool to enter all required PII, and click "Generate GUID." In order to create multiple GUIDs at once, get the GUID batch Template, replace the sample data with the required PII, and use the "Get Multiple GUIDs" option from the "Function" menu of the tool.
- When you create a GUID or upload the batch template, the software creates one-way hash codes that are sent to NDAR. No PII ever leaves your computer. Based upon these hash codes, the software will create a new GUID (if the hash codes were never seen before) or return an existing GUID (if the hash codes were seen before).
Additional GUID Features and Functions
- Get pseudoGUID - The GUID is generated based upon a subject's personally identifiable information (PII). However, for some projects the consent given is not sufficient or the PII collected did not include all of the fields needed to generate a standard GUID. To account for these occurrences, the Get pseudoGUID function is provided. A pseudoGUID is a random identifier that can be promoted to a GUID once the appropriate consent is provided and the necessary PII fields are completed. If you need identifiers but are unable to create full GUIDs (e.g. missing middle name), this option should be used.
- Get GUIDs for Multiple Subjects - The GUID interface requires double entry making it useful when entering a few subjects at a time. For research sites that have already collected participant PII, the software can generate GUIDs for multiple subjects at a time using the GUID Batch Template linked to in the tool's interface and as described above.
- Promote pseudoGUID(s) - Using this function allows you to specify a pseudoGUID along with the PII. This will link a pseudoGUID to a standard GUID, essentially recognizing the two identifiers as the same individual, and is the method used to change a pseudoGUID into a full GUID once consent or the necessary PII have been obtained after the initial creation. Multiple pseudoGUIDs can be linked using the Promote pseudoGUID template.
Review the publication Using Global Unique Identifiers to Link Autism Collections (Johnson et al. 2010) for more information on the GUID. Note that the GUID matching sensitivity has been reduced from the original design described in this paper as follows:
The following must match - excluding case and special characters - to produce the same GUID:
- Legal Name, Date of Birth, and City/Municipality
The following matches will produce the same GUID based on the specified combinations, but the user will be notified to confirm issuance of the same GUID. Choosing “no” when asked whether to use the existing GUID should only be done if the subjects are twins, where it is known that the subjects are different (e.g., twins with the same first letter of their first name):
- Sex, Legal Name, Date of Birth, but a different City/Municipality.
- Sex, Legal First and Last Name, Date of Birth, and City/Municipality, but different Middle Name.
- Sex, First Initial of Legal First Name, Middle Name, Last Name, Date of Birth, and City/Municipality, but different First Name.
- Sex, Legal Name, Date of Birth excluding Year, and City/Municipality.