12 Specific naming conventions
As mentioned in the general naming conventions section, naming should be both human as well as machine readable. Most of this can be solved by using BIDS as explained in the previous section, however there are some GUTS-specific naming conventions that need to be adhered to.
To comply with the machine-readable requirement, processed measure file names should always include "_task-[short name]_" *
. The short name of a measure, as specified in the codebook, should be used to ensure each collection site uses the same measure name. For example, the Interpersonal Reactivity Index (IRI) has been given the short name “iri
”. Its file name should therefore always include "_task-iri_"
. It is crucial that the short names match the ones in the codebook, as our scripts specifically search for these names.
*An exception to the rule is non-tabular structural mri data.
Note that only letters and numbers are allowed in short names.
Sessions will be named according to the BIDS standard, meaning that they start with 'ses-'
.
Cohorts A, B, D are scheduled to collect data in year 2, 5 and 8 of the ten-year GUTS project. Their respective sessions will be named 'ses-01'
, 'ses-02'
, 'ses-03'
.
Cohort D, in addition to the regular sessions, will collect data between session 1 and 2, and between session 2 and 3: `ses-01a'
, 'ses-02a'.
Cohort C plans to collect data twice a year in year 2 and 3. The first sessions of a year will be named 'ses-01'
, 'ses-02',
while the second measurements, taking place six months after the first measurement, will be labeled `ses-01a'
, 'ses-02a'.
Pilot
The pilot sessions will be named 'ses-pilot'
The subject ID naming convention varies slightly for each data storage location. To prevent accidental overlap between collected data from different cohorts, an abbreviation of the location will be added to each subject ID, as illustrated below.
Given that approximately 400-1000 subjects will participate at each location, we advise to use a number between 0 and 1000. For parents of subjects, the subjectid should start with ‘5’. The subject id number should always consist of four digits, e.g., subject 1 will be assigned the number 0001, subject 15 will receive the number 0015. A parent would get the subjectid 5001, 5002, 5003, etc.
Note that all data from questionnaires that are answered by the parents about the child should fall under the subject id of the child. All data from questionnaires answered by parents about themselves should remain under their own subject id.
Family ID
To be able to identify any sibling and parent-child relationships within our data, subjects get a familyid. For example, if subjects gutslei0001, gutslei0002, and gutslei50001 are part of one and the same family, all of them will be appointed family id 0001.
Data storage location | Subject ID naming convention | participant_id | family_id |
---|---|---|---|
Erasmus University Rotterdam (EUR) | sub-gutseur# | sub-gutseur0001 | 0001 |
Leiden University (LEI) | sub-gutslei# | sub-gutslei0001 | 0001 |
Vrije Universiteit Amsterdam (VU) | sub-gutsvu# | sub-gutsvu0001 | 0001 |
Amsterdam UMC (AUMC) | sub-gutsaumc# | sub-gutsaumc0001 | 0001 |
Subject IDs pilot
During the pilot, 'pilot'
will be added to the subject ID: 'sub-gutspilot[location]#'.
To prevent duplicate file names, it is essential that for each data storage location to add their abbreviation to the file name. For participant-level files, such as brain imaging data, the location will be integrated into the subject ID in the file name. For group-level files, the location must be added separately. Additionally, file names should consistently include the session and task, as outlined earlier.
Participant-level (individual) files: sub-[subjectid]_ses-[session]_task-[shortname].
Group-level files (all participants merged): guts[location]_ses-[session]_task-[shortname].
The demographics file will be called:
guts[location]_demographics.tsv
The participation file will be called:
guts[location]_participation.tsv
Raw vs processed files
Ultimately, it is crucial that the processed files strictly follow the correct naming conventions to ensure our scripts can be executed properly. However, future you will be grateful if the raw output also adheres to a clear and consistent naming convention, as this will facilitate data processing.
Therefore, when setting up a task (e.g., (f)MRI, EEG, E-Prime, Dynamometer, etc.) and if possible, please ensure adherence to the naming conventions by filling in subject names/session/task names. Note that for some measures (e.g., MRI), the option to fully decide on output file naming might not be available beforehand.
Below you can find some examples.
Type | Raw | Raw naming conventions | Processed | Processed naming conventions |
---|---|---|---|---|
Qualtrics - GUTS wide | .sav | 2024-02-26_gutslei_ses-02_quests-guts-wide_raw.sav | .tsv | gutslei_ses-02_task-iri.tsv gutslei_ses-02_task-ypi.tsv |
Qualtrics - Cohort specific | .sav | 2024-02-26_gutslei_ses-02_quests-guts-specific_raw.sav | .tsv | gutslei_ses-02_task-iri.tsv gutslei_ses-02_task-ypi.tsv |
ESM | .csv | 2024-02-26_gutseur_ses-01_esm_raw.csv/sub-gutseur0001_ses-01_task-e sm_raw.csv | .tsv | gutseur_ses-01_task-esm-pressure.tsv |
(f)MRI | .nii | sub-gutseur0014_ses-03_task-fmrisddt_run-01_raw.nii | .nii | sub-gutseur0014_ses-03_task-fmrisddt_run-01.nii |
.par | sub-gutseur0014_ses-03_task-fmrisddt_run-01_raw.par | - | ||
.rec | sub-gutseur0014_ses-03_task-fmrisddt_run-01_raw.rec | - | ||
EEG | .bdf | sub-gutseur0027_ses-01_task-eegflanker_raw.bdf | .bdf | sub-gutseur0027_ses-01_task-eegflanker.bdf |
Dynanometer | .csv? | sub-gutseur0028_ses-03_task-dynosocialeffort_raw.csv | .tsv | gutseur_ses-03_task-dynosocialeffort.tsv |
Behavioral | .edat3 | sub-gutsvu0002_ses-02_task-behsddt_raw.edat3 | .tsv | gutsvu_ses-02_task-behsddt.tsv |
.txt | sub-gutsvu0002_ses-02_task-behsddt_raw.txt | - | - | |
Physiological | labels | sub-gutsaumc0005_ses-01_task-salivatesto_t0 sub-gutsaumc0007_ses-01_task-salivatesto_t30 |
.tsv | gutsaumc_ses-01_task-salivatesto.tsv |
For measures, involving tabular data, collected across multiple cohorts, the variable names must be harmonized. This harmonization allows us to easily merge the files belonging to a specific measure from various data storage locations into one file. To facilitate this, we propose the naming convention below. Please collaborate with representatives from overlapping cohorts to ensure uniformity in variable names. For measures with no overlap, the use of specific naming conventions is less crucial. However, we still recommend using a consistent naming pattern, such as the examples provided below.
s[session]_[shortname]_[subpart-task]_q/t0[question/trial #]
Names: lowercase letters with all distinct information separated by an_underscore
Labels: Full sentences starting with a capital letter.
Value labels: lowercase letters
The shortname can be omitted in the demographics file. Questions that are asked only once do not need a session prefix (e.g., sex), questions that are asked multiple sessions, do need a session prefix (e.g., s01_age_years, s01_gender_q01, s01_health_etc).
The variable and value labels can be added to an SPSS file and later converted to a .tsv file + .json file, or manually incorporated into a .json file directly. For more information on creating .json files, see Chapter 15. Below are examples of variable names:
*Note that hyphens/dashes are not allowed in SPSS and should therefore not be used in variable names.
Variable name | Variable label | Value labels |
---|---|---|
s02_iri_pt_q03 | Interpersonal Reactivity Index - Perspective taking scale Q3: I sometimes find it difficult to see things from the other guy’s point of view. | 0 = does not describe me very well, 4 = described me very well |
s02_dailyhassles_freq_q04 | Parenting Daily Hassles scale - Frequency Q4: The kids won’t listen or do what they are asked without being nagged. | 1 = never, 2 = rarely, 3 = sometimes, 4 = often, 5 = constantly |
s03_pcg_exb1_perc_to2 | Prosocial Cyberball Task - Exclusion Block 1: Percentage of throws to player 2. | |
s03_ddmoney_ind_day180 | Delay Discounting Money: Indifference point day 180: Prefer to receive this amount of money now than 10 euros in 180 days. | |
s03_salivacort_d1_m1 | Saliva Samples - Cortisol: Mean cortisol in nmol/l day 1, measurement 1. |