Overview
The Curation home page provides a high-level introduction to key aspects of the curation process, the definition of an immune signature, and the kinds of information to be captured . Also covered there are details of how to choose the tissue and response component for cell-type signatures (see "Example tissue and response component combinations").
This page describes every field in the example annotation template.
As a general rule, curation should be faithful to the manner in which findings are reported in the source manuscript. If need be, resolution of discrepancies (e.g. inconsistent use of controlled vocabulary terms by manuscript authors to specify gene symbols, cell types or protein markers) will be performed by the project team, in a post-curation review.
Results that are qualitative only, e.g. a visual comparison in a heatmap without a statistical test, should in general not be captured (for an example see figure 4C under "Example annotation").
Capitalization
Values should be entered in lower case except where capitalization carries semantic information, e.g., as in the case of reporting human gene symbols (in contrast to lower-case capitalization for mouse genes):
Column name | Example |
response_component (for case of gene symbols) | STAT1 |
The annotation template
The table below is a transposed version of the annotation sheet from the example annotation template. Each row of the table represents a column of the sheet. Permissible values are described under "Vocabulary"; "choose from list" means that values should come from the term lists specified in the "terms" sheet of the annotation template.
In the annotation sheet, curators can add as many signature rows as needed to capture all pertinent immune signatures found, one row per signature.
We suggest reviewing the table with the sample annotation sheet open in another browser window, for easy access to examples of immune signatures that can lend clarity to the points made below.
Curation sheet column header | Descriptive text on Dashboard | Vocabulary | Additional information for curators |
curation_date |
curation date YYYY-MM-DD |
YYYY-MM-DD |
Date curation completed |
cohort |
cohort - any characteristics of the population(s) studied, plus whether the result was taken from a subgroup of the broader cohort tested. |
free text |
Example: COVID-19 infected patients and age/sex-matched healthy controls (Atlanta, Georgia) Also report if the result was limited to a subgroup of the tested cohort, e.g.
|
age_min |
age_min - age of youngest subject including both cases and controls |
number |
Include both case/affected and control subjects |
age_max |
age_max - age of oldest subject including both cases and controls |
number |
include both case/affected and control subjects |
age_units |
age units |
choose from list |
hours, days, months, years |
number_subjects |
number of subjects - count of case plus control subjects used in the measurement |
number |
Often differs by signature within a publication. If number of subjects for a particular signature is not clear in text, use total for cohort |
tissue_type |
tissue type |
Free text. Multiple entries must be separated with a semicolon |
As reported. The parent tissue of the response components. For cell-type results, the tissue is often PBMCs, but can also be a specific base cell type. For gene expression, the tissue might be a specific base cell type |
tissue_type_term_id |
Cell Ontology ID of tissue |
Choose from list or lookup Cell Ontology code. Multiple entries must be separated by a semicolon, e.g. CL:0000576 (monocyte); CL:0000235 (macrophage). Just the code is also acceptable, e.g. CL:0000576. |
Cell Ontology IDs for tissues in column “tissue_type”. If there is no matching cell type in the pulldown list, the curator can try to look up a matching term in the Cell Ontology. If there is no appropriate Cell Ontology term, UBERON codes can also be used. |
method |
method - primary experimental method used to measure the response |
Choose from list. Only one entry expected. You can add new methods. |
The primary experimental method used to measure the response, e.g. RNA-seq, CyTOF, CITE-seq. |
response_component |
response component |
Gene or protein symbols can be separated with commas or semicolons. Cell types or other names must be separated using semicolons |
The entities whose response is being measured. Please copy symbols and names exactly as reported in the publication - except spell out greek letters or other special characters. For cell types, this includes all markers. Examples for a signature with three cell types: T cells CD3+/CD4+/Ki67+; T cells CD3+/CD8+/Ki67+ CD86+ myeloid dendritic cell (DC); CD86+ monocyte |
is_model |
signature was derived from a computational model |
Y/N |
Were the response components chosen using a classification or other model-building strategy? |
response_behavior_type |
response behavior type |
Choose from list. Only one entry allowed. |
The type of change being measured, e.g. gene expression, cell-type frequency. |
response_behavior |
response behavior (direction, correlation type etc.) |
Choose from list. Only one entry allowed. Add new behaviors if required. |
Common values in the pulldown list:
|
comparison |
comparison (affected vs control, correlated variable, time vs baseline event etc.) |
Free text. Use "vs" rather than a dash to separate comparison terms (A vs B). Separate multiple comparison entries with semicolons (A vs B; C vs D). |
Comparisons are typically between two groups, or may reflect a correlation of the response component with some other measured variable. Only report significant results. Examples:
Include group sizes (e.g., N=24). Include time comparison if relevant, e.g. 7d vs 0d, where the times are relative to the baseline reference event time (0d). Times before baseline event can be entered as negative numbers, e.g. days before vaccination (7d vs -1d). **Please be concise** |
baseline_time_event |
baseline time event |
Free text |
The reference event from which the time of the experimental response is measured, e.g. hospital admission, onset of symptoms |
time_point |
time point relative to baseline event at which response was measured |
Number or free text |
Time point when response was measured, e.g. “7”, “various”, “0 to 8” |
time_point_units |
time point units |
Choose from list. Lowercase only |
days, months etc. |
exposure_material |
infection: exposure material (pathogen name); vaccine: exposure material (vaccine name) |
free text |
Enter the (pathogen or vaccine) underlying the immune exposure as reported in the publication or use the NCBI Taxonomy term name |
exposure_material_id (vaccine and infection templates use different ontologies) |
infection:exposure material (NCBI taxid); vaccine: exposure material (vaccine ontology). |
infection: Choose from list or use format ncbi_taxid:2697049. vaccine: use format VO:0000045. |
infection: NCBI Taxonomy ID of pathogen causing disease. vaccine: vaccine ontology ID of vaccine administered |
exposure_process (infection template) |
exposure process - method by which immune exposure occurred |
Choose from list. Enter new process if needed. |
Method by which exposure to pathogen occurred |
disease_name (infection template) |
disease name |
free text |
E.g. COVID-19 |
disease_stage (infection template) |
disease stage - reported disease stage(s) of affected subjects |
free text |
Reported disease stage(s) of affected subjects in all comparisons entered in row (not including control subjects). We will not attempt to match these directly with the comparisons. Examples include moderate, severe, ICU etc. Pooled can also be added. |
additional_exposure_material (vaccine templates) |
exposure material - additional |
free text |
Any additional exposure material, e.g. “Live attenuated vaccine TC-83 challenge”, “ex-vivo restimulation with live VZV” |
target_pathogen (vaccine templates) |
target pathogen |
text |
NCBI taxonomy name (non-influenza pathogens). For influenza, can just note e.g. “influenza A virus; influenza B virus”. Values will be filled in by script based on vaccine year. |
target_pathogen_taxonid (vaccine templates) |
target pathogen (NCBI TaxID) |
“influ:xxxx”, “ncbi_taxid:nnnnn”, “ncbi_taxid:10335 (Human alphaherpesvirus 3)”. Separate multiple entries with a semicolon. |
For influenza, enter the tag “influ:xxxx” for the vaccine year/type from the vaccine_years.xlsx spreadsheet, e.g. “influ:2008”, “influ:2009mv”. For all other pathogens, enter a matching provided value e.g. “ncbi_taxid:10335 (Human alphaherpesvirus 3)” (see file “ncbi_txids.tsv”) or enter the NCBI Taxonomy ID of pathogen if available, or nearest higher taxonomy entry if not. Multiple entries are allowed (expected for influenza only). |
vaccine_year (vaccine templates) |
vaccine year |
YYYY, comma separated if multiple |
For influenza vaccines only, enter the official year of the vaccine, e.g. 2008, or the tag from the vaccine_years.xlsx spreadsheet, e.g. “2009mv”. Multiple years allowed, e.g. “2008, 2009, 2010” |
adjuvant (vaccine templates) |
adjuvant |
free text |
Name of adjuvant, and vaccine ontology ID if available, e.g. “AS03 (VO:0001320)” |
route (vaccine templates) |
route |
free text |
i.m, i.n, i.d., po, subcutaneous etc. |
scheduling (vaccine templates) |
scheduling |
free text |
Has been used to record number of doses, e.g. 1 dose, 2 doses. |
publication_reference_id |
publication reference (PMID) |
pmid:nnnnnnnn or nnnnnnnn |
Enter the PMID for the article curated, using format pmid:id,such as pmid:32788292, or just the id number, e.g. 32788292 |
publication_date |
print publication date or posted date YYYY-MM-DD |
YYYY-MM-DD |
Partial dates are OK, e.g. YYYY-MM |
publication_reference_url |
publication URL |
The URL only, not any associated display text |
URL of the article curated. Please use PubMed if available |
signature_source |
signature source - figure, table or text section |
Free text. Multiple entries allowed if needed |
Figure, table number etc. where the signature was found, as given in the source listed in the publication_reference_url. Multiple entries allowed, i.e. if a result is drawn both from a primary and a supplemental source |
comments |
comments and additional details |
Free text, limit of 250 characters |
Details clarifying any aspect of the signature not captured in other fields. Will appear in Dashboard |
curator_comments |
curator comments |
Free text, no limit |
Questions or notes for further examination from the curator. Will not appear in Dashboard |
Notes
*Be concise - A subset of the columns above (as specified by a signature-specific template) will be combined to generate a human-readable summary of the signature, to be displayed at the HIPC Dashboard. Please be as concise as possible when annotating free text columns.