Overview

The Curation home page provides a high-level introduction to key aspects of the curation process, the definition of an immune signature, and the kinds of information to be captured . Also covered there are details of how to choose the tissue and response component for cell-type signatures (see "Example tissue and response component combinations").

This page describes every field in the example annotation template.

As a general rule, curation should be faithful to the manner in which findings are reported in the source manuscript. If need be, resolution of discrepancies (e.g. inconsistent use of controlled vocabulary terms by manuscript authors to specify gene symbols, cell types or protein markers) will be performed by the project team, in a post-curation review.

Results that are qualitative only, e.g. a visual comparison in a heatmap without a statistical test, should in general not be captured (for an example see figure 4C under "Example annotation").

Capitalization

Values should be entered in lower case except where capitalization carries semantic information, e.g., as in the case of reporting human gene symbols (in contrast to lower-case capitalization for mouse genes):

Column name	Example
response_component (for case of gene symbols)	STAT1

The annotation template

The table below is a transposed version of the annotation sheet from the example annotation template. Each row of the table represents a column of the sheet. Permissible values are described under "Vocabulary"; "choose from list" means that values should come from the term lists specified in the "terms" sheet of the annotation template.

In the annotation sheet, curators can add as many signature rows as needed to capture all pertinent immune signatures found, one row per signature.

We suggest reviewing the table with the sample annotation sheet open in another browser window, for easy access to examples of immune signatures that can lend clarity to the points made below.

Curation sheet column header	Descriptive text on Dashboard	Vocabulary	Additional information for curators
curation_date	curation date YYYY-MM-DD	YYYY-MM-DD	Date curation completed
cohort	cohort - any characteristics of the population(s) studied, plus whether the result was taken from a subgroup of the broader cohort tested.	free text	Example: COVID-19 infected patients and age/sex-matched healthy controls (Atlanta, Georgia) Also report if the result was limited to a subgroup of the tested cohort, e.g. subjects suffering adverse events particular threshold levels of antibody titer level based on a receiving a particular treatment
age_min	age_min - age of youngest subject including both cases and controls	number	Include both case/affected and control subjects
age_max	age_max - age of oldest subject including both cases and controls	number	include both case/affected and control subjects
age_units	age units	choose from list	hours, days, months, years
number_subjects	number of subjects - count of case plus control subjects used in the measurement	number	Often differs by signature within a publication. If number of subjects for a particular signature is not clear in text, use total for cohort
tissue_type	tissue type	Free text. Multiple entries must be separated with a semicolon	As reported. The parent tissue of the response components. For cell-type results, the tissue is often PBMCs, but can also be a specific base cell type. For gene expression, the tissue might be a specific base cell type
tissue_type_term_id	Cell Ontology ID of tissue	Choose from list or lookup Cell Ontology code. Multiple entries must be separated by a semicolon, e.g. CL:0000576 (monocyte); CL:0000235 (macrophage). Just the code is also acceptable, e.g. CL:0000576.	Cell Ontology IDs for tissues in column “tissue_type”. If there is no matching cell type in the pulldown list, the curator can try to look up a matching term in the Cell Ontology. If there is no appropriate Cell Ontology term, UBERON codes can also be used.
method	method - primary experimental method used to measure the response	Choose from list. Only one entry expected. You can add new methods.	The primary experimental method used to measure the response, e.g. RNA-seq, CyTOF, CITE-seq.
response_component	response component	Gene or protein symbols can be separated with commas or semicolons. Cell types or other names must be separated using semicolons	The entities whose response is being measured. Please copy symbols and names exactly as reported in the publication - except spell out greek letters or other special characters. For cell types, this includes all markers. Examples for a signature with three cell types: T cells CD3+/CD4+/Ki67+; T cells CD3+/CD8+/Ki67+ CD86+ myeloid dendritic cell (DC); CD86+ monocyte
is_model	signature was derived from a computational model	Y/N	Were the response components chosen using a classification or other model-building strategy?
response_behavior_type	response behavior type	Choose from list. Only one entry allowed.	The type of change being measured, e.g. gene expression, cell-type frequency.
response_behavior	response behavior (direction, correlation type etc.)	Choose from list. Only one entry allowed. Add new behaviors if required.	Common values in the pulldown list: up, down positively correlated, negatively correlated, correlated positively predictive, negatively predictive, predictive
comparison	comparison (affected vs control, correlated variable, time vs baseline event etc.)	Free text. Use "vs" rather than a dash to separate comparison terms (A vs B). Separate multiple comparison entries with semicolons (A vs B; C vs D).	Comparisons are typically between two groups, or may reflect a correlation of the response component with some other measured variable. Only report significant results. Examples: severe COVID-19 cases (N=24) vs healthy (N=28) moderate COVID-19 cases vs healthy, severe COVID-19 cases vs healthy interferon-stimulated genes in COVID-19 vs healthy bacterial DNA levels across COVID-19 and healthy subjects Include group sizes (e.g., N=24). Include time comparison if relevant, e.g. 7d vs 0d, where the times are relative to the baseline reference event time (0d). Times before baseline event can be entered as negative numbers, e.g. days before vaccination (7d vs -1d). Please be concise
baseline_time_event	baseline time event	Free text	The reference event from which the time of the experimental response is measured, e.g. hospital admission, onset of symptoms
time_point	time point relative to baseline event at which response was measured	Number or free text	Time point when response was measured, e.g. “7”, “various”, “0 to 8”
time_point_units	time point units	Choose from list. Lowercase only	days, months etc.
exposure_material	infection: exposure material (pathogen name); vaccine: exposure material (vaccine name)	free text	Enter the (pathogen or vaccine) underlying the immune exposure as reported in the publication or use the NCBI Taxonomy term name
exposure_material_id (vaccine and infection templates use different ontologies)	infection:exposure material (NCBI taxid); vaccine: exposure material (vaccine ontology).	infection: Choose from list or use format ncbi_taxid:2697049. vaccine: use format VO:0000045.	infection: NCBI Taxonomy ID of pathogen causing disease. vaccine: vaccine ontology ID of vaccine administered
exposure_process (infection template)	exposure process - method by which immune exposure occurred	Choose from list. Enter new process if needed.	Method by which exposure to pathogen occurred
disease_name (infection template)	disease name	free text	E.g. COVID-19
disease_stage (infection template)	disease stage - reported disease stage(s) of affected subjects	free text	Reported disease stage(s) of affected subjects in all comparisons entered in row (not including control subjects). We will not attempt to match these directly with the comparisons. Examples include moderate, severe, ICU etc. Pooled can also be added.
additional_exposure_material (vaccine templates)	exposure material - additional	free text	Any additional exposure material, e.g. “Live attenuated vaccine TC-83 challenge”, “ex-vivo restimulation with live VZV”
target_pathogen (vaccine templates)	target pathogen	text	NCBI taxonomy name (non-influenza pathogens). For influenza, can just note e.g. “influenza A virus; influenza B virus”. Values will be filled in by script based on vaccine year.
target_pathogen_taxonid (vaccine templates)	target pathogen (NCBI TaxID)	“influ:xxxx”, “ncbi_taxid:nnnnn”, “ncbi_taxid:10335 (Human alphaherpesvirus 3)”. Separate multiple entries with a semicolon.	For influenza, enter the tag “influ:xxxx” for the vaccine year/type from the vaccine_years.xlsx spreadsheet, e.g. “influ:2008”, “influ:2009mv”. For all other pathogens, enter a matching provided value e.g. “ncbi_taxid:10335 (Human alphaherpesvirus 3)” (see file “ncbi_txids.tsv”) or enter the NCBI Taxonomy ID of pathogen if available, or nearest higher taxonomy entry if not. Multiple entries are allowed (expected for influenza only).
vaccine_year (vaccine templates)	vaccine year	YYYY, comma separated if multiple	For influenza vaccines only, enter the official year of the vaccine, e.g. 2008, or the tag from the vaccine_years.xlsx spreadsheet, e.g. “2009mv”. Multiple years allowed, e.g. “2008, 2009, 2010”
adjuvant (vaccine templates)	adjuvant	free text	Name of adjuvant, and vaccine ontology ID if available, e.g. “AS03 (VO:0001320)”
route (vaccine templates)	route	free text	i.m, i.n, i.d., po, subcutaneous etc.
scheduling (vaccine templates)	scheduling	free text	Has been used to record number of doses, e.g. 1 dose, 2 doses.
publication_reference_id	publication reference (PMID)	pmid:nnnnnnnn or nnnnnnnn	Enter the PMID for the article curated, using format pmid:id,such as pmid:32788292, or just the id number, e.g. 32788292
publication_date	print publication date or posted date YYYY-MM-DD	YYYY-MM-DD	Partial dates are OK, e.g. YYYY-MM
publication_reference_url	publication URL	The URL only, not any associated display text	URL of the article curated. Please use PubMed if available
signature_source	signature source - figure, table or text section	Free text. Multiple entries allowed if needed	Figure, table number etc. where the signature was found, as given in the source listed in the publication_reference_url. Multiple entries allowed, i.e. if a result is drawn both from a primary and a supplemental source
comments	comments and additional details	Free text, limit of 250 characters	Details clarifying any aspect of the signature not captured in other fields. Will appear in Dashboard
curator_comments	curator comments	Free text, no limit	Questions or notes for further examination from the curator. Will not appear in Dashboard

Notes

*Be concise - A subset of the columns above (as specified by a signature-specific template) will be combined to generate a human-readable summary of the signature, to be displayed at the HIPC Dashboard. Please be as concise as possible when annotating free text columns.

Search

Annotation template

Overview

Capitalization

The annotation template

Notes