Study ID Required
Name study_id
Description A unique alphanumeric identifier for this study
Example STUDY001
Reference #
Namespace ei:study_id
Title Required
Name title
Description The title for your dataset. This will be displayed when search results including your data are shown. Often this will be the same as an associated publication.
Example SARS-COV-2 drug repurposing - Caco2 cell line
Reference #
Regex ^.{25,}$
Namespace rembi:title
Description Required
Name description
Description Use this field to describe your dataset. This can be the abstract to an accompanying publication.
Example High-throughput screening of repurposed drugs against SARS-CoV-2 in Caco-2 cells
Reference http://purl.org/dc/terms/1.1/title
Regex ^.{25,}$
Namespace rembi:description
Release Date Required
Name private_until_date
Description The date until which the data remains private and embargoed.
Example 2027-06-01T00:00:00
Reference http://purl.obolibrary.org/obo/SLSO_0001056
Regex ^\d{4}-(0[1-9]|1[0-2])-(0[1-9]|[12]\d|3[01])$
Namespace rembi:private_until_date
Keywords Required
Name keywords
Description Keywords describing your data that can be used to aid search and classification.
Example CRISPR
Reference http://schema.org/keywords
Namespace rembi:keywords
Licence
Name licence
Description The license under which the data are available.
Example MIT License
Reference http://purl.org/dc/elements/1.1/license
Namespace rembi:licence
Allowed Values Apache License 2.0 Creative Commons Attribution 4.0 International Creative Commons Attribution Share Alike 4.0 International Creative Commons Zero v1.0 Universal GNU General Public License v3.0 or later MIT License
Funding Statement
Name funding_statement
Description A description of how the data generation was funded.
Example Data generation for this study was supported by a grant from the BBSRC, which funded annotation and analysis activities.
Reference http://purl.obolibrary.org/obo/IAO_0000623
Namespace rembi:funding_statement
Acknowledgements
Name acknowledgements
Description Any people or groups that should be acknowledged as part of the dataset.
Example We acknowledge the contributions of the field research team at the University of Edinburgh, the sequencing support from the Earlham Institute, and funding provided by the BBSRC. Special thanks to local conservation volunteers for assistance in sample collection.
Reference http://purl.obolibrary.org/obo/IAO_0000324
Namespace rembi:acknowledgements
Rembi Version Required
Name rembi_version
Description The version of REMBI. The current version to be used is 1.5.
Example 1.5
Reference #
Regex ^1\.5$
Namespace rembi:rembi_version
Study ID Required
Name study_id
Description A unique alphanumeric identifier for the study
Example STUDY001
Reference #
Namespace ei:study_id
Identifier Required
Name identifier
Description The identifier for the grant.
Example 12345
Reference http://purl.org/dc/terms/identifier
Namespace rembi:identifier
Funder Required
Name funder
Description The funding body provididing support.
Example Biotechnology and Biological Sciences Research Council (BBSRC)
Reference https://schema.org/funder
Namespace rembi:funder
Study ID Required
Name study_id
Description A unique alphanumeric identifier for the study
Example STUDY001
Reference #
Namespace ei:study_id
Title Required
Name title
Description Title of associated publication.
Example High-throughput drug screening identifies potential SARS-CoV-2 inhibitors in Caco2 cells
Reference http://purl.org/dc/terms/1.1/title
Namespace rembi:title
Authors
Name authors
Description Authors of the associated publication. Multiple authors should be listed in order of contribution. Each name should be formatted as Last name, First initial (e.g. Doe, J.). Separate multiple authors with commas.
Example Doe J., Lee A., Gupta R., Zhao L., Thompson M.
Reference http://purl.obolibrary.org/obo/GENEPIO_0001517
Namespace rembi:authors
DOI
Name doi
Description A Digital Object Identifier (DOI) is a unique alphanumeric string assigned to a digital object, such as a journal article, dataset, or publication, to provide a permanent link to its location on the internet. It ensures reliable citation and access. The DOI should follow the standard format (e.g., 10.1234/example.doi) and link to the original source of the publication or data referenced.
Example 10.1038/s41586-020-2577-1
Reference http://purl.obolibrary.org/obo/ONTOAVIDA_00000015
Regex ^10\.\d{4,9}/[-._;()/:A-Za-z0-9]+$
Namespace rembi:doi
Year
Name year
Description Year of publication.
Example 2025
Reference http://rs.tdwg.org/dwc/terms/year
Regex ^(19|20)\d{2}$
Namespace rembi:year
Pubmed ID
Name pubmed_id
Description PubMed identifier for the publication.
Example 32726801
Reference http://purl.obolibrary.org/obo/MS_1000879
Regex ^\d{1,8}$
Namespace rembi:pubmed_id
Study ID Required
Name study_id
Description A unique alphanumeric identifier for the study
Example STUDY001
Reference #
Namespace ei:study_id
Study Component ID Required
Name study_component_id
Description A unique alphanumeric identifier for the study component
Example STUDYCOMP001
Reference #
Namespace ei:study_component_id
Name Required
Name name
Description The name of your study component.
Example Confocal images
Reference #
Namespace rembi:name
Description Required
Name description
Description An explanation of your study component.
Example Stitched max-projected fluorescent confocal images
Reference #
Namespace rembi:description
Study ID Required
Name study_id
Description A unique alphanumeric identifier for the study
Example STUDY001
Reference #
Namespace ei:study_id
Annotation ID Required
Name annotation_id
Description A unique alphanumeric identifier for the image annotation record.
Example ANNOT001
Reference #
Namespace rembi:annotation_id
Annotation Overview Required
Name annotation_overview
Description Short descriptive summary indicating the type of annotation and how it was generated
Example Cell nuclei marked using DAPI staining.
Reference #
Namespace rembi:annotation_overview
File Type
Name file_type
Description The format of the annotation file.
Example gff
Reference http://purl.obolibrary.org/obo/SLSO_0001157
Namespace rembi:file_type
Annotation Type
Name annotation_type
Description Defines the type of annotation (e.g., class_labels, bounding_boxes, counts, derived_annotations).
Example geometrical_annotations
Reference http://purl.obolibrary.org/obo/NCIT_C89919
Namespace rembi:annotation_type
Allowed Values bounding_boxes class_labels counts derived_annotations geometrical_annotations graphs other point_annotations segmentation_mask tracks weak_annotations
Annotation Method Required
Name annotation_method
Description Description of how the annotations where created. Including protocols used for consensus and quality assurance, if applicable.
Example crowdsourced
Reference #
Namespace rembi:annotation_method
Annotation Criteria
Name annotation_criteria
Description Rules used to generate annotations
Example only nuclei in focus were segmented
Reference #
Namespace rembi:annotation_criteria
Annotation Coverage
Name annotation_coverage
Description The proportion of images from the dataset that were annotated.
Example All data that satisfied the Annotation Criteria were annotated.
Reference #
Namespace rembi:annotation_coverage
Annotation Confidence Level
Name annotation_confidence_level
Description Confidence on annotation accuracy
Example more than 95% pixel consensus where multiple annotators independently segmented the same object
Reference #
Namespace rembi:annotation_confidence_level
Study ID Required
Name study_id
Description A unique alphanumeric identifier for the study
Example STUDY001
Reference #
Namespace ei:study_id
Person ID Required
Name person_id
Description A unique alphanumeric identifier for the author.
Example PERSON001
Reference #
Namespace ei:person_id
Annotation ID Required
Name annotation_id
Description A unique alphanumeric identifier for the image annotation record.
Example ANNOT001
Reference #
Namespace ei:annotation_id
Author First Name Required
Name givenName
Description A first name (or given name) is the personal name given to an individual conducting the study.
Example Jane
Reference https://schema.org/givenName
Regex ^[A-Za-z]+(?:[-\s][A-Za-z]+)*[a-z]+$
Namespace schema.org:givenName
Author Last Name Required
Name familyName
Description A last name (or surname) is the family name passed down from one generation to the next for the individual conducting the study.
Example Doe
Reference https://schema.org/familyName
Regex ^[A-Za-z]+(-[A-Za-z]+)*[a-z]+$
Namespace schema.org:familyName
Email Address
Name email
Description A unique identifier used to send and receive electronic messages (emails) over the internet.
Example jane.doe@example.com
Reference https://schema.org/email
Regex ^(?!.*\.{2,})(?!.*-{2,})[\w.-]+@[a-zA-Z\d.-]+\.[a-zA-Z]{2,}$
Namespace rembi:email
Orcid ID
Name orcid_id
Description A 16-digit number that uniquely identify researchers.
Example 0000-1234-5678-9012
Reference #
Regex ^\d{4}-\d{4}-\d{4}-\d{4}$
Namespace rembi:orcid_id
Affiliation or Institution Required
Name affiliation
Description A URL to a public registry containing organisation information or the name of the organisation. A Research Organisation Registry (ROR) URL is recommended if a URL is provided.
Example https://ror.org/018cxtf62
Reference https://schema.org/affiliation
Namespace rembi:affiliation
Role
Name role
Description Author role in the study. If multiple separate by pipe sybom
Example Senior Bioinformatician
Reference http://www.w3.org/2006/vcard/ns#role
Namespace rembi:role
Study ID Required
Name study_id
Description A unique alphanumeric identifier for this study
Example STUDY001
Reference #
Namespace ei:study_id
Sample ID Required
Name sample_id
Description A unique alphanumeric identifier for this sample
Example SAMP001
Reference #
Namespace ei:sample_id
Scientific Name or Organism Required
Name scientific_name
Description The formal Latin name used to identify the organism from which the sample was derived (e.g. Homo sapiens or Arabidopsis thaliana). This name must accurately correspond to the Taxon ID provided to ensure correct taxonomic classification.
Example Salvelinus alpinus
Reference http://rs.tdwg.org/dwc/terms/scientificName
Regex ^[A-Za-z]+(?: [A-Za-z]+)*[a-z]+$
Namespace ei:scientific_name
Taxon ID Required
Name taxon_id
Description A unique identifier (usually from a recognized taxonomy database like NCBI Taxonomy) that corresponds to the organism’s scientific name. It must be accurately matched to the provided scientificName to maintain consistency and traceability in biological records.
Example 8036
Reference http://rs.tdwg.org/dwc/terms/taxonID
Regex ^[0-9]+$
Namespace ei:taxon_id
Biosample Accession Required
Name biosampleAccession
Description A unique identifier assigned to a biological sample after it has been submitted to a public database, such as the NCBI BioSample or ENA. It serves as a permanent reference to that specific sample, allowing researchers to retrieve metadata and link it across studies or datasets.
Example SAMEA12907823
Reference http://purl.obolibrary.org/obo/T4FS_0000316
Namespace ei:biosampleAccession
Biological Entity Required
Name biological_entity
Description What is being imaged
Example Drosophila endoderm
Reference #
Namespace rembi:biological_entity
Common Name
Name common_name
Description Common name
Example rock worm
Reference #
Namespace rembi:common_name
Description
Name description
Description High level description of sample.
Example Bronchial epithelial cell culture
Reference #
Namespace rembi:description
Intrinsic Variables
Name intrinsic_variables
Description Intrinsic (e.g. genetic) alteration if applicable
Example stable overexpression of HIST1H2BJ-mCherry and LMNA
Reference #
Namespace rembi:intrinsic_variables
Extrinsic Variables
Name extrinsic_variables
Description External sample treatment (e.g. reagent) if applicable
Example 2-(9-oxoacridin-10-yl)acetic acid
Reference #
Namespace rembi:extrinsic_variables
Experimental Variables
Name experimental_variables
Description What is intentionally varied (e.g. time) between multiple entries in this study component
Example Time
Reference #
Namespace rembi:experimental_variables
Study ID Required
Name study_id
Description A unique alphanumeric identifier for this study
Example STUDY001
Reference #
Namespace ei:study_id
Specimen ID Required
Name specimen_id
Description A unique alphanumeric identifier for this specimen
Example SPEC001
Reference #
Namespace ei:specimen_id
Sample ID Required
Name sample_id
Description A unique alphanumeric identifier for this sample
Example SAMP001
Reference #
Namespace ei:sample_id
Study Component ID Required
Name study_component_id
Description A unique alphanumeric identifier for the study component
Example STUDYCOMP001
Reference #
Namespace ei:study_component_id
Sample Preparation Required
Name sample_preparation
Description How the sample was prepared for imaging.
Example Cells were cultured on poly-L-lysine treated coverslips. Culture media was aspirated, and coverslips were washed once with PBS. Cells were fixed by incubating for 10 min with 4 % formaldehyde/PBS, washed twice with PBS, and permeabilized by incubating (>3 h, -20°C) in 70 % ethanol. Cells were rehydrated by incubating (5 min, RT) with FISH wash buffer (10 % formamide, 2x SSC). For hybridization, coverslips were placed cell-coated side down on a 48μl drop containing 100 nM Quasar570-labelled probes complementary to one of REV-ERBα, CRY2, or TP53 transcripts (Biosearch Technologies) (see Table S6 for probe sequences), 0.1 g/ml dextran sulfate, 1 mg/ml E. coli tRNA, 2 mM VRC, 20 μg/ml BSA, 2x SSC, 10 % formamide and incubated (37°C, 20 h) in a sealed parafilm chamber. Coverslips were twice incubated (37°C, 30 min) in pre-warmed FISH wash buffer, then in PBS containing 0.5 μg/ml 4’,6-diamidino-2-phenylindole (DAPI) (5 min, RT), washed twice with PBS, dipped in water, air-dried, placed cell-coated side down on a drop of ProLong Diamond Antifade Mountant (Life Technologies), allowed to polymerize for 24 h in the dark and then sealed with nail varnish.
Reference #
Namespace rembi:sample_preparation
Growth Protocol
Name growth_protocol
Description How the specimen was grown, e.g. cell line cultures, crosses or plant growth.
Example Cells grown on coverslips were fixed in ice-cold methanol at _20 _ C for 10 min. After blocking in 0.2% gelatine from cold-water fish (Sigma) in PBS (PBS/FSG) for 15 min, coverslips were incubated with primary antibodies in blocking solution for 1h. Following washes with 0.2% PBS/FSG, the cells were incubated with a 1:500 dilution of secondary antibodies for 1 h (donkey anti- mouse/rabbit/goat/sheep conjugated to Alexa 488 or Alexa 594; Molecular Probes or donkey anti-mouse conjugated to DyLight 405, Jackson ImmunoResearch). The cells were counterstained with 1 _g ml_1 Hoechst 33342 (Sigma) to visualize chromatin. After washing with 0.2% PBS/FSG, the coverslips were mounted on glass slides by inverting them into mounting solution (ProLong Gold antifade, Molecular Probes). The samples were allowed to cure for 24-48 h.
Reference #
Namespace rembi:growth_protocol
Study ID Required
Name study_id
Description A unique alphanumeric identifier for this study
Example STUDY001
Reference #
Namespace ei:study_id
Image Acquisition ID Required
Name image_acquisition_id
Description A unique alphanumeric identifier for the image acquisition
Example IMGACQ001
Reference #
Namespace ei:image_acquisition_id
Specimen ID Required
Name specimen_id
Description A unique alphanumeric identifier for this specimen
Example SPEC001
Reference #
Namespace ei:specimen_id
Image Method Required
Name image_method
Description What method was used to capture images.
Example secondary_electron imaging
Reference FBbi:00000222
Namespace ei:image_method
Imaging Instrument Required
Name imaging_instrument
Description Description of the instrument used to capture the images.
Example DeltaVision OMX V3 Blaze system (GE Healthcare) equipped with a 60x/1.42 NA PlanApo oil immersion objective (Olympus), pco.edge 5.5 sCMOS cameras (PCO) and 405, 488, 593 and 640 nm lasers
Reference #
Namespace rembi:imaging_instrument
Image Acquisition Parameters Required
Name image_acquisition_parameters
Description How the images were acquired, including instrument settings/parameters.
Example Embryos were imaged on a Luxendo MuVi SPIM light-sheet microscope, using 30x magnification setting on the Nikon 10x/0.3 water objective. The 488 nm laser was used to image nuclei (His-GFP), and the 561 nm laser was used to image transcriptional dots (MCP-mCherry), both at 5% laser power. Exposure time for the green channel was 55 ms and exposure for the red channel was 70 ms. The line illumination tool was used to improve background levels and was set to 40 pixels.
Reference #
Namespace rembi:image_acquisition_parameters
Study ID Required
Name study_id
Description A unique alphanumeric identifier for this study
Example STUDY001
Reference #
Namespace ei:study_id
Image Analysis ID Required
Name image_analysis_id
Description A unique alphanumeric identifier for the image analysis
Example IMGANAL001
Reference #
Namespace ei:image_analysis_id
Study Component ID Required
Name study_component_id
Description A unique alphanumeric identifier for the study component
Example STUDYCOMP001
Reference #
Namespace ei:study_component_id
Analysis Overview Required
Name analysis_overview
Description How image analysis was carried out.
Example Each 3D-SIM image contained one nucleus (in a small number of cases multiple nuclei were present, which did not affect the analysis). The image analysis pipeline contained six main steps: bivalent skeleton tracing, trace fluorescence intensity quantification, HEI10 peak detection, HEI10 foci identification, HEI10 foci intensity quantification, and total bivalent intensity quantification. Note that the normalization steps used for foci identification differ from those used for foci intensity quantification; the former was intended to robustly identify foci from noisy traces, whilst the latter was used to carefully quantify foci HEI10 levels.
Reference #
Namespace rembi:analysis_overview
Study ID Required
Name study_id
Description A unique alphanumeric identifier for this study
Example STUDY001
Reference #
Namespace ei:study_id
Image Correlation ID Required
Name image_correlation_id
Description A unique alphanumeric identifier for the image correlation
Example IMGCORR001
Reference #
Namespace ei:image_correlation_id
Image Analysis ID Required
Name image_analysis_id
Description A unique alphanumeric identifier for the image analysis
Example IMGANAL001
Reference #
Namespace ei:image_analysis_id
Spatial and Temporal Alignment Required
Name spatial_and_temporal_alignment
Description Method used to correlate images from different modalities (e.g. manual overlay, alignment algorithm etc)
Example Alignment algorithm
Reference #
Namespace rembi:spatial_and_temporal_alignment
Fiducials Used Required
Name fiducials_used
Description Features from correlated datasets used for colocalisation
Example Fluorescent bead markers
Reference #
Namespace rembi:fiducials_used
Transformation Matrix or Other Information Required
Name transformation_matrix
Description Correlation transformations
Example Translation and rotation matrix applied using ImageJ plugin
Reference #
Namespace rembi:transformation_matrix
Study ID Required
Name study_id
Description A unique alphanumeric identifier for this study
Example STUDY001
Reference #
Namespace ei:study_id
File ID Required
Name file_id
Description A unique alphanumeric identifier for this file
Example FILE001
Reference #
Regex ^[a-zA-Z0-9]+$
Namespace ei:file_id
Study Component ID Required
Name study_component_id
Description A unique alphanumeric identifier for the study component
Example STUDYCOMP001
Reference #
Namespace ei:study_component_id
Annotation ID Required
Name annotation_id
Description A unique alphanumeric identifier for the image annotation record.
Example ANNOT001
Reference #
Namespace rembi:annotation_id
Image File name Required
Name source_image_id
Description The file name of the image including the extension. Common file names end with tiff, jpeg, png, gif, bmp, and ome-tiff etc.
Example file001.png
Reference #
Namespace rembi:source_image_id
Transformations
Name transformations
Description Any preprocessing or transformations applied to the image.
Example z-stack flattening
Reference #
Namespace rembi:transformations
Spatial Information
Name spatial_information
Description Spatial resolution, scale, or coordinate info related to the image.
Example pixel_size=0.5µm
Reference #
Namespace rembi:spatial_information
Annotation Creation Time
Name annotation_creation_time
Description Timestamp of when the annotation was created.
Example 2025-05-15T14:32:00Z
Reference #
Regex ^\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}Z$
Namespace rembi:annotation_creation_time
Study ID Required
Name study_id
Description A unique alphanumeric identifier for this study
Example STUDY001
Reference #
Regex ^[a-zA-Z0-9]+$
Namespace ei:study_id
Title Required
Name title
Description A name given to the study or project. Project title should be fewer than 30 words, such as a title of a grant proposal or a publication.
Example Spatial Transcriptomics FISH of Human Lung Tissue
Reference http://purl.org/dc/terms/title
Namespace dcterms:title
Workflow
Name workflow
Description The workflow or protocol followed during the study.
Example Spatial Transcriptomics
Reference #
Namespace ei:workflow
Allowed Values Laser microdissection Laser microdissection, Culturing Laser microdissection, Culturing, Sequencing Laser microdissection, Sequencing Microfluidics, Facs, Culturing Microfluidics, Facs, Culturing, Sequencing Microfluidics, Facs, Sequencing Spatial Transcriptomics
Licence
Name licence
Description Specifies the terms under which the data associated with the study can be used, shared, or reused. It informs users how they may legally reference, distribute, or build upon the study. Common licenses include Creative Commons (e.g., CC BY 4.0), which require attribution to the original authors when the data is cited or reused.
Example MIT
Reference #
Namespace ei:licence
Allowed Values Apache-2.0 CC-BY-4.0 CC-BY-SA-4.0 CC0-1.0 GPL-3.0-or-later MIT
Study ID Required
Name study_id
Description A unique alphanumeric identifier for this study
Example STUDY001
Reference #
Regex ^[a-zA-Z0-9]+$
Namespace ei:study_id
Orcid ID
Name orcid_id
Description A 16-digit number that uniquely identify researchers.
Example 0000-1234-5678-9012
Reference #
Regex ^\d{4}-\d{4}-\d{4}-\d{4}$
Namespace ei:orcid_id
First Name Required
Name givenName
Description A first name (or given name) is the personal name given to an individual conducting the study.
Example Jane
Reference https://schema.org/givenName
Regex ^[A-Za-z]+(?:[-\s][A-Za-z]+)*[a-z]+$
Namespace schema.org:givenName
Last Name Required
Name familyName
Description A last name (or surname) is the family name passed down from one generation to the next for the individual conducting the study.
Example Doe
Reference https://schema.org/familyName
Regex ^[A-Za-z]+(-[A-Za-z]+)*[a-z]+$
Namespace schema.org:familyName
Email Address
Name email
Description A unique identifier used to send and receive electronic messages (emails) over the internet.
Example jane.doe@example.com
Reference https://schema.org/email
Regex ^(?!.*\.{2,})(?!.*-{2,})[\w.-]+@[a-zA-Z\d.-]+\.[a-zA-Z]{2,}$
Namespace schema.org:email
Affiliation or Institution Required
Name affiliation
Description An organisation or institution that this person is associated with.
Example University of Liverpool
Reference https://schema.org/affiliation
Namespace schema.org:affiliation
Funder
Name funder
Description A person or organization that supports (sponsors) something through some kind of financial contribution.
Example BBSRC
Reference https://schema.org/funder
Namespace schema.org:funder
Grant Award
Name funding
Description A grant that directly or indirectly provides funding or sponsorship for the person to conduct the study.
Example GRAK3489
Reference https://schema.org/funding
Namespace schema.org:funding
Study ID Required
Name study_id
Description A unique alphanumeric identifier for the study if referring to
Example STUDY001
Reference #
Regex ^[a-zA-Z0-9]+$
Namespace ei:study_id
Sample ID Required
Name sample_id
Description A unique alphanumeric reference or identifier for the sample. This field must provide a consistent, unambiguous way to identify the sample within and across datasets. It can be a name, code, or accession-like format, as long as it remains unique.
Example SAMP001
Reference #
Namespace ei:sample_id
Scientific Name or Organism
Name scientific_name
Description The formal Latin name used to identify the organism from which the sample was derived (e.g. Homo sapiens or Arabidopsis thaliana). This name must accurately correspond to the Taxon ID provided to ensure correct taxonomic classification.
Example Salvelinus alpinus
Reference http://rs.tdwg.org/dwc/terms/scientificName
Regex ^[A-Za-z]+(?: [A-Za-z]+)*[a-z]+$
Namespace ontology:scientific_name
Taxon ID Required
Name taxon_id
Description A unique identifier (usually from a recognized taxonomy database like NCBI Taxonomy) that corresponds to the organism’s scientific name. It must be accurately matched to the provided scientificName to maintain consistency and traceability in biological records.
Example 8036
Reference http://rs.tdwg.org/dwc/terms/taxonID
Regex ^[0-9]+$
Namespace ontology:taxon_id
Biosample Accession Required
Name biosampleAccession
Description A unique identifier assigned to a biological sample after it has been submitted to a public database, such as the NCBI BioSample or ENA. It serves as a permanent reference to that specific sample, allowing researchers to retrieve metadata and link it across studies or datasets.
Example SAMEA12907823
Reference http://purl.obolibrary.org/obo/T4FS_0000316
Namespace ontology:biosampleAccession
Study ID Required
Name study_id
Description A unique alphanumeric identifier for this study
Example STUDY001
Reference #
Regex ^[a-zA-Z0-9]+$
Namespace ei:study_id
Imaging Protocol ID Required
Name imaging_protocol_id
Description A unique alphanumeric identifier for the imaging protocol.
Example IMGPRO001
Reference #
Namespace ei:imaging_protocol_id
Platform Required
Name platform
Description The platform used to isolate the cells.
Example Illumina NovaSeq
Reference #
Namespace ei:platform
Instrument Required
Name instrument
Description The instrument used to isolate the cells.
Example Illumina NovaSeq 6000
Reference #
Namespace ei:instrument
Target Probe Code Required
Name target_probe_code
Description The type of probes used to detect and quantify specific RNA molecules in their native spatial context within a tissue or cell.
Example Oligo-dT
Reference #
Namespace ei:target_probe_code
Section Thickness (µm)
Name section_thickness_µm
Description The thickness of the tissue section in micrometres.
Example 10
Reference #
Regex ^\d+(\.\d+)?$
Namespace ei:section_thickness_µm
Section Thickness Measurement Method
Name section_thickness_measurement_method
Description The method used to measure tissue section thickness.
Example Microtome
Reference #
Namespace ei:section_thickness_measurement_method
Section Thickness Temperature
Name section_thickness_temperature
Description The temperature at which the section was made in degree celsius.
Example 22
Reference #
Regex ^-?\d+(\.\d+)?$
Namespace ei:section_thickness_temperature
Is Pathological
Name is_pathological
Description A quality inhering in a bearer by virtue of the bearer's being abnormal and having a destructive effect on living tissue.
Example No
Reference #
Namespace ei:is_pathological
Allowed Values No Yes
Photobleaching Duration In Hours
Name photobleaching_duration_in_hours
Description The duration of photobleaching in hours
Example 2
Reference #
Regex ^\d+$
Namespace ei:photobleaching_duration_in_hours
Clearing with ProteinaseK Required
Name clearing_with_proteinasek
Description The duration of clearing at 47°C with Proteinase K.
Example 24 hrs
Reference #
Regex ^\d+(\.\d+)?\s*(hrs?|days?|mins?|seconds?)$
Namespace ei:clearing_with_proteinasek
Clearing without ProteinaseK Required
Name clearing_without_proteinasek
Description The duration of tissue clearing at 37°C without Proteinase K.
Example 4.5 days
Reference #
Regex ^\d+(\.\d+)?\s*(hrs?|days?|mins?|seconds?)$
Namespace ei:clearing_without_proteinasek
Instrument User Guide Required
Name instrument_user_guide
Description The user guide for the instrument used.
Example User Guide
Reference #
Regex ^[A-Za-z]+(?: [A-Za-z]+)*[a-z]+$
Namespace ei:instrument_user_guide
Instrument User Guide Revision Required
Name instrument_user_guide_revision
Description The revision of the instrument user guide.
Example 1.2
Reference #
Regex ^\d+(\.\d+)?$
Namespace ei:instrument_user_guide_revision
Sample Preparation Guide Required
Name sample_preparation_guide
Description The guide used for sample preparation.
Example example_guide_v1.0.pdf
Reference #
Regex ^[A-Za-z0-9._-]*[a-z]+$
Namespace ei:sample_preparation_guide
Sample Preparation Guide Revision Required
Name sample_preparation_guide_revision
Description The revision of the sample preparation guide.
Example 1.0
Reference #
Regex ^\d+(\.\d+)?$
Namespace ei:sample_preparation_guide_revision
Deviations From Official Protocol Required
Name deviations_from_official_protocol
Description Any deviations from the official protocol. Separate individual deviations with '|'.
Example Temperature exceeded 25°C during storage | Sample handling delayed by 2 hours
Reference #
Namespace ei:deviations_from_official_protocol
Study ID Required
Name study_id
Description A unique alphanumeric identifier for this study
Example STUDY001
Reference #
Namespace ei:study_id
File ID Required
Name file_id
Description A unique alphanumeric identifier for this file
Example FILE001
Reference #
Namespace ei:file_id
Imaging Protocol ID Required
Name imaging_protocol_id
Description A unique alphanumeric identifier for the imaging protocol.
Example IMGPRO001
Reference #
Namespace ei:imaging_protocol_id
File Name Required
Name file_name
Description A file name is used to uniquely identify a data file related to the study. Common file names end with tiff, jpeg, png, gif, bmp and ome-tiff etc.
Example file001.tiff
Reference #
Namespace ei:file_name
File Type Required
Name file_type
Description A file type is a name given to a specific kind of file. Common file types are tiff, jpeg, png, gif, bmp and ome-tiff etc.
Example tiff
Reference #
Namespace ei:file_type
Study ID Required
Name study_id
Description A unique alphanumeric identifier for this study
Example STUDY001
Reference #
Regex ^[a-zA-Z0-9]+$
Namespace ei:study_id
Project Name Required
Name project_name
Description Official name of the study or project. Project title should be fewer than 30 words, such as a title of a grant proposal or a publication.
Example Spatial Transcriptomics FISH of Human Lung Tissue
Reference https://w3id.org/mixs/0000092
Namespace mixs:project_name
Workflow
Name workflow
Description The workflow or protocol followed during the study.
Example Spatial Transcriptomics
Reference #
Namespace ei:workflow
Allowed Values Laser microdissection Laser microdissection, Culturing Laser microdissection, Culturing, Sequencing Laser microdissection, Sequencing Microfluidics, Facs, Culturing Microfluidics, Facs, Culturing, Sequencing Microfluidics, Facs, Sequencing Spatial Transcriptomics
Licence
Name licence
Description Specifies the terms under which the data associated with the study can be used, shared, or reused. It informs users how they may legally reference, distribute, or build upon the study. Common licenses include Creative Commons (e.g., CC BY 4.0), which require attribution to the original authors when the data is cited or reused.
Example MIT
Reference #
Namespace ei:licence
Allowed Values Apache-2.0 CC-BY-4.0 CC-BY-SA-4.0 CC0-1.0 GPL-3.0-or-later MIT
Study ID Required
Name study_id
Description A unique alphanumeric identifier for this study
Example STUDY001
Reference #
Regex ^[a-zA-Z0-9]+$
Namespace ei:study_id
Orcid ID
Name orcid_id
Description A 16-digit number that uniquely identify researchers.
Example 0000-1234-5678-9012
Reference #
Regex ^\d{4}-\d{4}-\d{4}-\d{4}$
Namespace ei:orcid_id
First Name Required
Name givenName
Description A first name (or given name) is the personal name given to an individual conducting the study.
Example Jane
Reference https://schema.org/givenName
Regex ^[A-Za-z]+(?:[-\s][A-Za-z]+)*[a-z]+$
Namespace schema.org:givenName
Last Name Required
Name familyName
Description A last name (or surname) is the family name passed down from one generation to the next for the individual conducting the study.
Example Doe
Reference https://schema.org/familyName
Regex ^[A-Za-z]+(-[A-Za-z]+)*[a-z]+$
Namespace schema.org:familyName
Email Address
Name email
Description A unique identifier used to send and receive electronic messages (emails) over the internet.
Example jane.doe@example.com
Reference https://schema.org/email
Regex ^(?!.*\.{2,})(?!.*-{2,})[\w.-]+@[a-zA-Z\d.-]+\.[a-zA-Z]{2,}$
Namespace schema.org:email
Affiliation or Institution Required
Name affiliation
Description An organisation or institution that this person is associated with.
Example University of Liverpool
Reference https://schema.org/affiliation
Namespace schema.org:affiliation
Funder
Name funder
Description A person or organization that supports (sponsors) something through some kind of financial contribution.
Example BBSRC
Reference https://schema.org/funder
Namespace schema.org:funder
Grant Award
Name funding
Description A grant that directly or indirectly provides funding or sponsorship for the person to conduct the study.
Example GRAK3489
Reference https://schema.org/funding
Namespace schema.org:funding
Study ID Required
Name study_id
Description A unique alphanumeric identifier for the study if referring to
Example STUDY001
Reference #
Regex ^[a-zA-Z0-9]+$
Namespace ei:study_id
Sample ID Required
Name sample_id
Description A unique alphanumeric reference or identifier for the sample. This field must provide a consistent, unambiguous way to identify the sample within and across datasets. It can be a name, code, or accession-like format, as long as it remains unique.
Example SAMP001
Reference #
Namespace ei:sample_id
Scientific Name or Organism
Name scientific_name
Description The formal Latin name used to identify the organism from which the sample was derived (e.g. Homo sapiens or Arabidopsis thaliana). This name must accurately correspond to the Taxon ID provided to ensure correct taxonomic classification.
Example Salvelinus alpinus
Reference http://rs.tdwg.org/dwc/terms/scientificName
Regex ^[A-Za-z]+(?: [A-Za-z]+)*[a-z]+$
Namespace ontology:scientific_name
Taxon ID Required
Name taxon_id
Description A unique identifier (usually from a recognized taxonomy database like NCBI Taxonomy) that corresponds to the organism’s scientific name. It must be accurately matched to the provided scientificName to maintain consistency and traceability in biological records.
Example 8036
Reference http://rs.tdwg.org/dwc/terms/taxonID
Regex ^[0-9]+$
Namespace ontology:taxon_id
Biosample Accession Required
Name biosampleAccession
Description A unique identifier assigned to a biological sample after it has been submitted to a public database, such as the NCBI BioSample or ENA. It serves as a permanent reference to that specific sample, allowing researchers to retrieve metadata and link it across studies or datasets.
Example SAMEA12907823
Reference http://purl.obolibrary.org/obo/T4FS_0000316
Namespace ontology:biosampleAccession
Study ID Required
Name study_id
Description A unique alphanumeric identifier for this study
Example STUDY001
Reference #
Regex ^[a-zA-Z0-9]+$
Namespace ei:study_id
Imaging Protocol ID Required
Name imaging_protocol_id
Description A unique alphanumeric identifier for the imaging protocol.
Example IMGPRO001
Reference #
Namespace ei:imaging_protocol_id
Platform Required
Name platform
Description The platform used to isolate the cells.
Example Illumina NovaSeq
Reference #
Namespace ei:platform
Instrument Required
Name instrument
Description The instrument used to isolate the cells.
Example Illumina NovaSeq 6000
Reference #
Namespace ei:instrument
Target Probe Code Required
Name target_probe_code
Description The type of probes used to detect and quantify specific RNA molecules in their native spatial context within a tissue or cell.
Example Oligo-dT
Reference #
Namespace ei:target_probe_code
Section Thickness (µm)
Name section_thickness_µm
Description The thickness of the tissue section in micrometres.
Example 10
Reference #
Regex ^\d+(\.\d+)?$
Namespace ei:section_thickness_µm
Section Thickness Measurement Method
Name section_thickness_measurement_method
Description The method used to measure tissue section thickness.
Example Microtome
Reference #
Namespace ei:section_thickness_measurement_method
Section Thickness Temperature
Name section_thickness_temperature
Description The temperature at which the section was made in degree celsius.
Example 22
Reference #
Regex ^-?\d+(\.\d+)?$
Namespace ei:section_thickness_temperature
Is Pathological
Name is_pathological
Description A quality inhering in a bearer by virtue of the bearer's being abnormal and having a destructive effect on living tissue.
Example No
Reference #
Namespace ei:is_pathological
Allowed Values No Yes
Photobleaching Duration In Hours
Name photobleaching_duration_in_hours
Description The duration of photobleaching in hours
Example 2
Reference #
Regex ^\d+$
Namespace ei:photobleaching_duration_in_hours
Clearing with ProteinaseK Required
Name clearing_with_proteinasek
Description The duration of clearing at 47°C with Proteinase K.
Example 24 hrs
Reference #
Regex ^\d+(\.\d+)?\s*(hrs?|days?|mins?|seconds?)$
Namespace ei:clearing_with_proteinasek
Clearing without ProteinaseK Required
Name clearing_without_proteinasek
Description The duration of tissue clearing at 37°C without Proteinase K.
Example 4.5 days
Reference #
Regex ^\d+(\.\d+)?\s*(hrs?|days?|mins?|seconds?)$
Namespace ei:clearing_without_proteinasek
Instrument User Guide Required
Name instrument_user_guide
Description The user guide for the instrument used.
Example User Guide
Reference #
Regex ^[A-Za-z]+(?: [A-Za-z]+)*[a-z]+$
Namespace ei:instrument_user_guide
Instrument User Guide Revision Required
Name instrument_user_guide_revision
Description The revision of the instrument user guide.
Example 1.2
Reference #
Regex ^\d+(\.\d+)?$
Namespace ei:instrument_user_guide_revision
Sample Preparation Guide Required
Name sample_preparation_guide
Description The guide used for sample preparation.
Example example_guide_v1.0.pdf
Reference #
Regex ^[A-Za-z0-9._-]*[a-z]+$
Namespace ei:sample_preparation_guide
Sample Preparation Guide Revision Required
Name sample_preparation_guide_revision
Description The revision of the sample preparation guide.
Example 1.0
Reference #
Regex ^\d+(\.\d+)?$
Namespace ei:sample_preparation_guide_revision
Deviations From Official Protocol Required
Name deviations_from_official_protocol
Description Any deviations from the official protocol. Separate individual deviations with '|'.
Example Temperature exceeded 25°C during storage | Sample handling delayed by 2 hours
Reference #
Namespace ei:deviations_from_official_protocol
Study ID Required
Name study_id
Description A unique alphanumeric identifier for this study
Example STUDY001
Reference #
Namespace ei:study_id
File ID Required
Name file_id
Description A unique alphanumeric identifier for this file
Example FILE001
Reference #
Namespace ei:file_id
Imaging Protocol ID Required
Name imaging_protocol_id
Description A unique alphanumeric identifier for the imaging protocol.
Example IMGPRO001
Reference #
Namespace ei:imaging_protocol_id
File Name Required
Name file_name
Description A file name is used to uniquely identify a data file related to the study. Common file names end with tiff, jpeg, png, gif, bmp and ome-tiff etc.
Example file001.tiff
Reference #
Namespace ei:file_name
File Type Required
Name file_type
Description A file type is a name given to a specific kind of file. Common file types are tiff, jpeg, png, gif, bmp and ome-tiff etc.
Example tiff
Reference #
Namespace ei:file_type
Study ID Required
Name study_id
Description A unique alphanumeric identifier for this study
Example STUDY001
Reference #
Regex ^[a-zA-Z0-9]+$
Namespace ei:study_id
Title
Name title
Description A name given to the study or project. Project title should be fewer than 30 words, such as a title of a grant proposal or a publication.
Example Spatial Transcriptomics FISH of Human Lung Tissue
Reference http://purl.org/dc/terms/title
Namespace dcterms:title
Workflow
Name workflow
Description The workflow or protocol followed during the study.
Example Spatial Transcriptomics
Reference #
Namespace ei:workflow
Allowed Values Laser microdissection Laser microdissection, Culturing Laser microdissection, Culturing, Sequencing Laser microdissection, Sequencing Microfluidics, Facs, Culturing Microfluidics, Facs, Culturing, Sequencing Microfluidics, Facs, Sequencing Spatial Transcriptomics
Licence
Name licence
Description Specifies the terms under which the data associated with the study can be used, shared, or reused. It informs users how they may legally reference, distribute, or build upon the study. Common licenses include Creative Commons (e.g., CC BY 4.0), which require attribution to the original authors when the data is cited or reused.
Example MIT
Reference #
Namespace ei:licence
Allowed Values Apache-2.0 CC-BY-4.0 CC-BY-SA-4.0 CC0-1.0 GPL-3.0-or-later MIT
Study ID Required
Name study_id
Description A unique alphanumeric identifier for this study
Example STUDY001
Reference #
Regex ^[a-zA-Z0-9]+$
Namespace ei:study_id
Orcid ID
Name orcid_id
Description A 16-digit number that uniquely identify researchers.
Example 0000-1234-5678-9012
Reference #
Regex ^\d{4}-\d{4}-\d{4}-\d{4}$
Namespace ei:orcid_id
First Name Required
Name givenName
Description A first name (or given name) is the personal name given to an individual conducting the study.
Example Jane
Reference https://schema.org/givenName
Regex ^[A-Za-z]+(?:[-\s][A-Za-z]+)*[a-z]+$
Namespace schema.org:givenName
Last Name Required
Name familyName
Description A last name (or surname) is the family name passed down from one generation to the next for the individual conducting the study.
Example Doe
Reference https://schema.org/familyName
Regex ^[A-Za-z]+(-[A-Za-z]+)*[a-z]+$
Namespace schema.org:familyName
Email Address
Name email
Description A unique identifier used to send and receive electronic messages (emails) over the internet.
Example jane.doe@example.com
Reference https://schema.org/email
Regex ^(?!.*\.{2,})(?!.*-{2,})[\w.-]+@[a-zA-Z\d.-]+\.[a-zA-Z]{2,}$
Namespace schema.org:email
Affiliation or Institution Required
Name affiliation
Description An organisation or institution that this person is associated with.
Example University of Liverpool
Reference https://schema.org/affiliation
Namespace schema.org:affiliation
Funder
Name funder
Description A person or organization that supports (sponsors) something through some kind of financial contribution.
Example BBSRC
Reference https://schema.org/funder
Namespace schema.org:funder
Grant Award
Name funding
Description A grant that directly or indirectly provides funding or sponsorship for the person to conduct the study.
Example GRAK3489
Reference https://schema.org/funding
Namespace schema.org:funding
Study ID Required
Name study_id
Description A unique alphanumeric identifier for the study if referring to
Example STUDY001
Reference #
Regex ^[a-zA-Z0-9]+$
Namespace ei:study_id
Sample ID Required
Name sample_id
Description A unique alphanumeric reference or identifier for the sample. This field must provide a consistent, unambiguous way to identify the sample within and across datasets. It can be a name, code, or accession-like format, as long as it remains unique.
Example SAMP001
Reference #
Namespace ei:sample_id
Scientific Name or Organism
Name scientific_name
Description The formal Latin name used to identify the organism from which the sample was derived (e.g. Homo sapiens or Arabidopsis thaliana). This name must accurately correspond to the Taxon ID provided to ensure correct taxonomic classification.
Example Salvelinus alpinus
Reference http://rs.tdwg.org/dwc/terms/scientificName
Regex ^[A-Za-z]+(?: [A-Za-z]+)*[a-z]+$
Namespace ontology:scientific_name
Taxon ID Required
Name taxon_id
Description A unique identifier (usually from a recognized taxonomy database like NCBI Taxonomy) that corresponds to the organism’s scientific name. It must be accurately matched to the provided scientificName to maintain consistency and traceability in biological records.
Example 8036
Reference http://rs.tdwg.org/dwc/terms/taxonID
Regex ^[0-9]+$
Namespace ontology:taxon_id
Biosample Accession Required
Name biosampleAccession
Description A unique identifier assigned to a biological sample after it has been submitted to a public database, such as the NCBI BioSample or ENA. It serves as a permanent reference to that specific sample, allowing researchers to retrieve metadata and link it across studies or datasets.
Example SAMEA12907823
Reference http://purl.obolibrary.org/obo/T4FS_0000316
Namespace ontology:biosampleAccession
Study ID Required
Name study_id
Description A unique alphanumeric identifier for this study
Example STUDY001
Reference #
Regex ^[a-zA-Z0-9]+$
Namespace ei:study_id
Imaging Protocol ID Required
Name imaging_protocol_id
Description A unique alphanumeric identifier for the imaging protocol.
Example IMGPRO001
Reference #
Namespace ei:imaging_protocol_id
Platform Required
Name platform
Description The platform used to isolate the cells.
Example Illumina NovaSeq
Reference #
Namespace ei:platform
Instrument Required
Name instrument
Description The instrument used to isolate the cells.
Example Illumina NovaSeq 6000
Reference #
Namespace ei:instrument
Target Probe Code Required
Name target_probe_code
Description The type of probes used to detect and quantify specific RNA molecules in their native spatial context within a tissue or cell.
Example Oligo-dT
Reference #
Namespace ei:target_probe_code
Section Thickness (µm)
Name section_thickness_µm
Description The thickness of the tissue section in micrometres.
Example 10
Reference #
Regex ^\d+(\.\d+)?$
Namespace ei:section_thickness_µm
Section Thickness Measurement Method
Name section_thickness_measurement_method
Description The method used to measure tissue section thickness.
Example Microtome
Reference #
Namespace ei:section_thickness_measurement_method
Section Thickness Temperature
Name section_thickness_temperature
Description The temperature at which the section was made in degree celsius.
Example 22
Reference #
Regex ^-?\d+(\.\d+)?$
Namespace ei:section_thickness_temperature
Is Pathological
Name is_pathological
Description A quality inhering in a bearer by virtue of the bearer's being abnormal and having a destructive effect on living tissue.
Example No
Reference #
Namespace ei:is_pathological
Allowed Values No Yes
Photobleaching Duration In Hours
Name photobleaching_duration_in_hours
Description The duration of photobleaching in hours
Example 2
Reference #
Regex ^\d+$
Namespace ei:photobleaching_duration_in_hours
Clearing with ProteinaseK Required
Name clearing_with_proteinasek
Description The duration of clearing at 47°C with Proteinase K.
Example 24 hrs
Reference #
Regex ^\d+(\.\d+)?\s*(hrs?|days?|mins?|seconds?)$
Namespace ei:clearing_with_proteinasek
Clearing without ProteinaseK Required
Name clearing_without_proteinasek
Description The duration of tissue clearing at 37°C without Proteinase K.
Example 4.5 days
Reference #
Regex ^\d+(\.\d+)?\s*(hrs?|days?|mins?|seconds?)$
Namespace ei:clearing_without_proteinasek
Instrument User Guide Required
Name instrument_user_guide
Description The user guide for the instrument used.
Example User Guide
Reference #
Regex ^[A-Za-z]+(?: [A-Za-z]+)*[a-z]+$
Namespace ei:instrument_user_guide
Instrument User Guide Revision Required
Name instrument_user_guide_revision
Description The revision of the instrument user guide.
Example 1.2
Reference #
Regex ^\d+(\.\d+)?$
Namespace ei:instrument_user_guide_revision
Sample Preparation Guide Required
Name sample_preparation_guide
Description The guide used for sample preparation.
Example example_guide_v1.0.pdf
Reference #
Regex ^[A-Za-z0-9._-]*[a-z]+$
Namespace ei:sample_preparation_guide
Sample Preparation Guide Revision Required
Name sample_preparation_guide_revision
Description The revision of the sample preparation guide.
Example 1.0
Reference #
Regex ^\d+(\.\d+)?$
Namespace ei:sample_preparation_guide_revision
Deviations From Official Protocol Required
Name deviations_from_official_protocol
Description Any deviations from the official protocol. Separate individual deviations with '|'.
Example Temperature exceeded 25°C during storage | Sample handling delayed by 2 hours
Reference #
Namespace ei:deviations_from_official_protocol
Study ID Required
Name study_id
Description A unique alphanumeric identifier for this study
Example STUDY001
Reference #
Namespace ei:study_id
File ID Required
Name file_id
Description A unique alphanumeric identifier for this file
Example FILE001
Reference #
Namespace ei:file_id
Imaging Protocol ID Required
Name imaging_protocol_id
Description A unique alphanumeric identifier for the imaging protocol.
Example IMGPRO001
Reference #
Namespace ei:imaging_protocol_id
File Name Required
Name file_name
Description A file name is used to uniquely identify a data file related to the study. Common file names end with tiff, jpeg, png, gif, bmp and ome-tiff etc.
Example file001.tiff
Reference #
Namespace ei:file_name
File Type Required
Name file_type
Description A file type is a name given to a specific kind of file. Common file types are tiff, jpeg, png, gif, bmp and ome-tiff etc.
Example tiff
Reference #
Namespace ei:file_type
Study ID Required
Name study_id
Description A unique alphanumeric identifier for this study
Example STUDY001
Reference #
Regex ^[a-zA-Z0-9]+$
Namespace ei:study_id
Title Required
Name title
Description A name given to the study or project. Project title should be fewer than 30 words, such as a title of a grant proposal or a publication.
Example Study of single cells in the human body
Reference http://purl.org/dc/terms/title
Namespace dcterms:title
Description Required
Name description
Description A detailed description of the project which includes research goals and experimental approach. Project description should be fewer than 300 words, such as an abstract from a grant application or publication.
Example This project explores the intricate details of single cells in the human body, focusing on their structure, function, and behaviour. By studying individual cells, it aims to uncover how they contribute to overall health, disease progression, and human biology. This research can provide deeper insights into cellular processes, paving the way for advancements in medical treatments and personalised medicine.
Reference http://purl.org/dc/terms/description
Namespace dcterms:description
Bibliographic Citation Required
Name bibliographicCitation
Description A citation for the study resource, following a standard format.
Example Doe J., et al. (2024). Single Cell Transcriptomic Analysis of Human Liver Cells. Journal of Cellular Biology.
Reference http://purl.org/dc/terms/bibliographicCitation
Namespace dcterms:bibliographicCitation
Created Required
Name created
Description The date when the study was created or registered.
Example 2024-10-14
Reference http://purl.org/dc/terms/created
Regex ^\d{4}-(0[1-9]|1[0-2])-(0[1-9]|[12]\d|3[01])$
Namespace dcterms:created
Workflow
Name workflow
Description The workflow or protocol followed during the study.
Example Laser microdissection
Reference #
Namespace ei:workflow
Allowed Values Laser microdissection Laser microdissection, Culturing Laser microdissection, Culturing, Sequencing Laser microdissection, Sequencing Microfluidics, Facs, Culturing Microfluidics, Facs, Culturing, Sequencing Microfluidics, Facs, Sequencing Spatial Transcriptomics
Technology Required
Name technology
Description The sorting or visualisation technology used.
Example Vizgen
Reference #
Namespace ei:technology
Licence
Name licence
Description Specifies the terms under which the data associated with the study can be used, shared, or reused. It informs users how they may legally reference, distribute, or build upon the study. Common licenses include Creative Commons (e.g., CC BY 4.0), which require attribution to the original authors when the data is cited or reused.
Example MIT
Reference #
Namespace ei:licence
Allowed Values Apache-2.0 CC-BY-4.0 CC-BY-SA-4.0 CC0-1.0 GPL-3.0-or-later MIT
Study ID Required
Name study_id
Description A unique alphanumeric identifier for this study
Example STUDY001
Reference #
Regex ^[a-zA-Z0-9]+$
Namespace ei:study_id
Orcid ID
Name orcid_id
Description A 16-digit number that uniquely identify researchers.
Example 0000-1234-5678-9012
Reference #
Regex ^\d{4}-\d{4}-\d{4}-\d{3}[\dX]$
Namespace ei:orcid_id
First Name Required
Name givenName
Description A first name (or given name) is the personal name given to an individual conducting the study.
Example Jane
Reference https://schema.org/givenName
Regex ^[A-Za-z]+(?:[-\s][A-Za-z]+)*[a-z]+$
Namespace schema.org:givenName
Last Name Required
Name familyName
Description A last name (or surname) is the family name passed down from one generation to the next for the individual conducting the study.
Example Doe
Reference https://schema.org/familyName
Regex ^[A-Za-z]+(-[A-Za-z]+)*[a-z]+$
Namespace schema.org:familyName
Email Address
Name email
Description A unique identifier used to send and receive electronic messages (emails) over the internet.
Example jane.doe@example.com
Reference https://schema.org/email
Regex ^(?!.*\.{2,})(?!.*-{2,})[\w.-]+@[a-zA-Z\d.-]+\.[a-zA-Z]{2,}$
Namespace schema.org:email
Affiliation or Institution Required
Name affiliation
Description An organisation or institution that this person is associated with.
Example University of Liverpool
Reference https://schema.org/affiliation
Regex ^[A-Za-z]+(?: [A-Za-z]+)*[a-z]+$
Namespace schema.org:affiliation
Funder
Name funder
Description A person or organization that supports (sponsors) something through some kind of financial contribution.
Example BBSRC
Reference https://schema.org/funder
Namespace schema.org:funder
Grant Award
Name funding
Description A grant that directly or indirectly provides funding or sponsorship for the person to conduct the study.
Example GRAK3489
Reference https://schema.org/funding
Regex ^[A-Za-z0-9]+(?: [A-Za-z0-9]+)*$
Namespace schema.org:funding
Study ID Required
Name study_id
Description A unique alphanumeric identifier for the study if referring to
Example STUDY001
Reference #
Regex ^[a-zA-Z0-9]+$
Namespace ei:study_id
Sample ID Required
Name sample_id
Description A unique reference or identifier for the sample. This field must provide a consistent, unambiguous way to identify the sample within and across datasets. It can be a name, code, or accession-like format, as long as it remains unique.
Example SAMPLE001
Reference #
Namespace ei:sample_id
Scientific Name or Organism
Name scientific_name
Description The formal Latin name used to identify the organism from which the sample was derived (e.g. Homo sapiens or Arabidopsis thaliana). This name must accurately correspond to the Taxon ID provided to ensure correct taxonomic classification.
Example Salvelinus alpinus
Reference http://rs.tdwg.org/dwc/terms/scientificName
Regex ^[A-Za-z]+(?: [A-Za-z]+)*[a-z]+$
Namespace ontology:scientific_name
Taxon ID Required
Name taxon_id
Description A unique identifier (usually from a recognized taxonomy database like NCBI Taxonomy) that corresponds to the organism’s scientific name. It must be accurately matched to the provided scientificName to maintain consistency and traceability in biological records.
Example 8036
Reference http://rs.tdwg.org/dwc/terms/taxonID
Regex ^[0-9]+$
Namespace ontology:taxon_id
Biosample Accession Required
Name biosampleAccession
Description A unique identifier assigned to a biological sample after it has been submitted to a public database, such as the NCBI BioSample or ENA. It serves as a permanent reference to that specific sample, allowing researchers to retrieve metadata and link it across studies or datasets.
Example SAMEA12907823
Reference http://purl.obolibrary.org/obo/T4FS_0000316
Namespace ontology:biosampleAccession
Study ID Required
Name study_id
Description A unique alphanumeric identifier for this study
Example STUDY001
Reference #
Regex ^[a-zA-Z0-9]+$
Namespace ei:study_id
Dissociation Protocol ID Required
Name dissociation_protocol_id
Description A unique alphanumeric code for the dissociation protocol in the study
Example DISSOC001
Reference #
Regex ^[a-zA-Z0-9]+$
Namespace ei:dissociation_protocol_id
Protocol Name Required
Name protocol_name
Description A descriptive name of the protocol used for single-cell sequencing.
Example 10X Genomics Single Cell 3' Library Prep
Reference #
Namespace ei:protocol_name
Dissociation Description Required
Name dissociation_description
Description A free-text description of the process used to separate cells from tissues or cell aggregates.
Example Tissue was enzymatically dissociated using collagenase for 30 minutes.
Reference #
Namespace ei:dissociation_description
Enrichment Markers
Name enrichment_markers
Description Description of the specificity markers used to isolate cell populations, e.g. 'CD45+'. Please contact FAANG DCC to add more terms.
Example CD45
Reference #
Namespace faang:enrichment_markers
Isolation Kit
Name isolation_kit
Description The kit used to isolate the cells.
Example 10x Nuclei Isolation Kit
Reference #
Namespace ei:isolation_kit
Allowed Values 10x Nuclei Isolation Kit 3' standard throughput kit Custom
Literature Source Reference
Name literature_source_reference
Description Reference to literature sources that describe the protocol or methods used.
Example Doe et al. (2024), 'Single-cell RNA-seq: A comprehensive overview'
Reference #
Namespace ei:literature_source_reference
Protocols IO Reference
Name protocols_io_reference
Description Reference link to protocols.io for additional details on the protocol.
Example https://www.protocols.io/view/sample-protocol-b2ubqesn
Reference #
Regex ^https?:\/\/(?:www\.)?[-a-zA-Z0-9@:%._\+~#=]{1,256}\.[a-zA-Z0-9()]{1,6}(?:[-a-zA-Z0-9()@:%_\+.~#?&\/=]*)+(?: \| https?:\/\/(?:www\.)?[-a-zA-Z0-9@:%._\+~#=]{1,256}\.[a-zA-Z0-9()]{1,6}(?:[-a-zA-Z0-9()@:%_\+.~#?&\/=]*)*)*$
Namespace ei:protocols_io_reference
Workflowhub Sop Reference
Name workflow_hub_sop_reference
Description Reference to the Standard Operating Procedure (SOP) in workflow hub.
Example https://workflowhub.eu/works/12345
Reference #
Namespace ei:workflow_hub_sop_reference
Dissociation Protocol Method
Name dissociation_protocol_method
Description The method used to dissociate tissues into single cells.
Example Mechanical and enzymatic dissociation
Reference #
Namespace ei:dissociation_protocol_method
Single Cell Quality Metric
Name single_cell_quality_metric
Description Metrics used to assess the quality of single cells before sequencing.
Example Cell viability percentage
Reference #
Namespace ei:single_cell_quality_metric
Study ID Required
Name study_id
Description A unique alphanumeric identifier for this study
Example STUDY001
Reference #
Regex ^[a-zA-Z0-9]+$
Namespace ei:study_id
Cell Suspension ID Required
Name cell_suspension_id
Description A unique alphanumeric code for the cell suspension for the sample
Example CELLSUSP001
Reference #
Regex ^[a-zA-Z0-9]+$
Namespace ei:cell_suspension_id
Sample ID Required
Name sample_id
Description A unique reference or identifier for the sample associated with the cell suspension. This field must provide a consistent, unambiguous way to identify the sample within and across datasets. It can be a name, code, or accession-like format, as long as it remains unique.
Example SAMPLE001
Reference #
Namespace ei:sample_id
Dissociation Protocol ID Required
Name dissociation_protocol_id
Description A unique alphanumeric code for the dissociation protocol in the study
Example DISSOC001
Reference #
Regex ^[a-zA-Z0-9]+$
Namespace ei:dissociation_protocol_id
Suspension Type Required
Name suspension_type
Description The type of suspension used to keep cells in solution during processing.
Example Cell
Reference #
Namespace ei:suspension_type
Allowed Values Cell Nuclei Protoplast
Cell Count
Name cell_count
Description An number representing the number of cells in the sequencing library.
Example 10000
Reference #
Regex ^\d+$
Namespace ei:cell_count
Cell Viability
Name cell_viability
Description The percentage of living cells in a sample, indicating the health and quality of cells for RNA-sequencing analysis.
Example 95
Reference #
Namespace ei:cell_viability
Cell Viability Assessment Method
Name cell_viability_assessment_method
Description The method used to evaluate the viability of cells in the sample, often involving staining or flow cytometry techniques.
Example Trypan Blue Exclusion
Reference #
Namespace ei:cell_viability_assessment_method
Cell Size
Name cell_size
Description The size of the cell, typically measured in micrometres.
Example 10
Reference #
Namespace ei:cell_size
Suspension Volume (µL)
Name suspension_volume_µl
Description The volume of the cell suspension in microlitres (µL).
Example 100
Reference #
Namespace ei:suspension_volume_µl
Suspension Concentration Cells Per µL
Name suspension_concentration_cells_per_µl
Description The concentration of cells in the suspension in microlitres (µL).
Example 1000
Reference #
Namespace ei:suspension_concentration_cells_per_µl
Suspension Dilution
Name suspension_dilution
Description The dilution factor of the cell suspension.
Example 1:10
Reference #
Namespace ei:suspension_dilution
Loading Volume Μl
Name loading_volume_µl
Description The volume of the cell suspension loaded into the single-cell RNA-sequencing system for analysis.
Example 10
Reference #
Regex ^\d+$
Namespace ei:loading_volume_µl
Suspension Dilution Buffer
Name suspension_dilution_buffer
Description A solution used to dilute cell suspensions to a desired concentration, typically prior to loading cells into a device for single-cell RNA sequencing. It helps maintain cell viability and integrity during processing.
Example PBS (Phosphate-buffered saline) with 0.04% BSA (Bovine serum albumin)
Reference #
Namespace ei:suspension_dilution_buffer
Study ID Required
Name study_id
Description A unique alphanumeric identifier for this study
Example STUDY001
Reference #
Regex ^[a-zA-Z0-9]+$
Namespace ei:study_id
Library Preparation ID Required
Name library_prep_id
Description A unique alphanumeric reference or identifier for the library preparation protocol used during the sequencing.
Example LIBPREP001
Reference #
Regex ^[a-zA-Z0-9]+$
Namespace ei:library_prep_id
Cell Suspension ID Required
Name cell_suspension_id
Description A unique alphanumeric code for the cell suspension for the library preparation.
Example CELLSUSP001
Reference #
Regex ^[a-zA-Z0-9]+$
Namespace ei:cell_suspension_id
Library Preparation Kit Required
Name library_prep_kit
Description Packaged kits (containing adapters, indexes, enzymes, buffers etc.), tailored for specific sequencing workflows, which allow the simplified preparation of sequencing-ready libraries for small genomes, amplicons, and plasmids
Example 10X Genomics Single Cell 3' v3
Reference https://w3id.org/mixs/0001145
Namespace mixs:library_prep_kit
Library Preparation Kit Version Required
Name library_prep_kit_version
Description The version number of the library preparation kit used for sequencing.
Example 2
Reference http://purl.obolibrary.org/obo/GENEPIO_0000149
Regex ^\d+(\.\d+)?$
Namespace ontology:library_prep_kit_version
Amplification Method
Name amplification_method
Description The method used to amplify the Complementary DNA (cDNA).
Example PCR
Reference #
Namespace ei:amplification_method
cDNA Amplification Cycles
Name cdna_amplification_cycles
Description The number of cycles used during the Complementary DNA (cDNA) amplification process.
Example 12
Reference #
Regex ^\d+$
Namespace ei:cdna_amplification_cycles
Average Size Distribution
Name average_size_distribution
Description The average length of RNA fragments in base pairs (BP) after library preparation, indicating the quality and suitability of the RNA for sequencing.
Example 350
Reference #
Regex ^\d+$
Namespace ei:average_size_distribution
Library Construction Method
Name lib_construction_method
Description The library construction method (including version) that was used.
Example Smart-Seq2
Reference #
Namespace ei:lib_construction_method
Input Molecule
Name input_molecule
Description The specific fraction of biological macromolecule from which the sequencing library is derived.
Example RNA
Reference #
Namespace ei:input_molecule
Primer
Name primer
Description The type of primer used for reverse transcription. This allows users to identify content of the cDNA library input for mRNA.
Example Random
Reference #
Namespace ei:primer
Allowed Values Oligo-dT Random
Primeness Required
Name primeness
Description The end from which the molecule was sequenced.
Example 5'
Reference #
Namespace ei:primeness
Allowed Values 3' 5' Both
End Bias
Name end_bias
Description The end bias of the library.
Example 3
Reference #
Namespace ei:end_bias
Allowed Values 3 5
Library Strand
Name library_strand
Description The Complementary DNA (cDNA) strand of the library from which the reads derived from - sense (first), antisense (second), both or none.
Example Antisense
Reference #
Namespace ei:library_strand
Allowed Values Antisense Both Sense Unstranded
Spike In Required
Name spike_in
Description External RNA added to the sample as a control to assess technical variability and normalization in RNA-sequencing. State whether spike-in was used.
Example Yes
Reference #
Namespace ei:spike_in
Allowed Values No Yes
Spike Type
Name spike_type
Description The specific type of external RNA used for spiking in, often indicating the source or nature of the control RNA.
Example Synthetic RNA
Reference #
Namespace ei:spike_type
Spike In Dilution Or Concentration
Name spike_in_dilution_or_concentration
Description The final concentration or dilution (for commercial sets) of the spike in mix.
Example 1:1000
Reference #
Namespace ei:spike_in_dilution_or_concentration
i5 Index Required
Name i5_index
Description Barcode sequence used on the i5 adapter during library preparation for identifying samples in multiplexed single-cell RNA-sequencing.
Example ATCACG
Reference #
Namespace ei:i5_index
i7 Index Required
Name i7_index
Description Barcode sequence used on the i7 adapter to distinguish samples in multiplexed sequencing runs.
Example CGATGT
Reference #
Namespace ei:i7_index
Dual or Single Index Required
Name dual_single_index
Description Specifies if both i5 and i7 indices (dual) or only one index (single) was used for sample identification during sequencing.
Example Dual
Reference #
Namespace ei:dual_single_index
Allowed Values Dual Single
I5 Sequence Required
Name i5_sequence
Description The nucleotide sequence of the i5 index used in multiplexing during sequencing.
Example ATCGTAGC
Reference #
Namespace ei:i5_sequence
i7 Sequence Required
Name i7_sequence
Description The specific nucleotide sequence of the i7 index used for a sample.
Example TGCATGCA
Reference #
Namespace ei:i7_sequence
Plate ID
Name plate_id
Description Identifier for the 96-well plate used in sample preparation.
Example PLT001
Reference #
Regex ^[a-zA-Z0-9]+$
Namespace ei:plate_id
Well Row
Name well_row
Description The row identifier in a 96-well plate indicating the sample's position.
Example A
Reference #
Namespace ei:well_row
Well Column
Name well_col
Description The column identifier in a 96-well plate indicating the sample's position.
Example 5
Reference #
Regex ^\d+$
Namespace ei:well_col
Cell Phenotype
Name cell_phenotype
Description The cell marker for the Fluorescence-Activated Cell Sorting (FACS) of cells.
Example CD41-
Reference #
Namespace ei:cell_phenotype
Allowed Values CD41+ CD41-
Design description
Name design_description
Description The design of the library including details of how it was constructed.
Reference #
Namespace ei:design_description
Library selection Required
Name library_selection
Description The method used to select for or against, enrich, or screen the material being sequenced.
Example RANDOM PCR
Reference #
Namespace ei:library_selection
Allowed Values 5-methylcytidine antibody CAGE ChIP ChIP-Seq Dnase HMPR Hybrid Selection Inverse rRNA Inverse rRNA selection MBD2 protein methyl-CpG binding domain MDA MF MSLL Mnase Oligo-dT PCR PolyA RACE RANDOM RANDOM PCR RT-PCR Reduced Representation Restriction Digest cDNA cDNA_oligo_dT cDNA_randomPriming other padlock probes capture method repeat fractionation size fractionation unspecified
Library source Required
Name library_source
Description The type of source material that is being sequenced.
Example GENOMIC
Reference #
Namespace ei:library_source
Allowed Values GENOMIC GENOMIC SINGLE CELL METAGENOMIC METATRANSCRIPTOMIC OTHER SYNTHETIC TRANSCRIPTOMIC TRANSCRIPTOMIC SINGLE CELL VIRAL RNA
Library strategy Required
Name library_strategy
Description The sequencing technique intended for this library.
Example RNA-Seq
Reference #
Namespace ei:library_strategy
Allowed Values AMPLICON ATAC-seq Bisulfite-Seq CLONE CLONEEND CTS ChIA-PET ChIP-Seq ChM-Seq DNase-Hypersensitivity EST FAIRE-seq FINISHING FL-cDNA GBS Hi-C MBD-Seq MNase-Seq MRE-Seq MeDIP-Seq NOMe-Seq OTHER POOLCLONE RAD-Seq RIP-Seq RNA-Seq Ribo-Seq SELEX Synthetic-Long-Read Targeted-Capture Tethered Chromatin Conformation Capture Tn-Seq VALIDATION WCS WGA WGS WXS miRNA-Seq ncRNA-Seq snRNA-seq ssRNA-seq
Study ID Required
Name study_id
Description A unique alphanumeric identifier for this study
Example STUDY001
Reference #
Regex ^[a-zA-Z0-9]+$
Namespace ei:study_id
Sequencing ID Required
Name sequencing_id
Description A unique alphanumeric reference or identifier for the sequencing protocol.
Example SEQ001
Reference https://w3id.org/mixs/0000016
Regex ^[a-zA-Z0-9]+$
Namespace ontology:sequencing_id
Sequencing Platform Name Required
Name sequencing_platform_name
Description The name of the sequencing platform used for the experiment.
Example Pacbio
Reference http://purl.obolibrary.org/obo/NCIT_C172274
Namespace ontology:sequencing_platform_name
Sequencing Instrument Model Required
Name sequencing_instrument_model
Description This refers to the machine or platform used for sequencing, with variations in throughput, read lengths, error rates, and application suitability.
Example Illumina NovaSeq 6000
Reference http://purl.obolibrary.org/obo/GENEPIO_0000149
Namespace ontology:sequencing_instrument_model
Allowed Values 454 GS 454 GS 20 454 GS FLX 454 GS FLX Titanium 454 GS FLX+ 454 GS Junior AB 310 Genetic Analyzer AB 3130 Genetic Analyzer AB 3130xL Genetic Analyzer AB 3500 Genetic Analyzer AB 3500xL Genetic Analyzer AB 3730 Genetic Analyzer AB 3730xL Genetic Analyzer AB 5500 Genetic Analyzer AB 5500xl Genetic Analyzer AB 5500xl-W Genetic Analysis System AB SOLiD 3 Plus System AB SOLiD 4 System AB SOLiD 4hq System AB SOLiD PI System AB SOLiD System AB SOLiD System 2.0 AB SOLiD System 3.0 BGISEQ-50 BGISEQ-500 Complete Genomics DNBSEQ-G400 DNBSEQ-G400 FAST DNBSEQ-G50 DNBSEQ-T10x4RS DNBSEQ-T7 Element AVITI FASTASeq 300 GENIUS GS111 Genapsys Sequencer GenoCare 1600 GenoLab M GridION Illumina Genome Analyzer Illumina Genome Analyzer II Illumina Genome Analyzer IIx Illumina HiScanSQ Illumina HiSeq 1000 Illumina HiSeq 1500 Illumina HiSeq 2000 Illumina HiSeq 2500 Illumina HiSeq 3000 Illumina HiSeq 4000 Illumina HiSeq X Illumina HiSeq X Five Illumina HiSeq X Ten Illumina MiSeq Illumina MiniSeq Illumina NextSeq 500 Illumina NextSeq 550 Illumina NovaSeq 6000 Illumina NovaSeq X Illumina NovaSeq X Plus Illumina iSeq 100 Ion GeneStudio S5 Ion GeneStudio S5 Plus Ion GeneStudio S5 Prime Ion Torrent Genexus Ion Torrent PGM Ion Torrent Proton Ion Torrent S5 Ion Torrent S5 XL MGISEQ-2000RS MinION NextSeq 1000 NextSeq 2000 Onso PacBio RS PacBio RS II PromethION Revio Sentosa SQ301 Sequel Sequel II Sequel IIe Tapestri UG 100
Library Layout Required
Name lib_layout
Description Specify whether to expect single, paired, or other configuration of reads for sequencing
Example Paired
Reference https://w3id.org/mixs/0000111
Namespace mixs:lib_layout
Allowed Values Other Paired Single Vector
UMI Barcode Read
Name umi_barcode_read
Description The type of read that contains the Unique Molecular Identifier (UMI) barcode.
Example index2
Reference #
Namespace ei:umi_barcode_read
Allowed Values index1 index2 read1 read2
UMI Barcode Offset
Name umi_barcode_offset
Description The offset in sequence of the Unique Molecular Identifier (UMI) identifying barcode.
Example 0
Reference #
Regex ^\d+$
Namespace ei:umi_barcode_offset
UMI Barcode Size
Name umi_barcode_size
Description The size of the Unique Molecular Identifier (UMI) identifying barcode.
Example 10
Reference #
Regex ^\d+$
Namespace ei:umi_barcode_size
Cell Barcode Read
Name cell_barcode_read
Description The type of read that contains the UMI barcode.
Example index1
Reference http://www.ebi.ac.uk/efo/EFO_0010203
Namespace ontology:cell_barcode_read
Allowed Values index1 index2 read1 read2
Cell Barcode Offset
Name cell_barcode_offset
Description The offset in sequence of the cell identifying barcode.
Example 10
Reference http://www.ebi.ac.uk/efo/EFO_0010204
Regex ^\d+$
Namespace ontology:cell_barcode_offset
Cell Barcode Size
Name cell_barcode_size
Description The offset in sequence of the cell identifying barcode.
Example 0
Reference http://www.ebi.ac.uk/efo/EFO_0010205
Regex ^\d+$
Namespace ontology:cell_barcode_size
cDNA Read Required
Name cdna_read
Description The actual nucleotide sequence obtained from Complementary DNA (cDNA) during sequencing.
Example read1
Reference http://www.ebi.ac.uk/efo/EFO_0010195
Namespace ontology:cdna_read
Allowed Values index1 index2 read1 read2
cDNA Read Offset
Name cdna_read_offset
Description The starting position of the Complementary DNA (cDNA) read within the entire sequence, indicating where the read begins after any barcodes or technical sequences.
Example 6
Reference http://www.ebi.ac.uk/efo/EFO_0010201
Regex ^\d+$
Namespace ontology:cdna_read_offset
cDNA Read Size
Name cdna_read_size
Description The size of the Complementary DNA (cDNA) read.
Example 75
Reference http://www.ebi.ac.uk/efo/EFO_0010202
Regex ^\d+$
Namespace ontology:cdna_read_size
Study ID Required
Name study_id
Description A unique alphanumeric identifier for this study
Example STUDY001
Reference #
Regex ^[a-zA-Z0-9]+$
Namespace ei:study_id
File Derived From
Name file_derived_from
Description The name of the file that was used to generate the analysis derived data.
Example file1_sequencing.json
Reference #
Namespace ei:file_derived_from
Inferred Cell Type
Name inferred_cell_type
Description Post analysis cell type or identity declaration based on expression profile or known gene function identified by the performer.
Example type II bipolar neuron
Reference #
Namespace ei:inferred_cell_type
Post Analysis Cell Well Quality
Name post_analysis_cell_well_quality
Description Performer defined measure of whether the read output from the cell was included in the sequencing analysis. For example, cells might be excluded if a threshold percentage of reads did not map to the genome or if pre-sequencing quality measures were not passed.
Example Pass
Reference #
Namespace ei:post_analysis_cell_well_quality
Allowed Values Fail Pass
Other Derived Cell Attributes
Name other_derived_cell_attributes
Description Any other cell level measurement or annotation as result of the analysis.
Example Cluster
Reference #
Namespace ei:other_derived_cell_attributes
Allowed Values Cluster Count Gene UMI tSNE coordinates
Study ID Required
Name study_id
Description A unique alphanumeric identifier for this study
Example STUDY001
Reference #
Regex ^[a-zA-Z0-9]+$
Namespace ei:study_id
Reference Genome
Name reference_genome
Description Indicate version and include stable link to genome data (or attach genome fasta file).
Example GRCh38, https://example.org/grch38.fa
Reference #
Namespace ei:reference_genome
Genome Annotation
Name genome_annotation
Description Indicate version and include stable link. Also indicate if any modification to the original annotation has been applied (e.g. 3' UTR extension) and include modified annotation file employed in the analysis.
Example Ensembl v101, https://example.org/ensembl_v101.gtf
Reference #
Namespace ei:genome_annotation
Annotation Filtering
Name annotation_filtering
Description Indicate which features were filtered (i.e. protein coding, pseudo-genes, TCRs, etc.)
Example Filtered to include only protein-coding genes
Reference #
Namespace ei:annotation_filtering
Genes vs Exons
Name genes_vs_exons
Description Quantification using whole gene intervals or exons.
Example Exon quantification
Reference #
Namespace ei:genes_vs_exons
Library Structure
Name library_structure
Description seqspec format
Example Single-cell 3' library
Reference #
Namespace ei:library_structure
Mapping and Demultiplexing Software
Name mapping_and_demultiplexing_software
Description Reads/UMI
Example Cell Ranger 6.0.0
Reference #
Namespace ei:mapping_and_demultiplexing_software
Read Mapping Statistics
Name read_mapping_statistics
Description Statistics of the Reads or Unique Molecular Identifier (UMI).
Example 80% reads mapped to reference
Reference #
Namespace ei:read_mapping_statistics
Sequencing Saturation
Name sequencing_saturation
Description Depending on number of cells recovered (not targeted) and technology
Example 95% sequencing saturation
Reference #
Namespace ei:sequencing_saturation
UMIs or Barcode Distribution QC
Name umis_barcode_distribution_qc
Description Show Unique Molecular Identifiers (UMIs) per barcode distribution and threshold applied
Example Threshold: 10 UMIs per barcode
Reference #
Namespace ei:umis_barcode_distribution_qc
Cell or Non-Cell Filtering Strategy
Name cell_non_cell_filtering_strategy
Description Unique Molecular Identifier (UMI) threshold used to discriminate cells from non-cells. Description of algorithm (if any) and parameters used to determine cells or non-cells.
Example Threshold: 5 UMIs for cell detection
Reference #
Namespace ei:cell_non_cell_filtering_strategy
Other Quality Filters Applied
Name other_quality_filters_applied
Description Cells/nuclei discarded based on % mitochondrial reads, % rRNA reads, etc.
Example Cells with >20% mitochondrial reads discarded
Reference #
Namespace ei:other_quality_filters_applied
Ambient RNA QC
Name ambient_rna_qc
Description Report % UMIs in background cell barcodes, and algorithm (if any) used to remove ambient RNA
Example Ambient RNA removed if >5% UMIs in background barcodes
Reference #
Namespace ei:ambient_rna_qc
Predicted Doublet Rate QC
Name predicted_doublet_rate_qc
Description Depending on number of cells recovered (not targeted) and technology
Example Predicted doublet rate: 1.5%
Reference #
Namespace ei:predicted_doublet_rate_qc
Individual Organism SNP Demultiplexing
Name individual_organism_snp_demultiplexing
Description If carried out, show SNP partitioning quality (e.g. SNP UMAP embedding or covariance matrix), algorithm used
Example SNP UMAP embedding using CellSNP
Reference #
Namespace ei:individual_organism_snp_demultiplexing
Study ID Required
Name study_id
Description A unique alphanumeric identifier for this study
Example STUDY001
Reference #
Regex ^[a-zA-Z0-9]+$
Namespace ei:study_id
Clustering Algorithm and Version
Name clustering_algorithm_and_version
Description If compared/integrated with existing datasets
Example Louvain 0.8.0
Reference #
Namespace ei:clustering_algorithm_and_version
Clustering Parameters
Name clustering_parameters
Description If compared/integrated with existing datasets
Example Resolution: 0.6, K-nearest neighbors: 10
Reference #
Namespace ei:clustering_parameters
Integration/Batch Correction
Name integration_batch_correction
Description If compared/integrated with existing datasets
Example Harmony v1.0
Reference #
Namespace ei:integration_batch_correction
Source Code
Name source_code
Description If any newly developed code/software has been used in the processing and downstream analysis of the dataset.
Example Source code is hosted on GitHub and includes custom algorithms for UMI count normalization. The repository can be found at: https://github.com/user/umi-normalization.
Reference #
Namespace ei:source_code
UMI Count Matrix
Name umi_count_matrix
Description Gene x cell matrix with UMI counts for each gene in each cell.
Example The UMI count matrix is stored in a CSV file with gene IDs as rows (e.g., ENSG00000139618) and cell barcodes as columns (e.g., Cell_001, Cell_002). The matrix file is available at: https://example.com/umi_count_matrix.csv.
Reference #
Namespace ei:umi_count_matrix
Ensembl IDs
Name ensembl_ids
Description Gene or transcript names should be listed as Ensembl (or other standardized ID), with gene short names in metadata.
Example ENSG00000139618
Reference #
Regex ^[a-zA-Z0-9]+$
Namespace ei:ensembl_ids
Functional Gene Annotations
Name functional_gene_annotations
Description Any functional annotation generated/used (gene names, GOs, structural domains, etc.).
Example Functional gene annotations, including Gene Ontology (GO) terms, are provided in the metadata. For example, the gene 'ENSG00000139618' (BRCA1) is annotated with the GO term 'GO:0003674' (DNA binding).
Reference #
Namespace ei:functional_gene_annotations
Protein Models
Name protein_models
Description FASTA file with (or stable link to) the predicted proteins associated to genes in the UMI count matrix and matching IDs.
Example The protein sequences for genes are provided in a FASTA file available at: https://example.com/protein_models.fasta, where each protein sequence is linked to the corresponding gene ID.
Reference #
Namespace ei:protein_models
Cell Metadata
Name cell_metadata
Description Table mapping cell IDs to cluster/cell type/broad cell type annotations.
Example Cell metadata includes information such as cell type annotations ('Tumor', 'Normal') and experimental conditions ('Control', 'Treatment'). This data is available in a table at: https://example.com/cell_metadata.csv.
Reference #
Namespace ei:cell_metadata
Cluster-Level Normalised Expression Tables
Name cluster_level_normalised_expression_tables
Description Expression tables that show normalised gene expression at the cluster or cell-type level.
Example Normalised gene expression data at the cluster level is provided in a tab-delimited text file. For example, gene 'ENSG00000139618' (BRCA1) has expression values for clusters: Cluster_1: 1200, Cluster_2: 900. The full expression table is available at: https://example.com/cluster_level_expression.csv.
Reference #
Namespace ei:cluster_level_normalised_expression_tables
Other Resource Files
Name other_resource_files
Description Necessary to re-use and interpret the data. E.g. barcode information in complex, serial multiplexing protocols (clicktags).
Example Barcode information used in multiplexing protocols is provided in a separate file, which can be accessed at: https://example.com/barcode_data.csv.
Reference #
Namespace ei:other_resource_files
Study ID Required
Name study_id
Description A unique alphanumeric identifier for this study
Example STUDY001
Reference #
Regex ^[a-zA-Z0-9]+$
Namespace ei:study_id
File ID Required
Name file_id
Description A unique alphanumeric identifier for this file
Example STUDY001
Reference #
Regex ^[a-zA-Z0-9]+$
Namespace ei:file_id
Library Preparation ID Required
Name library_prep_id
Description A unique alphanumeric reference or identifier for the library preparation protocol used during the sequencing.
Example LIBPREP001
Reference #
Regex ^[a-zA-Z0-9]+$
Namespace ei:library_prep_id
Sequencing ID Required
Name sequencing_id
Description A unique alphanumeric reference or identifier for the sequencing protocol.
Example SEQ001
Reference #
Regex ^[a-zA-Z0-9]+$
Namespace ei:sequencing_id
Read 1 File Required
Name read_1_file
Description The name or accession of the file that contains read 1.
Example file1_r1.fastq.gz
Reference #
Namespace ei:read_1_file
Read 2 File
Name read_2_file
Description The name or accession of the file that contains read 2.
Example file2_r2.fastq.gz
Reference #
Namespace ei:read_2_file
Index 1 File
Name index_1_file
Description The name of the file that contains index 1.
Example file1_i1.fastq.gz
Reference #
Namespace ei:index_1_file
Index 2 File
Name index_2_file
Description The name of the file that contains index 2.
Example file2_i2.fastq.gz
Reference #
Namespace ei:index_2_file
Read 1 Checksum Required
Name read_1_file_checksum
Description Result of a hash function calculated on the content of the read 1 file to verify file integrity. Commonly used algorithms include MD5 and SHA-1. The checksums should be separated by a comma (,).
Example f8d29e41a73b5c02de9a6fb314e7c8ad
Reference #
Regex ^[0-9a-f]{32}$
Namespace ei:read_1_file_checksum
Read 2 Checksum
Name read_2_file_checksum
Description Result of a hash function calculated on the content of the read 2 file to verify file integrity. Commonly used algorithms include MD5 and SHA-1. The checksums should be separated by a comma (,).
Example a3f4c1b29d8e57fa41b02de6c7f9ab83
Reference #
Regex ^[0-9a-f]{32}$
Namespace ei:read_2_file_checksum
White List Barcode File
Name white_list_barcode_file
Description A file containing the known cell barcodes in the dataset.
Example barcodes.tsv
Reference #
Namespace ei:white_list_barcode_file
Study ID Required
Name study_id
Description A unique alphanumeric identifier for this study
Example STUDY001
Reference #
Regex ^[a-zA-Z0-9]+$
Namespace ei:study_id
Expression Data Process Setting ID Required
Name expression_data_process_setting_id
Description A unique alphanumeric identifier for the expression data process setting
Example EXPSET001
Reference #
Regex ^[a-zA-Z0-9]+$
Namespace ei:expression_data_process_setting_id
Matrix Type
Name matrix_type
Description Matrix Type
Example raw_counts
Reference #
Namespace ei:matrix_type
Allowed Values imputed log1p nomalised pseudobulk raw_counts scaled
Reference Genome Required
Name reference_genome
Description The associated reference genome
Example https://reference-genome-example.com
Reference #
Regex ^((https?|ftp):\/\/[^\s|]+)(\|((https?|ftp):\/\/[^\s|]+))*$
Namespace ei:reference_genome
Annotation Version
Name annotation_version
Description The annotation version of the associated reference genome
Example GENCODE v44
Reference #
Namespace ei:annotation_version
Normalisation Method
Name normalisation_method
Description Any normalisation processing performed
Example Log normalisation
Reference #
Namespace ei:normalisation_method
Allowed Values Library Size Normalisation Log Normalisation SCNorm SCTransform scran
Highly Variable Gene Selection (HVG)
Name highly_variable_gene_selection
Description Number of Highly Variable Genes
Example seurat_v3, n=2000
Reference #
Namespace ei:highly_variable_gene_selection
Dimensionality Reduction
Name dimensionality_reduction
Description Method used to reduce dimensionality in the expression data
Example PCA
Reference #
Namespace ei:dimensionality_reduction
Allowed Values Diffusion Map ICA NMF PCA UMAP t-SNE
Number of Nearest Neighbours
Name n_neighbours
Description Number of nearest neighbours used to calculate cluster membership
Example pca:50
Reference #
Namespace ei:n_neighbours
Clustering Algorithm
Name clustering_algorithm
Description Algorithm used to create clusters
Reference #
Namespace ei:clustering_algorithm
Clustering Resolution
Name clustering_resolution
Description Resolution parameter
Example 2.5
Reference #
Regex ^([0-9]*[.])?[0-9]+
Namespace ei:clustering_resolution
Clustering Distance Metric
Name clustering_distance_metric
Description Metic used to calculate a points distance to others
Example cosine
Reference #
Namespace ei:clustering_distance_metric
Allowed Values cosine euclidean hamming jaccard manhatten mehalanobis
Software Versions
Name software_versions
Description Primary software packages used for analysis
Reference #
Namespace ei:software_versions
Cell Type Annotation
Name cell-type annotation
Description Tools and Databases used for cell annotation
Reference #
Namespace ei:cell-type annotation
Generated by Pipeline
Name generated_by_pipeline
Description URL of the deposited pipeline used to create this data
Reference #
Regex ^(https?|ftp):\/\/[^\s/$.?#].[^\s]*$
Namespace ei:generated_by_pipeline
Notes
Name notes
Description Any other information
Reference #
Namespace ei:notes
Study ID Required
Name study_id
Description A unique alphanumeric identifier for this study
Example STUDY001
Reference #
Regex ^[a-zA-Z0-9]+$
Namespace ei:study_id
File ID Required
Name expression_data_file_id
Description A unique alphanumeric identifier for the expression data file
Example EXPFILE001
Reference #
Regex ^[a-zA-Z0-9]+$
Namespace ei:expression_data_file_id
Library Preparation ID Required
Name library_prep_id
Description A unique alphanumeric identifier for library preparation
Example LIBPREP001
Reference #
Regex ^[a-zA-Z0-9]+$
Namespace ei:library_prep_id
Expression Data Process Setting ID Required
Name expression_data_setting_id
Description A unique alphanumeric identifier for the expression data process setting
Example EXPSET001
Reference #
Regex ^[a-zA-Z0-9]+$
Namespace ei:expression_data_setting_id
File Name Required
Name expression_data_file
Description Expression data file name
Example exp_file.csv
Reference #
Namespace ei:expression_data_file
File md5 Checkshum Required
Name expression_data_file_checksum
Description calculated md5 checksum for this file
Example 9e4b7a23f6c1d0ab85f29c47e3d8a610
Reference #
Regex ^[0-9a-f]{32}$
Namespace ei:expression_data_file_checksum
File Format Required
Name expression_data_file_format
Description The format of the expression file, such as h5ad or rds
Example csv
Reference #
Namespace ei:expression_data_file_format
Allowed Values csv h5ad loom mtx rds
Number of Cells
Name n_cells
Description The number of cells represented in the expression data
Example 4
Reference #
Regex ^\d+$
Namespace ei:n_cells
Number of Genes
Name n_genes
Description The number of genese represented in the expression data
Example 50
Reference #
Regex ^\d+$
Namespace ei:n_genes
File Size in Bytes
Name file_size_bytes
Description Size of the file recorded in bytes
Example 90
Reference #
Regex ^\d+$
Namespace ei:file_size_bytes
Date Generated
Name date_generated
Description Approximate date this expression data was generated
Example 2024-10-14
Reference #
Regex ^\d{4}-(0[1-9]|1[0-2])-(0[1-9]|[12]\d|3[01])$
Namespace ei:date_generated
Study ID Required
Name study_id
Description A unique alphanumeric identifier for this study
Example STUDY001
Reference #
Regex ^[a-zA-Z0-9]+$
Namespace ei:study_id
Project Name Required
Name project_name
Description Official name of the study or project. Project title should be fewer than 30 words, such as a title of a grant proposal or a publication.
Example Study of single cells in the human body
Reference https://w3id.org/mixs/0000092
Namespace mixs:project_name
Description
Name description
Description A detailed description of the project which includes research goals and experimental approach. Project description should be fewer than 300 words, such as an abstract from a grant application or publication.
Example This project explores the intricate details of single cells in the human body, focusing on their structure, function, and behaviour. By studying individual cells, it aims to uncover how they contribute to overall health, disease progression, and human biology. This research can provide deeper insights into cellular processes, paving the way for advancements in medical treatments and personalised medicine.
Reference http://purl.org/dc/terms/description
Namespace dcterms:description
Workflow
Name workflow
Description The workflow or protocol followed during the study.
Example Laser microdissection
Reference #
Namespace ei:workflow
Allowed Values Laser microdissection Laser microdissection, Culturing Laser microdissection, Culturing, Sequencing Laser microdissection, Sequencing Microfluidics, Facs, Culturing Microfluidics, Facs, Culturing, Sequencing Microfluidics, Facs, Sequencing Spatial Transcriptomics
Technology Required
Name technology
Description The sorting or visualisation technology used.
Example Vizgen
Reference #
Namespace ei:technology
Negative Control Type
Name neg_cont_type
Description The substance or equipment used as a negative control in an investigation
Example Phosphate buffer
Reference https://w3id.org/mixs/0001321
Namespace mixs:neg_cont_type
Allowed Values DNA-free PCR mix Distilled water Empty collection device Empty collection tube Phosphate buffer Sterile swab Sterile syringe
Positive Control Type
Name pos_cont_type
Description The substance, mixture, product, or apparatus used to verify that a process which is part of an investigation delivers a true positive
Example substance1
Reference https://w3id.org/mixs/0001322
Regex ^[a-zA-Z0-9]+$
Namespace mixs:pos_cont_type
Experimental Factor
Name experimental_factor
Description Variable aspects of an experiment design that can be used to describe an experiment, or set of experiments, in an increasingly detailed manner. This field accepts ontology terms from Experimental Factor Ontology (EFO) and/or Ontology for Biomedical Investigations (OBI)
Example EFO:0001779
Reference https://w3id.org/mixs/0000008
Regex ^[A-Z]{2,}:\d+$
Namespace mixs:experimental_factor
Relevant Electronic Resources
Name associated_resource
Description A related resource that is referenced, cited, or otherwise associated to the sequence.
Example https://arctos.database.museum/media/10520962 | https://arctos.database.museum/media/10520964
Reference https://w3id.org/mixs/0000091
Regex ^https?:\/\/(?:www\.)?[-a-zA-Z0-9@:%._\+~#=]{1,256}\.[a-zA-Z0-9()]{1,6}(?:[-a-zA-Z0-9()@:%_\+.~#?&\/=]*)+(?: \| https?:\/\/(?:www\.)?[-a-zA-Z0-9@:%._\+~#=]{1,256}\.[a-zA-Z0-9()]{1,6}(?:[-a-zA-Z0-9()@:%_\+.~#?&\/=]*)*)*$
Namespace mixs:associated_resource
Licence
Name licence
Description Specifies the terms under which the data associated with the study can be used, shared, or reused. It informs users how they may legally reference, distribute, or build upon the study. Common licenses include Creative Commons (e.g., CC BY 4.0), which require attribution to the original authors when the data is cited or reused.
Example MIT
Reference #
Namespace ei:licence
Allowed Values Apache-2.0 CC-BY-4.0 CC-BY-SA-4.0 CC0-1.0 GPL-3.0-or-later MIT
Study ID Required
Name study_id
Description A unique alphanumeric identifier for this study
Example STUDY001
Reference #
Regex ^[a-zA-Z0-9]+$
Namespace ei:study_id
Orcid ID
Name orcid_id
Description A 16-digit number that uniquely identify researchers.
Example 0000-1234-5678-9012
Reference #
Regex ^\d{4}-\d{4}-\d{4}-\d{3}[\dX]$
Namespace ei:orcid_id
First Name Required
Name givenName
Description A first name (or given name) is the personal name given to an individual conducting the study.
Example Jane
Reference https://schema.org/givenName
Regex ^[A-Za-z]+(?:[-\s][A-Za-z]+)*[a-z]+$
Namespace schema.org:givenName
Last Name Required
Name familyName
Description A last name (or surname) is the family name passed down from one generation to the next for the individual conducting the study.
Example Doe
Reference https://schema.org/familyName
Regex ^[A-Za-z]+(-[A-Za-z]+)*[a-z]+$
Namespace schema.org:familyName
Email Address
Name email
Description A unique identifier used to send and receive electronic messages (emails) over the internet.
Example jane.doe@example.com
Reference https://schema.org/email
Regex ^(?!.*\.{2,})(?!.*-{2,})[\w.-]+@[a-zA-Z\d.-]+\.[a-zA-Z]{2,}$
Namespace schema.org:email
Affiliation or Institution Required
Name affiliation
Description An organisation or institution that this person is associated with.
Example University of Liverpool
Reference https://schema.org/affiliation
Regex ^[A-Za-z]+(?: [A-Za-z]+)*[a-z]+$
Namespace schema.org:affiliation
Funder
Name funder
Description A person or organization that supports (sponsors) something through some kind of financial contribution.
Example BBSRC
Reference https://schema.org/funder
Namespace schema.org:funder
Grant Award
Name funding
Description A grant that directly or indirectly provides funding or sponsorship for the person to conduct the study.
Example GRAK3489
Reference https://schema.org/funding
Regex ^[A-Za-z0-9]+(?: [A-Za-z0-9]+)*$
Namespace schema.org:funding
Study ID Required
Name study_id
Description A unique alphanumeric identifier for the study if referring to
Example STUDY001
Reference #
Regex ^[a-zA-Z0-9]+$
Namespace ei:study_id
Sample ID Required
Name sample_id
Description A unique reference or identifier for the sample. This field must provide a consistent, unambiguous way to identify the sample within and across datasets. It can be a name, code, or accession-like format, as long as it remains unique.
Example SAMPLE001
Reference #
Namespace ei:sample_id
Scientific Name or Organism
Name scientific_name
Description The formal Latin name used to identify the organism from which the sample was derived (e.g. Homo sapiens or Arabidopsis thaliana). This name must accurately correspond to the Taxon ID provided to ensure correct taxonomic classification.
Example Salvelinus alpinus
Reference http://rs.tdwg.org/dwc/terms/scientificName
Regex ^[A-Za-z]+(?: [A-Za-z]+)*[a-z]+$
Namespace ontology:scientific_name
Taxon ID Required
Name taxon_id
Description A unique identifier (usually from a recognized taxonomy database like NCBI Taxonomy) that corresponds to the organism’s scientific name. It must be accurately matched to the provided scientificName to maintain consistency and traceability in biological records.
Example 8036
Reference http://rs.tdwg.org/dwc/terms/taxonID
Regex ^[0-9]+$
Namespace ontology:taxon_id
Biosample Accession Required
Name biosampleAccession
Description A unique identifier assigned to a biological sample after it has been submitted to a public database, such as the NCBI BioSample or ENA. It serves as a permanent reference to that specific sample, allowing researchers to retrieve metadata and link it across studies or datasets.
Example SAMEA12907823
Reference http://purl.obolibrary.org/obo/T4FS_0000316
Namespace ontology:biosampleAccession
Study ID Required
Name study_id
Description A unique alphanumeric identifier for this study
Example STUDY001
Reference #
Regex ^[a-zA-Z0-9]+$
Namespace ei:study_id
Dissociation Protocol ID Required
Name dissociation_protocol_id
Description A unique alphanumeric code for the dissociation protocol in the study
Example DISSOC001
Reference #
Regex ^[a-zA-Z0-9]+$
Namespace ei:dissociation_protocol_id
Protocol Name Required
Name protocol_name
Description A descriptive name of the protocol used for single-cell sequencing.
Example 10X Genomics Single Cell 3' Library Prep
Reference #
Namespace ei:protocol_name
Dissociation Description Required
Name dissociation_description
Description A free-text description of the process used to separate cells from tissues or cell aggregates.
Example Tissue was enzymatically dissociated using collagenase for 30 minutes.
Reference #
Namespace ei:dissociation_description
Enrichment Markers
Name enrichment_markers
Description Description of the specificity markers used to isolate cell populations, e.g. 'CD45+'. Please contact FAANG DCC to add more terms.
Example CD45
Reference #
Namespace faang:enrichment_markers
Isolation Kit
Name isolation_kit
Description The kit used to isolate the cells.
Example 10x Nuclei Isolation Kit
Reference #
Namespace ei:isolation_kit
Allowed Values 10x Nuclei Isolation Kit 3' standard throughput kit Custom
Literature Source Reference
Name literature_source_reference
Description Reference to literature sources that describe the protocol or methods used.
Example Doe et al. (2024), 'Single-cell RNA-seq: A comprehensive overview'
Reference #
Namespace ei:literature_source_reference
Protocols IO Reference
Name protocols_io_reference
Description Reference link to protocols.io for additional details on the protocol.
Example https://www.protocols.io/view/sample-protocol-b2ubqesn
Reference #
Regex ^https?:\/\/(?:www\.)?[-a-zA-Z0-9@:%._\+~#=]{1,256}\.[a-zA-Z0-9()]{1,6}(?:[-a-zA-Z0-9()@:%_\+.~#?&\/=]*)+(?: \| https?:\/\/(?:www\.)?[-a-zA-Z0-9@:%._\+~#=]{1,256}\.[a-zA-Z0-9()]{1,6}(?:[-a-zA-Z0-9()@:%_\+.~#?&\/=]*)*)*$
Namespace ei:protocols_io_reference
Workflowhub Sop Reference
Name workflow_hub_sop_reference
Description Reference to the Standard Operating Procedure (SOP) in workflow hub.
Example https://workflowhub.eu/works/12345
Reference #
Namespace ei:workflow_hub_sop_reference
Dissociation Protocol Method
Name dissociation_protocol_method
Description The method used to dissociate tissues into single cells.
Example Mechanical and enzymatic dissociation
Reference #
Namespace ei:dissociation_protocol_method
Single Cell Quality Metric
Name single_cell_quality_metric
Description Metrics used to assess the quality of single cells before sequencing.
Example Cell viability percentage
Reference #
Namespace ei:single_cell_quality_metric
Study ID Required
Name study_id
Description A unique alphanumeric identifier for this study
Example STUDY001
Reference #
Regex ^[a-zA-Z0-9]+$
Namespace ei:study_id
Cell Suspension ID Required
Name cell_suspension_id
Description A unique alphanumeric code for the cell suspension for the sample
Example CELLSUSP001
Reference #
Regex ^[a-zA-Z0-9]+$
Namespace ei:cell_suspension_id
Sample ID Required
Name sample_id
Description A unique reference or identifier for the sample associated with the cell suspension. This field must provide a consistent, unambiguous way to identify the sample within and across datasets. It can be a name, code, or accession-like format, as long as it remains unique.
Example SAMPLE001
Reference #
Namespace ei:sample_id
Dissociation Protocol ID Required
Name dissociation_protocol_id
Description A unique alphanumeric code for the dissociation protocol in the study
Example DISSOC001
Reference #
Regex ^[a-zA-Z0-9]+$
Namespace ei:dissociation_protocol_id
Suspension Type Required
Name suspension_type
Description The type of suspension used to keep cells in solution during processing.
Example Cell
Reference #
Namespace ei:suspension_type
Allowed Values Cell Nuclei Protoplast
Cell Count
Name cell_count
Description An number representing the number of cells in the sequencing library.
Example 10000
Reference #
Regex ^\d+$
Namespace ei:cell_count
Cell Viability
Name cell_viability
Description The percentage of living cells in a sample, indicating the health and quality of cells for RNA-sequencing analysis.
Example 95
Reference #
Namespace ei:cell_viability
Cell Viability Assessment Method
Name cell_viability_assessment_method
Description The method used to evaluate the viability of cells in the sample, often involving staining or flow cytometry techniques.
Example Trypan Blue Exclusion
Reference #
Namespace ei:cell_viability_assessment_method
Cell Size
Name cell_size
Description The size of the cell, typically measured in micrometres.
Example 10
Reference #
Namespace ei:cell_size
Suspension Volume (µL)
Name suspension_volume_µl
Description The volume of the cell suspension in microlitres (µL).
Example 100
Reference #
Namespace ei:suspension_volume_µl
Suspension Concentration Cells Per µL
Name suspension_concentration_cells_per_µl
Description The concentration of cells in the suspension in microlitres (µL).
Example 1000
Reference #
Namespace ei:suspension_concentration_cells_per_µl
Suspension Dilution
Name suspension_dilution
Description The dilution factor of the cell suspension.
Example 1:10
Reference #
Namespace ei:suspension_dilution
Loading Volume Μl
Name loading_volume_µl
Description The volume of the cell suspension loaded into the single-cell RNA-sequencing system for analysis.
Example 10
Reference #
Regex ^\d+$
Namespace ei:loading_volume_µl
Suspension Dilution Buffer
Name suspension_dilution_buffer
Description A solution used to dilute cell suspensions to a desired concentration, typically prior to loading cells into a device for single-cell RNA sequencing. It helps maintain cell viability and integrity during processing.
Example PBS (Phosphate-buffered saline) with 0.04% BSA (Bovine serum albumin)
Reference #
Namespace ei:suspension_dilution_buffer
Study ID Required
Name study_id
Description A unique alphanumeric identifier for this study
Example STUDY001
Reference #
Regex ^[a-zA-Z0-9]+$
Namespace ei:study_id
Library Preparation ID Required
Name library_prep_id
Description A unique alphanumeric reference or identifier for the library preparation protocol used during the sequencing.
Example LIBPREP001
Reference #
Regex ^[a-zA-Z0-9]+$
Namespace ei:library_prep_id
Cell Suspension ID Required
Name cell_suspension_id
Description A unique alphanumeric code for the cell suspension for the library preparation.
Example CELLSUSP001
Reference #
Regex ^[a-zA-Z0-9]+$
Namespace ei:cell_suspension_id
Library Preparation Kit Required
Name library_prep_kit
Description Packaged kits (containing adapters, indexes, enzymes, buffers etc.), tailored for specific sequencing workflows, which allow the simplified preparation of sequencing-ready libraries for small genomes, amplicons, and plasmids
Example 10X Genomics Single Cell 3' v3
Reference https://w3id.org/mixs/0001145
Namespace mixs:library_prep_kit
Library Preparation Kit Version Required
Name library_prep_kit_version
Description The version number of the library preparation kit used for sequencing.
Example 2
Reference http://purl.obolibrary.org/obo/GENEPIO_0000149
Regex ^\d+(\.\d+)?$
Namespace ontology:library_prep_kit_version
Amplification Method
Name amplification_method
Description The method used to amplify the Complementary DNA (cDNA).
Example PCR
Reference #
Namespace ei:amplification_method
cDNA Amplification Cycles
Name cdna_amplification_cycles
Description The number of cycles used during the Complementary DNA (cDNA) amplification process.
Example 12
Reference #
Regex ^\d+$
Namespace ei:cdna_amplification_cycles
Average Size Distribution
Name average_size_distribution
Description The average length of RNA fragments in base pairs (BP) after library preparation, indicating the quality and suitability of the RNA for sequencing.
Example 350
Reference #
Regex ^\d+$
Namespace ei:average_size_distribution
Library Construction Method
Name lib_construction_method
Description The library construction method (including version) that was used.
Example Smart-Seq2
Reference #
Namespace ei:lib_construction_method
Input Molecule
Name input_molecule
Description The specific fraction of biological macromolecule from which the sequencing library is derived.
Example RNA
Reference #
Namespace ei:input_molecule
Primer
Name primer
Description The type of primer used for reverse transcription. This allows users to identify content of the cDNA library input for mRNA.
Example Random
Reference #
Namespace ei:primer
Allowed Values Oligo-dT Random
Primeness Required
Name primeness
Description The end from which the molecule was sequenced.
Example 5'
Reference #
Namespace ei:primeness
Allowed Values 3' 5' Both
End Bias
Name end_bias
Description The end bias of the library.
Example 3
Reference #
Namespace ei:end_bias
Allowed Values 3 5
Library Strand
Name library_strand
Description The Complementary DNA (cDNA) strand of the library from which the reads derived from - sense (first), antisense (second), both or none.
Example Antisense
Reference #
Namespace ei:library_strand
Allowed Values Antisense Both Sense Unstranded
Spike In Required
Name spike_in
Description External RNA added to the sample as a control to assess technical variability and normalization in RNA-sequencing. State whether spike-in was used.
Example Yes
Reference #
Namespace ei:spike_in
Allowed Values No Yes
Spike Type
Name spike_type
Description The specific type of external RNA used for spiking in, often indicating the source or nature of the control RNA.
Example Synthetic RNA
Reference #
Namespace ei:spike_type
Spike In Dilution Or Concentration
Name spike_in_dilution_or_concentration
Description The final concentration or dilution (for commercial sets) of the spike in mix.
Example 1:1000
Reference #
Namespace ei:spike_in_dilution_or_concentration
i5 Index Required
Name i5_index
Description Barcode sequence used on the i5 adapter during library preparation for identifying samples in multiplexed single-cell RNA-sequencing.
Example ATCACG
Reference #
Namespace ei:i5_index
i7 Index Required
Name i7_index
Description Barcode sequence used on the i7 adapter to distinguish samples in multiplexed sequencing runs.
Example CGATGT
Reference #
Namespace ei:i7_index
Dual or Single Index Required
Name dual_single_index
Description Specifies if both i5 and i7 indices (dual) or only one index (single) was used for sample identification during sequencing.
Example Dual
Reference #
Namespace ei:dual_single_index
Allowed Values Dual Single
I5 Sequence Required
Name i5_sequence
Description The nucleotide sequence of the i5 index used in multiplexing during sequencing.
Example ATCGTAGC
Reference #
Namespace ei:i5_sequence
i7 Sequence Required
Name i7_sequence
Description The specific nucleotide sequence of the i7 index used for a sample.
Example TGCATGCA
Reference #
Namespace ei:i7_sequence
Plate ID
Name plate_id
Description Identifier for the 96-well plate used in sample preparation.
Example PLT001
Reference #
Regex ^[a-zA-Z0-9]+$
Namespace ei:plate_id
Well Row
Name well_row
Description The row identifier in a 96-well plate indicating the sample's position.
Example A
Reference #
Namespace ei:well_row
Well Column
Name well_col
Description The column identifier in a 96-well plate indicating the sample's position.
Example 5
Reference #
Regex ^\d+$
Namespace ei:well_col
Cell Phenotype
Name cell_phenotype
Description The cell marker for the Fluorescence-Activated Cell Sorting (FACS) of cells.
Example CD41-
Reference #
Namespace ei:cell_phenotype
Allowed Values CD41+ CD41-
Nucleic Acid Amplification
Name nucl_acid_amp
Description A link to a literature reference, electronic resource or a standard operating procedure (SOP), that describes the enzymatic amplification (PCR, TMA, NASBA) of specific nucleic acids. The link can be a PMID, DOI or URL.
Example https://phylogenomics.me/protocols/16s-pcr-protocol/
Reference https://w3id.org/mixs/0000050
Regex ^PMID:\d+$|^doi:10.\d{2,9}/.*$|^https?:\/\/(?:www\.)?[-a-zA-Z0-9@:%._\+~#=]{1,256}\.[a-zA-Z0-9()]{1,6}\b(?:[-a-zA-Z0-9()@:%_\+.~#?&\/=]*)$|([^\s-]{1,2}|[^\s-]+.+[^\s-]+)$
Namespace mixs:nucl_acid_amp
Nucleic Acid Extraction
Name nucl_acid_ext
Description A link to a literature reference, electronic resource or a standard operating procedure (SOP), that describes the material separation to recover the nucleic acid fraction from a sample
Example https://mobio.com/media/wysiwyg/pdfs/protocols/12888.pdf
Reference https://w3id.org/mixs/0000038
Regex ^PMID:\d+$|^doi:10.\d{2,9}/.*$|^https?:\/\/(?:www\.)?[-a-zA-Z0-9@:%._\+~#=]{1,256}\.[a-zA-Z0-9()]{1,6}\b(?:[-a-zA-Z0-9()@:%_\+.~#?&\/=]*)$|([^\s-]{1,2}|[^\s-]+.+[^\s-]+)$
Namespace mixs:nucl_acid_ext
Amount or Size of Sample Collected
Name samp_size
Description The total amount or size (volume (ml), mass (g) or area (m2) ) of sample collected
Example 5 litre
Reference https://w3id.org/mixs/0000037
Regex ^[-+]?[0-9]*\.?[0-9]+(?:[eE][-+]?[0-9]+)?( *- *[-+]?[0-9]*\.?[0-9]+(?:[eE][-+]?[0-9]+)?)? *([^\s-]{1,2}|[^\s-]+.+[^\s-]+)$
Namespace mixs:samp_size
Estimated Size
Name estimated_size
Description The estimated size of the genome prior to sequencing in base pairs (bp). Of particular importance in the sequencing of (eukaryotic) genome which could remain in draft form for a long or unspecified period
Example 300000
Reference https://w3id.org/mixs/0000001
Namespace mixs:estimated_size
Sample Volume or Weight for DNA Extraction
Name samp_vol_we_dna_ext
Description Volume (ml) or mass (g) of total collected sample processed for DNA extraction.
Example 1500 milliliter
Reference https://w3id.org/mixs/0000024
Regex ^[-+]?[0-9]*\.?[0-9]+(?:[eE][-+]?[0-9]+)?(?: *- *[-+]?[0-9]*\.?[0-9]+(?:[eE][-+]?[0-9]+)?)? *(milliliter|gram|milligram|square centimeter)$
Namespace mixs:samp_vol_we_dna_ext
Library Vector
Name lib_vector
Description Cloning vector type(s) used in construction of libraries
Example Bacteriophage P1
Reference https://w3id.org/mixs/0000041
Namespace mixs:lib_vector
Adapters
Name adapters
Description Adapters provide priming sequences for both amplification and sequencing of the sample-library fragments. Both adapters should be reported; in uppercase letters
Example AATGATACGGCGACCACCGAGATCTACACGCT;CAAGCAGAAGACGGCATACGAGAT
Reference https://w3id.org/mixs/0000042
Namespace mixs:adapters
Sample Material Processing
Name samp_mat_process
Description A brief description of any processing applied to the sample during or after retrieving the sample from environment, or a link to the relevant protocol(s) performed
Example filtering of seawater, storing samples in ethanol
Reference https://w3id.org/mixs/0000048
Namespace mixs:samp_mat_process
Design description
Name design_description
Description The design of the library including details of how it was constructed.
Reference #
Namespace ei:design_description
Library selection Required
Name library_selection
Description The method used to select for or against, enrich, or screen the material being sequenced.
Example RANDOM PCR
Reference #
Namespace ei:library_selection
Allowed Values 5-methylcytidine antibody CAGE ChIP ChIP-Seq Dnase HMPR Hybrid Selection Inverse rRNA Inverse rRNA selection MBD2 protein methyl-CpG binding domain MDA MF MSLL Mnase Oligo-dT PCR PolyA RACE RANDOM RANDOM PCR RT-PCR Reduced Representation Restriction Digest cDNA cDNA_oligo_dT cDNA_randomPriming other padlock probes capture method repeat fractionation size fractionation unspecified
Library source Required
Name library_source
Description The type of source material that is being sequenced.
Example GENOMIC
Reference #
Namespace ei:library_source
Allowed Values GENOMIC GENOMIC SINGLE CELL METAGENOMIC METATRANSCRIPTOMIC OTHER SYNTHETIC TRANSCRIPTOMIC TRANSCRIPTOMIC SINGLE CELL VIRAL RNA
Library strategy Required
Name library_strategy
Description The sequencing technique intended for this library.
Example RNA-Seq
Reference #
Namespace ei:library_strategy
Allowed Values AMPLICON ATAC-seq Bisulfite-Seq CLONE CLONEEND CTS ChIA-PET ChIP-Seq ChM-Seq DNase-Hypersensitivity EST FAIRE-seq FINISHING FL-cDNA GBS Hi-C MBD-Seq MNase-Seq MRE-Seq MeDIP-Seq NOMe-Seq OTHER POOLCLONE RAD-Seq RIP-Seq RNA-Seq Ribo-Seq SELEX Synthetic-Long-Read Targeted-Capture Tethered Chromatin Conformation Capture Tn-Seq VALIDATION WCS WGA WGS WXS miRNA-Seq ncRNA-Seq snRNA-seq ssRNA-seq
Study ID Required
Name study_id
Description A unique alphanumeric identifier for this study
Example STUDY001
Reference #
Regex ^[a-zA-Z0-9]+$
Namespace ei:study_id
Sequencing ID Required
Name sequencing_id
Description A unique alphanumeric reference or identifier for the sequencing protocol.
Example SEQ001
Reference https://w3id.org/mixs/0000016
Regex ^[a-zA-Z0-9]+$
Namespace ontology:sequencing_id
Sequencing Platform Name Required
Name sequencing_platform_name
Description The name of the sequencing platform used for the experiment.
Example Pacbio
Reference http://purl.obolibrary.org/obo/NCIT_C172274
Namespace ontology:sequencing_platform_name
Sequencing Instrument Model Required
Name sequencing_instrument_model
Description This refers to the machine or platform used for sequencing, with variations in throughput, read lengths, error rates, and application suitability.
Example Illumina NovaSeq 6000
Reference http://purl.obolibrary.org/obo/GENEPIO_0000149
Namespace ontology:sequencing_instrument_model
Allowed Values 454 GS 454 GS 20 454 GS FLX 454 GS FLX Titanium 454 GS FLX+ 454 GS Junior AB 310 Genetic Analyzer AB 3130 Genetic Analyzer AB 3130xL Genetic Analyzer AB 3500 Genetic Analyzer AB 3500xL Genetic Analyzer AB 3730 Genetic Analyzer AB 3730xL Genetic Analyzer AB 5500 Genetic Analyzer AB 5500xl Genetic Analyzer AB 5500xl-W Genetic Analysis System AB SOLiD 3 Plus System AB SOLiD 4 System AB SOLiD 4hq System AB SOLiD PI System AB SOLiD System AB SOLiD System 2.0 AB SOLiD System 3.0 BGISEQ-50 BGISEQ-500 Complete Genomics DNBSEQ-G400 DNBSEQ-G400 FAST DNBSEQ-G50 DNBSEQ-T10x4RS DNBSEQ-T7 Element AVITI FASTASeq 300 GENIUS GS111 Genapsys Sequencer GenoCare 1600 GenoLab M GridION Illumina Genome Analyzer Illumina Genome Analyzer II Illumina Genome Analyzer IIx Illumina HiScanSQ Illumina HiSeq 1000 Illumina HiSeq 1500 Illumina HiSeq 2000 Illumina HiSeq 2500 Illumina HiSeq 3000 Illumina HiSeq 4000 Illumina HiSeq X Illumina HiSeq X Five Illumina HiSeq X Ten Illumina MiSeq Illumina MiniSeq Illumina NextSeq 500 Illumina NextSeq 550 Illumina NovaSeq 6000 Illumina NovaSeq X Illumina NovaSeq X Plus Illumina iSeq 100 Ion GeneStudio S5 Ion GeneStudio S5 Plus Ion GeneStudio S5 Prime Ion Torrent Genexus Ion Torrent PGM Ion Torrent Proton Ion Torrent S5 Ion Torrent S5 XL MGISEQ-2000RS MinION NextSeq 1000 NextSeq 2000 Onso PacBio RS PacBio RS II PromethION Revio Sentosa SQ301 Sequel Sequel II Sequel IIe Tapestri UG 100
Library Layout Required
Name lib_layout
Description Specify whether to expect single, paired, or other configuration of reads for sequencing
Example Paired
Reference https://w3id.org/mixs/0000111
Namespace mixs:lib_layout
Allowed Values Other Paired Single Vector
UMI Barcode Read
Name umi_barcode_read
Description The type of read that contains the Unique Molecular Identifier (UMI) barcode.
Example index2
Reference #
Namespace ei:umi_barcode_read
Allowed Values index1 index2 read1 read2
UMI Barcode Offset
Name umi_barcode_offset
Description The offset in sequence of the Unique Molecular Identifier (UMI) identifying barcode.
Example 0
Reference #
Regex ^\d+$
Namespace ei:umi_barcode_offset
UMI Barcode Size
Name umi_barcode_size
Description The size of the Unique Molecular Identifier (UMI) identifying barcode.
Example 10
Reference #
Regex ^\d+$
Namespace ei:umi_barcode_size
Cell Barcode Read
Name cell_barcode_read
Description The type of read that contains the UMI barcode.
Example index1
Reference http://www.ebi.ac.uk/efo/EFO_0010203
Namespace ontology:cell_barcode_read
Allowed Values index1 index2 read1 read2
Cell Barcode Offset
Name cell_barcode_offset
Description The offset in sequence of the cell identifying barcode.
Example 10
Reference http://www.ebi.ac.uk/efo/EFO_0010204
Regex ^\d+$
Namespace ontology:cell_barcode_offset
Cell Barcode Size
Name cell_barcode_size
Description The offset in sequence of the cell identifying barcode.
Example 0
Reference http://www.ebi.ac.uk/efo/EFO_0010205
Regex ^\d+$
Namespace ontology:cell_barcode_size
cDNA Read Required
Name cdna_read
Description The actual nucleotide sequence obtained from Complementary DNA (cDNA) during sequencing.
Example read1
Reference http://www.ebi.ac.uk/efo/EFO_0010195
Namespace ontology:cdna_read
Allowed Values index1 index2 read1 read2
cDNA Read Offset
Name cdna_read_offset
Description The starting position of the Complementary DNA (cDNA) read within the entire sequence, indicating where the read begins after any barcodes or technical sequences.
Example 6
Reference http://www.ebi.ac.uk/efo/EFO_0010201
Regex ^\d+$
Namespace ontology:cdna_read_offset
cDNA Read Size
Name cdna_read_size
Description The size of the Complementary DNA (cDNA) read.
Example 75
Reference http://www.ebi.ac.uk/efo/EFO_0010202
Regex ^\d+$
Namespace ontology:cdna_read_size
Library Size
Name lib_size
Description Total number of clones in the library prepared for the project
Example 50
Reference https://w3id.org/mixs/0000039
Regex ^\d+$
Namespace mixs:lib_size
Completeness Score
Name compl_score
Description Completeness score is typically based on either the fraction of markers found as compared to a database or the percent of a genome found as compared to a closely related reference genome. High Quality Draft: >90%, Medium Quality Draft: >50%, and Low Quality Draft: < 50% should have the indicated completeness scores
Example med;60%
Reference https://w3id.org/mixs/0000069
Namespace mixs:compl_score
Library Reads Sequenced
Name lib_reads_seq
Description Total number of clones sequenced from the library
Example 20
Reference https://w3id.org/mixs/0000040
Regex ^\d+$
Namespace mixs:lib_reads_seq
Number of Contigs
Name number_contig
Description Total number of contigs in the cleaned/submitted assembly that makes up a given genome, SAG, MAG, or UViG
Example 40
Reference https://w3id.org/mixs/0000060
Namespace mixs:number_contig
Number of Replicons
Name num_replicons
Description Reports the number of replicons in a nuclear genome of eukaryotes, in the genome of a bacterium or archaea or the number of segments in a segmented virus. Always applied to the haploid chromosome count of a eukaryote
Example 2
Reference https://w3id.org/mixs/0000022
Regex ^\d+$
Namespace mixs:num_replicons
Study ID Required
Name study_id
Description A unique alphanumeric identifier for this study
Example STUDY001
Reference #
Regex ^[a-zA-Z0-9]+$
Namespace ei:study_id
File Derived From
Name file_derived_from
Description The name of the file that was used to generate the analysis derived data.
Example file1_sequencing.json
Reference #
Namespace ei:file_derived_from
Inferred Cell Type
Name inferred_cell_type
Description Post analysis cell type or identity declaration based on expression profile or known gene function identified by the performer.
Example type II bipolar neuron
Reference #
Namespace ei:inferred_cell_type
Post Analysis Cell Well Quality
Name post_analysis_cell_well_quality
Description Performer defined measure of whether the read output from the cell was included in the sequencing analysis. For example, cells might be excluded if a threshold percentage of reads did not map to the genome or if pre-sequencing quality measures were not passed.
Example Pass
Reference #
Namespace ei:post_analysis_cell_well_quality
Allowed Values Fail Pass
Other Derived Cell Attributes
Name other_derived_cell_attributes
Description Any other cell level measurement or annotation as result of the analysis.
Example Cluster
Reference #
Namespace ei:other_derived_cell_attributes
Allowed Values Cluster Count Gene UMI tSNE coordinates
Study ID Required
Name study_id
Description A unique alphanumeric identifier for this study
Example STUDY001
Reference #
Regex ^[a-zA-Z0-9]+$
Namespace ei:study_id
Reference Genome
Name reference_genome
Description Indicate version and include stable link to genome data (or attach genome fasta file).
Example GRCh38, https://example.org/grch38.fa
Reference #
Namespace ei:reference_genome
Genome Annotation
Name genome_annotation
Description Indicate version and include stable link. Also indicate if any modification to the original annotation has been applied (e.g. 3' UTR extension) and include modified annotation file employed in the analysis.
Example Ensembl v101, https://example.org/ensembl_v101.gtf
Reference #
Namespace ei:genome_annotation
Annotation Filtering
Name annotation_filtering
Description Indicate which features were filtered (i.e. protein coding, pseudo-genes, TCRs, etc.)
Example Filtered to include only protein-coding genes
Reference #
Namespace ei:annotation_filtering
Genes vs Exons
Name genes_vs_exons
Description Quantification using whole gene intervals or exons.
Example Exon quantification
Reference #
Namespace ei:genes_vs_exons
Library Structure
Name library_structure
Description seqspec format
Example Single-cell 3' library
Reference #
Namespace ei:library_structure
Mapping and Demultiplexing Software
Name mapping_and_demultiplexing_software
Description Reads/UMI
Example Cell Ranger 6.0.0
Reference #
Namespace ei:mapping_and_demultiplexing_software
Read Mapping Statistics
Name read_mapping_statistics
Description Statistics of the Reads or Unique Molecular Identifier (UMI).
Example 80% reads mapped to reference
Reference #
Namespace ei:read_mapping_statistics
Sequencing Saturation
Name sequencing_saturation
Description Depending on number of cells recovered (not targeted) and technology
Example 95% sequencing saturation
Reference #
Namespace ei:sequencing_saturation
UMIs or Barcode Distribution QC
Name umis_barcode_distribution_qc
Description Show Unique Molecular Identifiers (UMIs) per barcode distribution and threshold applied
Example Threshold: 10 UMIs per barcode
Reference #
Namespace ei:umis_barcode_distribution_qc
Cell or Non-Cell Filtering Strategy
Name cell_non_cell_filtering_strategy
Description Unique Molecular Identifier (UMI) threshold used to discriminate cells from non-cells. Description of algorithm (if any) and parameters used to determine cells or non-cells.
Example Threshold: 5 UMIs for cell detection
Reference #
Namespace ei:cell_non_cell_filtering_strategy
Other Quality Filters Applied
Name other_quality_filters_applied
Description Cells/nuclei discarded based on % mitochondrial reads, % rRNA reads, etc.
Example Cells with >20% mitochondrial reads discarded
Reference #
Namespace ei:other_quality_filters_applied
Ambient RNA QC
Name ambient_rna_qc
Description Report % UMIs in background cell barcodes, and algorithm (if any) used to remove ambient RNA
Example Ambient RNA removed if >5% UMIs in background barcodes
Reference #
Namespace ei:ambient_rna_qc
Predicted Doublet Rate QC
Name predicted_doublet_rate_qc
Description Depending on number of cells recovered (not targeted) and technology
Example Predicted doublet rate: 1.5%
Reference #
Namespace ei:predicted_doublet_rate_qc
Individual Organism SNP Demultiplexing
Name individual_organism_snp_demultiplexing
Description If carried out, show SNP partitioning quality (e.g. SNP UMAP embedding or covariance matrix), algorithm used
Example SNP UMAP embedding using CellSNP
Reference #
Namespace ei:individual_organism_snp_demultiplexing
Assembly Name
Name assembly_name
Description Name/version of the assembly provided by the submitter that is used in the genome browsers and in the community
Example JCVI_ISG_i3_1.0
Reference https://w3id.org/mixs/0000057
Namespace mixs:assembly_name
Extrachromosomal Elements
Name extrachrom_elements
Description Do plasmids exist of significant phenotypic consequence (e.g. ones that determine virulence or antibiotic resistance). Megaplasmids? Other plasmids (borrelia has 15+ plasmids)
Example 5
Reference https://w3id.org/mixs/0000023
Regex ^\d+$
Namespace mixs:extrachrom_elements
Assembly Quality
Name assembly_qual
Description The assembly quality category is based on sets of criteria outlined for each assembly quality category.
Example High-quality draft genome
Reference https://w3id.org/mixs/0000056
Namespace mixs:assembly_qual
Allowed Values Finished genome Genome fragment(s) High-quality draft genome Low-quality draft genome Medium-quality draft genome
Assembly Software
Name assembly_software
Description Tool(s) used for assembly, including version number and parameters
Example metaSPAdes;3.11.0;kmer set 21,33,55,77,99,121, default parameters otherwise
Reference https://w3id.org/mixs/0000058
Namespace mixs:assembly_software
Annotation
Name annot
Description Tool used for annotation, or for cases where annotation was provided by a community jamboree or model organism database rather than by a specific submitter
Example prokka
Reference https://w3id.org/mixs/0000059
Namespace mixs:annot
Feature Prediction
Name feat_pred
Description Method used to predict UViGs features such as ORFs, integration site, etc
Example Prodigal;2.6.3;default parameters
Reference https://w3id.org/mixs/0000061
Regex ^([^\s-]{1,2}|[^\s-]+.+[^\s-]+);([^\s-]{1,2}|[^\s-]+.+[^\s-]+);([^\s-]{1,2}|[^\s-]+.+[^\s-]+)$
Namespace mixs:feat_pred
Completeness Software
Name compl_software
Description Tools used for completion estimate, i.e. checkm, anvi'o, busco
Example checkm
Reference https://w3id.org/mixs/0000070
Namespace mixs:compl_software
Similarity Search Method
Name sim_search_meth
Description Tool used to compare ORFs with database, along with version and cutoffs used
Example HMMER3;3.1b2;hmmsearch, cutoff of 50 on score
Reference https://w3id.org/mixs/0000063
Regex ^([^\s-]{1,2}|[^\s-]+.+[^\s-]+);([^\s-]{1,2}|[^\s-]+.+[^\s-]+);([^\s-]{1,2}|[^\s-]+.+[^\s-]+)$
Namespace mixs:sim_search_meth
Relevant Standard Operating Procedures
Name sop
Description Standard operating procedures used in assembly and/or annotation of genomes, metagenomes or environmental sequences
Example http://press.igsb.anl.gov/earthmicrobiome/protocols-and-standards/its/
Reference https://w3id.org/mixs/0000090
Namespace mixs:sop
Study ID Required
Name study_id
Description A unique alphanumeric identifier for this study
Example STUDY001
Reference #
Regex ^[a-zA-Z0-9]+$
Namespace ei:study_id
Clustering Algorithm and Version
Name clustering_algorithm_and_version
Description If compared/integrated with existing datasets
Example Louvain 0.8.0
Reference #
Namespace ei:clustering_algorithm_and_version
Clustering Parameters
Name clustering_parameters
Description If compared/integrated with existing datasets
Example Resolution: 0.6, K-nearest neighbors: 10
Reference #
Namespace ei:clustering_parameters
Integration/Batch Correction
Name integration_batch_correction
Description If compared/integrated with existing datasets
Example Harmony v1.0
Reference #
Namespace ei:integration_batch_correction
Source Code
Name source_code
Description If any newly developed code/software has been used in the processing and downstream analysis of the dataset.
Example Source code is hosted on GitHub and includes custom algorithms for UMI count normalization. The repository can be found at: https://github.com/user/umi-normalization.
Reference #
Namespace ei:source_code
UMI Count Matrix
Name umi_count_matrix
Description Gene x cell matrix with UMI counts for each gene in each cell.
Example The UMI count matrix is stored in a CSV file with gene IDs as rows (e.g., ENSG00000139618) and cell barcodes as columns (e.g., Cell_001, Cell_002). The matrix file is available at: https://example.com/umi_count_matrix.csv.
Reference #
Namespace ei:umi_count_matrix
Ensembl IDs
Name ensembl_ids
Description Gene or transcript names should be listed as Ensembl (or other standardized ID), with gene short names in metadata.
Example ENSG00000139618
Reference #
Regex ^[a-zA-Z0-9]+$
Namespace ei:ensembl_ids
Functional Gene Annotations
Name functional_gene_annotations
Description Any functional annotation generated/used (gene names, GOs, structural domains, etc.).
Example Functional gene annotations, including Gene Ontology (GO) terms, are provided in the metadata. For example, the gene 'ENSG00000139618' (BRCA1) is annotated with the GO term 'GO:0003674' (DNA binding).
Reference #
Namespace ei:functional_gene_annotations
Protein Models
Name protein_models
Description FASTA file with (or stable link to) the predicted proteins associated to genes in the UMI count matrix and matching IDs.
Example The protein sequences for genes are provided in a FASTA file available at: https://example.com/protein_models.fasta, where each protein sequence is linked to the corresponding gene ID.
Reference #
Namespace ei:protein_models
Cell Metadata
Name cell_metadata
Description Table mapping cell IDs to cluster/cell type/broad cell type annotations.
Example Cell metadata includes information such as cell type annotations ('Tumor', 'Normal') and experimental conditions ('Control', 'Treatment'). This data is available in a table at: https://example.com/cell_metadata.csv.
Reference #
Namespace ei:cell_metadata
Cluster-Level Normalised Expression Tables
Name cluster_level_normalised_expression_tables
Description Expression tables that show normalised gene expression at the cluster or cell-type level.
Example Normalised gene expression data at the cluster level is provided in a tab-delimited text file. For example, gene 'ENSG00000139618' (BRCA1) has expression values for clusters: Cluster_1: 1200, Cluster_2: 900. The full expression table is available at: https://example.com/cluster_level_expression.csv.
Reference #
Namespace ei:cluster_level_normalised_expression_tables
Other Resource Files
Name other_resource_files
Description Necessary to re-use and interpret the data. E.g. barcode information in complex, serial multiplexing protocols (clicktags).
Example Barcode information used in multiplexing protocols is provided in a separate file, which can be accessed at: https://example.com/barcode_data.csv.
Reference #
Namespace ei:other_resource_files
Study ID Required
Name study_id
Description A unique alphanumeric identifier for this study
Example STUDY001
Reference #
Regex ^[a-zA-Z0-9]+$
Namespace ei:study_id
File ID Required
Name file_id
Description A unique alphanumeric identifier for this file
Example STUDY001
Reference #
Regex ^[a-zA-Z0-9]+$
Namespace ei:file_id
Library Preparation ID Required
Name library_prep_id
Description A unique alphanumeric reference or identifier for the library preparation protocol used during the sequencing.
Example LIBPREP001
Reference #
Regex ^[a-zA-Z0-9]+$
Namespace ei:library_prep_id
Sequencing ID Required
Name sequencing_id
Description A unique alphanumeric reference or identifier for the sequencing protocol.
Example SEQ001
Reference #
Regex ^[a-zA-Z0-9]+$
Namespace ei:sequencing_id
Read 1 File Required
Name read_1_file
Description The name or accession of the file that contains read 1.
Example file1_r1.fastq.gz
Reference #
Namespace ei:read_1_file
Read 2 File
Name read_2_file
Description The name or accession of the file that contains read 2.
Example file2_r2.fastq.gz
Reference #
Namespace ei:read_2_file
Index 1 File
Name index_1_file
Description The name of the file that contains index 1.
Example file1_i1.fastq.gz
Reference #
Namespace ei:index_1_file
Index 2 File
Name index_2_file
Description The name of the file that contains index 2.
Example file2_i2.fastq.gz
Reference #
Namespace ei:index_2_file
Read 1 Checksum Required
Name read_1_file_checksum
Description Result of a hash function calculated on the content of the read 1 file to verify file integrity. Commonly used algorithms include MD5 and SHA-1. The checksums should be separated by a comma (,).
Example f8d29e41a73b5c02de9a6fb314e7c8ad
Reference #
Regex ^[0-9a-f]{32}$
Namespace ei:read_1_file_checksum
Read 2 Checksum
Name read_2_file_checksum
Description Result of a hash function calculated on the content of the read 2 file to verify file integrity. Commonly used algorithms include MD5 and SHA-1. The checksums should be separated by a comma (,).
Example a3f4c1b29d8e57fa41b02de6c7f9ab83
Reference #
Regex ^[0-9a-f]{32}$
Namespace ei:read_2_file_checksum
White List Barcode File
Name white_list_barcode_file
Description A file containing the known cell barcodes in the dataset.
Example barcodes.tsv
Reference #
Namespace ei:white_list_barcode_file
Study ID Required
Name study_id
Description A unique alphanumeric identifier for this study
Example STUDY001
Reference #
Regex ^[a-zA-Z0-9]+$
Namespace ei:study_id
Expression Data Process Setting ID Required
Name expression_data_process_setting_id
Description A unique alphanumeric identifier for the expression data process setting
Example EXPSET001
Reference #
Regex ^[a-zA-Z0-9]+$
Namespace ei:expression_data_process_setting_id
Matrix Type
Name matrix_type
Description Matrix Type
Example raw_counts
Reference #
Namespace ei:matrix_type
Allowed Values imputed log1p nomalised pseudobulk raw_counts scaled
Reference Genome Required
Name reference_genome
Description The associated reference genome
Example https://reference-genome-example.com
Reference #
Regex ^((https?|ftp):\/\/[^\s|]+)(\|((https?|ftp):\/\/[^\s|]+))*$
Namespace ei:reference_genome
Annotation Version
Name annotation_version
Description The annotation version of the associated reference genome
Example GENCODE v44
Reference #
Namespace ei:annotation_version
Normalisation Method
Name normalisation_method
Description Any normalisation processing performed
Example Log normalisation
Reference #
Namespace ei:normalisation_method
Allowed Values Library Size Normalisation Log Normalisation SCNorm SCTransform scran
Highly Variable Gene Selection (HVG)
Name highly_variable_gene_selection
Description Number of Highly Variable Genes
Example seurat_v3, n=2000
Reference #
Namespace ei:highly_variable_gene_selection
Dimensionality Reduction
Name dimensionality_reduction
Description Method used to reduce dimensionality in the expression data
Example PCA
Reference #
Namespace ei:dimensionality_reduction
Allowed Values Diffusion Map ICA NMF PCA UMAP t-SNE
Number of Nearest Neighbours
Name n_neighbours
Description Number of nearest neighbours used to calculate cluster membership
Example pca:50
Reference #
Namespace ei:n_neighbours
Clustering Algorithm
Name clustering_algorithm
Description Algorithm used to create clusters
Reference #
Namespace ei:clustering_algorithm
Clustering Resolution
Name clustering_resolution
Description Resolution parameter
Example 2.5
Reference #
Regex ^([0-9]*[.])?[0-9]+
Namespace ei:clustering_resolution
Clustering Distance Metric
Name clustering_distance_metric
Description Metic used to calculate a points distance to others
Example cosine
Reference #
Namespace ei:clustering_distance_metric
Allowed Values cosine euclidean hamming jaccard manhatten mehalanobis
Software Versions
Name software_versions
Description Primary software packages used for analysis
Reference #
Namespace ei:software_versions
Cell Type Annotation
Name cell-type annotation
Description Tools and Databases used for cell annotation
Reference #
Namespace ei:cell-type annotation
Generated by Pipeline
Name generated_by_pipeline
Description URL of the deposited pipeline used to create this data
Reference #
Regex ^(https?|ftp):\/\/[^\s/$.?#].[^\s]*$
Namespace ei:generated_by_pipeline
Notes
Name notes
Description Any other information
Reference #
Namespace ei:notes
Study ID Required
Name study_id
Description A unique alphanumeric identifier for this study
Example STUDY001
Reference #
Regex ^[a-zA-Z0-9]+$
Namespace ei:study_id
File ID Required
Name expression_data_file_id
Description A unique alphanumeric identifier for the expression data file
Example EXPFILE001
Reference #
Regex ^[a-zA-Z0-9]+$
Namespace ei:expression_data_file_id
Library Preparation ID Required
Name library_prep_id
Description A unique alphanumeric identifier for library preparation
Example LIBPREP001
Reference #
Regex ^[a-zA-Z0-9]+$
Namespace ei:library_prep_id
Expression Data Process Setting ID Required
Name expression_data_setting_id
Description A unique alphanumeric identifier for the expression data process setting
Example EXPSET001
Reference #
Regex ^[a-zA-Z0-9]+$
Namespace ei:expression_data_setting_id
File Name Required
Name expression_data_file
Description Expression data file name
Example exp_file.csv
Reference #
Namespace ei:expression_data_file
File md5 Checkshum Required
Name expression_data_file_checksum
Description calculated md5 checksum for this file
Example 9e4b7a23f6c1d0ab85f29c47e3d8a610
Reference #
Regex ^[0-9a-f]{32}$
Namespace ei:expression_data_file_checksum
File Format Required
Name expression_data_file_format
Description The format of the expression file, such as h5ad or rds
Example csv
Reference #
Namespace ei:expression_data_file_format
Allowed Values csv h5ad loom mtx rds
Number of Cells
Name n_cells
Description The number of cells represented in the expression data
Example 4
Reference #
Regex ^\d+$
Namespace ei:n_cells
Number of Genes
Name n_genes
Description The number of genese represented in the expression data
Example 50
Reference #
Regex ^\d+$
Namespace ei:n_genes
File Size in Bytes
Name file_size_bytes
Description Size of the file recorded in bytes
Example 90
Reference #
Regex ^\d+$
Namespace ei:file_size_bytes
Date Generated
Name date_generated
Description Approximate date this expression data was generated
Example 2024-10-14
Reference #
Regex ^\d{4}-(0[1-9]|1[0-2])-(0[1-9]|[12]\d|3[01])$
Namespace ei:date_generated
Study ID Required
Name study_id
Description A unique alphanumeric identifier for this study
Example STUDY001
Reference #
Regex ^[a-zA-Z0-9]+$
Namespace ei:study_id
Title
Name title
Description A name given to the study or project. Project title should be fewer than 30 words, such as a title of a grant proposal or a publication.
Example Study of single cells in the human body
Reference http://purl.org/dc/terms/title
Namespace dcterms:title
Description
Name description
Description A detailed description of the project which includes research goals and experimental approach. Project description should be fewer than 300 words, such as an abstract from a grant application or publication.
Example This project explores the intricate details of single cells in the human body, focusing on their structure, function, and behaviour. By studying individual cells, it aims to uncover how they contribute to overall health, disease progression, and human biology. This research can provide deeper insights into cellular processes, paving the way for advancements in medical treatments and personalised medicine.
Reference http://purl.org/dc/terms/description
Namespace dcterms:description
Workflow
Name workflow
Description The workflow or protocol followed during the study.
Example Laser microdissection
Reference #
Namespace ei:workflow
Allowed Values Laser microdissection Laser microdissection, Culturing Laser microdissection, Culturing, Sequencing Laser microdissection, Sequencing Microfluidics, Facs, Culturing Microfluidics, Facs, Culturing, Sequencing Microfluidics, Facs, Sequencing Spatial Transcriptomics
Technology Required
Name technology
Description The sorting or visualisation technology used.
Example Vizgen
Reference #
Namespace ei:technology
Licence
Name licence
Description Specifies the terms under which the data associated with the study can be used, shared, or reused. It informs users how they may legally reference, distribute, or build upon the study. Common licenses include Creative Commons (e.g., CC BY 4.0), which require attribution to the original authors when the data is cited or reused.
Example MIT
Reference #
Namespace ei:licence
Allowed Values Apache-2.0 CC-BY-4.0 CC-BY-SA-4.0 CC0-1.0 GPL-3.0-or-later MIT
Study ID Required
Name study_id
Description A unique alphanumeric identifier for this study
Example STUDY001
Reference #
Regex ^[a-zA-Z0-9]+$
Namespace ei:study_id
Orcid ID
Name orcid_id
Description A 16-digit number that uniquely identify researchers.
Example 0000-1234-5678-9012
Reference #
Regex ^\d{4}-\d{4}-\d{4}-\d{3}[\dX]$
Namespace ei:orcid_id
First Name Required
Name givenName
Description A first name (or given name) is the personal name given to an individual conducting the study.
Example Jane
Reference https://schema.org/givenName
Regex ^[A-Za-z]+(?:[-\s][A-Za-z]+)*[a-z]+$
Namespace schema.org:givenName
Last Name Required
Name familyName
Description A last name (or surname) is the family name passed down from one generation to the next for the individual conducting the study.
Example Doe
Reference https://schema.org/familyName
Regex ^[A-Za-z]+(-[A-Za-z]+)*[a-z]+$
Namespace schema.org:familyName
Email Address
Name email
Description A unique identifier used to send and receive electronic messages (emails) over the internet.
Example jane.doe@example.com
Reference https://schema.org/email
Regex ^(?!.*\.{2,})(?!.*-{2,})[\w.-]+@[a-zA-Z\d.-]+\.[a-zA-Z]{2,}$
Namespace schema.org:email
Affiliation or Institution Required
Name affiliation
Description An organisation or institution that this person is associated with.
Example University of Liverpool
Reference https://schema.org/affiliation
Regex ^[A-Za-z]+(?: [A-Za-z]+)*[a-z]+$
Namespace schema.org:affiliation
Funder
Name funder
Description A person or organization that supports (sponsors) something through some kind of financial contribution.
Example BBSRC
Reference https://schema.org/funder
Namespace schema.org:funder
Grant Award
Name funding
Description A grant that directly or indirectly provides funding or sponsorship for the person to conduct the study.
Example GRAK3489
Reference https://schema.org/funding
Regex ^[A-Za-z0-9]+(?: [A-Za-z0-9]+)*$
Namespace schema.org:funding
Study ID Required
Name study_id
Description A unique alphanumeric identifier for the study if referring to
Example STUDY001
Reference #
Regex ^[a-zA-Z0-9]+$
Namespace ei:study_id
Sample ID Required
Name sample_id
Description A unique reference or identifier for the sample. This field must provide a consistent, unambiguous way to identify the sample within and across datasets. It can be a name, code, or accession-like format, as long as it remains unique.
Example SAMPLE001
Reference #
Namespace ei:sample_id
Scientific Name or Organism
Name scientific_name
Description The formal Latin name used to identify the organism from which the sample was derived (e.g. Homo sapiens or Arabidopsis thaliana). This name must accurately correspond to the Taxon ID provided to ensure correct taxonomic classification.
Example Salvelinus alpinus
Reference http://rs.tdwg.org/dwc/terms/scientificName
Regex ^[A-Za-z]+(?: [A-Za-z]+)*[a-z]+$
Namespace ontology:scientific_name
Taxon ID Required
Name taxon_id
Description A unique identifier (usually from a recognized taxonomy database like NCBI Taxonomy) that corresponds to the organism’s scientific name. It must be accurately matched to the provided scientificName to maintain consistency and traceability in biological records.
Example 8036
Reference http://rs.tdwg.org/dwc/terms/taxonID
Regex ^[0-9]+$
Namespace ontology:taxon_id
Biosample Accession Required
Name biosampleAccession
Description A unique identifier assigned to a biological sample after it has been submitted to a public database, such as the NCBI BioSample or ENA. It serves as a permanent reference to that specific sample, allowing researchers to retrieve metadata and link it across studies or datasets.
Example SAMEA12907823
Reference http://purl.obolibrary.org/obo/T4FS_0000316
Namespace ontology:biosampleAccession
Study ID Required
Name study_id
Description A unique alphanumeric identifier for this study
Example STUDY001
Reference #
Regex ^[a-zA-Z0-9]+$
Namespace ei:study_id
Dissociation Protocol ID Required
Name dissociation_protocol_id
Description A unique alphanumeric code for the dissociation protocol in the study
Example DISSOC001
Reference #
Regex ^[a-zA-Z0-9]+$
Namespace ei:dissociation_protocol_id
Protocol Name Required
Name protocol_name
Description A descriptive name of the protocol used for single-cell sequencing.
Example 10X Genomics Single Cell 3' Library Prep
Reference #
Namespace ei:protocol_name
Dissociation Description Required
Name dissociation_description
Description A free-text description of the process used to separate cells from tissues or cell aggregates.
Example Tissue was enzymatically dissociated using collagenase for 30 minutes.
Reference #
Namespace ei:dissociation_description
Enrichment Markers
Name enrichment_markers
Description Description of the specificity markers used to isolate cell populations, e.g. 'CD45+'. Please contact FAANG DCC to add more terms.
Example CD45
Reference #
Namespace faang:enrichment_markers
Isolation Kit
Name isolation_kit
Description The kit used to isolate the cells.
Example 10x Nuclei Isolation Kit
Reference #
Namespace ei:isolation_kit
Allowed Values 10x Nuclei Isolation Kit 3' standard throughput kit Custom
Literature Source Reference
Name literature_source_reference
Description Reference to literature sources that describe the protocol or methods used.
Example Doe et al. (2024), 'Single-cell RNA-seq: A comprehensive overview'
Reference #
Namespace ei:literature_source_reference
Protocols IO Reference
Name protocols_io_reference
Description Reference link to protocols.io for additional details on the protocol.
Example https://www.protocols.io/view/sample-protocol-b2ubqesn
Reference #
Regex ^https?:\/\/(?:www\.)?[-a-zA-Z0-9@:%._\+~#=]{1,256}\.[a-zA-Z0-9()]{1,6}(?:[-a-zA-Z0-9()@:%_\+.~#?&\/=]*)+(?: \| https?:\/\/(?:www\.)?[-a-zA-Z0-9@:%._\+~#=]{1,256}\.[a-zA-Z0-9()]{1,6}(?:[-a-zA-Z0-9()@:%_\+.~#?&\/=]*)*)*$
Namespace ei:protocols_io_reference
Workflowhub Sop Reference
Name workflow_hub_sop_reference
Description Reference to the Standard Operating Procedure (SOP) in workflow hub.
Example https://workflowhub.eu/works/12345
Reference #
Namespace ei:workflow_hub_sop_reference
Dissociation Protocol Method
Name dissociation_protocol_method
Description The method used to dissociate tissues into single cells.
Example Mechanical and enzymatic dissociation
Reference #
Namespace ei:dissociation_protocol_method
Single Cell Quality Metric
Name single_cell_quality_metric
Description Metrics used to assess the quality of single cells before sequencing.
Example Cell viability percentage
Reference #
Namespace ei:single_cell_quality_metric
Study ID Required
Name study_id
Description A unique alphanumeric identifier for this study
Example STUDY001
Reference #
Regex ^[a-zA-Z0-9]+$
Namespace ei:study_id
Cell Suspension ID Required
Name cell_suspension_id
Description A unique alphanumeric code for the cell suspension for the sample
Example CELLSUSP001
Reference #
Regex ^[a-zA-Z0-9]+$
Namespace ei:cell_suspension_id
Sample ID Required
Name sample_id
Description A unique reference or identifier for the sample associated with the cell suspension. This field must provide a consistent, unambiguous way to identify the sample within and across datasets. It can be a name, code, or accession-like format, as long as it remains unique.
Example SAMPLE001
Reference #
Namespace ei:sample_id
Dissociation Protocol ID Required
Name dissociation_protocol_id
Description A unique alphanumeric code for the dissociation protocol in the study
Example DISSOC001
Reference #
Regex ^[a-zA-Z0-9]+$
Namespace ei:dissociation_protocol_id
Suspension Type Required
Name suspension_type
Description The type of suspension used to keep cells in solution during processing.
Example Cell
Reference #
Namespace ei:suspension_type
Allowed Values Cell Nuclei Protoplast
Cell Count
Name cell_count
Description An number representing the number of cells in the sequencing library.
Example 10000
Reference #
Regex ^\d+$
Namespace ei:cell_count
Cell Number
Name cell_number
Description An number representing the number of cells in the sequencing library.
Example 101-10000
Reference #
Namespace tol:cell_number
Allowed Values 1 1000000+ 100001-500000 10001-50000 101-10000 11-50 2-10 500001-1000000 50001-100000 51-100
Cell Viability
Name cell_viability
Description The percentage of living cells in a sample, indicating the health and quality of cells for RNA-sequencing analysis.
Example 95
Reference #
Namespace ei:cell_viability
Cell Viability Assessment Method
Name cell_viability_assessment_method
Description The method used to evaluate the viability of cells in the sample, often involving staining or flow cytometry techniques.
Example Trypan Blue Exclusion
Reference #
Namespace ei:cell_viability_assessment_method
Cell Size
Name cell_size
Description The size of the cell, typically measured in micrometres.
Example 10
Reference #
Namespace ei:cell_size
Suspension Volume (µL)
Name suspension_volume_µl
Description The volume of the cell suspension in microlitres (µL).
Example 100
Reference #
Namespace ei:suspension_volume_µl
Suspension Concentration Cells Per µL
Name suspension_concentration_cells_per_µl
Description The concentration of cells in the suspension in microlitres (µL).
Example 1000
Reference #
Namespace ei:suspension_concentration_cells_per_µl
Suspension Dilution
Name suspension_dilution
Description The dilution factor of the cell suspension.
Example 1:10
Reference #
Namespace ei:suspension_dilution
Loading Volume Μl
Name loading_volume_µl
Description The volume of the cell suspension loaded into the single-cell RNA-sequencing system for analysis.
Example 10
Reference #
Regex ^\d+$
Namespace ei:loading_volume_µl
Suspension Dilution Buffer
Name suspension_dilution_buffer
Description A solution used to dilute cell suspensions to a desired concentration, typically prior to loading cells into a device for single-cell RNA sequencing. It helps maintain cell viability and integrity during processing.
Example PBS (Phosphate-buffered saline) with 0.04% BSA (Bovine serum albumin)
Reference #
Namespace ei:suspension_dilution_buffer
Study ID Required
Name study_id
Description A unique alphanumeric identifier for this study
Example STUDY001
Reference #
Regex ^[a-zA-Z0-9]+$
Namespace ei:study_id
Library Preparation ID Required
Name library_prep_id
Description A unique alphanumeric reference or identifier for the library preparation protocol used during the sequencing.
Example LIBPREP001
Reference #
Regex ^[a-zA-Z0-9]+$
Namespace ei:library_prep_id
Cell Suspension ID Required
Name cell_suspension_id
Description A unique alphanumeric code for the cell suspension for the library preparation.
Example CELLSUSP001
Reference #
Regex ^[a-zA-Z0-9]+$
Namespace ei:cell_suspension_id
Library Preparation Kit Required
Name library_prep_kit
Description Packaged kits (containing adapters, indexes, enzymes, buffers etc.), tailored for specific sequencing workflows, which allow the simplified preparation of sequencing-ready libraries for small genomes, amplicons, and plasmids
Example 10X Genomics Single Cell 3' v3
Reference https://w3id.org/mixs/0001145
Namespace mixs:library_prep_kit
Library Preparation Kit Version Required
Name library_prep_kit_version
Description The version number of the library preparation kit used for sequencing.
Example 2
Reference http://purl.obolibrary.org/obo/GENEPIO_0000149
Regex ^\d+(\.\d+)?$
Namespace ontology:library_prep_kit_version
Amplification Method
Name amplification_method
Description The method used to amplify the Complementary DNA (cDNA).
Example PCR
Reference #
Namespace ei:amplification_method
cDNA Amplification Cycles
Name cdna_amplification_cycles
Description The number of cycles used during the Complementary DNA (cDNA) amplification process.
Example 12
Reference #
Regex ^\d+$
Namespace ei:cdna_amplification_cycles
Average Size Distribution
Name average_size_distribution
Description The average length of RNA fragments in base pairs (BP) after library preparation, indicating the quality and suitability of the RNA for sequencing.
Example 350
Reference #
Regex ^\d+$
Namespace ei:average_size_distribution
Library Construction Method
Name lib_construction_method
Description The library construction method (including version) that was used.
Example Smart-Seq2
Reference #
Namespace ei:lib_construction_method
Input Molecule
Name input_molecule
Description The specific fraction of biological macromolecule from which the sequencing library is derived.
Example RNA
Reference #
Namespace ei:input_molecule
Primer
Name primer
Description The type of primer used for reverse transcription. This allows users to identify content of the cDNA library input for mRNA.
Example Random
Reference #
Namespace ei:primer
Allowed Values Oligo-dT Random
Primeness Required
Name primeness
Description The end from which the molecule was sequenced.
Example 5'
Reference #
Namespace ei:primeness
Allowed Values 3' 5' Both
End Bias
Name end_bias
Description The end bias of the library.
Example 3
Reference #
Namespace ei:end_bias
Allowed Values 3 5
Library Strand
Name library_strand
Description The Complementary DNA (cDNA) strand of the library from which the reads derived from - sense (first), antisense (second), both or none.
Example Antisense
Reference #
Namespace ei:library_strand
Allowed Values Antisense Both Sense Unstranded
Spike In Required
Name spike_in
Description External RNA added to the sample as a control to assess technical variability and normalization in RNA-sequencing. State whether spike-in was used.
Example Yes
Reference #
Namespace ei:spike_in
Allowed Values No Yes
Spike Type
Name spike_type
Description The specific type of external RNA used for spiking in, often indicating the source or nature of the control RNA.
Example Synthetic RNA
Reference #
Namespace ei:spike_type
Spike In Dilution Or Concentration
Name spike_in_dilution_or_concentration
Description The final concentration or dilution (for commercial sets) of the spike in mix.
Example 1:1000
Reference #
Namespace ei:spike_in_dilution_or_concentration
i5 Index Required
Name i5_index
Description Barcode sequence used on the i5 adapter during library preparation for identifying samples in multiplexed single-cell RNA-sequencing.
Example ATCACG
Reference #
Namespace ei:i5_index
i7 Index Required
Name i7_index
Description Barcode sequence used on the i7 adapter to distinguish samples in multiplexed sequencing runs.
Example CGATGT
Reference #
Namespace ei:i7_index
Dual or Single Index Required
Name dual_single_index
Description Specifies if both i5 and i7 indices (dual) or only one index (single) was used for sample identification during sequencing.
Example Dual
Reference #
Namespace ei:dual_single_index
Allowed Values Dual Single
I5 Sequence Required
Name i5_sequence
Description The nucleotide sequence of the i5 index used in multiplexing during sequencing.
Example ATCGTAGC
Reference #
Namespace ei:i5_sequence
i7 Sequence Required
Name i7_sequence
Description The specific nucleotide sequence of the i7 index used for a sample.
Example TGCATGCA
Reference #
Namespace ei:i7_sequence
Plate ID
Name plate_id
Description Identifier for the 96-well plate used in sample preparation.
Example PLT001
Reference #
Regex ^[a-zA-Z0-9]+$
Namespace ei:plate_id
Well Row
Name well_row
Description The row identifier in a 96-well plate indicating the sample's position.
Example A
Reference #
Namespace ei:well_row
Well Column
Name well_col
Description The column identifier in a 96-well plate indicating the sample's position.
Example 5
Reference #
Regex ^\d+$
Namespace ei:well_col
Cell Phenotype
Name cell_phenotype
Description The cell marker for the Fluorescence-Activated Cell Sorting (FACS) of cells.
Example CD41-
Reference #
Namespace ei:cell_phenotype
Allowed Values CD41+ CD41-
Design description
Name design_description
Description The design of the library including details of how it was constructed.
Reference #
Namespace ei:design_description
Library selection Required
Name library_selection
Description The method used to select for or against, enrich, or screen the material being sequenced.
Example RANDOM PCR
Reference #
Namespace ei:library_selection
Allowed Values 5-methylcytidine antibody CAGE ChIP ChIP-Seq Dnase HMPR Hybrid Selection Inverse rRNA Inverse rRNA selection MBD2 protein methyl-CpG binding domain MDA MF MSLL Mnase Oligo-dT PCR PolyA RACE RANDOM RANDOM PCR RT-PCR Reduced Representation Restriction Digest cDNA cDNA_oligo_dT cDNA_randomPriming other padlock probes capture method repeat fractionation size fractionation unspecified
Library source Required
Name library_source
Description The type of source material that is being sequenced.
Example GENOMIC
Reference #
Namespace ei:library_source
Allowed Values GENOMIC GENOMIC SINGLE CELL METAGENOMIC METATRANSCRIPTOMIC OTHER SYNTHETIC TRANSCRIPTOMIC TRANSCRIPTOMIC SINGLE CELL VIRAL RNA
Library strategy Required
Name library_strategy
Description The sequencing technique intended for this library.
Example RNA-Seq
Reference #
Namespace ei:library_strategy
Allowed Values AMPLICON ATAC-seq Bisulfite-Seq CLONE CLONEEND CTS ChIA-PET ChIP-Seq ChM-Seq DNase-Hypersensitivity EST FAIRE-seq FINISHING FL-cDNA GBS Hi-C MBD-Seq MNase-Seq MRE-Seq MeDIP-Seq NOMe-Seq OTHER POOLCLONE RAD-Seq RIP-Seq RNA-Seq Ribo-Seq SELEX Synthetic-Long-Read Targeted-Capture Tethered Chromatin Conformation Capture Tn-Seq VALIDATION WCS WGA WGS WXS miRNA-Seq ncRNA-Seq snRNA-seq ssRNA-seq
Study ID Required
Name study_id
Description A unique alphanumeric identifier for this study
Example STUDY001
Reference #
Regex ^[a-zA-Z0-9]+$
Namespace ei:study_id
Sequencing ID Required
Name sequencing_id
Description A unique alphanumeric reference or identifier for the sequencing protocol.
Example SEQ001
Reference https://w3id.org/mixs/0000016
Regex ^[a-zA-Z0-9]+$
Namespace ontology:sequencing_id
Sequencing Platform Name Required
Name sequencing_platform_name
Description The name of the sequencing platform used for the experiment.
Example Pacbio
Reference http://purl.obolibrary.org/obo/NCIT_C172274
Namespace ontology:sequencing_platform_name
Sequencing Instrument Model Required
Name sequencing_instrument_model
Description This refers to the machine or platform used for sequencing, with variations in throughput, read lengths, error rates, and application suitability.
Example Illumina NovaSeq 6000
Reference http://purl.obolibrary.org/obo/GENEPIO_0000149
Namespace ontology:sequencing_instrument_model
Allowed Values 454 GS 454 GS 20 454 GS FLX 454 GS FLX Titanium 454 GS FLX+ 454 GS Junior AB 310 Genetic Analyzer AB 3130 Genetic Analyzer AB 3130xL Genetic Analyzer AB 3500 Genetic Analyzer AB 3500xL Genetic Analyzer AB 3730 Genetic Analyzer AB 3730xL Genetic Analyzer AB 5500 Genetic Analyzer AB 5500xl Genetic Analyzer AB 5500xl-W Genetic Analysis System AB SOLiD 3 Plus System AB SOLiD 4 System AB SOLiD 4hq System AB SOLiD PI System AB SOLiD System AB SOLiD System 2.0 AB SOLiD System 3.0 BGISEQ-50 BGISEQ-500 Complete Genomics DNBSEQ-G400 DNBSEQ-G400 FAST DNBSEQ-G50 DNBSEQ-T10x4RS DNBSEQ-T7 Element AVITI FASTASeq 300 GENIUS GS111 Genapsys Sequencer GenoCare 1600 GenoLab M GridION Illumina Genome Analyzer Illumina Genome Analyzer II Illumina Genome Analyzer IIx Illumina HiScanSQ Illumina HiSeq 1000 Illumina HiSeq 1500 Illumina HiSeq 2000 Illumina HiSeq 2500 Illumina HiSeq 3000 Illumina HiSeq 4000 Illumina HiSeq X Illumina HiSeq X Five Illumina HiSeq X Ten Illumina MiSeq Illumina MiniSeq Illumina NextSeq 500 Illumina NextSeq 550 Illumina NovaSeq 6000 Illumina NovaSeq X Illumina NovaSeq X Plus Illumina iSeq 100 Ion GeneStudio S5 Ion GeneStudio S5 Plus Ion GeneStudio S5 Prime Ion Torrent Genexus Ion Torrent PGM Ion Torrent Proton Ion Torrent S5 Ion Torrent S5 XL MGISEQ-2000RS MinION NextSeq 1000 NextSeq 2000 Onso PacBio RS PacBio RS II PromethION Revio Sentosa SQ301 Sequel Sequel II Sequel IIe Tapestri UG 100
Library Layout Required
Name lib_layout
Description Specify whether to expect single, paired, or other configuration of reads for sequencing
Example Paired
Reference https://w3id.org/mixs/0000111
Namespace mixs:lib_layout
Allowed Values Other Paired Single Vector
UMI Barcode Read
Name umi_barcode_read
Description The type of read that contains the Unique Molecular Identifier (UMI) barcode.
Example index2
Reference #
Namespace ei:umi_barcode_read
Allowed Values index1 index2 read1 read2
UMI Barcode Offset
Name umi_barcode_offset
Description The offset in sequence of the Unique Molecular Identifier (UMI) identifying barcode.
Example 0
Reference #
Regex ^\d+$
Namespace ei:umi_barcode_offset
UMI Barcode Size
Name umi_barcode_size
Description The size of the Unique Molecular Identifier (UMI) identifying barcode.
Example 10
Reference #
Regex ^\d+$
Namespace ei:umi_barcode_size
Cell Barcode Read
Name cell_barcode_read
Description The type of read that contains the UMI barcode.
Example index1
Reference http://www.ebi.ac.uk/efo/EFO_0010203
Namespace ontology:cell_barcode_read
Allowed Values index1 index2 read1 read2
Cell Barcode Offset
Name cell_barcode_offset
Description The offset in sequence of the cell identifying barcode.
Example 10
Reference http://www.ebi.ac.uk/efo/EFO_0010204
Regex ^\d+$
Namespace ontology:cell_barcode_offset
Cell Barcode Size
Name cell_barcode_size
Description The offset in sequence of the cell identifying barcode.
Example 0
Reference http://www.ebi.ac.uk/efo/EFO_0010205
Regex ^\d+$
Namespace ontology:cell_barcode_size
cDNA Read Required
Name cdna_read
Description The actual nucleotide sequence obtained from Complementary DNA (cDNA) during sequencing.
Example read1
Reference http://www.ebi.ac.uk/efo/EFO_0010195
Namespace ontology:cdna_read
Allowed Values index1 index2 read1 read2
cDNA Read Offset
Name cdna_read_offset
Description The starting position of the Complementary DNA (cDNA) read within the entire sequence, indicating where the read begins after any barcodes or technical sequences.
Example 6
Reference http://www.ebi.ac.uk/efo/EFO_0010201
Regex ^\d+$
Namespace ontology:cdna_read_offset
cDNA Read Size
Name cdna_read_size
Description The size of the Complementary DNA (cDNA) read.
Example 75
Reference http://www.ebi.ac.uk/efo/EFO_0010202
Regex ^\d+$
Namespace ontology:cdna_read_size
Study ID Required
Name study_id
Description A unique alphanumeric identifier for this study
Example STUDY001
Reference #
Regex ^[a-zA-Z0-9]+$
Namespace ei:study_id
File Derived From
Name file_derived_from
Description The name of the file that was used to generate the analysis derived data.
Example file1_sequencing.json
Reference #
Namespace ei:file_derived_from
Inferred Cell Type
Name inferred_cell_type
Description Post analysis cell type or identity declaration based on expression profile or known gene function identified by the performer.
Example type II bipolar neuron
Reference #
Namespace ei:inferred_cell_type
Post Analysis Cell Well Quality
Name post_analysis_cell_well_quality
Description Performer defined measure of whether the read output from the cell was included in the sequencing analysis. For example, cells might be excluded if a threshold percentage of reads did not map to the genome or if pre-sequencing quality measures were not passed.
Example Pass
Reference #
Namespace ei:post_analysis_cell_well_quality
Allowed Values Fail Pass
Other Derived Cell Attributes
Name other_derived_cell_attributes
Description Any other cell level measurement or annotation as result of the analysis.
Example Cluster
Reference #
Namespace ei:other_derived_cell_attributes
Allowed Values Cluster Count Gene UMI tSNE coordinates
Study ID Required
Name study_id
Description A unique alphanumeric identifier for this study
Example STUDY001
Reference #
Regex ^[a-zA-Z0-9]+$
Namespace ei:study_id
Reference Genome
Name reference_genome
Description Indicate version and include stable link to genome data (or attach genome fasta file).
Example GRCh38, https://example.org/grch38.fa
Reference #
Namespace ei:reference_genome
Genome Annotation
Name genome_annotation
Description Indicate version and include stable link. Also indicate if any modification to the original annotation has been applied (e.g. 3' UTR extension) and include modified annotation file employed in the analysis.
Example Ensembl v101, https://example.org/ensembl_v101.gtf
Reference #
Namespace ei:genome_annotation
Annotation Filtering
Name annotation_filtering
Description Indicate which features were filtered (i.e. protein coding, pseudo-genes, TCRs, etc.)
Example Filtered to include only protein-coding genes
Reference #
Namespace ei:annotation_filtering
Genes vs Exons
Name genes_vs_exons
Description Quantification using whole gene intervals or exons.
Example Exon quantification
Reference #
Namespace ei:genes_vs_exons
Library Structure
Name library_structure
Description seqspec format
Example Single-cell 3' library
Reference #
Namespace ei:library_structure
Mapping and Demultiplexing Software
Name mapping_and_demultiplexing_software
Description Reads/UMI
Example Cell Ranger 6.0.0
Reference #
Namespace ei:mapping_and_demultiplexing_software
Read Mapping Statistics
Name read_mapping_statistics
Description Statistics of the Reads or Unique Molecular Identifier (UMI).
Example 80% reads mapped to reference
Reference #
Namespace ei:read_mapping_statistics
Sequencing Saturation
Name sequencing_saturation
Description Depending on number of cells recovered (not targeted) and technology
Example 95% sequencing saturation
Reference #
Namespace ei:sequencing_saturation
UMIs or Barcode Distribution QC
Name umis_barcode_distribution_qc
Description Show Unique Molecular Identifiers (UMIs) per barcode distribution and threshold applied
Example Threshold: 10 UMIs per barcode
Reference #
Namespace ei:umis_barcode_distribution_qc
Cell or Non-Cell Filtering Strategy
Name cell_non_cell_filtering_strategy
Description Unique Molecular Identifier (UMI) threshold used to discriminate cells from non-cells. Description of algorithm (if any) and parameters used to determine cells or non-cells.
Example Threshold: 5 UMIs for cell detection
Reference #
Namespace ei:cell_non_cell_filtering_strategy
Other Quality Filters Applied
Name other_quality_filters_applied
Description Cells/nuclei discarded based on % mitochondrial reads, % rRNA reads, etc.
Example Cells with >20% mitochondrial reads discarded
Reference #
Namespace ei:other_quality_filters_applied
Ambient RNA QC
Name ambient_rna_qc
Description Report % UMIs in background cell barcodes, and algorithm (if any) used to remove ambient RNA
Example Ambient RNA removed if >5% UMIs in background barcodes
Reference #
Namespace ei:ambient_rna_qc
Predicted Doublet Rate QC
Name predicted_doublet_rate_qc
Description Depending on number of cells recovered (not targeted) and technology
Example Predicted doublet rate: 1.5%
Reference #
Namespace ei:predicted_doublet_rate_qc
Individual Organism SNP Demultiplexing
Name individual_organism_snp_demultiplexing
Description If carried out, show SNP partitioning quality (e.g. SNP UMAP embedding or covariance matrix), algorithm used
Example SNP UMAP embedding using CellSNP
Reference #
Namespace ei:individual_organism_snp_demultiplexing
Study ID Required
Name study_id
Description A unique alphanumeric identifier for this study
Example STUDY001
Reference #
Regex ^[a-zA-Z0-9]+$
Namespace ei:study_id
Clustering Algorithm and Version
Name clustering_algorithm_and_version
Description If compared/integrated with existing datasets
Example Louvain 0.8.0
Reference #
Namespace ei:clustering_algorithm_and_version
Clustering Parameters
Name clustering_parameters
Description If compared/integrated with existing datasets
Example Resolution: 0.6, K-nearest neighbors: 10
Reference #
Namespace ei:clustering_parameters
Integration/Batch Correction
Name integration_batch_correction
Description If compared/integrated with existing datasets
Example Harmony v1.0
Reference #
Namespace ei:integration_batch_correction
Source Code
Name source_code
Description If any newly developed code/software has been used in the processing and downstream analysis of the dataset.
Example Source code is hosted on GitHub and includes custom algorithms for UMI count normalization. The repository can be found at: https://github.com/user/umi-normalization.
Reference #
Namespace ei:source_code
UMI Count Matrix
Name umi_count_matrix
Description Gene x cell matrix with UMI counts for each gene in each cell.
Example The UMI count matrix is stored in a CSV file with gene IDs as rows (e.g., ENSG00000139618) and cell barcodes as columns (e.g., Cell_001, Cell_002). The matrix file is available at: https://example.com/umi_count_matrix.csv.
Reference #
Namespace ei:umi_count_matrix
Ensembl IDs
Name ensembl_ids
Description Gene or transcript names should be listed as Ensembl (or other standardized ID), with gene short names in metadata.
Example ENSG00000139618
Reference #
Regex ^[a-zA-Z0-9]+$
Namespace ei:ensembl_ids
Functional Gene Annotations
Name functional_gene_annotations
Description Any functional annotation generated/used (gene names, GOs, structural domains, etc.).
Example Functional gene annotations, including Gene Ontology (GO) terms, are provided in the metadata. For example, the gene 'ENSG00000139618' (BRCA1) is annotated with the GO term 'GO:0003674' (DNA binding).
Reference #
Namespace ei:functional_gene_annotations
Protein Models
Name protein_models
Description FASTA file with (or stable link to) the predicted proteins associated to genes in the UMI count matrix and matching IDs.
Example The protein sequences for genes are provided in a FASTA file available at: https://example.com/protein_models.fasta, where each protein sequence is linked to the corresponding gene ID.
Reference #
Namespace ei:protein_models
Cell Metadata
Name cell_metadata
Description Table mapping cell IDs to cluster/cell type/broad cell type annotations.
Example Cell metadata includes information such as cell type annotations ('Tumor', 'Normal') and experimental conditions ('Control', 'Treatment'). This data is available in a table at: https://example.com/cell_metadata.csv.
Reference #
Namespace ei:cell_metadata
Cluster-Level Normalised Expression Tables
Name cluster_level_normalised_expression_tables
Description Expression tables that show normalised gene expression at the cluster or cell-type level.
Example Normalised gene expression data at the cluster level is provided in a tab-delimited text file. For example, gene 'ENSG00000139618' (BRCA1) has expression values for clusters: Cluster_1: 1200, Cluster_2: 900. The full expression table is available at: https://example.com/cluster_level_expression.csv.
Reference #
Namespace ei:cluster_level_normalised_expression_tables
Other Resource Files
Name other_resource_files
Description Necessary to re-use and interpret the data. E.g. barcode information in complex, serial multiplexing protocols (clicktags).
Example Barcode information used in multiplexing protocols is provided in a separate file, which can be accessed at: https://example.com/barcode_data.csv.
Reference #
Namespace ei:other_resource_files
Study ID Required
Name study_id
Description A unique alphanumeric identifier for this study
Example STUDY001
Reference #
Regex ^[a-zA-Z0-9]+$
Namespace ei:study_id
File ID Required
Name file_id
Description A unique alphanumeric identifier for this file
Example STUDY001
Reference #
Regex ^[a-zA-Z0-9]+$
Namespace ei:file_id
Library Preparation ID Required
Name library_prep_id
Description A unique alphanumeric reference or identifier for the library preparation protocol used during the sequencing.
Example LIBPREP001
Reference #
Regex ^[a-zA-Z0-9]+$
Namespace ei:library_prep_id
Sequencing ID Required
Name sequencing_id
Description A unique alphanumeric reference or identifier for the sequencing protocol.
Example SEQ001
Reference #
Regex ^[a-zA-Z0-9]+$
Namespace ei:sequencing_id
Read 1 File Required
Name read_1_file
Description The name or accession of the file that contains read 1.
Example file1_r1.fastq.gz
Reference #
Namespace ei:read_1_file
Read 2 File
Name read_2_file
Description The name or accession of the file that contains read 2.
Example file2_r2.fastq.gz
Reference #
Namespace ei:read_2_file
Index 1 File
Name index_1_file
Description The name of the file that contains index 1.
Example file1_i1.fastq.gz
Reference #
Namespace ei:index_1_file
Index 2 File
Name index_2_file
Description The name of the file that contains index 2.
Example file2_i2.fastq.gz
Reference #
Namespace ei:index_2_file
Read 1 Checksum Required
Name read_1_file_checksum
Description Result of a hash function calculated on the content of the read 1 file to verify file integrity. Commonly used algorithms include MD5 and SHA-1. The checksums should be separated by a comma (,).
Example f8d29e41a73b5c02de9a6fb314e7c8ad
Reference #
Regex ^[0-9a-f]{32}$
Namespace ei:read_1_file_checksum
Read 2 Checksum
Name read_2_file_checksum
Description Result of a hash function calculated on the content of the read 2 file to verify file integrity. Commonly used algorithms include MD5 and SHA-1. The checksums should be separated by a comma (,).
Example a3f4c1b29d8e57fa41b02de6c7f9ab83
Reference #
Regex ^[0-9a-f]{32}$
Namespace ei:read_2_file_checksum
White List Barcode File
Name white_list_barcode_file
Description A file containing the known cell barcodes in the dataset.
Example barcodes.tsv
Reference #
Namespace ei:white_list_barcode_file
Study ID Required
Name study_id
Description A unique alphanumeric identifier for this study
Example STUDY001
Reference #
Regex ^[a-zA-Z0-9]+$
Namespace ei:study_id
Expression Data Process Setting ID Required
Name expression_data_process_setting_id
Description A unique alphanumeric identifier for the expression data process setting
Example EXPSET001
Reference #
Regex ^[a-zA-Z0-9]+$
Namespace ei:expression_data_process_setting_id
Matrix Type
Name matrix_type
Description Matrix Type
Example raw_counts
Reference #
Namespace ei:matrix_type
Allowed Values imputed log1p nomalised pseudobulk raw_counts scaled
Reference Genome Required
Name reference_genome
Description The associated reference genome
Example https://reference-genome-example.com
Reference #
Regex ^((https?|ftp):\/\/[^\s|]+)(\|((https?|ftp):\/\/[^\s|]+))*$
Namespace ei:reference_genome
Annotation Version
Name annotation_version
Description The annotation version of the associated reference genome
Example GENCODE v44
Reference #
Namespace ei:annotation_version
Normalisation Method
Name normalisation_method
Description Any normalisation processing performed
Example Log normalisation
Reference #
Namespace ei:normalisation_method
Allowed Values Library Size Normalisation Log Normalisation SCNorm SCTransform scran
Highly Variable Gene Selection (HVG)
Name highly_variable_gene_selection
Description Number of Highly Variable Genes
Example seurat_v3, n=2000
Reference #
Namespace ei:highly_variable_gene_selection
Dimensionality Reduction
Name dimensionality_reduction
Description Method used to reduce dimensionality in the expression data
Example PCA
Reference #
Namespace ei:dimensionality_reduction
Allowed Values Diffusion Map ICA NMF PCA UMAP t-SNE
Number of Nearest Neighbours
Name n_neighbours
Description Number of nearest neighbours used to calculate cluster membership
Example pca:50
Reference #
Namespace ei:n_neighbours
Clustering Algorithm
Name clustering_algorithm
Description Algorithm used to create clusters
Reference #
Namespace ei:clustering_algorithm
Clustering Resolution
Name clustering_resolution
Description Resolution parameter
Example 2.5
Reference #
Regex ^([0-9]*[.])?[0-9]+
Namespace ei:clustering_resolution
Clustering Distance Metric
Name clustering_distance_metric
Description Metic used to calculate a points distance to others
Example cosine
Reference #
Namespace ei:clustering_distance_metric
Allowed Values cosine euclidean hamming jaccard manhatten mehalanobis
Software Versions
Name software_versions
Description Primary software packages used for analysis
Reference #
Namespace ei:software_versions
Cell Type Annotation
Name cell-type annotation
Description Tools and Databases used for cell annotation
Reference #
Namespace ei:cell-type annotation
Generated by Pipeline
Name generated_by_pipeline
Description URL of the deposited pipeline used to create this data
Reference #
Regex ^(https?|ftp):\/\/[^\s/$.?#].[^\s]*$
Namespace ei:generated_by_pipeline
Notes
Name notes
Description Any other information
Reference #
Namespace ei:notes
Study ID Required
Name study_id
Description A unique alphanumeric identifier for this study
Example STUDY001
Reference #
Regex ^[a-zA-Z0-9]+$
Namespace ei:study_id
File ID Required
Name expression_data_file_id
Description A unique alphanumeric identifier for the expression data file
Example EXPFILE001
Reference #
Regex ^[a-zA-Z0-9]+$
Namespace ei:expression_data_file_id
Library Preparation ID Required
Name library_prep_id
Description A unique alphanumeric identifier for library preparation
Example LIBPREP001
Reference #
Regex ^[a-zA-Z0-9]+$
Namespace ei:library_prep_id
Expression Data Process Setting ID Required
Name expression_data_setting_id
Description A unique alphanumeric identifier for the expression data process setting
Example EXPSET001
Reference #
Regex ^[a-zA-Z0-9]+$
Namespace ei:expression_data_setting_id
File Name Required
Name expression_data_file
Description Expression data file name
Example exp_file.csv
Reference #
Namespace ei:expression_data_file
File md5 Checkshum Required
Name expression_data_file_checksum
Description calculated md5 checksum for this file
Example 9e4b7a23f6c1d0ab85f29c47e3d8a610
Reference #
Regex ^[0-9a-f]{32}$
Namespace ei:expression_data_file_checksum
File Format Required
Name expression_data_file_format
Description The format of the expression file, such as h5ad or rds
Example csv
Reference #
Namespace ei:expression_data_file_format
Allowed Values csv h5ad loom mtx rds
Number of Cells
Name n_cells
Description The number of cells represented in the expression data
Example 4
Reference #
Regex ^\d+$
Namespace ei:n_cells
Number of Genes
Name n_genes
Description The number of genese represented in the expression data
Example 50
Reference #
Regex ^\d+$
Namespace ei:n_genes
File Size in Bytes
Name file_size_bytes
Description Size of the file recorded in bytes
Example 90
Reference #
Regex ^\d+$
Namespace ei:file_size_bytes
Date Generated
Name date_generated
Description Approximate date this expression data was generated
Example 2024-10-14
Reference #
Regex ^\d{4}-(0[1-9]|1[0-2])-(0[1-9]|[12]\d|3[01])$
Namespace ei:date_generated
Study ID Required
Name study_id
Description A unique alphanumeric identifier for this study
Example STUDY001
Reference #
Regex ^[a-zA-Z0-9]+$
Namespace ei:study_id
Description
Name description
Description A detailed description of the project which includes research goals and experimental approach. Project description should be fewer than 300 words, such as an abstract from a grant application or publication.
Example This project explores the intricate details of single cells in the human body, focusing on their structure, function, and behaviour. By studying individual cells, it aims to uncover how they contribute to overall health, disease progression, and human biology. This research can provide deeper insights into cellular processes, paving the way for advancements in medical treatments and personalised medicine.
Reference http://purl.org/dc/terms/description
Namespace dcterms:description
Material Required
Name material
Description The type of material being described.
Example Organism
Reference #
Namespace faang:material
Allowed Values Cell culture Cell line Cell specimen Organism Organoid Pool of Specimens Single cell specimen Specimen from Organism
Project Required
Name project
Description State that the project is 'FAANG'.
Example FAANG
Reference #
Regex ^FAANG$
Namespace faang:project
Cell Enrichment Required
Name cell_enrichment
Description The method by which specific cell populations are sorted or enriched, e.g. 'fluorescence-activated cell sorting (FACS)'. Please contact FAANG DCC to add more terms.
Example Fluorescence-activated Cell Sorting (FACS)
Reference #
Namespace faang:cell_enrichment
Allowed Values Bead-based sorting Cell culture Centrifugation Fluorescence-activated Cell Sorting (FACS) Magnetic levitation Raman-spectometry sorting, cell culture
Licence
Name licence
Description Specifies the terms under which the data associated with the study can be used, shared, or reused. It informs users how they may legally reference, distribute, or build upon the study. Common licenses include Creative Commons (e.g., CC BY 4.0), which require attribution to the original authors when the data is cited or reused.
Example MIT
Reference #
Namespace ei:licence
Allowed Values Apache-2.0 CC-BY-4.0 CC-BY-SA-4.0 CC0-1.0 GPL-3.0-or-later MIT
Study ID Required
Name study_id
Description A unique alphanumeric identifier for this study
Example STUDY001
Reference #
Regex ^[a-zA-Z0-9]+$
Namespace ei:study_id
Orcid ID
Name orcid_id
Description A 16-digit number that uniquely identify researchers.
Example 0000-1234-5678-9012
Reference #
Regex ^\d{4}-\d{4}-\d{4}-\d{3}[\dX]$
Namespace ei:orcid_id
First Name Required
Name givenName
Description A first name (or given name) is the personal name given to an individual conducting the study.
Example Jane
Reference https://schema.org/givenName
Regex ^[A-Za-z]+(?:[-\s][A-Za-z]+)*[a-z]+$
Namespace schema.org:givenName
Last Name Required
Name familyName
Description A last name (or surname) is the family name passed down from one generation to the next for the individual conducting the study.
Example Doe
Reference https://schema.org/familyName
Regex ^[A-Za-z]+(-[A-Za-z]+)*[a-z]+$
Namespace schema.org:familyName
Email Address
Name email
Description A unique identifier used to send and receive electronic messages (emails) over the internet.
Example jane.doe@example.com
Reference https://schema.org/email
Regex ^(?!.*\.{2,})(?!.*-{2,})[\w.-]+@[a-zA-Z\d.-]+\.[a-zA-Z]{2,}$
Namespace schema.org:email
Affiliation or Institution Required
Name affiliation
Description An organisation or institution that this person is associated with.
Example University of Liverpool
Reference https://schema.org/affiliation
Regex ^[A-Za-z]+(?: [A-Za-z]+)*[a-z]+$
Namespace schema.org:affiliation
Funder
Name funder
Description A person or organization that supports (sponsors) something through some kind of financial contribution.
Example BBSRC
Reference https://schema.org/funder
Namespace schema.org:funder
Grant Award
Name funding
Description A grant that directly or indirectly provides funding or sponsorship for the person to conduct the study.
Example GRAK3489
Reference https://schema.org/funding
Regex ^[A-Za-z0-9]+(?: [A-Za-z0-9]+)*$
Namespace schema.org:funding
Study ID Required
Name study_id
Description A unique alphanumeric identifier for the study if referring to
Example STUDY001
Reference #
Regex ^[a-zA-Z0-9]+$
Namespace ei:study_id
Sample ID Required
Name sample_id
Description A unique reference or identifier for the sample. This field must provide a consistent, unambiguous way to identify the sample within and across datasets. It can be a name, code, or accession-like format, as long as it remains unique.
Example SAMPLE001
Reference #
Namespace ei:sample_id
Scientific Name or Organism
Name scientific_name
Description The formal Latin name used to identify the organism from which the sample was derived (e.g. Homo sapiens or Arabidopsis thaliana). This name must accurately correspond to the Taxon ID provided to ensure correct taxonomic classification.
Example Salvelinus alpinus
Reference http://rs.tdwg.org/dwc/terms/scientificName
Regex ^[A-Za-z]+(?: [A-Za-z]+)*[a-z]+$
Namespace ontology:scientific_name
Taxon ID Required
Name taxon_id
Description A unique identifier (usually from a recognized taxonomy database like NCBI Taxonomy) that corresponds to the organism’s scientific name. It must be accurately matched to the provided scientificName to maintain consistency and traceability in biological records.
Example 8036
Reference http://rs.tdwg.org/dwc/terms/taxonID
Regex ^[0-9]+$
Namespace ontology:taxon_id
Biosample Accession Required
Name biosampleAccession
Description A unique identifier assigned to a biological sample after it has been submitted to a public database, such as the NCBI BioSample or ENA. It serves as a permanent reference to that specific sample, allowing researchers to retrieve metadata and link it across studies or datasets.
Example SAMEA12907823
Reference http://purl.obolibrary.org/obo/T4FS_0000316
Namespace ontology:biosampleAccession
Study ID Required
Name study_id
Description A unique alphanumeric identifier for this study
Example STUDY001
Reference #
Regex ^[a-zA-Z0-9]+$
Namespace ei:study_id
Dissociation Protocol ID Required
Name dissociation_protocol_id
Description A unique alphanumeric code for the dissociation protocol in the study
Example DISSOC001
Reference #
Regex ^[a-zA-Z0-9]+$
Namespace ei:dissociation_protocol_id
Protocol Name Required
Name protocol_name
Description A descriptive name of the protocol used for single-cell sequencing.
Example 10X Genomics Single Cell 3' Library Prep
Reference #
Namespace ei:protocol_name
Enrichment Markers
Name enrichment_markers
Description Description of the specificity markers used to isolate cell populations, e.g. 'CD45+'. Please contact FAANG DCC to add more terms.
Example CD45
Reference #
Namespace faang:enrichment_markers
Isolation Kit
Name isolation_kit
Description The kit used to isolate the cells.
Example 10x Nuclei Isolation Kit
Reference #
Namespace ei:isolation_kit
Allowed Values 10x Nuclei Isolation Kit 3' standard throughput kit Custom
Literature Source Reference
Name literature_source_reference
Description Reference to literature sources that describe the protocol or methods used.
Example Doe et al. (2024), 'Single-cell RNA-seq: A comprehensive overview'
Reference #
Namespace ei:literature_source_reference
Protocols IO Reference
Name protocols_io_reference
Description Reference link to protocols.io for additional details on the protocol.
Example https://www.protocols.io/view/sample-protocol-b2ubqesn
Reference #
Regex ^https?:\/\/(?:www\.)?[-a-zA-Z0-9@:%._\+~#=]{1,256}\.[a-zA-Z0-9()]{1,6}(?:[-a-zA-Z0-9()@:%_\+.~#?&\/=]*)+(?: \| https?:\/\/(?:www\.)?[-a-zA-Z0-9@:%._\+~#=]{1,256}\.[a-zA-Z0-9()]{1,6}(?:[-a-zA-Z0-9()@:%_\+.~#?&\/=]*)*)*$
Namespace ei:protocols_io_reference
Single cell isolation protocol Required
Name single_cell_isolation_protocol
Description Link to protocol describing how the single cells were separated into a single-cell suspension.
Example https://api.faang.org/files/protocols/samples/INRAE_SOP_PLUS4PIGS_EMBRYOS_DISSOCIATION_PROTO4_20240710.pdf
Reference #
Regex ^(https?|ftp):\/\/[^\s/$.?#].[^\s]*$
Namespace faang:single_cell_isolation_protocol
Workflowhub Sop Reference
Name workflow_hub_sop_reference
Description Reference to the Standard Operating Procedure (SOP) in workflow hub.
Example https://workflowhub.eu/works/12345
Reference #
Namespace ei:workflow_hub_sop_reference
Dissociation Protocol Method
Name dissociation_protocol_method
Description The method used to dissociate tissues into single cells.
Example Mechanical and enzymatic dissociation
Reference #
Namespace ei:dissociation_protocol_method
Single Cell Quality Metric
Name single_cell_quality_metric
Description Metrics used to assess the quality of single cells before sequencing.
Example Cell viability percentage
Reference #
Namespace ei:single_cell_quality_metric
Cell Type Required
Name cell_type
Description Provide a cell type from the CL ontology.
Example malignant cell
Reference CL:0000000
Regex ^[A-Za-z\s]*[a-z]+$
Namespace faang:cell_type
Tissue Dissociation Required
Name tissue_dissociation
Description The method by which tissues are dissociated into purified or single cells in suspension. Examples are 'proteolysis', 'mesh passage', 'fine needle trituration'. For blood, milk and other fluids, where there is no tissue dissociation use 'fluids'. Please contact FAANG DCC to add more terms.
Example Proteolysis
Reference #
Namespace faang:tissue_dissociation
Allowed Values Fine needle trituration Fluids Mechanical dissociation Mesh passage Proteolysis
Derived from Required
Name derived_from
Description Sample name or BioSample ID for a specimen record.
Example SSC_INRAE_GUT_ORGANOID_100I
Reference #
Regex ^[A-Za-z0-9_]+$
Namespace faang:derived_from
Study ID Required
Name study_id
Description A unique alphanumeric identifier for this study
Example STUDY001
Reference #
Regex ^[a-zA-Z0-9]+$
Namespace ei:study_id
Cell Suspension ID Required
Name cell_suspension_id
Description A unique alphanumeric code for the cell suspension for the sample
Example CELLSUSP001
Reference #
Regex ^[a-zA-Z0-9]+$
Namespace ei:cell_suspension_id
Sample ID Required
Name sample_id
Description A unique reference or identifier for the sample associated with the cell suspension. This field must provide a consistent, unambiguous way to identify the sample within and across datasets. It can be a name, code, or accession-like format, as long as it remains unique.
Example SAMPLE001
Reference #
Namespace ei:sample_id
Dissociation Protocol ID Required
Name dissociation_protocol_id
Description A unique alphanumeric code for the dissociation protocol in the study
Example DISSOC001
Reference #
Regex ^[a-zA-Z0-9]+$
Namespace ei:dissociation_protocol_id
Suspension Type Required
Name suspension_type
Description The type of suspension used to keep cells in solution during processing.
Example Cell
Reference #
Namespace ei:suspension_type
Allowed Values Cell Nuclei Protoplast
Purification Protocol Required
Name purification_protocol
Description Link to protocol describing how the cells were purified.
Reference #
Regex ^(https?|ftp):\/\/[^\s/$.?#].[^\s]*$
Namespace faang:purification_protocol
Cell Count
Name cell_count
Description An number representing the number of cells in the sequencing library.
Example 10000
Reference #
Regex ^\d+$
Namespace ei:cell_count
Cell Number
Name cell_number
Description An number representing the number of cells in the sequencing library.
Example 101-10000
Reference #
Namespace tol:cell_number
Allowed Values 1 1000000+ 100001-500000 10001-50000 101-10000 11-50 2-10 500001-1000000 50001-100000 51-100
Cell Viability
Name cell_viability
Description The percentage of living cells in a sample, indicating the health and quality of cells for RNA-sequencing analysis.
Example 95
Reference #
Namespace ei:cell_viability
Cell Viability Assessment Method
Name cell_viability_assessment_method
Description The method used to evaluate the viability of cells in the sample, often involving staining or flow cytometry techniques.
Example Trypan Blue Exclusion
Reference #
Namespace ei:cell_viability_assessment_method
Cell Size
Name cell_size
Description The size of the cell, typically measured in micrometres.
Example 10
Reference #
Namespace ei:cell_size
Suspension Volume (µL)
Name suspension_volume_µl
Description The volume of the cell suspension in microlitres (µL).
Example 100
Reference #
Namespace ei:suspension_volume_µl
Suspension Concentration Cells Per µL
Name suspension_concentration_cells_per_µl
Description The concentration of cells in the suspension in microlitres (µL).
Example 1000
Reference #
Namespace ei:suspension_concentration_cells_per_µl
Suspension Dilution
Name suspension_dilution
Description The dilution factor of the cell suspension.
Example 1:10
Reference #
Namespace ei:suspension_dilution
Loading Volume Μl
Name loading_volume_µl
Description The volume of the cell suspension loaded into the single-cell RNA-sequencing system for analysis.
Example 10
Reference #
Regex ^\d+$
Namespace ei:loading_volume_µl
Suspension Dilution Buffer
Name suspension_dilution_buffer
Description A solution used to dilute cell suspensions to a desired concentration, typically prior to loading cells into a device for single-cell RNA sequencing. It helps maintain cell viability and integrity during processing.
Example PBS (Phosphate-buffered saline) with 0.04% BSA (Bovine serum albumin)
Reference #
Namespace ei:suspension_dilution_buffer
Derived from Required
Name derived_from
Description Sample name or BioSample ID for a specimen record.
Example SAMEA112465628
Reference #
Regex ^[A-Za-z0-9_]+$
Namespace faang:derived_from
Study ID Required
Name study_id
Description A unique alphanumeric identifier for this study
Example STUDY001
Reference #
Regex ^[a-zA-Z0-9]+$
Namespace ei:study_id
Library Preparation ID Required
Name library_prep_id
Description A unique alphanumeric reference or identifier for the library preparation protocol used during the sequencing.
Example LIBPREP001
Reference #
Regex ^[a-zA-Z0-9]+$
Namespace ei:library_prep_id
Cell Suspension ID Required
Name cell_suspension_id
Description A unique alphanumeric code for the cell suspension for the library preparation.
Example CELLSUSP001
Reference #
Regex ^[a-zA-Z0-9]+$
Namespace ei:cell_suspension_id
Library Preparation Kit Required
Name library_prep_kit
Description Packaged kits (containing adapters, indexes, enzymes, buffers etc.), tailored for specific sequencing workflows, which allow the simplified preparation of sequencing-ready libraries for small genomes, amplicons, and plasmids
Example 10X Genomics Single Cell 3' v3
Reference https://w3id.org/mixs/0001145
Namespace mixs:library_prep_kit
Library Preparation Kit Version Required
Name library_prep_kit_version
Description The version number of the library preparation kit used for sequencing.
Example 2
Reference http://purl.obolibrary.org/obo/GENEPIO_0000149
Regex ^\d+(\.\d+)?$
Namespace ontology:library_prep_kit_version
Amplification Method
Name amplification_method
Description The method used to amplify the Complementary DNA (cDNA).
Example PCR
Reference #
Namespace ei:amplification_method
cDNA Amplification Cycles
Name cdna_amplification_cycles
Description The number of cycles used during the Complementary DNA (cDNA) amplification process.
Example 12
Reference #
Regex ^\d+$
Namespace ei:cdna_amplification_cycles
Average Size Distribution
Name average_size_distribution
Description The average length of RNA fragments in base pairs (BP) after library preparation, indicating the quality and suitability of the RNA for sequencing.
Example 350
Reference #
Regex ^\d+$
Namespace ei:average_size_distribution
Library Construction Method
Name lib_construction_method
Description The library construction method (including version) that was used.
Example Smart-Seq2
Reference #
Namespace ei:lib_construction_method
Input Molecule
Name input_molecule
Description The specific fraction of biological macromolecule from which the sequencing library is derived.
Example RNA
Reference #
Namespace ei:input_molecule
Primer
Name primer
Description The type of primer used for reverse transcription. This allows users to identify content of the cDNA library input for mRNA.
Example Random
Reference #
Namespace ei:primer
Allowed Values Oligo-dT Random
Primeness
Name primeness
Description The end from which the molecule was sequenced.
Example 5'
Reference #
Namespace ei:primeness
Allowed Values 3' 5' Both
End Bias
Name end_bias
Description The end bias of the library.
Example 3
Reference #
Namespace ei:end_bias
Allowed Values 3 5
Library Strand
Name library_strand
Description The Complementary DNA (cDNA) strand of the library from which the reads derived from - sense (first), antisense (second), both or none.
Example Antisense
Reference #
Namespace ei:library_strand
Allowed Values Antisense Both Sense Unstranded
Spike In
Name spike_in
Description External RNA added to the sample as a control to assess technical variability and normalization in RNA-sequencing. State whether spike-in was used.
Example Yes
Reference #
Namespace ei:spike_in
Allowed Values No Yes
Spike Type
Name spike_type
Description The specific type of external RNA used for spiking in, often indicating the source or nature of the control RNA.
Example Synthetic RNA
Reference #
Namespace ei:spike_type
Spike In Dilution Or Concentration
Name spike_in_dilution_or_concentration
Description The final concentration or dilution (for commercial sets) of the spike in mix.
Example 1:1000
Reference #
Namespace ei:spike_in_dilution_or_concentration
i5 Index Required
Name i5_index
Description Barcode sequence used on the i5 adapter during library preparation for identifying samples in multiplexed single-cell RNA-sequencing.
Example ATCACG
Reference #
Namespace ei:i5_index
i7 Index Required
Name i7_index
Description Barcode sequence used on the i7 adapter to distinguish samples in multiplexed sequencing runs.
Example CGATGT
Reference #
Namespace ei:i7_index
Dual or Single Index Required
Name dual_single_index
Description Specifies if both i5 and i7 indices (dual) or only one index (single) was used for sample identification during sequencing.
Example Dual
Reference #
Namespace ei:dual_single_index
Allowed Values Dual Single
I5 Sequence Required
Name i5_sequence
Description The nucleotide sequence of the i5 index used in multiplexing during sequencing.
Example ATCGTAGC
Reference #
Namespace ei:i5_sequence
i7 Sequence Required
Name i7_sequence
Description The specific nucleotide sequence of the i7 index used for a sample.
Example TGCATGCA
Reference #
Namespace ei:i7_sequence
Plate ID
Name plate_id
Description Identifier for the 96-well plate used in sample preparation.
Example PLT001
Reference #
Regex ^[a-zA-Z0-9]+$
Namespace ei:plate_id
Well Row
Name well_row
Description The row identifier in a 96-well plate indicating the sample's position.
Example A
Reference #
Namespace ei:well_row
Well Column
Name well_col
Description The column identifier in a 96-well plate indicating the sample's position.
Example 5
Reference #
Regex ^\d+$
Namespace ei:well_col
Cell Phenotype
Name cell_phenotype
Description The cell marker for the Fluorescence-Activated Cell Sorting (FACS) of cells.
Example CD41-
Reference #
Namespace ei:cell_phenotype
Allowed Values CD41+ CD41-
Pool Creation Date Required
Name pool_creation_date
Description Date at which the pool was created.
Example 2025-10-24
Reference #
Regex ^\d{4}-(0[1-9]|1[0-2])-(0[1-9]|[12]\d|3[01])$
Namespace faang:pool_creation_date
Pool Creation Protocol Required
Name pool_creation_protocol
Description A link to the protocol for pool of specimens creation.
Reference #
Regex ^(https?|ftp):\/\/[^\s/$.?#].[^\s]*$
Namespace faang:pool_creation_protocol
Design description
Name design_description
Description The design of the library including details of how it was constructed.
Reference #
Namespace ei:design_description
Library selection Required
Name library_selection
Description The method used to select for or against, enrich, or screen the material being sequenced.
Example RANDOM PCR
Reference #
Namespace ei:library_selection
Allowed Values 5-methylcytidine antibody CAGE ChIP ChIP-Seq Dnase HMPR Hybrid Selection Inverse rRNA Inverse rRNA selection MBD2 protein methyl-CpG binding domain MDA MF MSLL Mnase Oligo-dT PCR PolyA RACE RANDOM RANDOM PCR RT-PCR Reduced Representation Restriction Digest cDNA cDNA_oligo_dT cDNA_randomPriming other padlock probes capture method repeat fractionation size fractionation unspecified
Library source Required
Name library_source
Description The type of source material that is being sequenced.
Example GENOMIC
Reference #
Namespace ei:library_source
Allowed Values GENOMIC GENOMIC SINGLE CELL METAGENOMIC METATRANSCRIPTOMIC OTHER SYNTHETIC TRANSCRIPTOMIC TRANSCRIPTOMIC SINGLE CELL VIRAL RNA
Library strategy Required
Name library_strategy
Description The sequencing technique intended for this library.
Example RNA-Seq
Reference #
Namespace ei:library_strategy
Allowed Values AMPLICON ATAC-seq Bisulfite-Seq CLONE CLONEEND CTS ChIA-PET ChIP-Seq ChM-Seq DNase-Hypersensitivity EST FAIRE-seq FINISHING FL-cDNA GBS Hi-C MBD-Seq MNase-Seq MRE-Seq MeDIP-Seq NOMe-Seq OTHER POOLCLONE RAD-Seq RIP-Seq RNA-Seq Ribo-Seq SELEX Synthetic-Long-Read Targeted-Capture Tethered Chromatin Conformation Capture Tn-Seq VALIDATION WCS WGA WGS WXS miRNA-Seq ncRNA-Seq snRNA-seq ssRNA-seq
Study ID Required
Name study_id
Description A unique alphanumeric identifier for this study
Example STUDY001
Reference #
Regex ^[a-zA-Z0-9]+$
Namespace ei:study_id
Sequencing ID Required
Name sequencing_id
Description A unique alphanumeric reference or identifier for the sequencing protocol.
Example SEQ001
Reference https://w3id.org/mixs/0000016
Regex ^[a-zA-Z0-9]+$
Namespace ontology:sequencing_id
Sequencing Platform Name Required
Name sequencing_platform_name
Description The name of the sequencing platform used for the experiment.
Example Pacbio
Reference http://purl.obolibrary.org/obo/NCIT_C172274
Namespace ontology:sequencing_platform_name
Sequencing Instrument Model Required
Name sequencing_instrument_model
Description This refers to the machine or platform used for sequencing, with variations in throughput, read lengths, error rates, and application suitability.
Example Illumina NovaSeq 6000
Reference http://purl.obolibrary.org/obo/GENEPIO_0000149
Namespace ontology:sequencing_instrument_model
Allowed Values 454 GS 454 GS 20 454 GS FLX 454 GS FLX Titanium 454 GS FLX+ 454 GS Junior AB 310 Genetic Analyzer AB 3130 Genetic Analyzer AB 3130xL Genetic Analyzer AB 3500 Genetic Analyzer AB 3500xL Genetic Analyzer AB 3730 Genetic Analyzer AB 3730xL Genetic Analyzer AB 5500 Genetic Analyzer AB 5500xl Genetic Analyzer AB 5500xl-W Genetic Analysis System AB SOLiD 3 Plus System AB SOLiD 4 System AB SOLiD 4hq System AB SOLiD PI System AB SOLiD System AB SOLiD System 2.0 AB SOLiD System 3.0 BGISEQ-50 BGISEQ-500 Complete Genomics DNBSEQ-G400 DNBSEQ-G400 FAST DNBSEQ-G50 DNBSEQ-T10x4RS DNBSEQ-T7 Element AVITI FASTASeq 300 GENIUS GS111 Genapsys Sequencer GenoCare 1600 GenoLab M GridION Illumina Genome Analyzer Illumina Genome Analyzer II Illumina Genome Analyzer IIx Illumina HiScanSQ Illumina HiSeq 1000 Illumina HiSeq 1500 Illumina HiSeq 2000 Illumina HiSeq 2500 Illumina HiSeq 3000 Illumina HiSeq 4000 Illumina HiSeq X Illumina HiSeq X Five Illumina HiSeq X Ten Illumina MiSeq Illumina MiniSeq Illumina NextSeq 500 Illumina NextSeq 550 Illumina NovaSeq 6000 Illumina NovaSeq X Illumina NovaSeq X Plus Illumina iSeq 100 Ion GeneStudio S5 Ion GeneStudio S5 Plus Ion GeneStudio S5 Prime Ion Torrent Genexus Ion Torrent PGM Ion Torrent Proton Ion Torrent S5 Ion Torrent S5 XL MGISEQ-2000RS MinION NextSeq 1000 NextSeq 2000 Onso PacBio RS PacBio RS II PromethION Revio Sentosa SQ301 Sequel Sequel II Sequel IIe Tapestri UG 100
Library Layout Required
Name lib_layout
Description Specify whether to expect single, paired, or other configuration of reads for sequencing
Example Paired
Reference https://w3id.org/mixs/0000111
Namespace mixs:lib_layout
Allowed Values Other Paired Single Vector
UMI Barcode Read
Name umi_barcode_read
Description The type of read that contains the Unique Molecular Identifier (UMI) barcode.
Example index2
Reference #
Namespace ei:umi_barcode_read
Allowed Values index1 index2 read1 read2
UMI Barcode Offset
Name umi_barcode_offset
Description The offset in sequence of the Unique Molecular Identifier (UMI) identifying barcode.
Example 0
Reference #
Regex ^\d+$
Namespace ei:umi_barcode_offset
UMI Barcode Size
Name umi_barcode_size
Description The size of the Unique Molecular Identifier (UMI) identifying barcode.
Example 10
Reference #
Regex ^\d+$
Namespace ei:umi_barcode_size
Cell Barcode Read
Name cell_barcode_read
Description The type of read that contains the UMI barcode.
Example index1
Reference http://www.ebi.ac.uk/efo/EFO_0010203
Namespace ontology:cell_barcode_read
Allowed Values index1 index2 read1 read2
Cell Barcode Offset
Name cell_barcode_offset
Description The offset in sequence of the cell identifying barcode.
Example 10
Reference http://www.ebi.ac.uk/efo/EFO_0010204
Regex ^\d+$
Namespace ontology:cell_barcode_offset
Cell Barcode Size
Name cell_barcode_size
Description The offset in sequence of the cell identifying barcode.
Example 0
Reference http://www.ebi.ac.uk/efo/EFO_0010205
Regex ^\d+$
Namespace ontology:cell_barcode_size
cDNA Read Required
Name cdna_read
Description The actual nucleotide sequence obtained from Complementary DNA (cDNA) during sequencing.
Example read1
Reference http://www.ebi.ac.uk/efo/EFO_0010195
Namespace ontology:cdna_read
Allowed Values index1 index2 read1 read2
cDNA Read Offset
Name cdna_read_offset
Description The starting position of the Complementary DNA (cDNA) read within the entire sequence, indicating where the read begins after any barcodes or technical sequences.
Example 6
Reference http://www.ebi.ac.uk/efo/EFO_0010201
Regex ^\d+$
Namespace ontology:cdna_read_offset
cDNA Read Size
Name cdna_read_size
Description The size of the Complementary DNA (cDNA) read.
Example 75
Reference http://www.ebi.ac.uk/efo/EFO_0010202
Regex ^\d+$
Namespace ontology:cdna_read_size
Study ID Required
Name study_id
Description A unique alphanumeric identifier for this study
Example STUDY001
Reference #
Regex ^[a-zA-Z0-9]+$
Namespace ei:study_id
File Derived From
Name file_derived_from
Description The name of the file that was used to generate the analysis derived data.
Example file1_sequencing.json
Reference #
Namespace ei:file_derived_from
Inferred Cell Type
Name inferred_cell_type
Description Post analysis cell type or identity declaration based on expression profile or known gene function identified by the performer.
Example type II bipolar neuron
Reference #
Namespace ei:inferred_cell_type
Post Analysis Cell Well Quality
Name post_analysis_cell_well_quality
Description Performer defined measure of whether the read output from the cell was included in the sequencing analysis. For example, cells might be excluded if a threshold percentage of reads did not map to the genome or if pre-sequencing quality measures were not passed.
Example Pass
Reference #
Namespace ei:post_analysis_cell_well_quality
Allowed Values Fail Pass
Other Derived Cell Attributes
Name other_derived_cell_attributes
Description Any other cell level measurement or annotation as result of the analysis.
Example Cluster
Reference #
Namespace ei:other_derived_cell_attributes
Allowed Values Cluster Count Gene UMI tSNE coordinates
Study ID Required
Name study_id
Description A unique alphanumeric identifier for this study
Example STUDY001
Reference #
Regex ^[a-zA-Z0-9]+$
Namespace ei:study_id
Reference Genome
Name reference_genome
Description Indicate version and include stable link to genome data (or attach genome fasta file).
Example GRCh38, https://example.org/grch38.fa
Reference #
Namespace ei:reference_genome
Genome Annotation
Name genome_annotation
Description Indicate version and include stable link. Also indicate if any modification to the original annotation has been applied (e.g. 3' UTR extension) and include modified annotation file employed in the analysis.
Example Ensembl v101, https://example.org/ensembl_v101.gtf
Reference #
Namespace ei:genome_annotation
Annotation Filtering
Name annotation_filtering
Description Indicate which features were filtered (i.e. protein coding, pseudo-genes, TCRs, etc.)
Example Filtered to include only protein-coding genes
Reference #
Namespace ei:annotation_filtering
Genes vs Exons
Name genes_vs_exons
Description Quantification using whole gene intervals or exons.
Example Exon quantification
Reference #
Namespace ei:genes_vs_exons
Library Structure
Name library_structure
Description seqspec format
Example Single-cell 3' library
Reference #
Namespace ei:library_structure
Mapping and Demultiplexing Software
Name mapping_and_demultiplexing_software
Description Reads/UMI
Example Cell Ranger 6.0.0
Reference #
Namespace ei:mapping_and_demultiplexing_software
Read Mapping Statistics
Name read_mapping_statistics
Description Statistics of the Reads or Unique Molecular Identifier (UMI).
Example 80% reads mapped to reference
Reference #
Namespace ei:read_mapping_statistics
Sequencing Saturation
Name sequencing_saturation
Description Depending on number of cells recovered (not targeted) and technology
Example 95% sequencing saturation
Reference #
Namespace ei:sequencing_saturation
UMIs or Barcode Distribution QC
Name umis_barcode_distribution_qc
Description Show Unique Molecular Identifiers (UMIs) per barcode distribution and threshold applied
Example Threshold: 10 UMIs per barcode
Reference #
Namespace ei:umis_barcode_distribution_qc
Cell or Non-Cell Filtering Strategy
Name cell_non_cell_filtering_strategy
Description Unique Molecular Identifier (UMI) threshold used to discriminate cells from non-cells. Description of algorithm (if any) and parameters used to determine cells or non-cells.
Example Threshold: 5 UMIs for cell detection
Reference #
Namespace ei:cell_non_cell_filtering_strategy
Other Quality Filters Applied
Name other_quality_filters_applied
Description Cells/nuclei discarded based on % mitochondrial reads, % rRNA reads, etc.
Example Cells with >20% mitochondrial reads discarded
Reference #
Namespace ei:other_quality_filters_applied
Ambient RNA QC
Name ambient_rna_qc
Description Report % UMIs in background cell barcodes, and algorithm (if any) used to remove ambient RNA
Example Ambient RNA removed if >5% UMIs in background barcodes
Reference #
Namespace ei:ambient_rna_qc
Predicted Doublet Rate QC
Name predicted_doublet_rate_qc
Description Depending on number of cells recovered (not targeted) and technology
Example Predicted doublet rate: 1.5%
Reference #
Namespace ei:predicted_doublet_rate_qc
Individual Organism SNP Demultiplexing
Name individual_organism_snp_demultiplexing
Description If carried out, show SNP partitioning quality (e.g. SNP UMAP embedding or covariance matrix), algorithm used
Example SNP UMAP embedding using CellSNP
Reference #
Namespace ei:individual_organism_snp_demultiplexing
Study ID Required
Name study_id
Description A unique alphanumeric identifier for this study
Example STUDY001
Reference #
Regex ^[a-zA-Z0-9]+$
Namespace ei:study_id
Clustering Algorithm and Version
Name clustering_algorithm_and_version
Description If compared/integrated with existing datasets
Example Louvain 0.8.0
Reference #
Namespace ei:clustering_algorithm_and_version
Clustering Parameters
Name clustering_parameters
Description If compared/integrated with existing datasets
Example Resolution: 0.6, K-nearest neighbors: 10
Reference #
Namespace ei:clustering_parameters
Integration/Batch Correction
Name integration_batch_correction
Description If compared/integrated with existing datasets
Example Harmony v1.0
Reference #
Namespace ei:integration_batch_correction
Source Code
Name source_code
Description If any newly developed code/software has been used in the processing and downstream analysis of the dataset.
Example Source code is hosted on GitHub and includes custom algorithms for UMI count normalization. The repository can be found at: https://github.com/user/umi-normalization.
Reference #
Namespace ei:source_code
UMI Count Matrix
Name umi_count_matrix
Description Gene x cell matrix with UMI counts for each gene in each cell.
Example The UMI count matrix is stored in a CSV file with gene IDs as rows (e.g., ENSG00000139618) and cell barcodes as columns (e.g., Cell_001, Cell_002). The matrix file is available at: https://example.com/umi_count_matrix.csv.
Reference #
Namespace ei:umi_count_matrix
Ensembl IDs
Name ensembl_ids
Description Gene or transcript names should be listed as Ensembl (or other standardized ID), with gene short names in metadata.
Example ENSG00000139618
Reference #
Regex ^[a-zA-Z0-9]+$
Namespace ei:ensembl_ids
Functional Gene Annotations
Name functional_gene_annotations
Description Any functional annotation generated/used (gene names, GOs, structural domains, etc.).
Example Functional gene annotations, including Gene Ontology (GO) terms, are provided in the metadata. For example, the gene 'ENSG00000139618' (BRCA1) is annotated with the GO term 'GO:0003674' (DNA binding).
Reference #
Namespace ei:functional_gene_annotations
Protein Models
Name protein_models
Description FASTA file with (or stable link to) the predicted proteins associated to genes in the UMI count matrix and matching IDs.
Example The protein sequences for genes are provided in a FASTA file available at: https://example.com/protein_models.fasta, where each protein sequence is linked to the corresponding gene ID.
Reference #
Namespace ei:protein_models
Cell Metadata
Name cell_metadata
Description Table mapping cell IDs to cluster/cell type/broad cell type annotations.
Example Cell metadata includes information such as cell type annotations ('Tumor', 'Normal') and experimental conditions ('Control', 'Treatment'). This data is available in a table at: https://example.com/cell_metadata.csv.
Reference #
Namespace ei:cell_metadata
Cluster-Level Normalised Expression Tables
Name cluster_level_normalised_expression_tables
Description Expression tables that show normalised gene expression at the cluster or cell-type level.
Example Normalised gene expression data at the cluster level is provided in a tab-delimited text file. For example, gene 'ENSG00000139618' (BRCA1) has expression values for clusters: Cluster_1: 1200, Cluster_2: 900. The full expression table is available at: https://example.com/cluster_level_expression.csv.
Reference #
Namespace ei:cluster_level_normalised_expression_tables
Other Resource Files
Name other_resource_files
Description Necessary to re-use and interpret the data. E.g. barcode information in complex, serial multiplexing protocols (clicktags).
Example Barcode information used in multiplexing protocols is provided in a separate file, which can be accessed at: https://example.com/barcode_data.csv.
Reference #
Namespace ei:other_resource_files
Study ID Required
Name study_id
Description A unique alphanumeric identifier for this study
Example STUDY001
Reference #
Regex ^[a-zA-Z0-9]+$
Namespace ei:study_id
File ID Required
Name file_id
Description A unique alphanumeric identifier for this file
Example STUDY001
Reference #
Regex ^[a-zA-Z0-9]+$
Namespace ei:file_id
Library Preparation ID Required
Name library_prep_id
Description A unique alphanumeric reference or identifier for the library preparation protocol used during the sequencing.
Example LIBPREP001
Reference #
Regex ^[a-zA-Z0-9]+$
Namespace ei:library_prep_id
Sequencing ID Required
Name sequencing_id
Description A unique alphanumeric reference or identifier for the sequencing protocol.
Example SEQ001
Reference #
Regex ^[a-zA-Z0-9]+$
Namespace ei:sequencing_id
Read 1 File Required
Name read_1_file
Description The name or accession of the file that contains read 1.
Example file1_r1.fastq.gz
Reference #
Namespace ei:read_1_file
Read 2 File
Name read_2_file
Description The name or accession of the file that contains read 2.
Example file2_r2.fastq.gz
Reference #
Namespace ei:read_2_file
Index 1 File
Name index_1_file
Description The name of the file that contains index 1.
Example file1_i1.fastq.gz
Reference #
Namespace ei:index_1_file
Index 2 File
Name index_2_file
Description The name of the file that contains index 2.
Example file2_i2.fastq.gz
Reference #
Namespace ei:index_2_file
Read 1 Checksum Required
Name read_1_file_checksum
Description Result of a hash function calculated on the content of the read 1 file to verify file integrity. Commonly used algorithms include MD5 and SHA-1. The checksums should be separated by a comma (,).
Example f8d29e41a73b5c02de9a6fb314e7c8ad
Reference #
Regex ^[0-9a-f]{32}$
Namespace ei:read_1_file_checksum
Read 2 Checksum
Name read_2_file_checksum
Description Result of a hash function calculated on the content of the read 2 file to verify file integrity. Commonly used algorithms include MD5 and SHA-1. The checksums should be separated by a comma (,).
Example a3f4c1b29d8e57fa41b02de6c7f9ab83
Reference #
Regex ^[0-9a-f]{32}$
Namespace ei:read_2_file_checksum
White List Barcode File
Name white_list_barcode_file
Description A file containing the known cell barcodes in the dataset.
Example barcodes.tsv
Reference #
Namespace ei:white_list_barcode_file
Study ID Required
Name study_id
Description A unique alphanumeric identifier for this study
Example STUDY001
Reference #
Regex ^[a-zA-Z0-9]+$
Namespace ei:study_id
Expression Data Process Setting ID Required
Name expression_data_process_setting_id
Description A unique alphanumeric identifier for the expression data process setting
Example EXPSET001
Reference #
Regex ^[a-zA-Z0-9]+$
Namespace ei:expression_data_process_setting_id
Matrix Type
Name matrix_type
Description Matrix Type
Example raw_counts
Reference #
Namespace ei:matrix_type
Allowed Values imputed log1p nomalised pseudobulk raw_counts scaled
Reference Genome Required
Name reference_genome
Description The associated reference genome
Example https://reference-genome-example.com
Reference #
Regex ^((https?|ftp):\/\/[^\s|]+)(\|((https?|ftp):\/\/[^\s|]+))*$
Namespace ei:reference_genome
Annotation Version
Name annotation_version
Description The annotation version of the associated reference genome
Example GENCODE v44
Reference #
Namespace ei:annotation_version
Normalisation Method
Name normalisation_method
Description Any normalisation processing performed
Example Log normalisation
Reference #
Namespace ei:normalisation_method
Allowed Values Library Size Normalisation Log Normalisation SCNorm SCTransform scran
Highly Variable Gene Selection (HVG)
Name highly_variable_gene_selection
Description Number of Highly Variable Genes
Example seurat_v3, n=2000
Reference #
Namespace ei:highly_variable_gene_selection
Dimensionality Reduction
Name dimensionality_reduction
Description Method used to reduce dimensionality in the expression data
Example PCA
Reference #
Namespace ei:dimensionality_reduction
Allowed Values Diffusion Map ICA NMF PCA UMAP t-SNE
Number of Nearest Neighbours
Name n_neighbours
Description Number of nearest neighbours used to calculate cluster membership
Example pca:50
Reference #
Namespace ei:n_neighbours
Clustering Algorithm
Name clustering_algorithm
Description Algorithm used to create clusters
Reference #
Namespace ei:clustering_algorithm
Clustering Resolution
Name clustering_resolution
Description Resolution parameter
Example 2.5
Reference #
Regex ^([0-9]*[.])?[0-9]+
Namespace ei:clustering_resolution
Clustering Distance Metric
Name clustering_distance_metric
Description Metic used to calculate a points distance to others
Example cosine
Reference #
Namespace ei:clustering_distance_metric
Allowed Values cosine euclidean hamming jaccard manhatten mehalanobis
Software Versions
Name software_versions
Description Primary software packages used for analysis
Reference #
Namespace ei:software_versions
Cell Type Annotation
Name cell-type annotation
Description Tools and Databases used for cell annotation
Reference #
Namespace ei:cell-type annotation
Generated by Pipeline
Name generated_by_pipeline
Description URL of the deposited pipeline used to create this data
Reference #
Regex ^(https?|ftp):\/\/[^\s/$.?#].[^\s]*$
Namespace ei:generated_by_pipeline
Notes
Name notes
Description Any other information
Reference #
Namespace ei:notes
Study ID Required
Name study_id
Description A unique alphanumeric identifier for this study
Example STUDY001
Reference #
Regex ^[a-zA-Z0-9]+$
Namespace ei:study_id
File ID Required
Name expression_data_file_id
Description A unique alphanumeric identifier for the expression data file
Example EXPFILE001
Reference #
Regex ^[a-zA-Z0-9]+$
Namespace ei:expression_data_file_id
Library Preparation ID Required
Name library_prep_id
Description A unique alphanumeric identifier for library preparation
Example LIBPREP001
Reference #
Regex ^[a-zA-Z0-9]+$
Namespace ei:library_prep_id
Expression Data Process Setting ID Required
Name expression_data_setting_id
Description A unique alphanumeric identifier for the expression data process setting
Example EXPSET001
Reference #
Regex ^[a-zA-Z0-9]+$
Namespace ei:expression_data_setting_id
File Name Required
Name expression_data_file
Description Expression data file name
Example exp_file.csv
Reference #
Namespace ei:expression_data_file
File md5 Checkshum Required
Name expression_data_file_checksum
Description calculated md5 checksum for this file
Example 9e4b7a23f6c1d0ab85f29c47e3d8a610
Reference #
Regex ^[0-9a-f]{32}$
Namespace ei:expression_data_file_checksum
File Format Required
Name expression_data_file_format
Description The format of the expression file, such as h5ad or rds
Example csv
Reference #
Namespace ei:expression_data_file_format
Allowed Values csv h5ad loom mtx rds
Number of Cells
Name n_cells
Description The number of cells represented in the expression data
Example 4
Reference #
Regex ^\d+$
Namespace ei:n_cells
Number of Genes
Name n_genes
Description The number of genese represented in the expression data
Example 50
Reference #
Regex ^\d+$
Namespace ei:n_genes
File Size in Bytes
Name file_size_bytes
Description Size of the file recorded in bytes
Example 90
Reference #
Regex ^\d+$
Namespace ei:file_size_bytes
Date Generated
Name date_generated
Description Approximate date this expression data was generated
Example 2024-10-14
Reference #
Regex ^\d{4}-(0[1-9]|1[0-2])-(0[1-9]|[12]\d|3[01])$
Namespace ei:date_generated