Help for filling out templates for High Content Screen submissions to the Image Data Resource (IDR)

When submitting data to the IDR, you should provide 3 basic files:

  1. study file describing the overall study and the screens that were performed
  2. library file(s) describing the plate layout of each screen
  3. processed data file(s) containing summary results and/or a ‘hit' list for each screen

All files should be in tab-delimited text format. Templates are provided but can be modified to suit your experiment. Add or remove columns from the templates as necessary. The templates can be opened in Excel, Open Office etc.

1. Study File

This file should contain a general description of the study, and then list the screen(s) performed, along with the protocols describing how each screen was set up, imaged and the data analyzed, including the criteria used to identify genes of interest.

1a. Study description:

  • Study title (mandatory)
  • Study description (mandatory)
  • Experiment publication (PubMed ID, if applicable)
  • Primary contact information (mandatory

1b. Study screens:

For each screen the following information should be included:

  • Screen type (primary/validation/secondary/other) (mandatory)
  • Screen description – a brief description of the aims or contents of the screen (mandatory)
  • Name of the library file describing the plate layout (mandatory)
  • Protocols –
    • cell growth
    • treatment (if applicable)
    • library selection and construction (mandatory)
    • image acquisition and feature extraction (mandatory)
    • data analysis – the analysis pipeline to go from the image data to any final analysis results such as list of hits and the associated phenotypes. Please specify the scores and thresholds used to assign phenotypes. (mandatory).
  • Phenotypes identified – list any cellular phenotypes identified e.g. large nucleus, elongated cell, delay of mitosis, localization of protein
  • Raw Image data – type of files and organization (mandatory)
  • Name of the processed data file (mandatory)
  • Description of each column in the processed data file (mandatory)
    • If there is more than one screen in the study, then copy and past the previous ‘screen block' of text so it can be filled in again for the next screen.

2. Library File

Each screen should have a library file describing the plates and the cells in each well. Columns can be added or removed as appropriate but this file should include:

  • plate layouts (plate + well location on the plate), The plate names should correspond to the directory names the images are stored in. (mandatory)
  • description of the genotype of the cells in each well or the reagents (siRNA, compound etc) applied to each well including:
    • RNAi screens
      • siRNA identifiers (mandatory) in a column called ‘siRNA Identifier' or ‘siRNA Pool Identifier' as appropriate.
      • siRNA sequences (if available)
      • target gene identifier and symbol (if available). The gene annotation build the siRNAs were mapped to in this analysis (if available)
    • gene knockouts
      • Identifier and symbol of the gene knocked out (mandatory), in columns called ‘Gene Identifier' and ‘Gene Symbol'
    • tagged proteins and overexpression of genes
      • The identifier of the tagged/overexpressed ORF. The relevant gene build information should be given in the screens section of the study file. (mandatory)
      • The associated gene identifier and symbol (if available) in columns called ‘Gene Identifier' and ‘Gene Symbol'
    • other systems
      • An identifier for any reagent used (mandatory) e.g. compound name
      • Any associated gene identifiers and gene symbols (if available)
    • list of control wells (empty well, positive control, negative control) (mandatory)
    • list of any quality control carried out at the well level e.g. wells rejected due to too few cells for analysis, or out of focus images.
    • The Reagent Gene Annotation Build refers to the gene annotation build used to design the reagent e.g. siRNA. The Analysis Gene Annotation Build refers to the gene annotation build used in the current analysis. Sometimes this is the same, in other cases the reagents have been remapped to a more recent build. The gene annotation build should contain both the genome build and gene build if possible e.g. “GRCh37, Ensembl release 61, Feb 2011” or “NCBI36.3, RefSeq release 27, Jan 2008”.

3. Processed Data Files

Each screen should have a processed data file that contains summary information about the results found. This may be a table from the associated publication. This file can contain information relating to each well in a screen e.g. the total number of cells and number of tagged protein foci, or be a list of genes found to show significant phenotypes.

The information in the processed data should be linkable to the library file in some way e.g. by combination of plate and well, or by identifiers such as siRNA or gene identifiers

The contents of each column in the processed file should be described in the study file so that is clear what the values are.

Lists of terms that can be used in the study and library file can be found in this google doc.

Example files can be found in If you are not familiar with github, it is a way of storing versions of files. You can browse the files through the github interface. If you want to download a file you can click on the ‘Raw' button on the top right hand side above the file preview and then do File -> Save Page As in your browser, or if you are familiar with github you can clone the repository. There are files in addition to the study, library and processed files in the repository but you don't need to provide these.

Examples of HCS studies include:

Email with any questions.

back to top

© 2016-2017 University of Dundee & Open Microscopy Environment

OMERO is distributed under the terms of the GNU GPL. For more information, visit