1.4 Preparing a Mapping File

Mapping File (Tab-delimited .txt): The mapping file is generated by the user, e.g. My_Project.txt. This file contains all of the information about the samples necessary to perform the data analysis.

At a minimum, the mapping file should contain:

  1. The first column header must be “#SampleID”.
  2. The second column header must be “BarcodeSequence”. Cells can be empty if not available.
  3. The third column header must be “LinkerPrimerSequence”.
  4. All subsequent column headers (except the last one) are metadata headers. For example, a “Smoker” column would include either “Yes” or “No”. Note that the data in each column is assumed to be categorical unless specified otherwise. Categorical data columns must include at least 2 unique values per column. For missing data, write “NA”; do not leave blanks.
  5. The last column of the mapping file must be named “Description”. Information in this column includes information that is unique to each sample, such as the medications taken by the patient, or any other descriptive information.

Example 1:

#SampleID BarcodeSequence LinkerPrimerSequence ReversePrimer region Visit Patient Description
101V2 TGATACGTCT agagtttgatcmtggctcag gcwgcctcccgtaggagt V1V2 V2 101V2 No_treatment

Example 2:

#SampleID BarcodeSequence LinkerPrimerSequence InputFileName Description
EB10 EB10.fasta Horse10

Check for errors in mapping file. The output of the command is an interactive .HTML file displaying any errors found. Validating Mapping Files Without Barcodes and/or Primers (The mapping file will still show a warning-as it is lacking any barcodes, it has no way to differentiate sequences, and thus cannot be used for demultiplexing. However, such warnings can be ignored if the mapping file is being used for steps downstream of demultiplexing.)

validate_mapping_file.py -m map.txt -o validate_map -p -b

-m, --mapping_fp Metadata mapping filepath

-o, --output_dir :Required output directory for log file, corrected mapping file, and html file.

-b, --not_barcoded: Use -b if barcodes are not present. BarcodeSequence header still required. [default: False]

-p, --disable_primer_check : Use -p to disable checks for primers. LinkerPrimerSequence header still required. [default: False]

results matching ""

    No results matching ""