Extract training data from BAM files (indel)

get_training_data_indels(
  bam_paths,
  reference_path,
  bed_include_path = NULL,
  factor = 1,
  common_positions_to_exclude_paths = NULL,
  positions_to_exclude_paths = NULL,
  mm_rate_max = 1,
  verbose = F
)

Arguments

bam_paths: Vector of strings. Paths to .bam files to extract training data from.
reference_path: String. Path to reference genome fasta file.
bed_include_path: String. Path to bed-file with regions to include. Default is NULL.
factor: Number between 0 and 1. Ratio between negative and positive data. Default is 1.
common_positions_to_exclude_paths: Vector of strings. List of files with positions to exclude from all samples. Default is NULL.
positions_to_exclude_paths: Vector of strings. List of files with positions to exclude from training with length equal to number of samples. Default is NULL.
mm_rate_max: Number between 0 and 1. Maximum mismatch rate in position. Default is 1.
verbose: TODO: Write this

Value

A list containing two elements:

data: A tbl_df
info: A data.frame

Extract training data from BAM files (indel)

Arguments

Value

See also