R/get_training_data_indels.R
get_training_data_indels.Rd
Extract training data from BAM files (indel)
get_training_data_indels(
bam_paths,
reference_path,
bed_include_path = NULL,
factor = 1,
common_positions_to_exclude_paths = NULL,
positions_to_exclude_paths = NULL,
mm_rate_max = 1,
verbose = F
)
Vector of strings. Paths to .bam
files to extract
training data from.
String. Path to reference genome fasta file.
String. Path to bed-file with regions to include.
Default is NULL
.
Number between 0 and 1. Ratio between negative and positive data. Default is 1.
Vector of strings. List of files
with positions to exclude from all samples. Default is NULL
.
Vector of strings. List of files with
positions to exclude from training with length equal to number of samples.
Default is NULL
.
Number between 0 and 1. Maximum mismatch rate in position. Default is 1.
TODO: Write this
A list containing two elements:
data
: A tbl_df
info
: A data.frame
train_dreams_model()
Function for training model.