Extract training data from BAM files
get_training_data(
bam_paths,
reference_path,
bed_include_path = NULL,
factor = 1,
common_positions_to_exclude_paths = NULL,
positions_to_exclude_paths = NULL,
mm_rate_max = 1,
verbose = F
)
Vector of strings. Paths to .bam
files to extract
training data from.
String. Path to reference genome fasta file.
String. Path to bed-file with regions to include.
Default is NULL
.
Number between 0 and 1. Ratio between negative and positive data. Default is 1.
Vector of strings. List of files
with positions to exclude from all samples. Default is NULL
.
Vector of strings. List of files with
positions to exclude from training with length equal to number of samples.
Default is NULL
.
Number between 0 and 1. Maximum mismatch rate in position. Default is 1.
TODO: Write this
A list containing two elements:
data
: A tbl_df
with dimensions 2 x 22.
info
: A data.frame
with dimensions 1 x 4.
train_dreams_model()
Function for training model.