| Title: | Contextualizing untargeted Annotation with Semi-quantitative Charged Aerosol Detection for pertinent characterization of natural Extracts |
|---|---|
| Description: | This package provides the infrastructure to perform Automated Composition Assessment of Natural Extracts. |
| Authors: | Adriano Rutz [aut, cre] (ORCID: <https://orcid.org/0000-0003-0443-9902>) |
| Maintainer: | Adriano Rutz <[email protected]> |
| License: | AGPL (>= 3) |
| Version: | 0.0.0.9002 |
| Built: | 2026-05-28 14:50:30 UTC |
| Source: | https://github.com/adafede/cascade |
Add chromato line
add_chromato_line( plot, chromato, shift = 0, normalize_time, name, color, polarity = "pos" )add_chromato_line( plot, chromato, shift = 0, normalize_time, name, color, polarity = "pos" )
plot |
Plot |
chromato |
Chromato |
shift |
Shift |
normalize_time |
Normalize time |
name |
Name |
color |
Color |
polarity |
Polarity |
A plot with added chromato line
NULLNULL
Baseline chromatogram
baseline_chromatogram(df, method = "peakDetection", ...)baseline_chromatogram(df, method = "peakDetection", ...)
df |
Dataframe |
method |
Baseline correction method. Default is "peakDetection". See
|
... |
Additional arguments passed to |
A dataframe with baselined chromatogram
NULLNULL
Change intensity name
change_intensity_name(df, name_rt = "rtime", name_intensity = "intensity")change_intensity_name(df, name_rt = "rtime", name_intensity = "intensity")
df |
Dataframe |
name_rt |
Name RT |
name_intensity |
Name intensity |
A dataframe with changed intensity name
NULLNULL
Check chromatograms
check_chromatograms( chromatograms = c("bpi_pos", "cad_pos", "pda_pos"), chromatograms_list, normalize_time = FALSE, shift_cad = 0, shift_pda = 0, type = "improved" )check_chromatograms( chromatograms = c("bpi_pos", "cad_pos", "pda_pos"), chromatograms_list, normalize_time = FALSE, shift_cad = 0, shift_pda = 0, type = "improved" )
chromatograms |
Chromatograms |
chromatograms_list |
Chromatograms list |
normalize_time |
Normalized time |
shift_cad |
Shift CAD |
shift_pda |
Shift PDA |
type |
Type |
A plot
NULLNULL
Check chromatograms alignment
check_chromatograms_alignment( file_negative = NULL, file_positive = NULL, time_min = 0.5, time_max = 32.5, cad_shift = 0.05, pda_shift = 0.1, fourier_components = 0.01, frequency = 1, resample = 1, chromatograms = c("bpi_pos", "cad_pos", "pda_pos"), headers = c(bpi = "BasePeak_0", pda = "PDA#1_TotalAbsorbance_0", cad = "UV#1_CAD_1_0"), type = "baselined", normalize_intensity = TRUE, normalize_time = FALSE, show_example = FALSE, intensity_floor = 0.001, k2 = 250, k4 = 1250000, sigma = 0.05, smoothing_width = 8, baseline_method = "peakDetection", improve_signal = TRUE )check_chromatograms_alignment( file_negative = NULL, file_positive = NULL, time_min = 0.5, time_max = 32.5, cad_shift = 0.05, pda_shift = 0.1, fourier_components = 0.01, frequency = 1, resample = 1, chromatograms = c("bpi_pos", "cad_pos", "pda_pos"), headers = c(bpi = "BasePeak_0", pda = "PDA#1_TotalAbsorbance_0", cad = "UV#1_CAD_1_0"), type = "baselined", normalize_intensity = TRUE, normalize_time = FALSE, show_example = FALSE, intensity_floor = 0.001, k2 = 250, k4 = 1250000, sigma = 0.05, smoothing_width = 8, baseline_method = "peakDetection", improve_signal = TRUE )
file_negative |
Negative file path |
file_positive |
Positive file path |
time_min |
Minimum time in minutes. Default is 0.5. |
time_max |
Maximum time in minutes. Default is 32.5. |
cad_shift |
CAD time shift in minutes. Default is 0.05. |
pda_shift |
PDA time shift in minutes. Default is 0.1. |
fourier_components |
Fraction of Fourier components to keep. Default is 0.01. |
frequency |
Acquisition frequency in Hz. Default is 1. |
resample |
Resampling factor. Default is 1. |
chromatograms |
Chromatograms to plot. Default is c("bpi_pos", "cad_pos", "pda_pos"). |
headers |
Named vector mapping detector types to header names in the mzML file. |
type |
Type of chromatogram to display. Either "baselined" or "improved". Default is "baselined". |
normalize_intensity |
Normalize intensity? Default is TRUE. |
normalize_time |
Normalize time? Default is FALSE. |
show_example |
Show example data? Default is FALSE. |
intensity_floor |
Small positive value for intensity floor. Default is 0.001. |
k2 |
K2 parameter for signal sharpening. Default is 250. |
k4 |
K4 parameter for signal sharpening. Default is 1250000. |
sigma |
Sigma parameter for signal sharpening. Default is 0.05. |
smoothing_width |
Smoothing width for signal sharpening. Default is 8. |
baseline_method |
Method for baseline correction. Default is "peakDetection". |
improve_signal |
Logical. Whether to apply signal improvement (Fourier filtering and sharpening). Default is TRUE. |
A plot with (non-)aligned chromatograms
## Not run: check_chromatograms_alignment(show_example = TRUE) ## End(Not run)## Not run: check_chromatograms_alignment(show_example = TRUE) ## End(Not run)
Check export dir
check_export_dir(dir)check_export_dir(dir)
dir |
Dir |
A log of checked dir
NULLNULL
Check chromatograms alignment
check_peaks_integration( file = NULL, features = NULL, detector = "cad", chromatogram = "baselined", headers = c(bpi = "BasePeak_0", pda = "PDA#1_TotalAbsorbance_0", cad = "UV#1_CAD_1_0"), min_area = 0.005, min_intensity = 10000, shift = 0.05, show_example = FALSE, fourier_components = 0.01, time_min = 0.5, time_max = 32.5, frequency = 1, resample = 1, intensity_floor = 0.001, k2 = 250, k4 = 1250000, sigma = 0.05, smoothing_width = 8, baseline_method = "peakDetection", sd_max = 50, max_iter = 1000, noise_threshold = 0.001, fit = "egh", intensity_threshold = 0.1, improve_signal = TRUE )check_peaks_integration( file = NULL, features = NULL, detector = "cad", chromatogram = "baselined", headers = c(bpi = "BasePeak_0", pda = "PDA#1_TotalAbsorbance_0", cad = "UV#1_CAD_1_0"), min_area = 0.005, min_intensity = 10000, shift = 0.05, show_example = FALSE, fourier_components = 0.01, time_min = 0.5, time_max = 32.5, frequency = 1, resample = 1, intensity_floor = 0.001, k2 = 250, k4 = 1250000, sigma = 0.05, smoothing_width = 8, baseline_method = "peakDetection", sd_max = 50, max_iter = 1000, noise_threshold = 0.001, fit = "egh", intensity_threshold = 0.1, improve_signal = TRUE )
file |
File path |
features |
Features path |
detector |
Detector type (e.g., "cad", "bpi", "pda") |
chromatogram |
Chromatogram type. One of "original", "improved", or "baselined". Default is "baselined". |
headers |
Named vector mapping detector types to header names. |
min_area |
Minimum area fraction for peak filtering. Default is 0.005. |
min_intensity |
Minimum intensity for feature filtering. Default is 1E4. |
shift |
Time shift in minutes. Default is 0.05. |
show_example |
Show example data? Default is FALSE. |
fourier_components |
Fraction of Fourier components to keep. Default is 0.01. |
time_min |
Time min in minutes. Default is 0.5. |
time_max |
Time max in minutes. Default is 32.5. |
frequency |
Acquisition frequency in Hz. Default is 1. |
resample |
Resampling factor. Default is 1. |
intensity_floor |
Small positive value for intensity floor. Default is 0.001. |
k2 |
K2 parameter for signal sharpening. Default is 250. |
k4 |
K4 parameter for signal sharpening. Default is 1250000. |
sigma |
Sigma parameter for signal sharpening. Default is 0.05. |
smoothing_width |
Smoothing width for signal sharpening. Default is 8. |
baseline_method |
Method for baseline correction. Default is "peakDetection". |
sd_max |
Maximum standard deviation for peak filtering. Default is 50. |
max_iter |
Maximum iterations for peak fitting. Default is 1000. |
noise_threshold |
Noise threshold for peak detection. Default is 0.001. |
fit |
Peak fitting method. One of "egh", "gaussian", or "raw". Default is "egh". |
intensity_threshold |
Minimum normalized intensity threshold for filtering. Default is 0.1. |
improve_signal |
Logical. Whether to apply signal improvement. Default is TRUE. |
A plot with (non-)aligned chromatograms
## Not run: check_peaks_integration(show_example = TRUE) ## End(Not run)## Not run: check_peaks_integration(show_example = TRUE) ## End(Not run)
Compare peaks
compare_peaks(x, list_ms_peaks, peaks_prelist)compare_peaks(x, list_ms_peaks, peaks_prelist)
x |
X |
list_ms_peaks |
list_ms_peaks |
peaks_prelist |
peaks_prelist |
A comparison score
NULLNULL
Deriv
deriv(x, y)deriv(x, y)
x |
X |
y |
Y |
The derivative
NULLNULL
Extract chromatogram
extract_chromatogram(list, type, headers)extract_chromatogram(list, type, headers)
list |
List |
type |
Type |
headers |
Headers |
An extracted chromatogram
NULLNULL
Extract MS peak
extract_ms_peak(x)extract_ms_peak(x)
x |
X |
A peak
NULLNULL
Extract MS progress
extract_ms_progress(xs, ms_data, rts, mzs, nrows)extract_ms_progress(xs, ms_data, rts, mzs, nrows)
xs |
XS |
ms_data |
MS Data |
rts |
RTs |
mzs |
MZs |
nrows |
N rows |
A list of extracted MS peaks
NULLNULL
Filter FFT
filter_fft(x, components)filter_fft(x, components)
x |
X |
components |
Components |
The fourier filtered x
NULLNULL
Temp GT function
format_gt(table, title = "", subtitle = "")format_gt(table, title = "", subtitle = "")
table |
Table |
title |
Title |
subtitle |
Subtitle |
A formatted GT table
NULLNULL
Generate IDs
generate_ids( taxa = c("Swertia", "Kopsia", "Ginkgo"), comparison = NULL, no_stereo = TRUE, filter_ms_conditions = TRUE, start = "0", end = "9999", limit = "1000000" )generate_ids( taxa = c("Swertia", "Kopsia", "Ginkgo"), comparison = NULL, no_stereo = TRUE, filter_ms_conditions = TRUE, start = "0", end = "9999", limit = "1000000" )
taxa |
Taxa |
comparison |
Comparison |
no_stereo |
No stereo |
filter_ms_conditions |
Filter MS conditions |
start |
Start |
end |
End |
limit |
Limit |
IDs
## Not run: generate_ids() ## End(Not run)## Not run: generate_ids() ## End(Not run)
Generate pseudochromatograms
generate_pseudochromatograms( annotations = NULL, features_informed = NULL, features_not_informed = NULL, file = NULL, headers = c(bpi = "BasePeak_0", pda = "PDA#1_TotalAbsorbance_0", cad = "UV#1_CAD_1_0"), detector = "cad", show_example = FALSE, min_confidence = 0.4, min_similarity_prefilter = 0.6, min_similarity_filter = 0.8, mode = "pos", organism = "Swertia chirayita", fourier_components = 0.01, frequency = 1, resample = 1, shift = 0.05, time_min = 0.5, time_max = 32.5, intensity_floor = 0.001, k2 = 250, k4 = 1250000, sigma = 0.05, smoothing_width = 8, baseline_method = "peakDetection", improve_signal = TRUE )generate_pseudochromatograms( annotations = NULL, features_informed = NULL, features_not_informed = NULL, file = NULL, headers = c(bpi = "BasePeak_0", pda = "PDA#1_TotalAbsorbance_0", cad = "UV#1_CAD_1_0"), detector = "cad", show_example = FALSE, min_confidence = 0.4, min_similarity_prefilter = 0.6, min_similarity_filter = 0.8, mode = "pos", organism = "Swertia chirayita", fourier_components = 0.01, frequency = 1, resample = 1, shift = 0.05, time_min = 0.5, time_max = 32.5, intensity_floor = 0.001, k2 = 250, k4 = 1250000, sigma = 0.05, smoothing_width = 8, baseline_method = "peakDetection", improve_signal = TRUE )
annotations |
Annotations file path |
features_informed |
Features informed file path |
features_not_informed |
Features not informed file path |
file |
mzML file path |
headers |
Named vector mapping detector types to header names. |
detector |
Detector type (e.g., "cad", "bpi", "pda") |
show_example |
Show example data? Default is FALSE. |
min_confidence |
Minimum confidence score. Default is 0.4. |
min_similarity_prefilter |
Minimum similarity for pre-filtering. Default is 0.6. |
min_similarity_filter |
Minimum similarity for final filtering. Default is 0.8. |
mode |
Ionization mode. Either "pos" or "neg". Default is "pos". |
organism |
Organism name for taxonomic filtering. |
fourier_components |
Fraction of Fourier components to keep. Default is 0.01. |
frequency |
Acquisition frequency in Hz. Default is 1. |
resample |
Resampling factor. Default is 1. |
shift |
Time shift in minutes. Default is 0.05. |
time_min |
Time min in minutes. Default is 0.5. |
time_max |
Time max in minutes. Default is 32.5. |
intensity_floor |
Small positive value for intensity floor. Default is 0.001. |
k2 |
K2 parameter for signal sharpening. Default is 250. |
k4 |
K4 parameter for signal sharpening. Default is 1250000. |
sigma |
Sigma parameter for signal sharpening. Default is 0.05. |
smoothing_width |
Smoothing width for signal sharpening. Default is 8. |
baseline_method |
Method for baseline correction. Default is "peakDetection". |
improve_signal |
Logical. Whether to apply signal improvement. Default is TRUE. |
A list of plots
## Not run: generate_pseudochromatograms(show_example = TRUE) ## End(Not run)## Not run: generate_pseudochromatograms(show_example = TRUE) ## End(Not run)
Generate IDs
generate_tables( annotations = NULL, file_negative = NULL, file_positive = NULL, min_confidence = 0.4, show_example = FALSE, export_csv = TRUE, export_html = TRUE, export_dir = "data/processed", export_name = "cascade_table" )generate_tables( annotations = NULL, file_negative = NULL, file_positive = NULL, min_confidence = 0.4, show_example = FALSE, export_csv = TRUE, export_html = TRUE, export_dir = "data/processed", export_name = "cascade_table" )
annotations |
Annotations |
file_negative |
File negative |
file_positive |
File positive |
min_confidence |
Min confidence |
show_example |
Show example? Default to FALSE |
export_csv |
Export CSV |
export_html |
Export HTML |
export_dir |
Export Dir |
export_name |
Export name |
Tables
## Not run: generate_tables() ## End(Not run)## Not run: generate_tables() ## End(Not run)
Get peaks
get_peaks( chrom_list, lambdas, fit = c("egh", "gaussian", "raw"), sd.max = 50, max.iter = 100, time.units = c("min", "s", "ms"), estimate_purity = FALSE, noise_threshold = 0.001, collapse = FALSE, ... )get_peaks( chrom_list, lambdas, fit = c("egh", "gaussian", "raw"), sd.max = 50, max.iter = 100, time.units = c("min", "s", "ms"), estimate_purity = FALSE, noise_threshold = 0.001, collapse = FALSE, ... )
chrom_list |
Chrom list |
lambdas |
Lambdas |
fit |
Fit |
sd.max |
Sd max |
max.iter |
Max iter |
time.units |
Time units |
estimate_purity |
Estimate purity |
noise_threshold |
Noise Threshold |
collapse |
Collapse |
... |
... |
Peaks
This was imported from {chromatographR} package and parallelization was removed as it was causing issues on Windows.
Ethan Bass
https://github.com/ethanbass/chromatographR
NULLNULL
Hierarchies grouped progress
hierarchies_grouped_progress(xs)hierarchies_grouped_progress(xs)
xs |
XS |
A list of grouped hierarchies
NULLNULL
Hierarchies Progress
hierarchies_progress(xs, comparison)hierarchies_progress(xs, comparison)
xs |
XS |
comparison |
Comparison |
A list of hierarchies
NULLNULL
Histograms progress
histograms_progress(xs)histograms_progress(xs)
xs |
XS |
A list of histograms
NULLNULL
Improve signal
improve_signal( df, fourier_components = 0.01, frequency = 2, resample = 1, time_min = 0, time_max = Inf, intensity_floor = 0.001, k2 = 250, k4 = 1250000, sigma = 0.05, smoothing_width = 8 )improve_signal( df, fourier_components = 0.01, frequency = 2, resample = 1, time_min = 0, time_max = Inf, intensity_floor = 0.001, k2 = 250, k4 = 1250000, sigma = 0.05, smoothing_width = 8 )
df |
Dataframe with columns 'rtime' and 'intensity' |
fourier_components |
Fraction of Fourier components to keep for filtering. Default is 0.01 (1%). Lower values provide more smoothing. |
frequency |
Acquisition frequency in Hz. Default is 2. |
resample |
Resampling factor. Default is 1. |
time_min |
Time min in minutes. Default is 0. |
time_max |
Time max in minutes. Default is Inf. |
intensity_floor |
Small positive value to ensure all intensities are strictly positive after shifting. Default is 0.001. |
k2 |
K2 parameter for signal sharpening. Default is 250. |
k4 |
K4 parameter for signal sharpening. Default is 1250000. |
sigma |
Sigma parameter for signal sharpening. Default is 0.05. |
smoothing_width |
Smoothing width for signal sharpening. Default is 8. |
A dataframe with improved signal
NULLNULL
Improve signals progress
improve_signals_progress( xs, fourier_components = 0.01, frequency = 2, resample = 1, time_min = 0, time_max = Inf, intensity_floor = 0.001, k2 = 250, k4 = 1250000, sigma = 0.05, smoothing_width = 8 )improve_signals_progress( xs, fourier_components = 0.01, frequency = 2, resample = 1, time_min = 0, time_max = Inf, intensity_floor = 0.001, k2 = 250, k4 = 1250000, sigma = 0.05, smoothing_width = 8 )
xs |
List of dataframes with 'rtime' and 'intensity' columns |
fourier_components |
Fraction of Fourier components to keep. Default is 0.01. |
frequency |
Acquisition frequency in Hz. Default is 2. |
resample |
Resampling factor. Default is 1. |
time_min |
Time min in minutes. Default is 0. |
time_max |
Time max in minutes. Default is Inf. |
intensity_floor |
Small positive value for intensity floor. Default is 0.001. |
k2 |
K2 parameter for signal sharpening. Default is 250. |
k4 |
K4 parameter for signal sharpening. Default is 1250000. |
sigma |
Sigma parameter for signal sharpening. Default is 0.05. |
smoothing_width |
Smoothing width for signal sharpening. Default is 8. |
A list of data frames with improved signals
NULLNULL
Join peaks
join_peaks(chromatograms, peaks, min_area)join_peaks(chromatograms, peaks, min_area)
chromatograms |
Chromatograms |
peaks |
Peaks |
min_area |
Min area |
A dataframe with joined peaks
NULLNULL
Keep best candidates
keep_best_candidates(df)keep_best_candidates(df)
df |
Dataframe |
A dataframe containing the best candidates only
NULLNULL
Load annotations
load_annotations(file = NULL, show_example = FALSE, mode = "pos")load_annotations(file = NULL, show_example = FALSE, mode = "pos")
file |
File |
show_example |
Show example? Default to FALSE |
mode |
Mode |
A table of annotations
NULLNULL
Load chromatograms
load_chromatograms( file = NULL, headers = c(bpi = "BasePeak_0", pda = "PDA#1_TotalAbsorbance_0", cad = "UV#1_CAD_1_0"), show_example = FALSE, example_polarity = "pos" )load_chromatograms( file = NULL, headers = c(bpi = "BasePeak_0", pda = "PDA#1_TotalAbsorbance_0", cad = "UV#1_CAD_1_0"), show_example = FALSE, example_polarity = "pos" )
file |
File |
headers |
Headers |
show_example |
Show example? Default to FALSE |
example_polarity |
Example polarity |
A list of chromatograms
NULLNULL
Load features
load_features(file = NULL, show_example = FALSE)load_features(file = NULL, show_example = FALSE)
file |
File |
show_example |
Show example? Default to FALSE |
A table of features
NULLNULL
Load features informed
load_features_informed(file = NULL, show_example = FALSE)load_features_informed(file = NULL, show_example = FALSE)
file |
File |
show_example |
Show example? Default to FALSE |
A table of informed features
NULLNULL
Load features not informed
load_features_not_informed(file = NULL, show_example = FALSE)load_features_not_informed(file = NULL, show_example = FALSE)
file |
File |
show_example |
Show example? Default to FALSE |
A table of non informed features
NULLNULL
Load MS data
load_ms_data(file = NULL, show_example = FALSE)load_ms_data(file = NULL, show_example = FALSE)
file |
File |
show_example |
Show example? Default to FALSE |
MS data
NULLNULL
Load name
load_name( file = NULL, default = "210619_AR_06_V_03_2_01.mzML", show_example = FALSE )load_name( file = NULL, default = "210619_AR_06_V_03_2_01.mzML", show_example = FALSE )
file |
File |
default |
Default |
show_example |
Show example? Default to FALSE |
A name
NULLNULL
Make chromatographiable
make_chromatographiable( df, mass_min = 50, mass_max = 1500, logp_min = -1, logp_max = 6 )make_chromatographiable( df, mass_min = 50, mass_max = 1500, logp_min = -1, logp_max = 6 )
df |
Dataframe |
mass_min |
Mass min |
mass_max |
Mass max |
logp_min |
Log P min |
logp_max |
Log P max |
A dataframe containing chromatographiable compounds
NULLNULL
Make confident
make_confident(df, score)make_confident(df, score)
df |
Dataframe |
score |
Score |
A dataframe containing annotations with scores above the confidence threshold set
NULLNULL
Make no stereo
make_no_stereo(df)make_no_stereo(df)
df |
Dataframe |
A dataframe with no stereo structures
NULLNULL
Make other
make_other(dataframe, value = "peak_area")make_other(dataframe, value = "peak_area")
dataframe |
Dataframe |
value |
Value |
A dataframe with harmonized "other" subcategories
NULLNULL
Middle pts
middle_pts(x)middle_pts(x)
x |
X |
Middle pts
NULLNULL
Molinfo
molinfo(x)molinfo(x)
x |
X |
A mol image
NULLNULL
No other
no_other(dataframe)no_other(dataframe)
dataframe |
Dataframe |
A dataframe with no other
NULLNULL
Normalize chromato
normalize_chromato(x, df_xy, intensity_threshold = 0.1)normalize_chromato(x, df_xy, intensity_threshold = 0.1)
x |
X |
df_xy |
Df X Y |
intensity_threshold |
Minimum normalized intensity threshold for filtering. Default is 0.1. Set to 0 to keep all points. |
A normalized chromato
NULLNULL
Normalize chromatograms list
normalize_chromatograms_list( list, shift = 0, normalize_intensity = TRUE, normalize_time = FALSE )normalize_chromatograms_list( list, shift = 0, normalize_intensity = TRUE, normalize_time = FALSE )
list |
List |
shift |
Shift |
normalize_intensity |
Normalize time |
normalize_time |
Normalize intensity |
A dataframe with normalized chromatograms
NULLNULL
P ACN I
p_acn_i(acn_eluent, q1, q2, q3)p_acn_i(acn_eluent, q1, q2, q3)
acn_eluent |
ACN eluent |
q1 |
Q1 |
q2 |
Q2 |
q3 |
Q3 |
P ACN I
NULLNULL
Peaks progress
peaks_progress( df_xy, sd_max = 50, max_iter = 1000, noise_threshold = 0.001, fit = "egh" )peaks_progress( df_xy, sd_max = 50, max_iter = 1000, noise_threshold = 0.001, fit = "egh" )
df_xy |
Df X Y |
sd_max |
Maximum standard deviation for peak filtering. Default is 50. |
max_iter |
Maximum iterations for peak fitting. Default is 1000. |
noise_threshold |
Noise threshold for peak detection. Default is 0.001. |
fit |
Peak fitting method. One of "egh", "gaussian", or "raw". Default is "egh". |
A list of peaks
NULLNULL
Plot chromatogram
plot_chromatogram(df, text)plot_chromatogram(df, text)
df |
Dataframe |
text |
Text |
A plot of a chromatogram
NULLNULL
Plot histograms
plot_histograms(dataframe, chromatogram, label, y = "values", xlab = TRUE)plot_histograms(dataframe, chromatogram, label, y = "values", xlab = TRUE)
dataframe |
Dataframe |
chromatogram |
Chromatogram |
label |
Label |
y |
Y |
xlab |
Xlab |
A plot of histograms
NULLNULL
Plot histograms confident
plot_histograms_confident( dataframe, chromatogram, level = "max", time_min, time_max )plot_histograms_confident( dataframe, chromatogram, level = "max", time_min, time_max )
dataframe |
Dataframe |
chromatogram |
Chromatogram |
level |
Level |
time_min |
Time min |
time_max |
Time max |
A plot of confident histograms
NULLNULL
Plot histograms litt
plot_histograms_litt(dataframe, label, y = "values", xlab = TRUE)plot_histograms_litt(dataframe, label, y = "values", xlab = TRUE)
dataframe |
Dataframe |
label |
Label |
y |
Y |
xlab |
Xlab |
A plot of literature histograms
NULLNULL
Plot histograms taxo
plot_histograms_taxo( dataframe, chromatogram, level = "max", mode = "pos", time_min, time_max )plot_histograms_taxo( dataframe, chromatogram, level = "max", mode = "pos", time_min, time_max )
dataframe |
Dataframe |
chromatogram |
Chromatogram |
level |
Level |
mode |
Mode |
time_min |
Time min |
time_max |
Time max |
A plot of taxo histograms
NULLNULL
Plot peak detection
plot_peak_detection(df1, df2, fun)plot_peak_detection(df1, df2, fun)
df1 |
DF 1 containing chromatogram |
df2 |
DF 2 containing peaks |
fun |
Fun |
A plot with (non-)detected peaks
NULLNULL
Plot results 1
plot_results_1(list, chromatogram, mode = "pos", time_min, time_max)plot_results_1(list, chromatogram, mode = "pos", time_min, time_max)
list |
List |
chromatogram |
Chromatogram |
mode |
Mode |
time_min |
Time min |
time_max |
Time max |
A list of plots
NULLNULL
Plot results 2
plot_results_2(list)plot_results_2(list)
list |
List |
A list of plots
NULLNULL
Plot TIMA
plot_tima(tables)plot_tima(tables)
tables |
Tables |
Pretty plots
NULLNULL
Predict response
predict_response( acn = 100, peak_area, p1q1 = 1e-05, p1q2 = -6e-04, p1q3 = -0.0778, p2q1 = 2e-05, p2q2 = -0.00022, p2q3 = 0.05499, p3q1 = -0.00017, p3q2 = 0.0209, p3q3 = 1.4041 )predict_response( acn = 100, peak_area, p1q1 = 1e-05, p1q2 = -6e-04, p1q3 = -0.0778, p2q1 = 2e-05, p2q2 = -0.00022, p2q3 = 0.05499, p3q1 = -0.00017, p3q2 = 0.0209, p3q3 = 1.4041 )
acn |
ACN |
peak_area |
Peak area |
p1q1 |
P1Q1 |
p1q2 |
P1Q2 |
p1q3 |
P1Q3 |
p2q1 |
P2Q1 |
p2q2 |
P2Q2 |
p2q3 |
P2Q3 |
p3q1 |
P3Q1 |
p3q2 |
P3Q2 |
p3q3 |
P3Q3 |
The concentration
NULLNULL
Prehistograms progress
prehistograms_progress(xs)prehistograms_progress(xs)
xs |
XS |
A list of prehistograms
NULLNULL
Prepare comparison
prepare_comparison( features_informed = NULL, features_not_informed = NULL, candidates_confident, min_similarity_prefilter = 0.6, min_similarity_filter = 0.8, mode = "pos", show_example = FALSE, default_peak_area = 0.001 )prepare_comparison( features_informed = NULL, features_not_informed = NULL, candidates_confident, min_similarity_prefilter = 0.6, min_similarity_filter = 0.8, mode = "pos", show_example = FALSE, default_peak_area = 0.001 )
features_informed |
Features informed |
features_not_informed |
Features not informed |
candidates_confident |
Candidates confident |
min_similarity_prefilter |
Min similarity pre filter |
min_similarity_filter |
Min similarity filter |
mode |
Mode |
show_example |
Show example? Default to FALSE |
default_peak_area |
Default peak area for features without peak information. Default is 0.001. |
A list of peaks
NULLNULL
Prepare features
prepare_features(df, min_intensity, name)prepare_features(df, min_intensity, name)
df |
Df |
min_intensity |
Min intensity |
name |
Name |
A dataframe of prepared features
NULLNULL
Prepare hierarchy
prepare_hierarchy( dataframe, type = "analysis", detector = "ms", rescale = FALSE )prepare_hierarchy( dataframe, type = "analysis", detector = "ms", rescale = FALSE )
dataframe |
Dataframe |
type |
Type |
detector |
Detector |
rescale |
Rescale |
A dataframe with prepared hierarchy
NULLNULL
Prepare mz
prepare_mz(x)prepare_mz(x)
x |
X |
A list of prepared mz's
NULLNULL
Prepare peaks
prepare_peaks(x)prepare_peaks(x)
x |
X |
Prepared peaks
NULLNULL
Prepare plot
prepare_plot(dataframe, organism = "species")prepare_plot(dataframe, organism = "species")
dataframe |
Dataframe |
organism |
Organism |
A dataframe prepared for plots
NULLNULL
Prepare plot 2
prepare_plot_2(dataframe)prepare_plot_2(dataframe)
dataframe |
Dataframe |
A dataframe prepared for plots
NULLNULL
Prepare rt
prepare_rt(x, shift = 0)prepare_rt(x, shift = 0)
x |
X |
shift |
Shift |
Prepared RTs
NULLNULL
Prepare TIMA annotations
prepare_tima_annotations( annotations = NULL, predicted_classes = FALSE, min_score_initial = 0, min_score_biological = 0, min_score_chemical = 0, min_score_final = 0, min_matched_peaks_absolute = 0L, min_matched_peaks_percentage = 0, min_peaks = 3L, libraries = c("gnps", "massbank", "merlin", "ISDB", "ISDB - Wikidata", "TIMA MS1"), show_example = FALSE )prepare_tima_annotations( annotations = NULL, predicted_classes = FALSE, min_score_initial = 0, min_score_biological = 0, min_score_chemical = 0, min_score_final = 0, min_matched_peaks_absolute = 0L, min_matched_peaks_percentage = 0, min_peaks = 3L, libraries = c("gnps", "massbank", "merlin", "ISDB", "ISDB - Wikidata", "TIMA MS1"), show_example = FALSE )
annotations |
annotations |
predicted_classes |
Show predicted classes? Default to FALSE |
min_score_initial |
Minimal initial score |
min_score_biological |
Minimal biological score |
min_score_chemical |
Minimal chemical score |
min_score_final |
Minimal final score |
min_matched_peaks_absolute |
Minimal number of matched peaks |
min_matched_peaks_percentage |
Minimal percentage of matched peaks |
min_peaks |
Minimal number of peaks in spectrum |
libraries |
Libraries to consider |
show_example |
Show example? Default to FALSE |
Prepared tables
NULLNULL
Preprocess chromatograms
preprocess_chromatograms( detector = "cad", fourier_components = 0.01, frequency = 2, list, name, resample = 1, shift = 0, time_min = 0, time_max = Inf, intensity_floor = 0.001, k2 = 250, k4 = 1250000, sigma = 0.05, smoothing_width = 8, baseline_method = "peakDetection", improve_signal = TRUE )preprocess_chromatograms( detector = "cad", fourier_components = 0.01, frequency = 2, list, name, resample = 1, shift = 0, time_min = 0, time_max = Inf, intensity_floor = 0.001, k2 = 250, k4 = 1250000, sigma = 0.05, smoothing_width = 8, baseline_method = "peakDetection", improve_signal = TRUE )
detector |
Detector type (e.g., "cad", "bpi", "pda") |
fourier_components |
Fraction of Fourier components to keep. Default is 0.01. |
frequency |
Acquisition frequency in Hz. Default is 2. |
list |
List of chromatograms |
name |
Sample name(s) |
resample |
Resampling factor. Default is 1. |
shift |
Time shift in minutes. Default is 0. |
time_min |
Time min in minutes. Default is 0. |
time_max |
Time max in minutes. Default is Inf. |
intensity_floor |
Small positive value for intensity floor. Default is 0.001. |
k2 |
K2 parameter for signal sharpening. Default is 250. |
k4 |
K4 parameter for signal sharpening. Default is 1250000. |
sigma |
Sigma parameter for signal sharpening. Default is 0.05. |
smoothing_width |
Smoothing width for signal sharpening. Default is 8. |
baseline_method |
Method for baseline correction. Default is
"peakDetection". See |
improve_signal |
Logical. Whether to apply signal improvement (Fourier filtering and sharpening). Default is TRUE. Set to FALSE to skip signal improvement and use original chromatograms. |
A list of preprocessed chromatograms
NULLNULL
Preprocess peaks
preprocess_peaks( detector = "cad", df_features, df_long, df_xy, name, shift = 0, min_area = 0, sd_max = 50, max_iter = 1000, noise_threshold = 0.001, fit = "egh", intensity_threshold = 0.1 )preprocess_peaks( detector = "cad", df_features, df_long, df_xy, name, shift = 0, min_area = 0, sd_max = 50, max_iter = 1000, noise_threshold = 0.001, fit = "egh", intensity_threshold = 0.1 )
detector |
Detector |
df_features |
DF features |
df_long |
DF long |
df_xy |
DF X Y |
name |
Name |
shift |
shift |
min_area |
Minimum area |
sd_max |
Maximum standard deviation for peak filtering. Default is 50. |
max_iter |
Maximum iterations for peak fitting. Default is 1000. |
noise_threshold |
Noise threshold for peak detection. Default is 0.001. |
fit |
Peak fitting method. One of "egh", "gaussian", or "raw". Default is "egh". |
intensity_threshold |
Minimum normalized intensity threshold for filtering in normalize_chromato. Default is 0.1. |
A list of lists and dataframe with preprocessed peaks
NULLNULL
Process compare peaks
process_compare_peaks( file = NULL, features = NULL, type = "baselined", detector = "cad", headers = c(bpi = "BasePeak_0", pda = "PDA#1_TotalAbsorbance_0", cad = "UV#1_CAD_1_0"), export_dir = "data/interim/peaks", show_example = FALSE, fourier_components = 0.01, frequency = 1, min_area = 0.005, min_intensity = 10000, resample = 1, shift = 0.05, time_min = 0.5, time_max = 32.5 )process_compare_peaks( file = NULL, features = NULL, type = "baselined", detector = "cad", headers = c(bpi = "BasePeak_0", pda = "PDA#1_TotalAbsorbance_0", cad = "UV#1_CAD_1_0"), export_dir = "data/interim/peaks", show_example = FALSE, fourier_components = 0.01, frequency = 1, min_area = 0.005, min_intensity = 10000, resample = 1, shift = 0.05, time_min = 0.5, time_max = 32.5 )
file |
File path |
features |
Features path |
type |
Type. "original", "baselined" or "improved" |
detector |
Detector |
headers |
Headers |
export_dir |
Export directory |
show_example |
Show example? Default to FALSE |
fourier_components |
Fourier components |
frequency |
Frequency |
min_area |
Min area |
min_intensity |
Min intensity |
resample |
Resample |
shift |
Shift |
time_min |
Time min |
time_max |
Time max |
A plot with (non-)aligned chromatograms
NULLNULL
Generates SPARQL queries for multiple taxa by fetching a query template from a remote repository and parameterizing it with taxon QIDs and filters.
queries_progress( xs, start = "0", end = "9999", limit = "1000000", query_url = NULL, query_part_1 = NULL, query_part_2 = NULL, query_part_3 = NULL, query_part_4 = NULL )queries_progress( xs, start = "0", end = "9999", limit = "1000000", query_url = NULL, query_part_1 = NULL, query_part_2 = NULL, query_part_3 = NULL, query_part_4 = NULL )
xs |
Named list of taxon QIDs |
start |
Start year for publication date filter (character) |
end |
End year for publication date filter (character) |
limit |
Maximum number of results per query (character) |
query_url |
URL to the remote SPARQL query template. If NULL, uses default. |
query_part_1 |
Deprecated. Kept for backward compatibility. |
query_part_2 |
Deprecated. Kept for backward compatibility. |
query_part_3 |
Deprecated. Kept for backward compatibility. |
query_part_4 |
Deprecated. Kept for backward compatibility. |
A named list of parameterized SPARQL queries
## Not run: qids <- list(Swertia = "Q1234", Kopsia = "Q5678") queries <- queries_progress(xs = qids, start = "2000", end = "2024") ## End(Not run)## Not run: qids <- list(Swertia = "Q1234", Kopsia = "Q5678") queries <- queries_progress(xs = qids, start = "2000", end = "2024") ## End(Not run)
Performs a SPARQL query and returns a data.table. Optimised for very large result sets (millions of rows):
Requests CSV (no IRI angle-bracket decoration, smallest text format)
Requests gzip transfer encoding (3-5x less data over the wire)
Streams the response directly to disk via curl (zero R memory use during download)
Parses with data.table::fread (C-level, multi-threaded)
Falls back to JSON for endpoints that do not support CSV.
query_wikidata( sparql_query, remove_url = TRUE, endpoint = "https://query.wikidata.org/sparql", agent = "https://github.com/bearloga/WikidataQueryServiceR", timeout = 3600L, fallback = TRUE, headers = NULL, post = FALSE )query_wikidata( sparql_query, remove_url = TRUE, endpoint = "https://query.wikidata.org/sparql", agent = "https://github.com/bearloga/WikidataQueryServiceR", timeout = 3600L, fallback = TRUE, headers = NULL, post = FALSE )
sparql_query |
Character. SPARQL query string. |
remove_url |
Logical. Strip |
endpoint |
Character. SPARQL endpoint URL. |
agent |
Character. User-Agent header string. |
timeout |
Integer. Total request timeout in seconds (default 3600). |
fallback |
Logical. Retry with QLever Wikidata endpoint on failure. |
headers |
Character or NULL. Optional |
post |
Logical. If TRUE, send the SPARQL query as an HTTP POST request body (application/x-www-form-urlencoded) rather than as a GET query parameter. Required for endpoints such as QLever that do not accept GET requests. |
A data.table.
NULLNULL
Save histograms progress
save_histograms_progress(xs)save_histograms_progress(xs)
xs |
XS |
Saved histograms
NULLNULL
Save treemaps progress
save_treemaps_progress(xs, type = "treemap")save_treemaps_progress(xs, type = "treemap")
xs |
XS |
type |
Type |
Saved treemaps
NULLNULL
Second der
second_der(x, y)second_der(x, y)
x |
X |
y |
Y |
The second derivative
NULLNULL
Signal sharpening
signal_sharpening( time, intensity, k2 = 250, k4 = 1250000, sigma = 0.05, Smoothing_width = 8, Baseline_adjust = 0 )signal_sharpening( time, intensity, k2 = 250, k4 = 1250000, sigma = 0.05, Smoothing_width = 8, Baseline_adjust = 0 )
time |
time |
intensity |
intensity |
k2 |
K2 parameter controlling the weight of the second derivative in signal sharpening. Default is 250. Lower values increase the sharpening effect from the second derivative. |
k4 |
K4 parameter controlling the weight of the fourth derivative in signal sharpening. Default is 1250000. Lower values increase the sharpening effect from the fourth derivative. |
sigma |
Sigma parameter for derivative weighting. Default is 0.05. Higher values increase the overall sharpening effect. |
Smoothing_width |
Smoothing width for the running mean filter. Default is 8. Higher values provide more smoothing but reduce resolution. |
Baseline_adjust |
Baseline adjustment value. Default is 0. |
A sharpened signal
NULLNULL
Tables progress
tables_progress(xs, structures_classified)tables_progress(xs, structures_classified)
xs |
XS |
structures_classified |
structures classified |
A list of tables
NULLNULL
Taxon name to QID
taxon_name_to_qid(taxon_name)taxon_name_to_qid(taxon_name)
taxon_name |
Taxon name |
A QID
## Not run: taxon_name_to_qid(taxon_name = "Gentiana lutea") ## End(Not run)## Not run: taxon_name_to_qid(taxon_name = "Gentiana lutea") ## End(Not run)
Transform MS
transform_ms(x, min_intensity = 0.1)transform_ms(x, min_intensity = 0.1)
x |
X |
min_intensity |
Minimum normalized intensity threshold for filtering. Default is 0.1. Set to 0 to keep all points. |
A list with transformed MS
NULLNULL
Treemaps progress
treemaps_progress(xs, type = "treemap", hierarchies)treemaps_progress(xs, type = "treemap", hierarchies)
xs |
XS |
type |
Type |
hierarchies |
Hierarchies |
A list of treemaps
NULLNULL
Treemaps progress no title
treemaps_progress_no_title(xs, type = "treemap", hierarchies)treemaps_progress_no_title(xs, type = "treemap", hierarchies)
xs |
XS |
type |
Type |
hierarchies |
Hierarchies |
A list of treemaps with no title
NULLNULL
Wiki progress
wiki_progress(xs)wiki_progress(xs)
xs |
XS |
A list of results of Wikidata queries
NULLNULL
Y as NA
y_as_na(x, y)y_as_na(x, y)
x |
x |
y |
y |
Y's replaced as NA's in X
NULLNULL