Package 'cascade'

Title: Contextualizing untargeted Annotation with Semi-quantitative Charged Aerosol Detection for pertinent characterization of natural Extracts
Description: This package provides the infrastructure to perform Automated Composition Assessment of Natural Extracts.
Authors: Adriano Rutz [aut, cre] (ORCID: <https://orcid.org/0000-0003-0443-9902>)
Maintainer: Adriano Rutz <[email protected]>
License: AGPL (>= 3)
Version: 0.0.0.9002
Built: 2026-05-28 14:50:30 UTC
Source: https://github.com/adafede/cascade

Help Index


Add chromato line

Description

Add chromato line

Usage

add_chromato_line(
  plot,
  chromato,
  shift = 0,
  normalize_time,
  name,
  color,
  polarity = "pos"
)

Arguments

plot

Plot

chromato

Chromato

shift

Shift

normalize_time

Normalize time

name

Name

color

Color

polarity

Polarity

Value

A plot with added chromato line

Examples

NULL

Baseline chromatogram

Description

Baseline chromatogram

Usage

baseline_chromatogram(df, method = "peakDetection", ...)

Arguments

df

Dataframe

method

Baseline correction method. Default is "peakDetection". See baseline for available methods including: "als", "fillPeaks", "irls", "lowpass", "medianWindow", "modpolyfit", "peakDetection", "rfbaseline", "rollingBall", "shirley", "TAP".

...

Additional arguments passed to baseline.

Value

A dataframe with baselined chromatogram

Examples

NULL

Change intensity name

Description

Change intensity name

Usage

change_intensity_name(df, name_rt = "rtime", name_intensity = "intensity")

Arguments

df

Dataframe

name_rt

Name RT

name_intensity

Name intensity

Value

A dataframe with changed intensity name

Examples

NULL

Check chromatograms

Description

Check chromatograms

Usage

check_chromatograms(
  chromatograms = c("bpi_pos", "cad_pos", "pda_pos"),
  chromatograms_list,
  normalize_time = FALSE,
  shift_cad = 0,
  shift_pda = 0,
  type = "improved"
)

Arguments

chromatograms

Chromatograms

chromatograms_list

Chromatograms list

normalize_time

Normalized time

shift_cad

Shift CAD

shift_pda

Shift PDA

type

Type

Value

A plot

Examples

NULL

Check chromatograms alignment

Description

Check chromatograms alignment

Usage

check_chromatograms_alignment(
  file_negative = NULL,
  file_positive = NULL,
  time_min = 0.5,
  time_max = 32.5,
  cad_shift = 0.05,
  pda_shift = 0.1,
  fourier_components = 0.01,
  frequency = 1,
  resample = 1,
  chromatograms = c("bpi_pos", "cad_pos", "pda_pos"),
  headers = c(bpi = "BasePeak_0", pda = "PDA#1_TotalAbsorbance_0", cad = "UV#1_CAD_1_0"),
  type = "baselined",
  normalize_intensity = TRUE,
  normalize_time = FALSE,
  show_example = FALSE,
  intensity_floor = 0.001,
  k2 = 250,
  k4 = 1250000,
  sigma = 0.05,
  smoothing_width = 8,
  baseline_method = "peakDetection",
  improve_signal = TRUE
)

Arguments

file_negative

Negative file path

file_positive

Positive file path

time_min

Minimum time in minutes. Default is 0.5.

time_max

Maximum time in minutes. Default is 32.5.

cad_shift

CAD time shift in minutes. Default is 0.05.

pda_shift

PDA time shift in minutes. Default is 0.1.

fourier_components

Fraction of Fourier components to keep. Default is 0.01.

frequency

Acquisition frequency in Hz. Default is 1.

resample

Resampling factor. Default is 1.

chromatograms

Chromatograms to plot. Default is c("bpi_pos", "cad_pos", "pda_pos").

headers

Named vector mapping detector types to header names in the mzML file.

type

Type of chromatogram to display. Either "baselined" or "improved". Default is "baselined".

normalize_intensity

Normalize intensity? Default is TRUE.

normalize_time

Normalize time? Default is FALSE.

show_example

Show example data? Default is FALSE.

intensity_floor

Small positive value for intensity floor. Default is 0.001.

k2

K2 parameter for signal sharpening. Default is 250.

k4

K4 parameter for signal sharpening. Default is 1250000.

sigma

Sigma parameter for signal sharpening. Default is 0.05.

smoothing_width

Smoothing width for signal sharpening. Default is 8.

baseline_method

Method for baseline correction. Default is "peakDetection".

improve_signal

Logical. Whether to apply signal improvement (Fourier filtering and sharpening). Default is TRUE.

Value

A plot with (non-)aligned chromatograms

Examples

## Not run: 
check_chromatograms_alignment(show_example = TRUE)

## End(Not run)

Check export dir

Description

Check export dir

Usage

check_export_dir(dir)

Arguments

dir

Dir

Value

A log of checked dir

Examples

NULL

Check chromatograms alignment

Description

Check chromatograms alignment

Usage

check_peaks_integration(
  file = NULL,
  features = NULL,
  detector = "cad",
  chromatogram = "baselined",
  headers = c(bpi = "BasePeak_0", pda = "PDA#1_TotalAbsorbance_0", cad = "UV#1_CAD_1_0"),
  min_area = 0.005,
  min_intensity = 10000,
  shift = 0.05,
  show_example = FALSE,
  fourier_components = 0.01,
  time_min = 0.5,
  time_max = 32.5,
  frequency = 1,
  resample = 1,
  intensity_floor = 0.001,
  k2 = 250,
  k4 = 1250000,
  sigma = 0.05,
  smoothing_width = 8,
  baseline_method = "peakDetection",
  sd_max = 50,
  max_iter = 1000,
  noise_threshold = 0.001,
  fit = "egh",
  intensity_threshold = 0.1,
  improve_signal = TRUE
)

Arguments

file

File path

features

Features path

detector

Detector type (e.g., "cad", "bpi", "pda")

chromatogram

Chromatogram type. One of "original", "improved", or "baselined". Default is "baselined".

headers

Named vector mapping detector types to header names.

min_area

Minimum area fraction for peak filtering. Default is 0.005.

min_intensity

Minimum intensity for feature filtering. Default is 1E4.

shift

Time shift in minutes. Default is 0.05.

show_example

Show example data? Default is FALSE.

fourier_components

Fraction of Fourier components to keep. Default is 0.01.

time_min

Time min in minutes. Default is 0.5.

time_max

Time max in minutes. Default is 32.5.

frequency

Acquisition frequency in Hz. Default is 1.

resample

Resampling factor. Default is 1.

intensity_floor

Small positive value for intensity floor. Default is 0.001.

k2

K2 parameter for signal sharpening. Default is 250.

k4

K4 parameter for signal sharpening. Default is 1250000.

sigma

Sigma parameter for signal sharpening. Default is 0.05.

smoothing_width

Smoothing width for signal sharpening. Default is 8.

baseline_method

Method for baseline correction. Default is "peakDetection".

sd_max

Maximum standard deviation for peak filtering. Default is 50.

max_iter

Maximum iterations for peak fitting. Default is 1000.

noise_threshold

Noise threshold for peak detection. Default is 0.001.

fit

Peak fitting method. One of "egh", "gaussian", or "raw". Default is "egh".

intensity_threshold

Minimum normalized intensity threshold for filtering. Default is 0.1.

improve_signal

Logical. Whether to apply signal improvement. Default is TRUE.

Value

A plot with (non-)aligned chromatograms

Examples

## Not run: 
check_peaks_integration(show_example = TRUE)

## End(Not run)

Compare peaks

Description

Compare peaks

Usage

compare_peaks(x, list_ms_peaks, peaks_prelist)

Arguments

x

X

list_ms_peaks

list_ms_peaks

peaks_prelist

peaks_prelist

Value

A comparison score

Examples

NULL

Deriv

Description

Deriv

Usage

deriv(x, y)

Arguments

x

X

y

Y

Value

The derivative

Examples

NULL

Extract chromatogram

Description

Extract chromatogram

Usage

extract_chromatogram(list, type, headers)

Arguments

list

List

type

Type

headers

Headers

Value

An extracted chromatogram

Examples

NULL

Extract MS peak

Description

Extract MS peak

Usage

extract_ms_peak(x)

Arguments

x

X

Value

A peak

Examples

NULL

Extract MS progress

Description

Extract MS progress

Usage

extract_ms_progress(xs, ms_data, rts, mzs, nrows)

Arguments

xs

XS

ms_data

MS Data

rts

RTs

mzs

MZs

nrows

N rows

Value

A list of extracted MS peaks

Examples

NULL

Filter FFT

Description

Filter FFT

Usage

filter_fft(x, components)

Arguments

x

X

components

Components

Value

The fourier filtered x

Examples

NULL

Temp GT function

Description

Temp GT function

Usage

format_gt(table, title = "", subtitle = "")

Arguments

table

Table

title

Title

subtitle

Subtitle

Value

A formatted GT table

Examples

NULL

Generate IDs

Description

Generate IDs

Usage

generate_ids(
  taxa = c("Swertia", "Kopsia", "Ginkgo"),
  comparison = NULL,
  no_stereo = TRUE,
  filter_ms_conditions = TRUE,
  start = "0",
  end = "9999",
  limit = "1000000"
)

Arguments

taxa

Taxa

comparison

Comparison

no_stereo

No stereo

filter_ms_conditions

Filter MS conditions

start

Start

end

End

limit

Limit

Value

IDs

Examples

## Not run: 
generate_ids()

## End(Not run)

Generate pseudochromatograms

Description

Generate pseudochromatograms

Usage

generate_pseudochromatograms(
  annotations = NULL,
  features_informed = NULL,
  features_not_informed = NULL,
  file = NULL,
  headers = c(bpi = "BasePeak_0", pda = "PDA#1_TotalAbsorbance_0", cad = "UV#1_CAD_1_0"),
  detector = "cad",
  show_example = FALSE,
  min_confidence = 0.4,
  min_similarity_prefilter = 0.6,
  min_similarity_filter = 0.8,
  mode = "pos",
  organism = "Swertia chirayita",
  fourier_components = 0.01,
  frequency = 1,
  resample = 1,
  shift = 0.05,
  time_min = 0.5,
  time_max = 32.5,
  intensity_floor = 0.001,
  k2 = 250,
  k4 = 1250000,
  sigma = 0.05,
  smoothing_width = 8,
  baseline_method = "peakDetection",
  improve_signal = TRUE
)

Arguments

annotations

Annotations file path

features_informed

Features informed file path

features_not_informed

Features not informed file path

file

mzML file path

headers

Named vector mapping detector types to header names.

detector

Detector type (e.g., "cad", "bpi", "pda")

show_example

Show example data? Default is FALSE.

min_confidence

Minimum confidence score. Default is 0.4.

min_similarity_prefilter

Minimum similarity for pre-filtering. Default is 0.6.

min_similarity_filter

Minimum similarity for final filtering. Default is 0.8.

mode

Ionization mode. Either "pos" or "neg". Default is "pos".

organism

Organism name for taxonomic filtering.

fourier_components

Fraction of Fourier components to keep. Default is 0.01.

frequency

Acquisition frequency in Hz. Default is 1.

resample

Resampling factor. Default is 1.

shift

Time shift in minutes. Default is 0.05.

time_min

Time min in minutes. Default is 0.5.

time_max

Time max in minutes. Default is 32.5.

intensity_floor

Small positive value for intensity floor. Default is 0.001.

k2

K2 parameter for signal sharpening. Default is 250.

k4

K4 parameter for signal sharpening. Default is 1250000.

sigma

Sigma parameter for signal sharpening. Default is 0.05.

smoothing_width

Smoothing width for signal sharpening. Default is 8.

baseline_method

Method for baseline correction. Default is "peakDetection".

improve_signal

Logical. Whether to apply signal improvement. Default is TRUE.

Value

A list of plots

Examples

## Not run: 
generate_pseudochromatograms(show_example = TRUE)

## End(Not run)

Generate IDs

Description

Generate IDs

Usage

generate_tables(
  annotations = NULL,
  file_negative = NULL,
  file_positive = NULL,
  min_confidence = 0.4,
  show_example = FALSE,
  export_csv = TRUE,
  export_html = TRUE,
  export_dir = "data/processed",
  export_name = "cascade_table"
)

Arguments

annotations

Annotations

file_negative

File negative

file_positive

File positive

min_confidence

Min confidence

show_example

Show example? Default to FALSE

export_csv

Export CSV

export_html

Export HTML

export_dir

Export Dir

export_name

Export name

Value

Tables

Examples

## Not run: 
generate_tables()

## End(Not run)

Get peaks

Description

Get peaks

Usage

get_peaks(
  chrom_list,
  lambdas,
  fit = c("egh", "gaussian", "raw"),
  sd.max = 50,
  max.iter = 100,
  time.units = c("min", "s", "ms"),
  estimate_purity = FALSE,
  noise_threshold = 0.001,
  collapse = FALSE,
  ...
)

Arguments

chrom_list

Chrom list

lambdas

Lambdas

fit

Fit

sd.max

Sd max

max.iter

Max iter

time.units

Time units

estimate_purity

Estimate purity

noise_threshold

Noise Threshold

collapse

Collapse

...

...

Value

Peaks

Note

This was imported from {chromatographR} package and parallelization was removed as it was causing issues on Windows.

Author(s)

Ethan Bass

Source

https://github.com/ethanbass/chromatographR

Examples

NULL

Hierarchies grouped progress

Description

Hierarchies grouped progress

Usage

hierarchies_grouped_progress(xs)

Arguments

xs

XS

Value

A list of grouped hierarchies

Examples

NULL

Hierarchies Progress

Description

Hierarchies Progress

Usage

hierarchies_progress(xs, comparison)

Arguments

xs

XS

comparison

Comparison

Value

A list of hierarchies

Examples

NULL

Histograms progress

Description

Histograms progress

Usage

histograms_progress(xs)

Arguments

xs

XS

Value

A list of histograms

Examples

NULL

Improve signal

Description

Improve signal

Usage

improve_signal(
  df,
  fourier_components = 0.01,
  frequency = 2,
  resample = 1,
  time_min = 0,
  time_max = Inf,
  intensity_floor = 0.001,
  k2 = 250,
  k4 = 1250000,
  sigma = 0.05,
  smoothing_width = 8
)

Arguments

df

Dataframe with columns 'rtime' and 'intensity'

fourier_components

Fraction of Fourier components to keep for filtering. Default is 0.01 (1%). Lower values provide more smoothing.

frequency

Acquisition frequency in Hz. Default is 2.

resample

Resampling factor. Default is 1.

time_min

Time min in minutes. Default is 0.

time_max

Time max in minutes. Default is Inf.

intensity_floor

Small positive value to ensure all intensities are strictly positive after shifting. Default is 0.001.

k2

K2 parameter for signal sharpening. Default is 250.

k4

K4 parameter for signal sharpening. Default is 1250000.

sigma

Sigma parameter for signal sharpening. Default is 0.05.

smoothing_width

Smoothing width for signal sharpening. Default is 8.

Value

A dataframe with improved signal

Examples

NULL

Improve signals progress

Description

Improve signals progress

Usage

improve_signals_progress(
  xs,
  fourier_components = 0.01,
  frequency = 2,
  resample = 1,
  time_min = 0,
  time_max = Inf,
  intensity_floor = 0.001,
  k2 = 250,
  k4 = 1250000,
  sigma = 0.05,
  smoothing_width = 8
)

Arguments

xs

List of dataframes with 'rtime' and 'intensity' columns

fourier_components

Fraction of Fourier components to keep. Default is 0.01.

frequency

Acquisition frequency in Hz. Default is 2.

resample

Resampling factor. Default is 1.

time_min

Time min in minutes. Default is 0.

time_max

Time max in minutes. Default is Inf.

intensity_floor

Small positive value for intensity floor. Default is 0.001.

k2

K2 parameter for signal sharpening. Default is 250.

k4

K4 parameter for signal sharpening. Default is 1250000.

sigma

Sigma parameter for signal sharpening. Default is 0.05.

smoothing_width

Smoothing width for signal sharpening. Default is 8.

Value

A list of data frames with improved signals

Examples

NULL

Join peaks

Description

Join peaks

Usage

join_peaks(chromatograms, peaks, min_area)

Arguments

chromatograms

Chromatograms

peaks

Peaks

min_area

Min area

Value

A dataframe with joined peaks

Examples

NULL

Keep best candidates

Description

Keep best candidates

Usage

keep_best_candidates(df)

Arguments

df

Dataframe

Value

A dataframe containing the best candidates only

Examples

NULL

Load annotations

Description

Load annotations

Usage

load_annotations(file = NULL, show_example = FALSE, mode = "pos")

Arguments

file

File

show_example

Show example? Default to FALSE

mode

Mode

Value

A table of annotations

Examples

NULL

Load chromatograms

Description

Load chromatograms

Usage

load_chromatograms(
  file = NULL,
  headers = c(bpi = "BasePeak_0", pda = "PDA#1_TotalAbsorbance_0", cad = "UV#1_CAD_1_0"),
  show_example = FALSE,
  example_polarity = "pos"
)

Arguments

file

File

headers

Headers

show_example

Show example? Default to FALSE

example_polarity

Example polarity

Value

A list of chromatograms

Examples

NULL

Load features

Description

Load features

Usage

load_features(file = NULL, show_example = FALSE)

Arguments

file

File

show_example

Show example? Default to FALSE

Value

A table of features

Examples

NULL

Load features informed

Description

Load features informed

Usage

load_features_informed(file = NULL, show_example = FALSE)

Arguments

file

File

show_example

Show example? Default to FALSE

Value

A table of informed features

Examples

NULL

Load features not informed

Description

Load features not informed

Usage

load_features_not_informed(file = NULL, show_example = FALSE)

Arguments

file

File

show_example

Show example? Default to FALSE

Value

A table of non informed features

Examples

NULL

Load MS data

Description

Load MS data

Usage

load_ms_data(file = NULL, show_example = FALSE)

Arguments

file

File

show_example

Show example? Default to FALSE

Value

MS data

Examples

NULL

Load name

Description

Load name

Usage

load_name(
  file = NULL,
  default = "210619_AR_06_V_03_2_01.mzML",
  show_example = FALSE
)

Arguments

file

File

default

Default

show_example

Show example? Default to FALSE

Value

A name

Examples

NULL

Make chromatographiable

Description

Make chromatographiable

Usage

make_chromatographiable(
  df,
  mass_min = 50,
  mass_max = 1500,
  logp_min = -1,
  logp_max = 6
)

Arguments

df

Dataframe

mass_min

Mass min

mass_max

Mass max

logp_min

Log P min

logp_max

Log P max

Value

A dataframe containing chromatographiable compounds

Examples

NULL

Make confident

Description

Make confident

Usage

make_confident(df, score)

Arguments

df

Dataframe

score

Score

Value

A dataframe containing annotations with scores above the confidence threshold set

Examples

NULL

Make no stereo

Description

Make no stereo

Usage

make_no_stereo(df)

Arguments

df

Dataframe

Value

A dataframe with no stereo structures

Examples

NULL

Make other

Description

Make other

Usage

make_other(dataframe, value = "peak_area")

Arguments

dataframe

Dataframe

value

Value

Value

A dataframe with harmonized "other" subcategories

Examples

NULL

Middle pts

Description

Middle pts

Usage

middle_pts(x)

Arguments

x

X

Value

Middle pts

Examples

NULL

Molinfo

Description

Molinfo

Usage

molinfo(x)

Arguments

x

X

Value

A mol image

Examples

NULL

No other

Description

No other

Usage

no_other(dataframe)

Arguments

dataframe

Dataframe

Value

A dataframe with no other

Examples

NULL

Normalize chromato

Description

Normalize chromato

Usage

normalize_chromato(x, df_xy, intensity_threshold = 0.1)

Arguments

x

X

df_xy

Df X Y

intensity_threshold

Minimum normalized intensity threshold for filtering. Default is 0.1. Set to 0 to keep all points.

Value

A normalized chromato

Examples

NULL

Normalize chromatograms list

Description

Normalize chromatograms list

Usage

normalize_chromatograms_list(
  list,
  shift = 0,
  normalize_intensity = TRUE,
  normalize_time = FALSE
)

Arguments

list

List

shift

Shift

normalize_intensity

Normalize time

normalize_time

Normalize intensity

Value

A dataframe with normalized chromatograms

Examples

NULL

P ACN I

Description

P ACN I

Usage

p_acn_i(acn_eluent, q1, q2, q3)

Arguments

acn_eluent

ACN eluent

q1

Q1

q2

Q2

q3

Q3

Value

P ACN I

Examples

NULL

Peaks progress

Description

Peaks progress

Usage

peaks_progress(
  df_xy,
  sd_max = 50,
  max_iter = 1000,
  noise_threshold = 0.001,
  fit = "egh"
)

Arguments

df_xy

Df X Y

sd_max

Maximum standard deviation for peak filtering. Default is 50.

max_iter

Maximum iterations for peak fitting. Default is 1000.

noise_threshold

Noise threshold for peak detection. Default is 0.001.

fit

Peak fitting method. One of "egh", "gaussian", or "raw". Default is "egh".

Value

A list of peaks

Examples

NULL

Plot chromatogram

Description

Plot chromatogram

Usage

plot_chromatogram(df, text)

Arguments

df

Dataframe

text

Text

Value

A plot of a chromatogram

Examples

NULL

Plot histograms

Description

Plot histograms

Usage

plot_histograms(dataframe, chromatogram, label, y = "values", xlab = TRUE)

Arguments

dataframe

Dataframe

chromatogram

Chromatogram

label

Label

y

Y

xlab

Xlab

Value

A plot of histograms

Examples

NULL

Plot histograms confident

Description

Plot histograms confident

Usage

plot_histograms_confident(
  dataframe,
  chromatogram,
  level = "max",
  time_min,
  time_max
)

Arguments

dataframe

Dataframe

chromatogram

Chromatogram

level

Level

time_min

Time min

time_max

Time max

Value

A plot of confident histograms

Examples

NULL

Plot histograms litt

Description

Plot histograms litt

Usage

plot_histograms_litt(dataframe, label, y = "values", xlab = TRUE)

Arguments

dataframe

Dataframe

label

Label

y

Y

xlab

Xlab

Value

A plot of literature histograms

Examples

NULL

Plot histograms taxo

Description

Plot histograms taxo

Usage

plot_histograms_taxo(
  dataframe,
  chromatogram,
  level = "max",
  mode = "pos",
  time_min,
  time_max
)

Arguments

dataframe

Dataframe

chromatogram

Chromatogram

level

Level

mode

Mode

time_min

Time min

time_max

Time max

Value

A plot of taxo histograms

Examples

NULL

Plot peak detection

Description

Plot peak detection

Usage

plot_peak_detection(df1, df2, fun)

Arguments

df1

DF 1 containing chromatogram

df2

DF 2 containing peaks

fun

Fun

Value

A plot with (non-)detected peaks

Examples

NULL

Plot results 1

Description

Plot results 1

Usage

plot_results_1(list, chromatogram, mode = "pos", time_min, time_max)

Arguments

list

List

chromatogram

Chromatogram

mode

Mode

time_min

Time min

time_max

Time max

Value

A list of plots

Examples

NULL

Plot results 2

Description

Plot results 2

Usage

plot_results_2(list)

Arguments

list

List

Value

A list of plots

Examples

NULL

Plot TIMA

Description

Plot TIMA

Usage

plot_tima(tables)

Arguments

tables

Tables

Value

Pretty plots

Examples

NULL

Predict response

Description

Predict response

Usage

predict_response(
  acn = 100,
  peak_area,
  p1q1 = 1e-05,
  p1q2 = -6e-04,
  p1q3 = -0.0778,
  p2q1 = 2e-05,
  p2q2 = -0.00022,
  p2q3 = 0.05499,
  p3q1 = -0.00017,
  p3q2 = 0.0209,
  p3q3 = 1.4041
)

Arguments

acn

ACN

peak_area

Peak area

p1q1

P1Q1

p1q2

P1Q2

p1q3

P1Q3

p2q1

P2Q1

p2q2

P2Q2

p2q3

P2Q3

p3q1

P3Q1

p3q2

P3Q2

p3q3

P3Q3

Value

The concentration

Examples

NULL

Prehistograms progress

Description

Prehistograms progress

Usage

prehistograms_progress(xs)

Arguments

xs

XS

Value

A list of prehistograms

Examples

NULL

Prepare comparison

Description

Prepare comparison

Usage

prepare_comparison(
  features_informed = NULL,
  features_not_informed = NULL,
  candidates_confident,
  min_similarity_prefilter = 0.6,
  min_similarity_filter = 0.8,
  mode = "pos",
  show_example = FALSE,
  default_peak_area = 0.001
)

Arguments

features_informed

Features informed

features_not_informed

Features not informed

candidates_confident

Candidates confident

min_similarity_prefilter

Min similarity pre filter

min_similarity_filter

Min similarity filter

mode

Mode

show_example

Show example? Default to FALSE

default_peak_area

Default peak area for features without peak information. Default is 0.001.

Value

A list of peaks

Examples

NULL

Prepare features

Description

Prepare features

Usage

prepare_features(df, min_intensity, name)

Arguments

df

Df

min_intensity

Min intensity

name

Name

Value

A dataframe of prepared features

Examples

NULL

Prepare hierarchy

Description

Prepare hierarchy

Usage

prepare_hierarchy(
  dataframe,
  type = "analysis",
  detector = "ms",
  rescale = FALSE
)

Arguments

dataframe

Dataframe

type

Type

detector

Detector

rescale

Rescale

Value

A dataframe with prepared hierarchy

Examples

NULL

Prepare mz

Description

Prepare mz

Usage

prepare_mz(x)

Arguments

x

X

Value

A list of prepared mz's

Examples

NULL

Prepare peaks

Description

Prepare peaks

Usage

prepare_peaks(x)

Arguments

x

X

Value

Prepared peaks

Examples

NULL

Prepare plot

Description

Prepare plot

Usage

prepare_plot(dataframe, organism = "species")

Arguments

dataframe

Dataframe

organism

Organism

Value

A dataframe prepared for plots

Examples

NULL

Prepare plot 2

Description

Prepare plot 2

Usage

prepare_plot_2(dataframe)

Arguments

dataframe

Dataframe

Value

A dataframe prepared for plots

Examples

NULL

Prepare rt

Description

Prepare rt

Usage

prepare_rt(x, shift = 0)

Arguments

x

X

shift

Shift

Value

Prepared RTs

Examples

NULL

Prepare TIMA annotations

Description

Prepare TIMA annotations

Usage

prepare_tima_annotations(
  annotations = NULL,
  predicted_classes = FALSE,
  min_score_initial = 0,
  min_score_biological = 0,
  min_score_chemical = 0,
  min_score_final = 0,
  min_matched_peaks_absolute = 0L,
  min_matched_peaks_percentage = 0,
  min_peaks = 3L,
  libraries = c("gnps", "massbank", "merlin", "ISDB", "ISDB - Wikidata", "TIMA MS1"),
  show_example = FALSE
)

Arguments

annotations

annotations

predicted_classes

Show predicted classes? Default to FALSE

min_score_initial

Minimal initial score

min_score_biological

Minimal biological score

min_score_chemical

Minimal chemical score

min_score_final

Minimal final score

min_matched_peaks_absolute

Minimal number of matched peaks

min_matched_peaks_percentage

Minimal percentage of matched peaks

min_peaks

Minimal number of peaks in spectrum

libraries

Libraries to consider

show_example

Show example? Default to FALSE

Value

Prepared tables

Examples

NULL

Preprocess chromatograms

Description

Preprocess chromatograms

Usage

preprocess_chromatograms(
  detector = "cad",
  fourier_components = 0.01,
  frequency = 2,
  list,
  name,
  resample = 1,
  shift = 0,
  time_min = 0,
  time_max = Inf,
  intensity_floor = 0.001,
  k2 = 250,
  k4 = 1250000,
  sigma = 0.05,
  smoothing_width = 8,
  baseline_method = "peakDetection",
  improve_signal = TRUE
)

Arguments

detector

Detector type (e.g., "cad", "bpi", "pda")

fourier_components

Fraction of Fourier components to keep. Default is 0.01.

frequency

Acquisition frequency in Hz. Default is 2.

list

List of chromatograms

name

Sample name(s)

resample

Resampling factor. Default is 1.

shift

Time shift in minutes. Default is 0.

time_min

Time min in minutes. Default is 0.

time_max

Time max in minutes. Default is Inf.

intensity_floor

Small positive value for intensity floor. Default is 0.001.

k2

K2 parameter for signal sharpening. Default is 250.

k4

K4 parameter for signal sharpening. Default is 1250000.

sigma

Sigma parameter for signal sharpening. Default is 0.05.

smoothing_width

Smoothing width for signal sharpening. Default is 8.

baseline_method

Method for baseline correction. Default is "peakDetection". See baseline for available methods.

improve_signal

Logical. Whether to apply signal improvement (Fourier filtering and sharpening). Default is TRUE. Set to FALSE to skip signal improvement and use original chromatograms.

Value

A list of preprocessed chromatograms

Examples

NULL

Preprocess peaks

Description

Preprocess peaks

Usage

preprocess_peaks(
  detector = "cad",
  df_features,
  df_long,
  df_xy,
  name,
  shift = 0,
  min_area = 0,
  sd_max = 50,
  max_iter = 1000,
  noise_threshold = 0.001,
  fit = "egh",
  intensity_threshold = 0.1
)

Arguments

detector

Detector

df_features

DF features

df_long

DF long

df_xy

DF X Y

name

Name

shift

shift

min_area

Minimum area

sd_max

Maximum standard deviation for peak filtering. Default is 50.

max_iter

Maximum iterations for peak fitting. Default is 1000.

noise_threshold

Noise threshold for peak detection. Default is 0.001.

fit

Peak fitting method. One of "egh", "gaussian", or "raw". Default is "egh".

intensity_threshold

Minimum normalized intensity threshold for filtering in normalize_chromato. Default is 0.1.

Value

A list of lists and dataframe with preprocessed peaks

Examples

NULL

Process compare peaks

Description

Process compare peaks

Usage

process_compare_peaks(
  file = NULL,
  features = NULL,
  type = "baselined",
  detector = "cad",
  headers = c(bpi = "BasePeak_0", pda = "PDA#1_TotalAbsorbance_0", cad = "UV#1_CAD_1_0"),
  export_dir = "data/interim/peaks",
  show_example = FALSE,
  fourier_components = 0.01,
  frequency = 1,
  min_area = 0.005,
  min_intensity = 10000,
  resample = 1,
  shift = 0.05,
  time_min = 0.5,
  time_max = 32.5
)

Arguments

file

File path

features

Features path

type

Type. "original", "baselined" or "improved"

detector

Detector

headers

Headers

export_dir

Export directory

show_example

Show example? Default to FALSE

fourier_components

Fourier components

frequency

Frequency

min_area

Min area

min_intensity

Min intensity

resample

Resample

shift

Shift

time_min

Time min

time_max

Time max

Value

A plot with (non-)aligned chromatograms

Examples

NULL

Queries progress

Description

Generates SPARQL queries for multiple taxa by fetching a query template from a remote repository and parameterizing it with taxon QIDs and filters.

Usage

queries_progress(
  xs,
  start = "0",
  end = "9999",
  limit = "1000000",
  query_url = NULL,
  query_part_1 = NULL,
  query_part_2 = NULL,
  query_part_3 = NULL,
  query_part_4 = NULL
)

Arguments

xs

Named list of taxon QIDs

start

Start year for publication date filter (character)

end

End year for publication date filter (character)

limit

Maximum number of results per query (character)

query_url

URL to the remote SPARQL query template. If NULL, uses default.

query_part_1

Deprecated. Kept for backward compatibility.

query_part_2

Deprecated. Kept for backward compatibility.

query_part_3

Deprecated. Kept for backward compatibility.

query_part_4

Deprecated. Kept for backward compatibility.

Value

A named list of parameterized SPARQL queries

Examples

## Not run: 
qids <- list(Swertia = "Q1234", Kopsia = "Q5678")
queries <- queries_progress(xs = qids, start = "2000", end = "2024")

## End(Not run)

Query a SPARQL endpoint efficiently

Description

Performs a SPARQL query and returns a data.table. Optimised for very large result sets (millions of rows):

  • Requests CSV (no IRI angle-bracket decoration, smallest text format)

  • Requests gzip transfer encoding (3-5x less data over the wire)

  • Streams the response directly to disk via curl (zero R memory use during download)

  • Parses with data.table::fread (C-level, multi-threaded)

Falls back to JSON for endpoints that do not support CSV.

Usage

query_wikidata(
  sparql_query,
  remove_url = TRUE,
  endpoint = "https://query.wikidata.org/sparql",
  agent = "https://github.com/bearloga/WikidataQueryServiceR",
  timeout = 3600L,
  fallback = TRUE,
  headers = NULL,
  post = FALSE
)

Arguments

sparql_query

Character. SPARQL query string.

remove_url

Logical. Strip ⁠http://www.wikidata.org/entity/⁠ prefix from character columns (default TRUE).

endpoint

Character. SPARQL endpoint URL.

agent

Character. User-Agent header string.

timeout

Integer. Total request timeout in seconds (default 3600).

fallback

Logical. Retry with QLever Wikidata endpoint on failure.

headers

Character or NULL. Optional Accept header used for backward compatibility with older versions. If set, this value is used as preferred response format.

post

Logical. If TRUE, send the SPARQL query as an HTTP POST request body (application/x-www-form-urlencoded) rather than as a GET query parameter. Required for endpoints such as QLever that do not accept GET requests.

Value

A data.table.

Examples

NULL

Save histograms progress

Description

Save histograms progress

Usage

save_histograms_progress(xs)

Arguments

xs

XS

Value

Saved histograms

Examples

NULL

Save treemaps progress

Description

Save treemaps progress

Usage

save_treemaps_progress(xs, type = "treemap")

Arguments

xs

XS

type

Type

Value

Saved treemaps

Examples

NULL

Second der

Description

Second der

Usage

second_der(x, y)

Arguments

x

X

y

Y

Value

The second derivative

Examples

NULL

Signal sharpening

Description

Signal sharpening

Usage

signal_sharpening(
  time,
  intensity,
  k2 = 250,
  k4 = 1250000,
  sigma = 0.05,
  Smoothing_width = 8,
  Baseline_adjust = 0
)

Arguments

time

time

intensity

intensity

k2

K2 parameter controlling the weight of the second derivative in signal sharpening. Default is 250. Lower values increase the sharpening effect from the second derivative.

k4

K4 parameter controlling the weight of the fourth derivative in signal sharpening. Default is 1250000. Lower values increase the sharpening effect from the fourth derivative.

sigma

Sigma parameter for derivative weighting. Default is 0.05. Higher values increase the overall sharpening effect.

Smoothing_width

Smoothing width for the running mean filter. Default is 8. Higher values provide more smoothing but reduce resolution.

Baseline_adjust

Baseline adjustment value. Default is 0.

Value

A sharpened signal

Examples

NULL

Tables progress

Description

Tables progress

Usage

tables_progress(xs, structures_classified)

Arguments

xs

XS

structures_classified

structures classified

Value

A list of tables

Examples

NULL

Taxon name to QID

Description

Taxon name to QID

Usage

taxon_name_to_qid(taxon_name)

Arguments

taxon_name

Taxon name

Value

A QID

Examples

## Not run: 
taxon_name_to_qid(taxon_name = "Gentiana lutea")

## End(Not run)

Transform MS

Description

Transform MS

Usage

transform_ms(x, min_intensity = 0.1)

Arguments

x

X

min_intensity

Minimum normalized intensity threshold for filtering. Default is 0.1. Set to 0 to keep all points.

Value

A list with transformed MS

Examples

NULL

Treemaps progress

Description

Treemaps progress

Usage

treemaps_progress(xs, type = "treemap", hierarchies)

Arguments

xs

XS

type

Type

hierarchies

Hierarchies

Value

A list of treemaps

Examples

NULL

Treemaps progress no title

Description

Treemaps progress no title

Usage

treemaps_progress_no_title(xs, type = "treemap", hierarchies)

Arguments

xs

XS

type

Type

hierarchies

Hierarchies

Value

A list of treemaps with no title

Examples

NULL

Wiki progress

Description

Wiki progress

Usage

wiki_progress(xs)

Arguments

xs

XS

Value

A list of results of Wikidata queries

Examples

NULL

Y as NA

Description

Y as NA

Usage

y_as_na(x, y)

Arguments

x

x

y

y

Value

Y's replaced as NA's in X

Examples

NULL