2.1. khipu ecompounds constructor
Constructing empirical compounds de novo based on khipu package. The ion patterns here are constructed differently from formula based calculations, because formulae are not known here. Search functions are based on mass2chem.
empCpds (or epds) can be initiated by signatures on isotopic relationships or adduct relationships. For low-intensity peaks, their isotopic counterparts may not be detectable. For assigned peaks/features, will calculate selectivity/score later too.
- class khipu.epdsConstructor.epdsConstructor(peak_list, mode='pos')[source]
Wrapper class to organize a list of peaks/features into a list of empirical compounds.
- To-dos:
add support of user input formats where rtime isn’t precise or unavailable. add options of coelution_function (see mass2chem.epdsConstructor ) Future consideration: explicitly model resolved and unresolved mass values (e.g. 34S vs 37Cl).
- __init__(peak_list, mode='pos')[source]
- Parameters:
peak_list ([{'parent_masstrace_id': 1670, 'mz': 133.09702315984987,) – ‘rtime’: 654, ‘height’: 14388.0, ‘id’: 555}, …]
mz_tolerance_ppm (ppm tolerance in examining m/z patterns.) –
- peaks_to_epdDict(isotope_search_patterns, adduct_search_patterns, extended_adducts, mz_tolerance_ppm, rt_tolerance=2, charges=[1, 2, 3], has_parent_masstrack=True)[source]
- Parameters:
isotope_search_patterns (exact list used to retrieve the subnetworks. E.g.) – [ (1.003355, ‘13C/12C’, (0, 0.8)), (2.00671, ‘13C/12C*2’, (0, 0.8)), (3.010065, ‘13C/12C*3’, (0, 0.8)), (4.01342, ‘13C/12C*4’, (0, 0.8)), (5.016775, ‘13C/12C*5’, (0, 0.8)), (6.02013, ‘13C/12C*6’, (0, 0.8)),]
adduct_search_patterns (exact list used to retrieve the subnetworks.) – It’s not recommended to have a long list here, as it’s better to search additional in-source modifications after empCpds are seeded. Example adduct_search_patterns list: [ (1.0078, ‘H’), (21.9820, ‘Na/H’), (41.026549, ‘Acetonitrile’)] adduct_search_patterns is dependent on ionization, but the option is left open for other functions.
mz_tolerance_ppm (ppm tolerance in examining m/z patterns.) –
rt_tolerance (tolerance threshold for deviation in retetion time, arbitrary unit depending on input data.) – Default intended as 2 seconds.
- Returns:
epdDict – Not including singletons.
- Return type:
A dictionary of empCpds (empirical compounds) indexed by IDs (‘interim_id’).