Data
Display options in Results can be customized to match the output needs for each study. The data table will automatically be updated as changes are made to reflect the output when the data is exported.
Two views are available Data and Plots.
Context settings
After selecting data, user-selected settings made in Results can be saved as an individual "context setting", e.g. "Data normalization" settings. These settings can used at any later time.
- Make user-selected settings, e.g. for "Data normalization" and "Metabolism indicators".
- To save these settings, click .
- Define a name for this "context setting", in this example "Quant 500 XL export". Click Save.
- If changes compared to loaded contaxt settings can be saved, the turns red.
- To save as new context settings, define a new name.
- All user-selected settings are saved, e.g. as "Quant 500 XL export".
Available options
- Select a context setting, e.g. "Quant 500 XL export".
- Reset context settings to default values.
- Delete a selected context setting.
Application example
Value type
The second dropdown menu in the subheader indicates the type of value being displayed in the data table. By Default, this will be set to Concentration and display the values in μM (the concentration unit can be specified in the Display Options ▾ section below). In most cases, concentration will be the most appropriate value to display. However, there are additional values that can be chosen from the dropdown menu as described below.
Display value | Description |
---|---|
Concentration | Metabolite concentration in the specified unit (default μM) |
Analyte intensity [cps] | Analyte signal intensity in counts per second (cps) |
Internal std. intensity [cps] | Internal standard (ISTD) signal intensity in counts per second (cps) |
Intensity ratio | Analyte signal intensity to ISTD signal intensity ratio |
Accuracy [%] | Accuracy of measurement given as percentage: Measured concentration / expected concentration × 100 |
CV [%] | CV given as percentage of quality control samples, run in replicates of three or more per plate run |
Analyte peak area [area] | Integrated peak area of analyte |
Internal std. peak area [area] | Integrated peak area of ISTD |
Area ratio | Analyte peak area to ISTD peak area ratio |
Analyte retention time [min] | Retention time of analyte based on chromatogram apex |
Internal standard retention time [min] | Retention time of ISTD based on chromatogram apex |
Relative retention time [min] | Ratio between analyte and ISTD retention time |
Analyte peak width [min] | Peak width (full peak width) of analyte, given in minutes |
Internal std. peak width [min] | Peak width (full peak width) of ISTD given in minutes |
Intensity-to-peak area ratio | Ratio between metabolite peak intensity (peak height) and peak area |
Internal std. intensity-to-internal std. peak area ratio | Ratio between ISTD peak intensity (peak height) and peak area |
Intensity-to-peak width ratio | Ratio between metabolite peak intensity (peak height) and peak width (full peak width) |
Internal std. intensity-to-internal std. peak width ratio | Ratio between ISTD peak intensity (peak height) and peak width (full peak width) |
Area-to-peak width ratio | Ratio between metabolite peak area and peak width (full peak width) |
Internal std. area-to-internal std. peak width ratio | Ratio between ISTD peak area and peak width (full peak width) |
Display options
In addition to the type of value displayed, display options can be further tailored using the Display Options ▾ dropdown menu. Each of the options are further described below.
Display value | Description |
---|---|
Show unknowns only | Display only the study samples (i.e. no Calibrators, QCs, etc.) |
Show class | Only display the metabolite class specified in the dropdown menu |
Show bio ID | Display metabolite IDs and direct links to publically available databases like Human Metabolome Database (HMDB) or LIPID MAPS (LMID) (note: not all databases will contain every metabolite and some metabolites may have multiple entries) |
Show analytical details | Displays upper and lower limits of quantitation (ULOQ and LLOQ, respectively). Further details can also be selected with the sliders below:
|
Sort columns by | Defines how columns are sorted in the table according the the dropdown menu:
|
Concentration unit | Specify the concentration units in the dropdown menu; pmol/mg Tissue and pmol/10E6 Cells are only available if the correct matrix and appropriate factors were previously defined in LIMS |
Log transform data | Log transform the results; The dropdown to the right can further specify Log2 or Log10 |
Split merged rows | Results from different kit parts, e.g. LC and FIA, is displayed in one row. To show each part in a separate row, activate this option |
Data preprocessing
Data preprocessing is a toolset that performs data cleaning and missing value imputation on results before exporting. This process helps to create a more robust dataset for downstream statistical analysis.
The processing is optional. Imputation is exclusively applied to unknown samples (study samples).
Preprocessing options
Group configuration
Data cleaning is performed based on the dataset's defined group information.
- Group information is specific for each loaded dataset and is defined for "unknowns" during sample registration, see section groups and variables.
- If metadata is not selected, all loaded samples are treated as one group.
Example | Combined (concatenated) metadata
- Metadata from different categories can be combined.
- Samples with linked metadata categories "Gender" and "Treatment" is loaded, containing two groups each,Gender: "m" and "f"Treatment: "treated" and "control"
- If one of both categories "Gender" or "Treatment" is loaded, for data cleaning two groups are used: "m" and "f".
- If both categories "Gender" and "Treatment" are loaded, for data cleaning a combination of categories and groups is used: "m-treated", "f-treated", "m-control", und "f-control"
Data cleaning
To improve the reliability of any statistical findings during the later evaluation, the function of data cleaning is to remove metabolites that are not well detected across all groups in the study.
- A cleaning threshold is defined to remove metabolites from the loaded dataset that are not suitable for further processing.
- A cleaning threshold of 80% will remove metabolites from a dataset if less than 80% of concentration values of one metabolite within all groups are valid measurements.
- Concentrations are evaluated metabolite by metabolite, based on the group configuration.
Imputation
Data imputation is a process that replaces unusable or removed values (defined below) before any further data processing is performed, such as statistical analysis.
Imputation is only performed for metabolites not excluded.
Definitions | unusable values
Imputation can be performed for the following categories of unusable values.
Values < LOD/2 are imputed with values between LOD and LOD/2 using a logspline probability function, preserving the variance within the dataset and taking the distribution of values between LOD and LOD/2 into account. If too few valid values for a metabolite are available for the logspline algorithm to work, the concentrations < LOD/2 are replaced with random concentrations between LOD and LOD/2.
If the LOD is not known, the concentration value is removed and replaced with "NA".Missing values > ULOQ, e.g. "∞", are replaced by a random number higher than the largest usable metabolite specific concentration value of the dataset but lower than its double within the interval between the max value and 2-times the max value. The evaluation is performed metabolite by metabolite. If the imputation fails, the missing value is replaced with "NA".
Other missing values are missing at random for technical reasons. They are imputed with the k-nearest neighbors (knn) algorithm. The three samples that have the most similar metabolome are identified and the mean concentration in these samples for the missing metabolite concentration replaces the missing value.
Example for "sample C" with missing value for metabolite "C0".
Metabolite --> Ala C0 C2 difference from C Sample A 3 3 2 (3 + 1) / 2 = 2 Sample B 3 2 1 (3 + 2) / 2 = 2.5 Sample C 1 - 2 - Sample D 1 1 1 (1 + 2) / 2 = 1.5 Sample E 0 0 2 (1) / 1 = 1 Sample F 3 1 0 (3) / 1 = 3 The closest three neighbors of sample C are D, A and B (E does not have a usable value for C0). Their average for C0 is (1 + 3 + 2) / 3 = 2. C0 in sample C will be replaced with value 2.
If no neighbors are available, the value is replaced by the overall mean metabolite concentration of the loaded dataset.
Data normalization
Metabolite data can be normalized in a variety of ways. By default, all data will automatically be target value normalized against the level 2 quality control sample provided the necessary criteria are met. Normalization has been shown to increase data accuracy, reduce cross-batch variability, and improve inter-laboratory reproducibility.
It is strongly recommended to normalize all sample data!
Number of replicates required for normalization
- To perform normalization, a minimum amount of sample source replicates per plate run is required for normalization. Default value is three.
- Define the minimum amount of replicates for normalization in the Settings > Results > Normalization.
Technical details regarding WebIDQ data normalization can be found in Normalization appendix and effectiveness can be seen demonstrated during inter-laboratory ring trial studies.
biocrates kit ring trial publications:
- AbsoluteIDQ® p180
- AbsoluteIDQ® Bile Acids
- AbsoluteIDQ® p400 HR
- MxP® Quant 500 coming soon...
- MxP® Quant 500 XL
Use the options in the Data Normalization ▾ dropdown menu to define and customize the normalization applied to the dataset.
Display value | Description |
---|---|
Batch normalization | Enable or disable sample-based normalization calculations
|
Normalize values < LOD | Normalization is performed for concentrations with the status "< LOD". |
Creatinine normalization | Divides each sample by their respective creatinine concentration (recommended for urine samples). |
Subtract median concentration of zero samples | The median concentrations of the zero samples can be subtracted from the concentrations of samples (not applied to QCs or calibration standards). It may be used for samples with very low metabolic concentrations, like cerebrospinal fluid (CSF), supernatant from cell culture. To subtract metabolite concentrations of cell culture medium from the supernatant (sample), use “unprocessed” medium as zero sample. |
MetaboINDICATOR
MetaboINDICATOR is a tool that calculates sums and ratios of metabolites with relevance to biological and clinical applications (metabolism indicators), to support a more comprehensive understanding of metabolomics studies. In addition, these indicators can significantly reduce biological and analytical variability and can improve the specificity of many findings.
The MetaboINDICATOR tool provides a set of pre-configured sums and ratios that proved to be particularly informative on certain clinical conditions or pathophysiological events. In addition, user-defined sums and ratios can also be created. All sums and ratios are automatically calculated and displayed at the end of the Results table.
If sample data was loaded, but metabolism indicators are not available
Example for Quant 500 XL kit
Select metabolism indicators
Categories for specific diseases, lifestyle factors, and physiological functions
Two display options for metabolism indicators are available, as table or grouped.
Table | Grouped |
---|---|
- To display or hide metabolism indicators, use the toggle .
- To search for specific categories, the filter may be used.
Change view option "table" or "grouped"
Display or hide metabolism indicators
In table view, multiple indicators or a category can be activated or deactivated simultaneously.
- To select specific metabolism indicators, use "shift-click" to select a range or"control-click" for individual selections.
- To activate or deactivate all selected, use the or buttons.
All metabolism indicators can be displayed collapsed or expanded.
Metabolism indicator status
- Metabolism indicators receive a status based on the corresponding single concentration statuses, like "valid" or "< LOD", see concentration validation status, list item 4.
- The status of hightest priority is used for a metabolism indicator (MI). Example: MI = A + B + C. Status of A and B are "valid" and of C is "< LOD". The status of MI is "< LOD".
- Sums are calculated if at least one of all summands or subtrahends is different from a zero value.
- If one summand or subtrahend is missing, the metabolism indicator status is "Incomplete metabolism indicator" and calculated.
- The metabolism indicator value is "NA" if, a calculation is not possible. E.g. no valid concentration is available or the devisor is zero.
Metabolism indicator details
- To see additional information, such as formula, description, and literature references, select a metabolism indicator and click the info button .
- Metabolism indicators pre-defined by biocrates are highlighted with the biocrates logo .
- Metabolism indicator "categories" or "analyte classes" can be shown in the results table, which is defined in the Settings.
Custom metabolism indicators
MetaboINDICATOR also supports user created custom metabolism indicators.
To create a new indicator, click the icon in the MetaboINDICATORs dropdown menu.
In the MetaboINDICATOR window, give the custom indicator a name (i.e. "Custom indicator"). Enter the formula for the indicator in the Formula field (i.e. (C0+C10)/C12). The list of possible metabolite names can be seen in the table on the right side. The formula can be constructed using standard arithmetic operators: +, -, *, / with parentheses to define the order of operation.
Once the formula has been entered, click Validated to check the validity of the formula. If the formula is OK, click Add to save the indicator.
To edit a custom indicator, select and click the info button . To remove, use click the trash icon .
Metabolism indicators pre-defined by biocrates are highlighted with the biocrates logo .
Applying the "80/20 rule"
Examples