Data

Display options in Results can be customized to match the output needs for each study. The data table will automatically be updated as changes are made to reflect the output when the data is exported.

Two views are available Data and Plots.

Context settings

After selecting data, user-selected settings made in Results can be saved as an individual "context setting", e.g. "Data normalization" settings. These settings can used at any later time.

Make user-selected settings, e.g. for "Data normalization" and "Metabolism indicators".
To save these settings, click .
Define a name for this "context setting", in this example "Quant 500 XL export".
Click Save.
- If changes compared to loaded contaxt settings can be saved, the turns red.
- To save as new context settings, define a new name.
All user-selected settings are saved, e.g. as "Quant 500 XL export".

Available options

Select a context setting, e.g. "Quant 500 XL export".
Reset context settings to default values.
Delete a selected context setting.

Application example

Context setting application example

Value type

Results > Concentration

The second dropdown menu in the subheader indicates the type of value being displayed in the data table. By Default, this will be set to Concentration and display the values in μM (the concentration unit can be specified in the Display Options ▾ section below). In most cases, concentration will be the most appropriate value to display. However, there are additional values that can be chosen from the dropdown menu as described below.

Display value	Description
Concentration	Metabolite concentration in the specified unit (default μM)
Analyte intensity [cps]	Analyte signal intensity in counts per second (cps)
Internal std. intensity [cps]	Internal standard (ISTD) signal intensity in counts per second (cps)
Intensity ratio	Analyte signal intensity to ISTD signal intensity ratio
Accuracy [%]	Accuracy of measurement given as percentage: Measured concentration / expected concentration × 100
CV [%]	CV given as percentage of quality control samples, run in replicates of three or more per plate run
Analyte peak area [area]	Integrated peak area of analyte
Internal std. peak area [area]	Integrated peak area of ISTD
Area ratio	Analyte peak area to ISTD peak area ratio
Analyte retention time [min]	Retention time of analyte based on chromatogram apex
Internal standard retention time [min]	Retention time of ISTD based on chromatogram apex
Relative retention time [min]	Ratio between analyte and ISTD retention time
Analyte peak width [min]	Peak width (full peak width) of analyte, given in minutes
Internal std. peak width [min]	Peak width (full peak width) of ISTD given in minutes
Intensity-to-peak area ratio	Ratio between metabolite peak intensity (peak height) and peak area
Internal std. intensity-to-internal std. peak area ratio	Ratio between ISTD peak intensity (peak height) and peak area
Intensity-to-peak width ratio	Ratio between metabolite peak intensity (peak height) and peak width (full peak width)
Internal std. intensity-to-internal std. peak width ratio	Ratio between ISTD peak intensity (peak height) and peak width (full peak width)
Area-to-peak width ratio	Ratio between metabolite peak area and peak width (full peak width)
Internal std. area-to-internal std. peak width ratio	Ratio between ISTD peak area and peak width (full peak width)

Display options

Results > Display options

In addition to the type of value displayed, display options can be further tailored using the Display Options ▾ dropdown menu. Each of the options are further described below.

Display value	Description
Show unknowns only	Display only the study samples (i.e. no Calibrators, QCs, etc.)
Show class	Only display the metabolite class specified in the dropdown menu
Show bio ID	Display metabolite IDs and direct links to publically available databases like Human Metabolome Database (HMDB) or LIPID MAPS (LMID) (note: not all databases will contain every metabolite and some metabolites may have multiple entries)
Show analytical details	Displays upper and lower limits of quantitation (ULOQ and LLOQ, respectively). Further details can also be selected with the sliders below: Show calibrator usage Show calibration equation
Sort columns by	Defines how columns are sorted in the table according the the dropdown menu: Class and name: Sort by metabolite class then name Name: Sort alphabetically by metabolite name
Concentration unit	Specify the concentration units in the dropdown menu; pmol/mg Tissue and pmol/10E6 Cells are only available if the correct matrix and appropriate factors were previously defined in LIMS
Log transform data	Log transform the results; The dropdown to the right can further specify Log₂ or Log₁₀
Split merged rows	Results from different kit parts, e.g. LC and FIA, is displayed in one row. To show each part in a separate row, activate this option

Data preprocessing

Data preprocessing is a toolset that performs data cleaning and missing value imputation on results before exporting. This process helps to create a more robust dataset for downstream statistical analysis.

The processing is optional. Imputation is exclusively applied to unknown samples (study samples).

Data preprocessing is performed before data normalization and calculation of metabolism indicators.

Preprocessing options

Data preprocessing options

Group configuration

Add metadata

Data cleaning is performed based on the dataset's defined group information.

Group information is specific for each loaded dataset and is defined for "unknowns" during sample registration, see section groups and variables.
If metadata is not selected, all loaded samples are treated as one group.

Example | Combined (concatenated) metadata

Metadata from different categories can be combined.
Samples with linked metadata categories "Gender" and "Treatment" is loaded, containing two groups each,
Gender: "m" and "f"

Treatment: "treated" and "control"
If one of both categories "Gender" or "Treatment" is loaded, for data cleaning two groups are used: "m" and "f".
If both categories "Gender" and "Treatment" are loaded, for data cleaning a combination of categories and groups is used: "m-treated", "f-treated", "m-control", und "f-control"

Data cleaning

Data cleaning threshold

To improve the reliability of any statistical findings during the later evaluation, the function of data cleaning is to remove metabolites that are not well detected across all groups in the study.

A cleaning threshold is defined to remove metabolites from the loaded dataset that are not suitable for further processing.
A cleaning threshold of 80% will remove metabolites from a dataset if less than 80% of concentration values of one metabolite within all groups are valid measurements.
Concentrations are evaluated metabolite by metabolite, based on the group configuration.

Applying the "80/20 rule"

The "80/20 rule" uses a cleaning threshold of 80%.
Concentrations are evaluated based on the group configuration for each defined metadata individually, e.g., for the combination of the metadata information "Material" and "Species".
The evaluation is performed metabolite by metabolite to each metadata set. The "80/20 rule" is applied: if more than 20% of concentrations in all groups have non-valid concentrations, e.g. < LOD, the metabolite is excluded from the loaded dataset.
Removed metabolites are not displayed in Results.
Imputation is only applied to metabolites that were not removed.
No specific status used for "removed metabolites", as they are are absent (no value).

Examples

A) Valid concentrations exist in 75% of the samples in group A, 79% in group B → the metabolite is excluded

B) Valid concentrations exists in 80% of the samples in group A, 0% in group B → the metabolite is not excluded

Imputation

Data imputation is a process that replaces unusable or removed values (defined below) before any further data processing is performed, such as statistical analysis.

Imputation is only performed for metabolites not excluded.

Definitions | unusable values

Imputation can be performed for the following categories of unusable values.

Values < LOD/2 are imputed with values between LOD and LOD/2 using a logspline probability function, preserving the variance within the dataset and taking the distribution of values between LOD and LOD/2 into account. If too few valid values for a metabolite are available for the logspline algorithm to work, the concentrations < LOD/2 are replaced with random concentrations between LOD and LOD/2.
If the LOD is not known, the concentration value is removed and replaced with "NA".
< LOD/2 = concentration below half of LOD.
Missing values > ULOQ, e.g. "∞", are replaced by a random number higher than the largest usable metabolite specific concentration value of the dataset but lower than its double within the interval between the max value and 2-times the max value. The evaluation is performed metabolite by metabolite. If the imputation fails, the missing value is replaced with "NA".
Other missing values are missing at random for technical reasons. They are imputed with the k-nearest neighbors (knn) algorithm. The three samples that have the most similar metabolome are identified and the mean concentration in these samples for the missing metabolite concentration replaces the missing value.
Example for "sample C" with missing value for metabolite "C0".
Metabolite --> Ala C0 C2 difference from C
Sample A 3 3 2 (3 + 1) / 2 = 2
Sample B 3 2 1 (3 + 2) / 2 = 2.5
Sample C 1 - 2 -
Sample D 1 1 1 (1 + 2) / 2 = 1.5
Sample E 0 0 2 (1) / 1 = 1
Sample F 3 1 0 (3) / 1 = 3
The closest three neighbors of sample C are D, A and B (E does not have a usable value for C0). Their average for C0 is (1 + 3 + 2) / 3 = 2. C0 in sample C will be replaced with value 2.
If no neighbors are available, the value is replaced by the overall mean metabolite concentration of the loaded dataset.

Metabolite -->	Ala	C0	C2	difference from C
Sample A	3	3	2	(3 + 1) / 2 = 2
Sample B	3	2	1	(3 + 2) / 2 = 2.5
Sample C	1	-	2	-
Sample D	1	1	1	(1 + 2) / 2 = 1.5
Sample E	0	0	2	(1) / 1 = 1
Sample F	3	1	0	(3) / 1 = 3

Data normalization

Metabolite data can be normalized in a variety of ways. By default, all data will automatically be target value normalized against the level 2 quality control sample provided the necessary criteria are met. Normalization has been shown to increase data accuracy, reduce cross-batch variability, and improve inter-laboratory reproducibility.

It is strongly recommended to normalize all sample data!

Number of replicates required for normalization

To perform normalization, a minimum amount of sample source replicates per plate run is required for normalization. Default value is three.
Define the minimum amount of replicates for normalization in the Settings > Results > Normalization.

Technical details regarding WebIDQ data normalization can be found in Normalization appendix and effectiveness can be seen demonstrated during inter-laboratory ring trial studies.

biocrates kit ring trial publications:

Results > Data normalization

Normalization options

Use the options in the Data Normalization ▾ dropdown menu to define and customize the normalization applied to the dataset.

Display value	Description
Batch normalization	Enable or disable sample-based normalization calculations Target value: Perform intra-plate normalization based quality control sample accuracy (only available with biocrates quality control samples) Reference sample: Perform intra-plate normalization based on reference sample's accuracy values. Reference sample can be any sample, e.g. "unknown" or Custom QC, run in replicates required for normalization. Available if at least two plate runs were loaded. Sample source: Select the sample type to be used for normalization. Sample must have been run in replicates required for normalization on all plates being normalized. Target value normalization only possible when a biocrates quality control is selected. Configuration: Define specific sample sources for normalization for each loaded plate run, e.g. QC1 for plate A and QC2 for plate B. Method: Choose if the normalization factors should be calculated from the mean or the median metabolite values of the sample source.
Normalize values < LOD	Normalization is performed for concentrations with the status "< LOD".
Creatinine normalization	Divides each sample by their respective creatinine concentration (recommended for urine samples).
Subtract median concentration of zero samples	The median concentrations of the zero samples can be subtracted from the concentrations of samples (not applied to QCs or calibration standards). It may be used for samples with very low metabolic concentrations, like cerebrospinal fluid (CSF), supernatant from cell culture. To subtract metabolite concentrations of cell culture medium from the supernatant (sample), use “unprocessed” medium as zero sample.

MetaboINDICATOR

MetaboINDICATOR is a tool that calculates sums and ratios of metabolites with relevance to biological and clinical applications (metabolism indicators), to support a more comprehensive understanding of metabolomics studies. In addition, these indicators can significantly reduce biological and analytical variability and can improve the specificity of many findings.

The MetaboINDICATOR tool provides a set of pre-configured sums and ratios that proved to be particularly informative on certain clinical conditions or pathophysiological events. In addition, user-defined sums and ratios can also be created. All sums and ratios are automatically calculated and displayed at the end of the Results table.

If sample data was loaded, but metabolism indicators are not available

load a patch of the corresponding kit.

Example for Quant 500 XL kit

MS: SCIEX 5500+

Select metabolism indicators

Categories for specific diseases, lifestyle factors, and physiological functions

MetaboINDICATOR

Two display options for metabolism indicators are available, as table or grouped.

Table	Grouped

To display or hide metabolism indicators, use the toggle .
To search for specific categories, the filter may be used.

Change view option "table" or "grouped"

Metabolism indicators change view

Display or hide metabolism indicators

Metabolism indicators display or hide

In table view, multiple indicators or a category can be activated or deactivated simultaneously.

To select specific metabolism indicators,
use "shift-click" to select a range or
"control-click" for individual selections.
To activate or deactivate all selected, use the or buttons.

All metabolism indicators can be displayed collapsed or expanded.

Metabolism indicators displayed collapsed or expanded

Metabolism indicator status

Metabolism indicators receive a status based on the corresponding single concentration statuses, like "valid" or "< LOD", see concentration validation status, list item 4.
The status of hightest priority is used for a metabolism indicator (MI).
Example: MI = A + B + C. Status of A and B are "valid" and of C is "< LOD". The status of MI is "< LOD".
Sums are calculated if at least one of all summands or subtrahends is different from a zero value.
Otherwise the sum is removed
If one summand or subtrahend is missing, the metabolism indicator status is "Incomplete metabolism indicator" and calculated.
If the enumerator or denominator of a ratio is zero (no concentration available), the ratio is removed (not displayed).
If imputation was performed, imputed concentrations are included in metabolism indicator calculations.
The metabolism indicator status is independent from imputed or non-imputed concentrations.

Metabolism indicator details

To see additional information, such as formula, description, and literature references, select a metabolism indicator and click the info button .
Metabolism indicators pre-defined by biocrates are highlighted with the biocrates logo .
Metabolism indicator "categories" or "analyte classes" can be shown in the results table, which is defined in the Settings.

Custom metabolism indicators

MetaboINDICATOR also supports user created custom metabolism indicators.

To create a new indicator, click the icon in the MetaboINDICATORs dropdown menu.

Custom

In the MetaboINDICATOR window, give the custom indicator a name (i.e. "Custom indicator"). Enter the formula for the indicator in the Formula field (i.e. (C0+C10)/C12). The list of possible metabolite names can be seen in the table on the right side. The formula can be constructed using standard arithmetic operators: +, -, *, / with parentheses to define the order of operation.

Once the formula has been entered, click Validated to check the validity of the formula. If the formula is OK, click Add to save the indicator.

To edit a custom indicator, select and click the info button . To remove, use click the trash icon .

Metabolism indicators pre-defined by biocrates are highlighted with the biocrates logo .

Context settings​

Available options​

Value type​

Display options​

Data preprocessing​

Group configuration​

Data cleaning​

Imputation​

Data normalization​

MetaboINDICATOR​

Select metabolism indicators​

Metabolism indicator status​

Metabolism indicator details​

Custom metabolism indicators​

Context settings

Available options

Value type

Display options

Data preprocessing

Group configuration

Data cleaning

Imputation

Data normalization

MetaboINDICATOR

Select metabolism indicators

Metabolism indicator status

Metabolism indicator details

Custom metabolism indicators