Normalisation

Dilution effects on global sample intensity can be normalised by attaching one of the classes in the normalisation sub-module to the Normalisation attribute of a Dataset.

By default new Dataset objects have a NullNormaliser attached, which carries out no normalisation. By assigning an instance of a Normaliser class all calls to intensityData to return values transformed by the normaliser. For example, to return total area normalised values:

totalAreaNormaliser = nPYc.utilities.normalisation.TotalAreaNormaliser()
dataset.Normalisation = totalAreaNormaliser

There are three built-in normalisation objects:

  • Null normaliser (NullNormaliser): no normalisation performed
  • Probabilistic quotient normaliser (ProbabilisticQuotientNormaliser): performs probabilistic quotient normalisation (Dieterle et al. [1] )
  • Total area normaliser (TotalAreaNormaliser): performs normalisation where each row (sample) is divided by the total sum of its variables (columns)

Normalisation Syntax and Parameters

The main function parameters (which may be of interest to advanced users) are as follows:

The utilities module implements several Normaliser objects, that perform intensity normalisation on the provided numpy matrix.

All normaliser objects must implement the Normaliser abstract base class.

Normalisers may be configured as required upon initialisation, then a normalised view of a matrix obtained by passing the data to be normalised to the normalise() method.

Once normalise() has been called, the normalisation coefficients last used can be obtained from normalisation_coefficients.

class nPYc.utilities.normalisation.NullNormaliser

Null normalisation object which performs no normalisation, returning the provided matrix unchanged when normalise() is called.

normalisation_coefficients

Returns normalisation coefficients. :return: 1

normalise(X)

Returns X unchanged.

Parameters:X (numpy.ndarray, shape [n_samples, n_features]) – Data intensity matrix
Returns:The original X matrix without any modification
Return type:numpy.ndarray, shape [n_samples, n_features]
class nPYc.utilities.normalisation.ProbabilisticQuotientNormaliser(reference=None, referenceDescription=None)

Normalisation object which performs Probabilistic Quotient normalisation (Dieterle et al Analytical Chemistry, 78(13):4281 – 90, 2006)

Parameters:
  • reference (str, int, or numpy.ndarray) – Source of the reference profile. If None, use the median of X, if an int treat as the index of a spectrum in X to use as the reference, if an array with same width as X, treat as the reference profile.
  • referenceDescription (None, or str) – A textual description of the reference provided
  • keepMagnitude (bool) – If True scales X such that the mean area of X remains constant for the dataset as a whole.
normalisation_coefficients

Returns the last set of normalisation coefficients calculated.

Returns:Normalisation coefficients or None if they have not been generated yet
Raises:AttributeError – Setting the normalisation coefficients directly is not allowed and raises an error
reference

Allows the reference profile used to calculated fold-changes to be queried or set.

Returns:The reference profile used to calculate normalisation coefficients
normalise(X)

Apply Probabilistic Quotient normalisation to a dataset.

Parameters:
  • X (numpy.ndarray, shape [n_samples, n_features]) – Data intensity matrix
  • reference (numpy.ndarray, shape [n_features]) – Spectrum to use as the normalisation reference
Returns:

A read-only, normalised view of X

Return type:

numpy.ndarray, shape [n_samples, n_features]

Raises:

ValueError – if X is not a numpy 2-d array representing a data matrix

class nPYc.utilities.normalisation.TotalAreaNormaliser(keepMagnitude=True)

Normalisation object which performs Total Area normalisation. Each row in the matrix provided will be scaled to sum to the same value.

Parameters:keepMagnitude (bool) – If True scales X such that the mean area of X remains constant for the dataset as a whole.
normalisation_coefficients

Returns the last set of normalisation coefficients calculated.

Returns:Normalisation coefficients or None if they have not been generated yet
Raises:AttributeError – Setting the normalisation coefficients directly is not allowed and raises an error
normalise(X)

Apply Total Area normalisation to the dataset.

Parameters:X (numpy.ndarray, shape [n_samples, n_features]) – Data intensity matrix
Returns:A read-only, normalised view of X
Return type:numpy.ndarray, shape [n_samples, n_features]
Raises:ValueError – If X is not a numpy 2-d array representing a data matrix
[1]Frank Dieterle, Alfred Ross, Götz Schlotterbeck and Hans Senn. Probabilistic quotient normalization as robust method to account for dilution of complex biological mixtures. application in 1H NMR metabonomics. Analytical Chemistry, 78(13):4281 – 90, 2006. URL: https://pubs.acs.org/doi/10.1021/ac051632c