Normalisation¶
Dilution effects on global sample intensity can be normalised by attaching one of the classes in the normalisation
submodule to the Normalisation
attribute of a Dataset
.
By default new Dataset
objects have a NullNormaliser
attached, which carries out no normalisation. By assigning an instance of a Normaliser
class all calls to intensityData
to return values transformed by the normaliser. For example, to return total area normalised values:
totalAreaNormaliser = nPYc.utilities.normalisation.TotalAreaNormaliser()
dataset.Normalisation = totalAreaNormaliser
There are three builtin normalisation objects:
 Null normaliser (
NullNormaliser
): no normalisation performed  Probabilistic quotient normaliser (
ProbabilisticQuotientNormaliser
): performs probabilistic quotient normalisation (Dieterle et al. [1] )  Total area normaliser (
TotalAreaNormaliser
): performs normalisation where each row (sample) is divided by the total sum of its variables (columns)
Normalisation Syntax and Parameters¶
The main function parameters (which may be of interest to advanced users) are as follows:
The utilities
module implements several Normaliser objects, that perform intensity normalisation on the provided numpy matrix.
All normaliser objects must implement the Normaliser
abstract base class.
Normalisers may be configured as required upon initialisation, then a normalised view of a matrix obtained by passing the data to be normalised to the normalise()
method.
Once normalise()
has been called, the normalisation coefficients last used can be obtained from normalisation_coefficients
.

class
nPYc.utilities.normalisation.
NullNormaliser
¶ Null normalisation object which performs no normalisation, returning the provided matrix unchanged when
normalise()
is called.
normalisation_coefficients
¶ Returns normalisation coefficients. :return: 1

normalise
(X)¶ Returns X unchanged.
Parameters: X (numpy.ndarray, shape [n_samples, n_features]) – Data intensity matrix Returns: The original X matrix without any modification Return type: numpy.ndarray, shape [n_samples, n_features]


class
nPYc.utilities.normalisation.
ProbabilisticQuotientNormaliser
(reference=None, referenceDescription=None)¶ Normalisation object which performs Probabilistic Quotient normalisation (Dieterle et al Analytical Chemistry, 78(13):4281 – 90, 2006)
Parameters:  reference (str, int, or numpy.ndarray) – Source of the reference profile. If
None
, use the median of X, if an int treat as the index of a spectrum in X to use as the reference, if an array with same width as X, treat as the reference profile.  referenceDescription (None, or str) – A textual description of the reference provided
 keepMagnitude (bool) – If
True
scales X such that the mean area of X remains constant for the dataset as a whole.

normalisation_coefficients
¶ Returns the last set of normalisation coefficients calculated.
Returns: Normalisation coefficients or None
if they have not been generated yetRaises: AttributeError – Setting the normalisation coefficients directly is not allowed and raises an error

reference
¶ Allows the reference profile used to calculated foldchanges to be queried or set.
Returns: The reference profile used to calculate normalisation coefficients

normalise
(X)¶ Apply Probabilistic Quotient normalisation to a dataset.
Parameters:  X (numpy.ndarray, shape [n_samples, n_features]) – Data intensity matrix
 reference (numpy.ndarray, shape [n_features]) – Spectrum to use as the normalisation reference
Returns: A readonly, normalised view of X
Return type: numpy.ndarray, shape [n_samples, n_features]
Raises: ValueError – if X is not a numpy 2d array representing a data matrix
 reference (str, int, or numpy.ndarray) – Source of the reference profile. If

class
nPYc.utilities.normalisation.
TotalAreaNormaliser
(keepMagnitude=True)¶ Normalisation object which performs Total Area normalisation. Each row in the matrix provided will be scaled to sum to the same value.
Parameters: keepMagnitude (bool) – If True
scales X such that the mean area of X remains constant for the dataset as a whole.
normalisation_coefficients
¶ Returns the last set of normalisation coefficients calculated.
Returns: Normalisation coefficients or None
if they have not been generated yetRaises: AttributeError – Setting the normalisation coefficients directly is not allowed and raises an error

normalise
(X)¶ Apply Total Area normalisation to the dataset.
Parameters: X (numpy.ndarray, shape [n_samples, n_features]) – Data intensity matrix Returns: A readonly, normalised view of X Return type: numpy.ndarray, shape [n_samples, n_features] Raises: ValueError – If X is not a numpy 2d array representing a data matrix

[1]  Frank Dieterle, Alfred Ross, Götz Schlotterbeck and Hans Senn. Probabilistic quotient normalization as robust method to account for dilution of complex biological mixtures. application in ^{1}H NMR metabonomics. Analytical Chemistry, 78(13):4281 – 90, 2006. URL: https://pubs.acs.org/doi/10.1021/ac051632c 