Normalisation¶
Dilution effects on global sample intensity can be normalised by attaching one of the classes in the normalisation
sub-module to the Normalisation
attribute of a Dataset
.
By default new Dataset
objects have a NullNormaliser
attached, which carries out no normalisation. By assigning an instance of a Normaliser
class all calls to intensityData
to return values transformed by the normaliser. For example, to return total area normalised values:
totalAreaNormaliser = nPYc.utilities.normalisation.TotalAreaNormaliser()
dataset.Normalisation = totalAreaNormaliser
There are three built-in normalisation objects:
- Null normaliser (
NullNormaliser
): no normalisation performed - Probabilistic quotient normaliser (
ProbabilisticQuotientNormaliser
): performs probabilistic quotient normalisation (Dieterle et al. [1] ) - Total area normaliser (
TotalAreaNormaliser
): performs normalisation where each row (sample) is divided by the total sum of its variables (columns)
Normalisation Syntax and Parameters¶
The main function parameters (which may be of interest to advanced users) are as follows:
The utilities
module implements several Normaliser objects, that perform intensity normalisation on the provided numpy matrix.
All normaliser objects must implement the Normaliser
abstract base class.
Normalisers may be configured as required upon initialisation, then a normalised view of a matrix obtained by passing the data to be normalised to the normalise()
method.
Once normalise()
has been called, the normalisation coefficients last used can be obtained from normalisation_coefficients
.
-
class
nPYc.utilities.normalisation.
NullNormaliser
¶ Null normalisation object which performs no normalisation, returning the provided matrix unchanged when
normalise()
is called.-
normalisation_coefficients
¶ Returns normalisation coefficients. :return: 1
-
normalise
(X)¶ Returns X unchanged.
Parameters: X (numpy.ndarray, shape [n_samples, n_features]) – Data intensity matrix Returns: The original X matrix without any modification Return type: numpy.ndarray, shape [n_samples, n_features]
-
-
class
nPYc.utilities.normalisation.
ProbabilisticQuotientNormaliser
(reference=None, referenceDescription=None)¶ Normalisation object which performs Probabilistic Quotient normalisation (Dieterle et al Analytical Chemistry, 78(13):4281 – 90, 2006)
Parameters: - reference (str, int, or numpy.ndarray) – Source of the reference profile. If
None
, use the median of X, if an int treat as the index of a spectrum in X to use as the reference, if an array with same width as X, treat as the reference profile. - referenceDescription (None, or str) – A textual description of the reference provided
- keepMagnitude (bool) – If
True
scales X such that the mean area of X remains constant for the dataset as a whole.
-
normalisation_coefficients
¶ Returns the last set of normalisation coefficients calculated.
Returns: Normalisation coefficients or None
if they have not been generated yetRaises: AttributeError – Setting the normalisation coefficients directly is not allowed and raises an error
-
reference
¶ Allows the reference profile used to calculated fold-changes to be queried or set.
Returns: The reference profile used to calculate normalisation coefficients
-
normalise
(X)¶ Apply Probabilistic Quotient normalisation to a dataset.
Parameters: - X (numpy.ndarray, shape [n_samples, n_features]) – Data intensity matrix
- reference (numpy.ndarray, shape [n_features]) – Spectrum to use as the normalisation reference
Returns: A read-only, normalised view of X
Return type: numpy.ndarray, shape [n_samples, n_features]
Raises: ValueError – if X is not a numpy 2-d array representing a data matrix
- reference (str, int, or numpy.ndarray) – Source of the reference profile. If
-
class
nPYc.utilities.normalisation.
TotalAreaNormaliser
(keepMagnitude=True)¶ Normalisation object which performs Total Area normalisation. Each row in the matrix provided will be scaled to sum to the same value.
Parameters: keepMagnitude (bool) – If True
scales X such that the mean area of X remains constant for the dataset as a whole.-
normalisation_coefficients
¶ Returns the last set of normalisation coefficients calculated.
Returns: Normalisation coefficients or None
if they have not been generated yetRaises: AttributeError – Setting the normalisation coefficients directly is not allowed and raises an error
-
normalise
(X)¶ Apply Total Area normalisation to the dataset.
Parameters: X (numpy.ndarray, shape [n_samples, n_features]) – Data intensity matrix Returns: A read-only, normalised view of X Return type: numpy.ndarray, shape [n_samples, n_features] Raises: ValueError – If X is not a numpy 2-d array representing a data matrix
-
[1] | Frank Dieterle, Alfred Ross, Götz Schlotterbeck and Hans Senn. Probabilistic quotient normalization as robust method to account for dilution of complex biological mixtures. application in 1H NMR metabonomics. Analytical Chemistry, 78(13):4281 – 90, 2006. URL: https://pubs.acs.org/doi/10.1021/ac051632c |