Feature Articles

Published: September 1, 2005
Find more content on:
Calibration of an immunochemical assay for a new biological marker

While a sound calibration strategy may require considerable resources, this key investment helps ensure that assays deliver reliable measurements.

By: David C. Sogin

Protein reagents are loaded into the Architect automated immunoassay system by Abbott Laboratories (Abbott Park, IL).

New biological markers discovered in clinical research can aid in diagnosis and treatment of disease. Many of these markers are large proteins, although recently even small molecules present at relatively low concentrations (such as homocysteine) have been of interest for diagnosis.1

Academic laboratories and IVD manufacturers often work together to identify clinically relevant markers. Once a viable candidate is identified, the prospective assay manufacturer needs to devise a stable calibration scheme.

The calibration strategy is key to developing an assay robust enough to differentiate patients with a disease from patients free of the disease in a given population. The strategy entails establishing calibrator commutability, which is a consistent relationship of the signals generated by the calibrators to those generated by the specimens, and demonstrating traceability to a well-characterized standard, which facilitates regulatory approval of new assays.

Many of the new biological markers are present in specimens at concentrations as low as picograms per milliliter. When a new protein marker is identified, the candidate molecule is typically purified to a level at which it can be used as an antigen for preparation of either monoclonal or polyclonal antibodies that will recognize it and bind to it.

Often the biological role of a new protein marker is not known, and no specific assay is available to assess the degree of normal remaining native structure in the isolated protein. This lack of knowledge of native structure may not affect the standardization, but detailed biophysical information increases the likelihood that a quantitative assay can be kept under stable control over the long term.

While speed to market is often a key imperative, development of a robust calibration method that can provide consistent results over time is a competing goal. Since the first assay on the market is likely to define the unit of measurement along with the significant medical decision points, the assay developer should carefully consider all the issues of calibration and associated traceability to a reference material, especially if no internationally recognized material exists.

A prominent example of a lack of agreement between assays is the wide range of reported values for troponin I.2 The recognition of the lack of agreement led to the cooperative effort to identify a reference material that could be used to achieve standardization across several commercial assays.

Figure 1. Calibrating a new protein assay.

The candidate reference material identified is now available from the National Institute of Standards and Technology (NIST) as SRM 2921.3 The material was isolated from human hearts and underwent extensive characterization at NIST before a concentration value was assigned. Further studies are being conducted, under the guidance of the Committee on Standardization of Markers of Cardiac Damage, International Federation of Clinical Chemistry and Laboratory Medicine, to confirm that the use of this standard will improve the comparability of troponin I results across different assays.

Establishing a Baseline

Since the enactment of the European IVD Directive in 1998, the concept of traceability has received significant attention.4 By calibrating measurements to a common reference material or reference method, traceability allows assays to be standardized.
As defined in the International Vocabulary of Metrology, traceability is a “property of the result of a measurement or the value of a standard whereby it can be related to stated references, usually national or international standards, through an unbroken chain of comparisons all having stated uncertainties.”5

Ideally, it should be possible to trace the patient result up to a starting reference material of high metrological order whose concentration can be expressed in Systeme Internationale units. For most quantitative clinical assays, the unit is expressed as mass concentration when a biological activity is not being measured. Ensuring that the measured values are metrologically true improves the likelihood of obtaining comparable assay results across time points, locations, and methodologies. This approach is less effective for larger proteins, due to greater differences in methodology between assays. Furthermore, the lack of recognized reference methods hinders the unambiguous determination of the true value in a specimen.

ISO document 17511 covers many critically important concepts of calibration, traceability, and validation of quantitative diagnostic assays.6 The process starts by defining the measurand, which is the amount of analyte, and the matrix in which it is analyzed, such as the mass of protein X per unit of volume or the level of glucose in serum, plasma, urine, or cerebrospinal fluid.
Most IVD manufacturers would like to claim that the assay is measuring the mass of a particular protein per unit of specimen volume, but the situation is far more complicated. For a typical immunoassay, a monoclonal antibody bound to a solid phase is used to capture a molecule of interest. Then a second monoclonal antibody conjugated to a signal-generating moiety is added after appropriate wash steps.

For such a two-epitope assay, a hypothetical measurand could be described as follows: in one milliliter, the amount of protein captured by epitope A, as detected by the amount of antibody conjugate against epitope B that generates a signal, under the sequence control of analyzer type XYZ. It is important that both epitopes of interest be present and that the chosen reference material be stable with respect to these epitopes. ISO document 15194 guidelines on how to certify a reference material can aid in developing a robust, sustainable calibration system.7

Choosing a Standard That Will Last

The material chosen as a standard should be available in sufficient quantities to support the calibration of the assay for an extended period of time and over different lots of reagent (see Figure 1). The designated physical-reference material at the top of the traceability chain exists as a single entity and is independent of other batches or lots of the material isolated in a similar manner. Its replacement requires full characterization, with the added requirement that the performance of the replacement material in the field be correlated to that of the predecessor. Consequently it is key that the initial qualification of the reference material be done carefully. Given the significant effort needed to properly qualify a reference material, it is critical to forecast future requirements and procure sufficient supplies. The physical characterization of the reference material can provide valuable information for preparing the long-term plan.

An important requirement for qualifying a reference material is to have independent analytical methods based on first principles, i.e., basic physicochemical methods that can identify physical changes in the reference material. These methods are independent of the field assay, typically a commercial immunoassay being used to test patient specimens. Physical changes can then be correlated with the performance of the assay.

Several physicochemical techniques, including amino acid analysis, electrophoretic techniques, mass spectrometry, and epitope identification, can be used to assess the purity and concentration of a protein. Procedures more appropriate for complex carbohydrates or nucleic acids can also be applied.

In forced-degradation studies, the reference material is intentionally degraded to cause loss of immunoactivity, which is correlated with an analytical method independent of the customer assay. The knowledge from these studies can be used to develop the needed tools for monitoring the integrity of the candidate material over time.

A common approach is to store the reference material in the form in which it is most stable (often as a lyophilized powder) and then incubate it at various temperatures for biophysical characterization at various times. Although liquid solutions can be incubated at elevated temperatures, an irrelevant inactivation product may be identified. Approaches taken in the pharmaceutical industry can provide some guidance on dealing with this problem.8,9 While these studies could be used to estimate expected dating, their primary purpose is to provide monitoring tools, as the degraded reference material is analyzed and the results are compared to those obtained with the field method. Such studies can predict likely degradation products and their effect on assay calibration.

The use of an independent reference method to assess the integrity of the reference material provides added assurance that the material is continuing to serve as a stable anchor for the assay. Note that the physicochemical analysis of the reference material does not need to be made at the same concentration levels as those found in patient specimens. The physical characterization is completed with the purified reference material and is correlated with the performance of the proposed field method within the expected concentration range of the assay. Monitoring the reference material with the independent analytical method can provide an early warning as to when the reference material requires replacement. The specification for allowable loss of concentration or allowable degradation of the reference material is based on the allowable bias of the assay as it relates to its clinical utility.

Once the reference material is identified, its use should be restricted to the manufacturer's working calibrator, which is used only internally and for value transfer to the reagent kit calibrators. In the absence of a recognized reference method for value assignment, the method of value transfer is left to the manufacturer.

While it is often convenient to use the field assay for value assignment, this may not always be the best choice. The method for value transfer should address many of the requirements outlined in ISO document 15193.10 Not all the requirements are relevant, since the assay is for the manufacturer's internal use only. However, this document does provide a useful list of potential concerns. In establishing a calibrator value assignment algorithm, the statistical approach is critical, since the accuracy and bias of the assay will be affected by the testing scheme.

The kit calibrators can use a material similar to the reference or, alternatively, a different material whose properties are more appropriate for a particular marketed assay. For example, using a native protein may be cost prohibitive for a kit calibrator, or the native material may be unstable over the desired shelf life of the customer calibration sets. Thus, in a liquid kit calibrator, a recombinant protein might demonstrate better stability than the native protein. Purity may not be critical so long as the material provides a stable concentration of the required epitopes and performs in the assay in a consistent manner over different lots of kit calibrators and reagents.

Ensuring Consistency of Measurements

Figure 2. Assay validation.

A calibrator should demonstrate commutability, which is the constant ratio of analytical signals in the proposed customer assay to human specimens at similar concentrations. Commutability is a property of calibration materials (either the reference material used or the actual kit calibrators).

The choice of the calibration analyte and matrix composition can affect the degree of commutability. While often it is assumed that the use of a serum or plasma devoid of the analyte of interest makes the best choice for a matrix, this is not always the case. The matrix must also support a stable calibration, and at times this requirement eliminates the use of natural biological fluids such as serum.

When there are differences in matrices, characterization of the impact on the signal generated is key. In many immunoassays there is a positive or negative shift in the calibration curve due to the matrix. Careful study of the performance of the assay is required to determine the degree to which these effects come into play. Standard addition of the analyte is one approach used to assess these effects.

The characterization of the assay performance with calibrators compared to that with actual specimens is then used to determine the appropriate measurement equation. The equation addresses any differences between the signal generated by the analyte in calibrators compared to specimen results. The data are used to determine offsets and weighting factors, and to define the most appropriate equation for a clinically consistent result.

Once the calibration equation is established, the effects of the lot-to-lot variability of the reagents, manufacturer's working calibrators, and kit calibrators need to be addressed for commutability. A lack of commutability indicates that additional research and development is needed to identify the cause and correct the deficiency.

Throughout the traceability chain, the calibrators must demonstrate commutability at each value-transfer step through to the reported result from the patient specimen. The consistent determination of analyte concentration in the human specimen is all that is of concern to the clinician.
Commutability is demonstrated by comparing the result of the specimen to each reference preparation or calibrator up the traceability chain. The ratios of these signals to the specimen result should be the same.

Further, for specimens containing the same concentration of the analyte, the numerical values of the results should remain constant. For a single lot of reagents, different lots of calibrators should yield identical results, regardless of the path to the reference material. All of this assumes that different lots of assay reagents do not vary in the signal generated as similar samples are run. No calibration and associated traceability to a higher-order standard can correct sensitivity to sample reagent variability.

Testing the Assay

Assuming that the calibration scheme has been sufficiently characterized and the assay performance is acceptable, validation of the assay can proceed. A protocol should address all the components of the calibration of the assay and demonstrate with a sufficient number of lots that there will be consistency in the performance of the assay over time.

One validation scheme (illustrated in Figure 2) can be described as follows. Three lots of the manufacturer's working calibrators are prepared, and one of the lots is used to prepare three lots of customer calibrators. The usual approach of varying the critical components of the calibrators is appropriate (except for the reference material for the working calibrators, which should be constant). All three lots of the manufacturer's working calibrators are used to calibrate a single lot of assay reagents. The same reagent lot is then used to determine the apparent concentration of a series of patient specimens that cover the intended concentration range of the assay. The procedure is then repeated with the three lots of the kit calibrators whose values were transferred from a single lot of working calibrators. Ideally, the reported specimen values for calibrations determined from the three lots of working calibrators should be the same (within specifications for lot-to-lot variability). Acceptance criteria are based on the expected accuracy of the assay.

Similar agreement of reported results for the panel of human specimens is expected with the customer calibrator lots. Note that, depending on the measurement equation, specimen values obtained with the working calibrators and those obtained with the customer calibrators may not agree. Assuming that commutability is established for the various calibrator lots with a single lot of assay reagents, the impact of reagent lot variability is then addressed. In validation runs, a single lot of calibrators is used to set the calibration curve for three or more reagent lots. The same panel of human specimens is then run against these multiple lots of reagents. The acceptance criterion is again based on the projected performance of the assay.

Estimating the uncertainty of the calibrator levels, calibration, multiple instruments, and reported results is essential in establishing an acceptance criterion for the performance of the assay. These calculations require information about the value assignment of the reference material, the manufacturing process contribution (including processing and testing), instrument variations, and actual assay performance.

Various components and lots of components need to be compared for their ability to define the same patient population. The uncertainty estimates can predict the expected error dispersion, as various components are combined to produce results. A useful guide for calculating uncertainty is found in the Eurachem/CITAC guide, “Traceability in Chemical Measurements.”11

Additional Safeguards

While the long-term anchoring of the assay can be accomplished by use of a well-characterized stable reference material, this alone will not guarantee that the assay will adequately distinguish between specimens from healthy patients and specimens from patients with a pathological condition.

To ensure constant discrimination of samples, a panel of authentic samples stored frozen can be used to confirm that the same population is identified. Of course, this is not always practical, since in some cases only fresh samples are appropriate. In these cases, an alternative surrogate approach may be used.

Ideally, the availability of a certified reference assay for each protein analyte would ensure a consistent distribution of patient values. Unfortunately, the development of such assays has remained elusive for most highly sensitive immunoassays for specific proteins, and for now, assay manufacturers are dependent on anchoring assays to stable reference materials.


1. Homocysteine in Health and Disease, ed. R Carmel and DW Jacobsen (New York: Cambridge University Press, 2001).

2. AHB Wu et al., “Characterization of Cardiac Troponin Subunit Release into Serum after Acute Myocardial Infarction and Comparison of Assays for Troponin T and I,” Clinical Chemistry 44, no. 6 (1998): 1198–1208.

3. RH Christenson et al., ”Standardization of Cardiac Troponin I Assays: Round Robin of Ten Candidate Reference Materials,” Clinical Chemistry 47, no. 3 (2001): 431–437.

4. “Directive 98/79/EC of the European Parliament and of the Council of 27 October 1998 on In Vitro Diagnostic Medical Devices,” Official Journal of the European Communities L 331 (1998): 1–37.

5. International Vocabulary of Basic and General Terms in Metrology (VIM), International Organization for Standardization Web site (Geneva: ISO, 1993 [cited 2 August 2005]); available from Internet: www.iso.org/iso/en/CatalogueDetailPage.CatalogueDetail?CSNUMBER=40721&scopelist=PROGRAMME.

6. In Vitro Diagnostic Medical Devices—Measurement of Quantities in Biological Samples—Metrological Traceability of Values Assigned to Calibrators and Control Materials, ISO 17511, International Organization for Standardization Web site (Geneva: ISO, 2003 [cited 2 August 2005]); available from Internet: www.iso.org/iso/en/CatalogueDetailPage.CatalogueDetail?CSNUMBER=30716&ICS1=11&ICS2=100&ICS3=10.

7. In Vitro Diagnostic Systems—Measurement of Quantities in Samples of Biological Origin—Description of Reference Materials, ISO 15194, International Organization for Standardization Web site (Geneva: ISO, 2002 [cited 2 August 2005]); available from Internet: www.iso.org/iso/en/CatalogueDetailPage.CatalogueDetail?CSNUMBER=26306&ICS1=11&ICS2=100&ICS3=10.

8. DW Reynolds, KL Facchine, and JF Mullaney, “Available Guidance and Best Practices for Conducting Forced Degradation Studies,” Pharmaceutical Technology (February 2002): 48–56.

9. R Stevenson, “Stability Indicating and Forced Degradation Assays for Proteins '04: It's About Time,” American Biotechnology (November 2004): 5–6.

10. In Vitro Diagnostic Systems—Measurement of Quantities in Samples of Biological Origin—Presentation of Reference Measurement Procedures, ISO 15193, International Organization for Standardization Web site (Geneva: ISO, 2002 [cited 2 August 2005]); available from Internet: www.iso.org/iso/en/CatalogueDetailPage.CatalogueDetail?CSNUMBER=26305&ICS1=11&ICS2=100&ICS3=10.

11. Eurachem/CITAC Guide: Traceability in Chemical Measurements, (2003 [cited 2 August 2005]); available from Internet: www. measurementuncertainty.org/mu/search/index.html.

David C. Sogin, PhD, is a research scientist at Abbott Laboratories (Abbott Park, IL). He can be reached at david.sogin@abbott.com.

Copyright ©2005 IVD Technology

No votes yet