16th International HLA and Immunogenetics Workshop : Liverpool : 28th May - 3rd June 2012

Projects

16th IHIWS Registry Diversity Project

Martin Maiers, Carlheinz Müller, Steven Marsh

Goal

Comparison, validation and improvement of tools for HLA haplotype frequency analysis specifically designed to address issues of registry datasets:

  • Global ethnic diversity
  • population sub-structure determination (clustering)
  • geographical stratification (GeoCoding)
  • missing data for one or more locus
  • very large samples
  • heterogeneous typing resolution and methods
  • Hardy-Weinberg deviation
  • Research on data collected for clinical purposes
  • Incorporation of available family data
  • Direct utilization of primary typing data (sequences)

The term “Haplotype” was first introduced by Ceppellini in 1967 in the context of Immunogenetics. Since then the concept has become central to full-genome analysis projects such as HapMap and the Human Genome Diversity Project.

Much of the mainstream computational work in haplotype estimation focuses on classical genetic data which is characterized by :

  1. a large number of markers (>20)
  2. low polymorphism of the markers (alleles <5) ,
  3. low number of genotyped individual (<10 000),
  4. sampling/recruitment process which is generally research-oriented as an homogenous samples
  5. low haplotype diversity (<500)

Immunogenetic data challenge haplotype estimation in special ways. The diversity and the complexity of Immunogenetic data make the computational problem fare more difficult. In terms of MHC genes, or markers, including copy number variants in the HLA class II region or in the KIR cluster, we have:

  1. the frequent very large sample size (frequently >100 000);
  2. heterogeneity of typing resolution, heterogeneity of typing techniques heterogeneity of allele nomenclatures, discoveries of new alleles ;
  3. the low number of loci (roughly <20) ;
  4. the large number of allele per loci (roughly > 50) ;
  5. high haplotype diversity (roughly > 1000).

In addition, ethnic background diversity is high enough in Immunogenetic to potentially sub-structure the population sample. As a consequence, Immunogenetic data frequently include several sub-groups of subjects of varying ancestry. Furthermore, familial data are often accessible because the road to registry-facilitated transplantation typically starts with typing of families.

Many tools for haplotype frequency estimation do not meet the needs of the Immunogenetics and specifically the HSC registry communities. As a result, individual registries have developed in-house tools very few of which have been published or made publicly available. This situation has led to the need of collaboration between Registries for this systematic tool assessment and complex simulation project.

The central issue of the working group is the tools for analysis data rather than the data itself. Management of large registry datasets have specific requirements which do not fit the scope of more traditional tools issued from anthropology and population genetics components of the workshops.

The clinical community has needs for global haplotype frequency data in ways that go far beyond the scope of research interests in population and evolutionary genetics. Haplotype frequencies are used for managing donor registries prioritizing list of potentially matched donors for searching, targeted donor recruitment, optimal registry size computation, optimal recruitment strategies , and typing quality cost vs. benefit analyses. Haplotype frequencies estimated at the population level are used at the level of individual for computing phenotype frequencies, assigning haplotype phase, and imputing high resolution typing. The accuracy of these individual level applications depends on the global accuracy of the population haplotype frequencies. The purposes of these tools are to provide better access to HSC transplantation. The need for extensive evaluation of the population haplotype frequencies has a direct implication for the patients search experience.

The IHIWS registry diversity working group is an opportunity for collaboration on the specific challenges faced by this community and provides a systematic way to use simulations to validate and evaluate registry analysis tool implementations.

Contacts:
Please let us know if you would like to participate in the workshop.  We will provide you with further information as the project proceeds.

Email: mmaiers [at] NMDP [dot] ORG

Download a PDF of the EFI 2011 Presentation - download PDF file.

Download a PDF of the ASHI 2011 Presentation - download PDF file.

May 09, 2011 Posted by Admin