Home /  Genetics of Complex Disease

Workshop

Genetics of Complex Disease February 09, 2004 - February 13, 2004
Registration Deadline: February 13, 2004 almost 21 years ago
To apply for Funding you must register by: November 09, 2003 about 21 years ago
Parent Program: --
Organizers Jun Liu, Mary Sara McPeek, Richard Olshen (chair), David O. Siegmund, and Wing Wong
Speaker(s)

Show List of Speakers

Description
This workshop is sponsored by MSRI and in part by Affymetrix, Aventis, Bristol-Myers Squibb and Pfizer. Our workshop will be held February 9-13, 2004 at the Mathematical Sciences Research Institute in Berkeley. Its topic is the genetics of complex human disease. The goals can be classified by subject matter by a variety of criteria, discussed below; but in human terms the goal is simple. We will bring together individuals who work at the forefront of laboratory research (perhaps also with patients) and others whose related activities are also cutting edge, but for whom the emphases are algorithmic, probabilistic, or statistical. While there are many conferences on specific topics from among those we cite, there are very few that span "bench to computer to bedside" topics and at the same time spare nothing in mathematical sophistication when suitable applications of mathematics can shed light on important biomedical problems. With respect to human diseases, the appropriate technologies, algorithms and clinical thinking are differentiated to a degree by the disease of interest. Important examples are cancer, autoimmune diseases, cardiovascular diseases, and lipid abnormalities. To some extent study of the human immune system holds these topics together, but there is much that is special to each topic. For example, much recent interest in cancer -- including but not limited to retinoblastoma, cancers of the head and neck, breast, and lung -- has focused on loss of heterozygosity and comparative genome hybridization, topics we will explore in lectures and discussion. By now there are numerous chips that are brought to bear on these diseases and others. "Historically," meaning, perhaps, one to five years ago, chips were primarily of the cDNA type, with 500-5000 base pairs studied at once, and oligonucleotide DNA chips, where attention is restricted to 20-80mers of DNA. Analysis of these and other chips and wafers will be topics of concern to us. One could think of the study of complex human disease as being analogous to a triangle, where scientists in the corners have their own emphases, but meet in the middle and interact. One corner concerns finding DNA markers, i.e., polymorphic sites, preferably in coding or regulatory regions of genes, that bear upon disease. While there might be viewed to be tension between studying allele sharing and identity by descent from linkage data on the one hand and association analysis of candidate genes on the other, we take it as a challenge to combine information from these different approaches. The use of animal models, where linkage analysis is easier than in humans, combined with identification of candidate genes in humans through homology searches provides another important tool. Linkage analysis has stimulated interest in first passage problems for Ornstein-Uhlenbeck processes to deal with problems of multiple testing. Candidate gene studies have stimulated interest in supervised learning, which in statistics is often called "classification." An important example is the study of angiotensinogen and protein tyrosine phsophatase as they bear upon hypertension. Cytochrome P450 genes are part of the study of all cited processes of disease. A second corner of the triangle concerns understanding genetic control and gene expression. There are now related data from beads, and other technologies, too. Analyses of these data might be in terms of supervised learning (when the outcome/phenotype is given) and unsupervised learning, or clustering (when the outcome/phenotype) is not. Clusters may be distinct or overlapping. Nearly always the clustering is of data that can be modeled as though they are points in Euclidean spaces, but where the cardinality of the sample pales by comparison with the dimension of the relevant space. With supervised learning there is, typically, a finite set of outcomes, the "covering diagnoses," and the goal is to classify, that is, to assign, each vector of expression values to a diagnosis. Some examples of interest in this area have been different flavors of hematopoietic malignancy. Most but not all classifiers of interest recently devolve from "voting methods," such as the celebrated AdaBoost method. The third corner of the triangle is concerned with the direct studies of proteins and their interactions, for example by time of flight mass spectrometry. Typical output here is a curve, or family of curves, with geometry (or geometries) that may apply to a particular genetic profile or disease. One approach of interest could be that of extracting a parsimonious set of basis functions for families of curves and representing a curve of interest to within specified discrepancy in a suitable norm. Perhaps low fractional Besov norms are relevant. Once a suitable basis and corresponding expansions are computed, we are back in the problem of supervised or unsupervised learning, as the case may be. Since what we get are the weights of proteins, it is imperative to be able to do the "inverse problem" of inferring the protein from the molecular weight. This could bring us to concerns of "fast table lookup," that have been important to streaming video over the Web and other problems. None of the above is meant to preclude interest in problems of evolution, which bear upon our subject matter through the identification of regions in proteins that are conserved across organisms and in evolutionary analysis of various pathogens, nor in problems in more traditional genetic epidemiology and statistical genetics. The latter can bring us to models where unconditional distributions are mixtures of Gaussians or other smooth distributions, and where sometimes distinctions between inference conditional on some data and unconditionally are blurred. The resulting inferential and computational issues can be very subtle. We on the committee that is organizing our workshop have contacted many individuals. Most are very interested in participating. Below please find a list of some of the individuals who will participate in our workshop. Although these are well known senior scientists, we are also committed to encourage many young and creative, but less well known, individuals to join us. Warren Ewens, Professor of Biology University of Pennsylvania (Winner, Weldon Memorial Prize, Oxford University, 2002) Joe W. Gray, Professor of Laboratory Medicine and Radiation Oncology Principal Investigator, UCSF Comprehensive Cancer Center University of California, San Francisco Jun Liu, Professor of Statistics and of Biostatistics Harvard University (Winner of Presidents' Award, Committee of Presidents of Statistical Societies, 2002) Mary Sara McPeek, Associate Professor, Departments of Statistics and Human Genetics Member, Committee on Genetics University of Chicago Richard Olshen, Professor of Health Research and Policy (Biostatistics) and (by Courtesy) of Electrical Engineering and Statistics Stanford University Thomas Quertermous, William G. Irwin Professor in Cardiovascular Medicine Research Chief, Division of Cardiovascular Medicine Stanford University Koustubh Ranade, Pharmaceutical Research Institute Bristol-Myers Squibb Princeton, New Jersey Neil Risch, Professor of Genetics and (by Courtesy) of Health Research and Policy and of Statistics Stanford University Adjunct Investigator, Division of Research Kaiser Permanente, Northern California David Siegmund, John D. and Sigrid Banks Professor and Professor of Statistics Stanford University Mark Skolnick, Chief Scientific Officer, Myriad Genetics, Inc. Terry Speed, Professor of Statistics University of California, Berkeley Head, Division of Bioinformatics Walter & Eliza Hall Institute of Medical Research Melbourne, Australia Robert Tibshirani, Professor of Health Research and Policy (Biostatistics) and (by Courtesy) of Statistics Stanford University (Winner of Presidents' Award, Committee of Presidents of Statistical Societies, 1996) Wing Hung Wong, Professor of Computational Biology Department of Biostatistics and Professor of Statistics Harvard University (Winner of Presidents' Award, Committee of Presidents of Statistical Societies, 1993)
Keywords and Mathematics Subject Classification (MSC)
Primary Mathematics Subject Classification No Primary AMS MSC
Secondary Mathematics Subject Classification No Secondary AMS MSC
Funding & Logistics Show All Collapse

Show Funding

To apply for funding, you must register by the funding application deadline displayed above.

Students, recent PhDs, women, and members of underrepresented minorities are particularly encouraged to apply. Funding awards are typically made 6 weeks before the workshop begins. Requests received after the funding deadline are considered only if additional funds become available.

Show Lodging

For information about recommended hotels for visits of under 30 days, visit Short-Term Housing. Questions? Contact coord@slmath.org.

Show Directions to Venue

Show Visa/Immigration

Schedule, Notes/Handouts & Videos
Show Schedule, Notes/Handouts & Videos
Show All Collapse
Feb 09, 2004
Monday
08:00 AM - 05:00 PM
  Detecting Genes from Data on Related Individuals
Elizabeth Thompson
08:00 AM - 05:00 PM
  Sequence-based Prediction of HIV-1 Replication Capacity
Mark Segal
08:00 AM - 05:00 PM
  Mapping Tumor Suppressor Genes Using Loss of Heterozygosity
Fred Wright
08:00 AM - 05:00 PM
  Resampling-based multiple testing procedures: Applications to microarray data analysis
Sandrine Dudoit (University of California, Berkeley)
08:00 AM - 05:00 PM
  Alleles, entropy, and locations: Which SNPs do you put on a chip?
Earl Hubbell
08:00 AM - 05:00 PM
  Sample classification from protein mass spectroscopy by peak probability contrasts
Robert Tibshirani
08:00 AM - 05:00 PM
  Normalization Methods of cDNA Microarray Experiments under Different Designs
Chao Agnes Hsiung
08:00 AM - 05:00 PM
  Analysis of oligonucleotide SNP array data
Cheng Li
08:00 AM - 05:00 PM
  Gene mapping in model organisms
Karl Broman
08:00 AM - 05:00 PM
  Haplotype Block Parition and its applications to association studies
Fengzhu Sun
08:00 AM - 05:00 PM
  A Statistical Sampling Algorithm for RNA Secondary Structure Prediction
Chip Lawrence
08:00 AM - 05:00 PM
  Efficient simulation of p-values for linkage analysis
Eleanor Feingold
08:00 AM - 05:00 PM
  Genomic Reconstruction of Yeast Transcription Networks
Hao Li
08:00 AM - 05:00 PM
  Obtaining evolutionary clues to protein structural mechanisms
Andrew Neuwald
08:00 AM - 05:00 PM
  Dictionary models for regulatory regions in DNA and gene expression arrays
Chiara Sabatti
08:00 AM - 05:00 PM
  Association Testing with Mendel
Ken Lange
08:00 AM - 05:00 PM
  Combinatorial Approaches to the Haplotype Phasing Problem
Richard Karp (University of California, Berkeley)
08:00 AM - 05:00 PM
  Identifying recombination hotspots from LD in the human genome
Matthew Stephens
08:00 AM - 05:00 PM
  Finding genes associated with multiple sclerosis
Terry Speed
08:00 AM - 05:00 PM
  Genetic complexity in cancer
Joe Gray (Lawrence Berkeley Laboratory)
08:00 AM - 05:00 PM
  Change-point methods for the analysis of array-based DNA copy number data
Adam Olshen
08:00 AM - 05:00 PM
  Thoughts on the TDT
Warren Ewens
08:00 AM - 05:00 PM
  Microarray profiling to identify vascular wall genes for association based candidate gene studies of atherosclerosis: genomics meets genetics
Thomas Quertermous
08:00 AM - 05:00 PM
  Comparing Genomes to Study Disease
Eddy Rubin
08:00 AM - 05:00 PM
  Novel Multivariate Analysis Methods for Genomic Analysis
Nik Schork
08:00 AM - 05:00 PM
  A Gene Recommender for C. elegans
Art Owen
08:00 AM - 05:00 PM
  Mapping QTL in the presence of gene-covariate
David Siegmund (Stanford University)
08:00 AM - 05:00 PM
  An approach to obtain tight clusters
Wing Wong (Stanford University)
08:00 AM - 05:00 PM
  A probabilistic framework for the statistics of selective sampling
Benjamin Yakir
08:00 AM - 05:00 PM
  Statistical Methods for Analysis of Microarray Time Course Gene Expression Data
Honghzhe Li
08:00 AM - 05:00 PM
  Inference of Ancestry for Admixed Groups
Hua Tang
08:00 AM - 05:00 PM
  Mapping of Transcription Factor Sites along Human Chromosones 21 and 22 Using Genome Tiling Arrays
Stefan Bekiranov
08:00 AM - 05:00 PM
  Algebraic Statistical Genetics: Linkage Analysis
Ingleif Hallgrimsdottir
08:00 AM - 05:00 PM
  Identification of polymorphisms that explain a linkage peak
Josee Dupuis
08:00 AM - 05:00 PM
  Tree-structured Supervied Learning and the Genetics of Hypertension
Richard Olshen
08:00 AM - 05:00 PM
  Linkage Analysis of Longitudinal Data and Study Design Considerations
Heping Zhang