Workshop

Genetics of Complex Disease

February 09, 2004 - February 13, 2004

Registration Deadline:	February 13, 2004 almost 22 years ago
To apply for Funding you must register by:	November 09, 2003 about 22 years ago

Parent Program:	--

Organizers

Jun Liu, Mary Sara McPeek, Richard Olshen (chair), David O. Siegmund, and Wing Wong

Speaker(s)

Show List of Speakers

Mark Abney
Donald Anderson
Stefan Bekiranov
John Blangero
Karl Broman
Aravinda Chakravarti
Sandrine Dudoit (University of California, Berkeley)
Josee Dupuis
Warren Ewens
Eleanor Feingold
Nanxiang Ge
Joe Gray (Lawrence Berkeley Laboratory)
Sacha Gutin
Ingileif Hallgrimsdottir
Chao Hsiung
Earl Hubbell
Richard Karp (University of California, Berkeley)
Augustine Kong
Leonid Kruglyak
Jerry Lanchbury
Kenneth Lange
Charles Lawrence
Laura Lazzeroni
Cheng Li
Hao Li
Hongzhe Li
Rob Lipshutz
Jun Liu (Harvard University)
Mary Sara McPeek (University of Chicago)
Richard Mott
Andrew Neuwald
Michael Newton
Dan Nicolae
Magnus Nordberg
Adam Olshen
Art Owen
Jonathan Pritchard
Thomas Quertermous
Daniel Rabinowitz
Koustubh Ranade
Neil Risch
Kathryn Roeder
Eddy Rubin
Chiara Sabatti
Nicholas Schork
Mark Segal
David Siegmund (Stanford University)
Mark Skolnick
Terence Speed (University of California, Berkeley)
Matthew Stephens
Fengzhu Sun
Hua Tang
Simon Tavare
Elizabeth Thompson
Robert Tibshirani
Michael Waterman (University of Southern California)
Wing Wong (Stanford University)
Fred Wright
Benjamin Yakir
Richard Zare
Heping Zhang
Hongyu Zhao

Description

This workshop is sponsored by MSRI and in part by Affymetrix, Aventis, Bristol-Myers Squibb and Pfizer. Our workshop will be held February 9-13, 2004 at the Mathematical Sciences Research Institute in Berkeley. Its topic is the genetics of complex human disease. The goals can be classified by subject matter by a variety of criteria, discussed below; but in human terms the goal is simple. We will bring together individuals who work at the forefront of laboratory research (perhaps also with patients) and others whose related activities are also cutting edge, but for whom the emphases are algorithmic, probabilistic, or statistical. While there are many conferences on specific topics from among those we cite, there are very few that span "bench to computer to bedside" topics and at the same time spare nothing in mathematical sophistication when suitable applications of mathematics can shed light on important biomedical problems. With respect to human diseases, the appropriate technologies, algorithms and clinical thinking are differentiated to a degree by the disease of interest. Important examples are cancer, autoimmune diseases, cardiovascular diseases, and lipid abnormalities. To some extent study of the human immune system holds these topics together, but there is much that is special to each topic. For example, much recent interest in cancer -- including but not limited to retinoblastoma, cancers of the head and neck, breast, and lung -- has focused on loss of heterozygosity and comparative genome hybridization, topics we will explore in lectures and discussion. By now there are numerous chips that are brought to bear on these diseases and others. "Historically," meaning, perhaps, one to five years ago, chips were primarily of the cDNA type, with 500-5000 base pairs studied at once, and oligonucleotide DNA chips, where attention is restricted to 20-80mers of DNA. Analysis of these and other chips and wafers will be topics of concern to us. One could think of the study of complex human disease as being analogous to a triangle, where scientists in the corners have their own emphases, but meet in the middle and interact. One corner concerns finding DNA markers, i.e., polymorphic sites, preferably in coding or regulatory regions of genes, that bear upon disease. While there might be viewed to be tension between studying allele sharing and identity by descent from linkage data on the one hand and association analysis of candidate genes on the other, we take it as a challenge to combine information from these different approaches. The use of animal models, where linkage analysis is easier than in humans, combined with identification of candidate genes in humans through homology searches provides another important tool. Linkage analysis has stimulated interest in first passage problems for Ornstein-Uhlenbeck processes to deal with problems of multiple testing. Candidate gene studies have stimulated interest in supervised learning, which in statistics is often called "classification." An important example is the study of angiotensinogen and protein tyrosine phsophatase as they bear upon hypertension. Cytochrome P450 genes are part of the study of all cited processes of disease. A second corner of the triangle concerns understanding genetic control and gene expression. There are now related data from beads, and other technologies, too. Analyses of these data might be in terms of supervised learning (when the outcome/phenotype is given) and unsupervised learning, or clustering (when the outcome/phenotype) is not. Clusters may be distinct or overlapping. Nearly always the clustering is of data that can be modeled as though they are points in Euclidean spaces, but where the cardinality of the sample pales by comparison with the dimension of the relevant space. With supervised learning there is, typically, a finite set of outcomes, the "covering diagnoses," and the goal is to classify, that is, to assign, each vector of expression values to a diagnosis. Some examples of interest in this area have been different flavors of hematopoietic malignancy. Most but not all classifiers of interest recently devolve from "voting methods," such as the celebrated AdaBoost method. The third corner of the triangle is concerned with the direct studies of proteins and their interactions, for example by time of flight mass spectrometry. Typical output here is a curve, or family of curves, with geometry (or geometries) that may apply to a particular genetic profile or disease. One approach of interest could be that of extracting a parsimonious set of basis functions for families of curves and representing a curve of interest to within specified discrepancy in a suitable norm. Perhaps low fractional Besov norms are relevant. Once a suitable basis and corresponding expansions are computed, we are back in the problem of supervised or unsupervised learning, as the case may be. Since what we get are the weights of proteins, it is imperative to be able to do the "inverse problem" of inferring the protein from the molecular weight. This could bring us to concerns of "fast table lookup," that have been important to streaming video over the Web and other problems. None of the above is meant to preclude interest in problems of evolution, which bear upon our subject matter through the identification of regions in proteins that are conserved across organisms and in evolutionary analysis of various pathogens, nor in problems in more traditional genetic epidemiology and statistical genetics. The latter can bring us to models where unconditional distributions are mixtures of Gaussians or other smooth distributions, and where sometimes distinctions between inference conditional on some data and unconditionally are blurred. The resulting inferential and computational issues can be very subtle. We on the committee that is organizing our workshop have contacted many individuals. Most are very interested in participating. Below please find a list of some of the individuals who will participate in our workshop. Although these are well known senior scientists, we are also committed to encourage many young and creative, but less well known, individuals to join us. Warren Ewens, Professor of Biology University of Pennsylvania (Winner, Weldon Memorial Prize, Oxford University, 2002) Joe W. Gray, Professor of Laboratory Medicine and Radiation Oncology Principal Investigator, UCSF Comprehensive Cancer Center University of California, San Francisco Jun Liu, Professor of Statistics and of Biostatistics Harvard University (Winner of Presidents' Award, Committee of Presidents of Statistical Societies, 2002) Mary Sara McPeek, Associate Professor, Departments of Statistics and Human Genetics Member, Committee on Genetics University of Chicago Richard Olshen, Professor of Health Research and Policy (Biostatistics) and (by Courtesy) of Electrical Engineering and Statistics Stanford University Thomas Quertermous, William G. Irwin Professor in Cardiovascular Medicine Research Chief, Division of Cardiovascular Medicine Stanford University Koustubh Ranade, Pharmaceutical Research Institute Bristol-Myers Squibb Princeton, New Jersey Neil Risch, Professor of Genetics and (by Courtesy) of Health Research and Policy and of Statistics Stanford University Adjunct Investigator, Division of Research Kaiser Permanente, Northern California David Siegmund, John D. and Sigrid Banks Professor and Professor of Statistics Stanford University Mark Skolnick, Chief Scientific Officer, Myriad Genetics, Inc. Terry Speed, Professor of Statistics University of California, Berkeley Head, Division of Bioinformatics Walter & Eliza Hall Institute of Medical Research Melbourne, Australia Robert Tibshirani, Professor of Health Research and Policy (Biostatistics) and (by Courtesy) of Statistics Stanford University (Winner of Presidents' Award, Committee of Presidents of Statistical Societies, 1996) Wing Hung Wong, Professor of Computational Biology Department of Biostatistics and Professor of Statistics Harvard University (Winner of Presidents' Award, Committee of Presidents of Statistical Societies, 1993)

Show less

Keywords and Mathematics Subject Classification (MSC)

Primary Mathematics Subject Classification No Primary AMS MSC

Secondary Mathematics Subject Classification No Secondary AMS MSC

Funding & Logistics Show All Collapse

Show Funding

To apply for funding, you must register by the funding application deadline displayed above.

All are welcome to apply for funding, including students and recent PhDs. Funding awards are typically made 6 weeks before the workshop begins. Requests received after the funding deadline are considered only if additional funds become available.

Show Lodging

For information about recommended hotels for visits of under 30 days, visit Short-Term Housing. Questions? Contact coord@slmath.org.

Show Directions to Venue

Map & Directions to SLMath

Show Visa/Immigration

Visa Information

Schedule, Notes/Handouts & Videos

Show Schedule, Notes/Handouts & Videos

Agenda View
Calendar View

Show All Collapse

Feb 09, 2004
Monday

08:00 AM - 05:00 PM

Detecting Genes from Data on Related Individuals
Elizabeth Thompson

08:00 AM - 05:00 PM

Sequence-based Prediction of HIV-1 Replication Capacity
Mark Segal

08:00 AM - 05:00 PM

Mapping Tumor Suppressor Genes Using Loss of Heterozygosity
Fred Wright

08:00 AM - 05:00 PM

Resampling-based multiple testing procedures: Applications to microarray data analysis
Sandrine Dudoit (University of California, Berkeley)

08:00 AM - 05:00 PM

Alleles, entropy, and locations: Which SNPs do you put on a chip?
Earl Hubbell

08:00 AM - 05:00 PM

Sample classification from protein mass spectroscopy by peak probability contrasts
Robert Tibshirani

08:00 AM - 05:00 PM

Normalization Methods of cDNA Microarray Experiments under Different Designs
Chao Agnes Hsiung

08:00 AM - 05:00 PM

Analysis of oligonucleotide SNP array data
Cheng Li

08:00 AM - 05:00 PM

Gene mapping in model organisms
Karl Broman

08:00 AM - 05:00 PM

Haplotype Block Parition and its applications to association studies
Fengzhu Sun

08:00 AM - 05:00 PM

A Statistical Sampling Algorithm for RNA Secondary Structure Prediction
Chip Lawrence

08:00 AM - 05:00 PM

Efficient simulation of p-values for linkage analysis
Eleanor Feingold

08:00 AM - 05:00 PM

Genomic Reconstruction of Yeast Transcription Networks
Hao Li

08:00 AM - 05:00 PM

Obtaining evolutionary clues to protein structural mechanisms
Andrew Neuwald

08:00 AM - 05:00 PM

Dictionary models for regulatory regions in DNA and gene expression arrays
Chiara Sabatti

08:00 AM - 05:00 PM

Association Testing with Mendel
Ken Lange

08:00 AM - 05:00 PM

Combinatorial Approaches to the Haplotype Phasing Problem
Richard Karp (University of California, Berkeley)

08:00 AM - 05:00 PM

Identifying recombination hotspots from LD in the human genome
Matthew Stephens

08:00 AM - 05:00 PM

Finding genes associated with multiple sclerosis
Terry Speed

08:00 AM - 05:00 PM

Genetic complexity in cancer
Joe Gray (Lawrence Berkeley Laboratory)

08:00 AM - 05:00 PM

Change-point methods for the analysis of array-based DNA copy number data
Adam Olshen

08:00 AM - 05:00 PM

Thoughts on the TDT
Warren Ewens

08:00 AM - 05:00 PM

Microarray profiling to identify vascular wall genes for association based candidate gene studies of atherosclerosis: genomics meets genetics
Thomas Quertermous

08:00 AM - 05:00 PM

Comparing Genomes to Study Disease
Eddy Rubin

08:00 AM - 05:00 PM

Novel Multivariate Analysis Methods for Genomic Analysis
Nik Schork

08:00 AM - 05:00 PM

A Gene Recommender for C. elegans
Art Owen

08:00 AM - 05:00 PM

Mapping QTL in the presence of gene-covariate
David Siegmund (Stanford University)

08:00 AM - 05:00 PM

An approach to obtain tight clusters
Wing Wong (Stanford University)

08:00 AM - 05:00 PM

A probabilistic framework for the statistics of selective sampling
Benjamin Yakir

08:00 AM - 05:00 PM

Statistical Methods for Analysis of Microarray Time Course Gene Expression Data
Honghzhe Li

08:00 AM - 05:00 PM

Inference of Ancestry for Admixed Groups
Hua Tang

08:00 AM - 05:00 PM

Mapping of Transcription Factor Sites along Human Chromosones 21 and 22 Using Genome Tiling Arrays
Stefan Bekiranov

08:00 AM - 05:00 PM

Algebraic Statistical Genetics: Linkage Analysis
Ingleif Hallgrimsdottir

08:00 AM - 05:00 PM

Identification of polymorphisms that explain a linkage peak
Josee Dupuis

08:00 AM - 05:00 PM

Tree-structured Supervied Learning and the Genetics of Hypertension
Richard Olshen

08:00 AM - 05:00 PM

Linkage Analysis of Longitudinal Data and Study Design Considerations
Heping Zhang