conStruct
One of the first steps in the analysis of genetic data, and a principal mission of biology, is to describe and categorize natural variation. A continuous pattern of differentiation (isolation by distance), where individuals found closer together in space are, on average, more genetically similar than individuals sampled farther apart, can confound attempts to categorize natural variation into groups. This is because current statistical methods for assigning individuals to discrete clusters cannot accommodate spatial patterns, and so are forced to use clusters to describe what is in fact continuous variation. As isolation by distance is common in nature, this is a substantial shortcoming of existing methods.
To address this, I developed a new method, conStruct, with my collaborators Graham Coop and Peter Ralph. conStruct is a new statistical method for categorizing natural genetic variation - one that describes variation as a combination of continuous and discrete patterns. It is a model-based clustering method that uses genetic information to infer discrete population structure and and assign ancestry of a set of genotyped samples between a user-specified number of discrete groups.
Instructions for how to install conStruct can be found on its github page here. The manuscript describing the method is described here.
To address this, I developed a new method, conStruct, with my collaborators Graham Coop and Peter Ralph. conStruct is a new statistical method for categorizing natural genetic variation - one that describes variation as a combination of continuous and discrete patterns. It is a model-based clustering method that uses genetic information to infer discrete population structure and and assign ancestry of a set of genotyped samples between a user-specified number of discrete groups.
Instructions for how to install conStruct can be found on its github page here. The manuscript describing the method is described here.