Title: On the use of resampling tests for evaluating statistical significance of binding-site co-occurrence
Authors: Huen, David S
Russell, Steven R
Issue Date: 30-Jun-2010
Citation: BMC Bioinformatics 2010, 11:359
Abstract: Abstract Background In eukaryotes, most DNA-binding proteins exert their action as members of large effector complexes. The presence of these complexes are revealed in high-throughput genome-wide assays by the co-occurrence of the binding sites of different complex components. Resampling tests are one route by which the statistical significance of apparent co-occurrence can be assessed. Results We have investigated two resampling approaches for evaluating the statistical significance of binding-site co-occurrence. The permutation test approach was found to yield overly favourable p-values while the independent resampling approach had the opposite effect and is of little use in practical terms. We have developed a new, pragmatically-devised hybrid approach that, when applied to the experimental results of an Polycomb/Trithorax study, yielded p-values consistent with the findings of that study. We extended our investigations to the FL method developed by Haiminen et al, which derives its null distribution from all binding sites within a dataset, and show that the p-value computed for a pair of factors by this method can depend on which other factors are included in that dataset. Both our hybrid method and the FL method appeared to yield plausible estimates of the statistical significance of co-occurrences although our hybrid method was more conservative when applied to the Polycomb/Trithorax dataset. A high-performance parallelized implementation of the hybrid method is available. Conclusions We propose a new resampling-based co-occurrence significance test and demonstrate that it performs as well as or better than existing methods on a large experimentally-derived dataset. We believe it can be usefully applied to data from high-throughput genome-wide techniques such as ChIP-chip or DamID. The Cooccur package, which implements our approach, accompanies this paper.
Description: RIGHTS : This article is licensed under the BioMed Central licence at http://www.biomedcentral.com/about/license which is similar to the 'Creative Commons Attribution Licence'. In brief you may : copy, distribute, and display the work; make derivative works; or make commercial use of the work - under the following conditions: the original author must be given credit; for any reuse or distribution, it must be made clear to others what the license terms of this work are.
URI: http://www.dspace.cam.ac.uk/handle/1810/237606
http://dx.doi.org/10.1186/1471-2105-11-359
Appears in Collections:Scholarly works - Genetics

Files in This Item:

File Description SizeFormat
1471-2105-11-359.xml86.54 kBXMLView/Open
1471-2105-11-359.pdf748.33 kBAdobe PDFThumbnail
View/Open
1471-2105-11-359-S1.GZ138.04 kBUnknownView/Open
Additional resources for this item
search for alternative versions in eresources@cambridge
retrieve citation metadata in EndNote format

This item has been accessed 180 times.

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.