readJuicer( file, chromosomes = NULL, pairs = NULL, unit = c("BP", "FRAG"), resolution, verbose = FALSE )
file | Filename can be a local path or remote path. The remote path full list can be obtained from http://aidenlab.org/data.html. |
---|---|
chromosomes | A vector contains all the chromosomes. For example c('chr1', 'chr2'), the resulting contact matrixes will include all the pairs of interaction (chr1_chr1, chr1_chr2, chr2_chr2). |
pairs | A vector contains all the pair.
The pair take format as '1_1' or 'chr1_chr1', both means the contact between chromosome1 and chromosome1. If |
unit | Unit only supports c('BP', 'FRAG'). 'BP' means base-pair, and 'FRAG' means fragment. |
resolution | The desired resolution of the contact matrix. The resolution must be a value from following list.
|
verbose | TRUE or FALSE. Whether print information. |
A list object includes following items.
contact
A list contains all the contact matrix. The keys of list is coded as chrA_chrB
.
information
A list contains basic information of the contact matrix.
genomeID
The genome id of the current hic file.
resolution
The list of current hic file available resolution.
chromosomeSizes
A dataframe of chromosome size (chromosome, size). Can be used in juicer pre function for different genome.
settings
A list contains file name, unit, resolution, and chromosomes.
This function is heavily adopted from both java version juicebox Dump function and c++ version straw.
library(FreeHiCLite) ## Remote file location. The reomte file include downloading, it may take a while remoteFilePath = 'https://hicfiles.s3.amazonaws.com/hiseq/gm12878/in-situ/combined.hic' ## Local file location localFilePath = system.file('extdata', 'example.hic', package = 'FreeHiCLite') ## Chromosomes needs to be extracted chromosomes = c('chr1', 'chr2') ## Pairs needs to be extracted pairs = c('1_1', '1_2') unit = 'BP' resolution = 500000L ## pass chrosomes into function, it will contains all the interaction pairs dat <- readJuicer(file=localFilePath, chromosomes=chromosomes, pairs = NULL, unit=unit, resolution=resolution) print(names(dat[['contact']]))#> [1] "1_1" "1_2" "2_2"## pass pairs into function, it will contains only the given pairs start = Sys.time() dat <- readJuicer(file=localFilePath, chromosomes=NULL, pairs = pairs, unit=unit, resolution=resolution) end = Sys.time() print(end - start)#> Time difference of 0.05015302 secsstr(dat)#> List of 3 #> $ contact :List of 2 #> ..$ 1_1: int [1:87158, 1:3] 500000 500000 1000000 500000 1000000 1500000 500000 1000000 1500000 2000000 ... #> .. ..- attr(*, "dimnames")=List of 2 #> .. .. ..$ : NULL #> .. .. ..$ : chr [1:3] "x" "y" "counts" #> ..$ 1_2: int [1:69341, 1:3] 2500000 4000000 4500000 5000000 5500000 6000000 7000000 8500000 9500000 10000000 ... #> .. ..- attr(*, "dimnames")=List of 2 #> .. .. ..$ : NULL #> .. .. ..$ : chr [1:3] "x" "y" "counts" #> $ information:List of 4 #> ..$ genomeID : chr "hg19" #> ..$ resolution :List of 2 #> .. ..$ BP : int [1:3] 2500000 500000 5000 #> .. ..$ FRAG: int(0) #> ..$ pairs : chr [1:6] "1_1" "1_2" "1_3" "2_2" ... #> ..$ chromosomeSizes:'data.frame': 26 obs. of 2 variables: #> .. ..$ chromosome: chr [1:26] "1" "10" "11" "12" ... #> .. ..$ size : int [1:26] 249250621 135534747 135006516 133851895 115169878 107349540 102531392 90354753 81195210 78077248 ... #> $ settings :List of 4 #> ..$ unit : chr "BP" #> ..$ resolution : int 500000 #> ..$ chromosomes: NULL #> ..$ file : chr "/Library/Frameworks/R.framework/Versions/4.0/Resources/library/FreeHiCLite/extdata/example.hic" #> - attr(*, "class")= chr [1:2] "juicer" "freehic"## Access each contact matrix contacts = dat[['contact']] ## chromosome 1 vs chromosome 1 key = '1_1' contactMatrix <- contacts[[key]] head(contactMatrix)#> x y counts #> [1,] 500000 500000 384 #> [2,] 500000 1000000 231 #> [3,] 1000000 1000000 1272 #> [4,] 500000 1500000 47 #> [5,] 1000000 1500000 373 #> [6,] 1500000 1500000 1665#> x y counts #> [1,] 2500000 0 1 #> [2,] 4000000 0 1 #> [3,] 4500000 0 1 #> [4,] 5000000 0 2 #> [5,] 5500000 0 1 #> [6,] 6000000 0 1