Read .hic file generated by Juicebox. Currently can be used both local and remote file.

readJuicer(
  file,
  chromosomes = NULL,
  pairs = NULL,
  unit = c("BP", "FRAG"),
  resolution,
  verbose = FALSE
)

Arguments

file

Filename can be a local path or remote path. The remote path full list can be obtained from http://aidenlab.org/data.html.

chromosomes

A vector contains all the chromosomes. For example c('chr1', 'chr2'), the resulting contact matrixes will include all the pairs of interaction (chr1_chr1, chr1_chr2, chr2_chr2).

pairs

A vector contains all the pair. The pair take format as '1_1' or 'chr1_chr1', both means the contact between chromosome1 and chromosome1. If pairs presents, chromosomes argument will be ignore.

unit

Unit only supports c('BP', 'FRAG'). 'BP' means base-pair, and 'FRAG' means fragment.

resolution

The desired resolution of the contact matrix. The resolution must be a value from following list.

  • unit: BP

    • 2500000

    • 1000000

    • 500000

    • 250000

    • 100000

    • 50000

    • 25000

    • 10000

    • 5000

  • unit: FRAG

    • 500

    • 250

    • 100

    • 50

    • 20

    • 5

    • 2

    • 1

verbose

TRUE or FALSE. Whether print information.

Value

A list object includes following items.

contact

A list contains all the contact matrix. The keys of list is coded as chrA_chrB.

information

A list contains basic information of the contact matrix.

genomeIDThe genome id of the current hic file. resolutionThe list of current hic file available resolution. chromosomeSizesA dataframe of chromosome size (chromosome, size). Can be used in juicer pre function for different genome.
settings

A list contains file name, unit, resolution, and chromosomes.

Details

This function is heavily adopted from both java version juicebox Dump function and c++ version straw.

Examples

library(FreeHiCLite) ## Remote file location. The reomte file include downloading, it may take a while remoteFilePath = 'https://hicfiles.s3.amazonaws.com/hiseq/gm12878/in-situ/combined.hic' ## Local file location localFilePath = system.file('extdata', 'example.hic', package = 'FreeHiCLite') ## Chromosomes needs to be extracted chromosomes = c('chr1', 'chr2') ## Pairs needs to be extracted pairs = c('1_1', '1_2') unit = 'BP' resolution = 500000L ## pass chrosomes into function, it will contains all the interaction pairs dat <- readJuicer(file=localFilePath, chromosomes=chromosomes, pairs = NULL, unit=unit, resolution=resolution) print(names(dat[['contact']]))
#> [1] "1_1" "1_2" "2_2"
## pass pairs into function, it will contains only the given pairs start = Sys.time() dat <- readJuicer(file=localFilePath, chromosomes=NULL, pairs = pairs, unit=unit, resolution=resolution) end = Sys.time() print(end - start)
#> Time difference of 0.05015302 secs
str(dat)
#> List of 3 #> $ contact :List of 2 #> ..$ 1_1: int [1:87158, 1:3] 500000 500000 1000000 500000 1000000 1500000 500000 1000000 1500000 2000000 ... #> .. ..- attr(*, "dimnames")=List of 2 #> .. .. ..$ : NULL #> .. .. ..$ : chr [1:3] "x" "y" "counts" #> ..$ 1_2: int [1:69341, 1:3] 2500000 4000000 4500000 5000000 5500000 6000000 7000000 8500000 9500000 10000000 ... #> .. ..- attr(*, "dimnames")=List of 2 #> .. .. ..$ : NULL #> .. .. ..$ : chr [1:3] "x" "y" "counts" #> $ information:List of 4 #> ..$ genomeID : chr "hg19" #> ..$ resolution :List of 2 #> .. ..$ BP : int [1:3] 2500000 500000 5000 #> .. ..$ FRAG: int(0) #> ..$ pairs : chr [1:6] "1_1" "1_2" "1_3" "2_2" ... #> ..$ chromosomeSizes:'data.frame': 26 obs. of 2 variables: #> .. ..$ chromosome: chr [1:26] "1" "10" "11" "12" ... #> .. ..$ size : int [1:26] 249250621 135534747 135006516 133851895 115169878 107349540 102531392 90354753 81195210 78077248 ... #> $ settings :List of 4 #> ..$ unit : chr "BP" #> ..$ resolution : int 500000 #> ..$ chromosomes: NULL #> ..$ file : chr "/Library/Frameworks/R.framework/Versions/4.0/Resources/library/FreeHiCLite/extdata/example.hic" #> - attr(*, "class")= chr [1:2] "juicer" "freehic"
## Access each contact matrix contacts = dat[['contact']] ## chromosome 1 vs chromosome 1 key = '1_1' contactMatrix <- contacts[[key]] head(contactMatrix)
#> x y counts #> [1,] 500000 500000 384 #> [2,] 500000 1000000 231 #> [3,] 1000000 1000000 1272 #> [4,] 500000 1500000 47 #> [5,] 1000000 1500000 373 #> [6,] 1500000 1500000 1665
## chromosome 1 vs chromosome 2 key = '1_2' contactMatrix <- contacts[[key]] head(contactMatrix)
#> x y counts #> [1,] 2500000 0 1 #> [2,] 4000000 0 1 #> [3,] 4500000 0 1 #> [4,] 5000000 0 2 #> [5,] 5500000 0 1 #> [6,] 6000000 0 1