
Collectively, the bedtools utilities are a swiss-army knife of tools for a wide-range of genomics analysis tasks. The most widely-used tools enable genome arithmetic: that is, set theory on the genome. For example, bedtools allows one to intersectmergecountcomplement, and shuffle genomic intervals from multiple files in widely-used genomic file formats such as BAM, BED, GFF/GTF, VCF. While each individual tool is designed to do a relatively simple task (e.g., intersect two interval files), quite sophisticated analyses can be conducted by combining multiple bedtools operations on the UNIX command line.

Summary of available tools.

bedtools support a wide range of operations for interrogating and manipulating genomic features. The table below summarizes the tools available in the suite.

Utility Description
annotate Annotate coverage of features from multiple files.
bamtobed Convert BAM alignments to BED (& other) formats.
bamtofastq Convert BAM records to FASTQ records.
bed12tobed6 Breaks BED12 intervals into discrete BED6 intervals.
bedpetobam Convert BEDPE intervals to BAM records.
bedtobam Convert intervals to BAM records.
closest Find the closest, potentially non-overlapping interval.
cluster Cluster (but don’t merge) overlapping/nearby intervals.
complement Extract intervals _not_ represented by an interval file.
coverage Compute the coverage over defined intervals.
expand Replicate lines based on lists of values in columns.
flank Create new intervals from the flanks of existing intervals.
genomecov Compute the coverage over an entire genome.
getfasta Use intervals to extract sequences from a FASTA file.
groupby Group by common cols. & summarize oth. cols. (~ SQL “groupBy”)
igv Create an IGV snapshot batch script.
intersect Find overlapping intervals in various ways.
jaccard Calculate the Jaccard statistic b/w two sets of intervals.
links Create a HTML page of links to UCSC locations.
makewindows Make interval “windows” across a genome.
map Apply a function to a column for each overlapping interval.
maskfasta Use intervals to mask sequences from a FASTA file.
merge Combine overlapping/nearby intervals into a single interval.
multicov Counts coverage from multiple BAMs at specific intervals.
multiinter Identifies common intervals among multiple interval files.
nuc Profile the nucleotide content of intervals in a FASTA file.
overlap Computes the amount of overlap from two intervals.
pairtobed Find pairs that overlap intervals in various ways.
pairtopair Find pairs that overlap other pairs in various ways.
random Generate random intervals in a genome.
reldist Calculate the distribution of relative distances b/w two files.
shuffle Randomly redistribute intervals in a genome.
slop Adjust the size of intervals.
sort Order the intervals in a file.
subtract Remove intervals based on overlaps b/w two files.
tag Tag BAM alignments based on overlaps with interval files.
unionbedg Combines coverage intervals from multiple BEDGRAPH files.

Find overlapping intervals within a window around an interval.

安装: yum install BEDTools

1, 将bam文件(tophat得到的结果)转化为fastq


samtools  merge RC6-1_ATTCCT_L005.bam accepted_hits.bam unmapped.bam



samtools_0.1.18 sort -n RC6-1_ATTCCT_L005.bam RC6-1_ATTCCT_L005.sorted



bedtools bamtofastq -i RC6-1_ATTCCT_L005.sorted.bam -fq RC6-1_ATTCCT_L005_R1.fastq -fq2 RC6-1_ATTCCT_L005_R2.fastq


