[1] bedtools (https://github.com/arq5x/bedtools2)

here is also bedtools (https://github.com/arq5x/bedtools2) getfasta. It uses Erik's code under the hood.

$ cat test.fa
chr1 5 10 $ bedtools getfasta -fi test.fa -bed test.bed -fo test.fa.out $ cat test.fa.out

Docs: http://bedtools.readthedocs.org/en/latest/content/tools/getfasta.html

And it is wrapped in pybedtools as well: http://pythonhosted.org/pybedtools/autodocs/pybedtools.BedTool.sequence.html?highlight=fasta


[2] Samtools faidx feature

faidx samtools faidx <ref.fasta> [region1 [...]] Index reference sequence in the FASTA format or extract subsequence from indexed reference sequence. If no region is specified, faidx will index the file and create <ref.fasta>.fai on the disk. If regions are speficified, the subsequences will be retrieved and printed to stdout in the FASTA format.

You will have to first create the fasta indexes of the reference genome fasta file and then use this command.

[3] python implementation of faidx to GitHub.


[4] UCSC twoBitToFa


python script to fetch sequences from UCSC DAS server:

[6] ensembl biomart





