Use a regular expression for filtering sequences by id from a FASTA file, e.g. just certain chromosomes from a genome. There are other tools as part of bigger packages to install (and no regex support), mostly awk-based awkward (sorry for the pun) bash solutions, and scripts using packages that one needs to install and with still no support for regular expressions. This however is a simple, straightforward little python script for a simple task. It doesn’t do anything else and doesn’t need anything but a stock python installation. Based on the FASTA reader snippet.

Download here.

Usage:

python FASTAfilter.py [-h] regex infile outfile

From a FASTA-file with multiple >entries, filter by sequence ids using a
regex.

positional arguments:
regex Regex to filter entry ids, e.g. ‘chr[1-4]’. Note that the id does not contain the initial > character.
infile A FASTA input file, usually with multiple entries.
outfile The new file with only the matching entries.

optional arguments:
-h, –help show this help message and exit

INSTALL:

cd /data/software
wget http://dm516.user.srcf.net/fastafilter/FASTAfilter.zip
unzip FASTAfilter.zip
easy_install argparse

USAGE:

python FASTAfilter.py   [1-9,10,11,12,13,14,15,16,17,18,X]  \
/dat2/INPUT.fa \
/dat2/OUTPUT.fa

Error:

Traceback (most recent call last):
  File "FASTAfilter.py", line 3, in <module>
    import argparse
ImportError: No module named argparse

Solution:

run "easy_install argparse" as root user.

http://dm516.user.srcf.net/?p=314

Filter FASTA files的更多相关文章

  1. Extract Fasta Sequences Sub Sets by position

    cut -d " " -f 1 sequences.fa | tr -s "\n" "\t"| sed -s 's/>/\n/g' & ...

  2. elfinder中通过DirectoryStream.Filter实现筛选隐藏目录(二)

    今天还是没事看了看elfinder源码,发现之前说的两个版本实现都是基于不同的jdkelfinder源码浏览-Volume文件系统操作类(1), 带前端页面的是基于1.6中File实现,另一个是基于1 ...

  3. OpenFileDialog.Filter 属性

    如果 Filter 属性为 Empty,将显示所有文件. 始终显示文件夹. Filter 由以下部分组成:筛选器说明,后跟竖线 (|) 和筛选模式. 筛选器可以指定一个或多个文件类型. 说明描述了对话 ...

  4. python 高阶函数之filter

    前文说到python高阶函数之map,相信大家对python中的高阶函数有所了解,此次继续分享python中的另一个高阶函数filter. 先看一下filter() 函数签名 >>> ...

  5. Falcon Genome Assembly Tool Kit Manual

    Falcon Falcon: a set of tools for fast aligning long reads for consensus and assembly The Falcon too ...

  6. Linux command line exercises for NGS data processing

    by Umer Zeeshan Ijaz The purpose of this tutorial is to introduce students to the frequently used to ...

  7. 构建NCBI本地BLAST数据库 (NR NT等) | blastx/diamond使用方法 | blast构建索引 | makeblastdb

    参考链接: FTP README 如何下载 NCBI NR NT数据库? 下载blast:ftp://ftp.ncbi.nlm.nih.gov/blast/executables/blast+ 先了解 ...

  8. STAR manual

    来源:STARmanual.pdf 来源:Calling variants in RNAseq PART0 准备工作 #STAR 安装前的依赖的工具 #Red Hat, CentOS, Fedora. ...

  9. &lt;二代測序&gt; 下载 NCBI sra 文件

    本文近期更新地址: http://blog.csdn.net/tanzuozhev/article/details/51077222 随着測序技术的不断提高.二代測序数据成指数增长. NCBI提供了S ...

随机推荐

  1. Codevs (3657括号序列 )

    题目链接:传送门 题目大意:中文题,略 题目思路:区间DP 这个题是问需要添加多少个括号使之成为合法括号序列,那么我们可以先求有多少合法的括号匹配,然后用字符串长度减去匹配的括号数就行 状态转移方程主 ...

  2. vue父子组件传值

    1.父组件向子组件传值 例如app.vue是父组件,v-header.vue是子组件,实现app向v-header传值父组件需要自定义自己的title值, 子组件v-header内容 <temp ...

  3. ZOJ 1648 Circuit Board(计算几何)

    Circuit Board Time Limit: 2 Seconds Memory Limit: 65536 KB On the circuit board, there are lots of c ...

  4. delphi------项目类型

    Console Application:控制台应用程序 writeln('HelloWorld'); //接收用户输入字符 readln: //直到用户输入回车结束 VCL Forms Applica ...

  5. lombok插件使用

    1.1 lombok介绍 lombok 是一个可以帮助我们简化java代码编写的工具类,尤其是简化javabean的编写,可以通过采用注解的方式,消除代码中的构造方法,getter/setter等代码 ...

  6. 一篇搞定vue请求和跨域

    vue本身不支持发送AJAX请求,需要使用vue-resource.axios等插件实现 axios是一个基本Promise的HTTP请求客户端,用来发送请求,也是vue2.0官方推荐的,同时不再对v ...

  7. paper reading:gaze tracking

    https://www.cv-foundation.org/openaccess/content_cvpr_2016/papers/Krafka_Eye_Tracking_for_CVPR_2016_ ...

  8. poco库 RSA加解密

    #include "poco/Crypto/Cipher.h"#include "poco/Crypto/CipherFactory.h"#include &q ...

  9. Python WSGI v1.0 中文版(转)

    add by zhj: WSGI全称Web Server Gateway Interface,即Web网关接口.其实它并不是OSI七层协议中的协议,它就是一个接口而已,即函数,而WSGI规定了该接口的 ...

  10. 初识python(二)

    初识python(二) 1.变量 变量:把程序运行的中间结果临时的存在内存里,以便后续的代码调用. 1.1 声明变量: #!/usr/bin/env python # -*- coding: utf- ...