Filter FASTA files
Use a regular expression for filtering sequences by id from a FASTA file, e.g. just certain chromosomes from a genome. There are other tools as part of bigger packages to install (and no regex support), mostly awk-based awkward (sorry for the pun) bash solutions, and scripts using packages that one needs to install and with still no support for regular expressions. This however is a simple, straightforward little python script for a simple task. It doesn’t do anything else and doesn’t need anything but a stock python installation. Based on the FASTA reader snippet.
Usage:
python FASTAfilter.py [-h] regex infile outfile
From a FASTA-file with multiple >entries, filter by sequence ids using a
regex.
positional arguments:
regex Regex to filter entry ids, e.g. ‘chr[1-4]’. Note that the id does not contain the initial > character.
infile A FASTA input file, usually with multiple entries.
outfile The new file with only the matching entries.
optional arguments:
-h, –help show this help message and exit
INSTALL:
cd /data/software
wget http://dm516.user.srcf.net/fastafilter/FASTAfilter.zip
unzip FASTAfilter.zip
easy_install argparse
USAGE:
python FASTAfilter.py [1-9,10,11,12,13,14,15,16,17,18,X] \
/dat2/INPUT.fa \
/dat2/OUTPUT.fa
Error:
Traceback (most recent call last):
File "FASTAfilter.py", line 3, in <module>
import argparse
ImportError: No module named argparse
Solution:
run "easy_install argparse
" as root user.
http://dm516.user.srcf.net/?p=314
Filter FASTA files的更多相关文章
- Extract Fasta Sequences Sub Sets by position
cut -d " " -f 1 sequences.fa | tr -s "\n" "\t"| sed -s 's/>/\n/g' & ...
- elfinder中通过DirectoryStream.Filter实现筛选隐藏目录(二)
今天还是没事看了看elfinder源码,发现之前说的两个版本实现都是基于不同的jdkelfinder源码浏览-Volume文件系统操作类(1), 带前端页面的是基于1.6中File实现,另一个是基于1 ...
- OpenFileDialog.Filter 属性
如果 Filter 属性为 Empty,将显示所有文件. 始终显示文件夹. Filter 由以下部分组成:筛选器说明,后跟竖线 (|) 和筛选模式. 筛选器可以指定一个或多个文件类型. 说明描述了对话 ...
- python 高阶函数之filter
前文说到python高阶函数之map,相信大家对python中的高阶函数有所了解,此次继续分享python中的另一个高阶函数filter. 先看一下filter() 函数签名 >>> ...
- Falcon Genome Assembly Tool Kit Manual
Falcon Falcon: a set of tools for fast aligning long reads for consensus and assembly The Falcon too ...
- Linux command line exercises for NGS data processing
by Umer Zeeshan Ijaz The purpose of this tutorial is to introduce students to the frequently used to ...
- 构建NCBI本地BLAST数据库 (NR NT等) | blastx/diamond使用方法 | blast构建索引 | makeblastdb
参考链接: FTP README 如何下载 NCBI NR NT数据库? 下载blast:ftp://ftp.ncbi.nlm.nih.gov/blast/executables/blast+ 先了解 ...
- STAR manual
来源:STARmanual.pdf 来源:Calling variants in RNAseq PART0 准备工作 #STAR 安装前的依赖的工具 #Red Hat, CentOS, Fedora. ...
- <二代測序> 下载 NCBI sra 文件
本文近期更新地址: http://blog.csdn.net/tanzuozhev/article/details/51077222 随着測序技术的不断提高.二代測序数据成指数增长. NCBI提供了S ...
随机推荐
- Codevs (3657括号序列 )
题目链接:传送门 题目大意:中文题,略 题目思路:区间DP 这个题是问需要添加多少个括号使之成为合法括号序列,那么我们可以先求有多少合法的括号匹配,然后用字符串长度减去匹配的括号数就行 状态转移方程主 ...
- vue父子组件传值
1.父组件向子组件传值 例如app.vue是父组件,v-header.vue是子组件,实现app向v-header传值父组件需要自定义自己的title值, 子组件v-header内容 <temp ...
- ZOJ 1648 Circuit Board(计算几何)
Circuit Board Time Limit: 2 Seconds Memory Limit: 65536 KB On the circuit board, there are lots of c ...
- delphi------项目类型
Console Application:控制台应用程序 writeln('HelloWorld'); //接收用户输入字符 readln: //直到用户输入回车结束 VCL Forms Applica ...
- lombok插件使用
1.1 lombok介绍 lombok 是一个可以帮助我们简化java代码编写的工具类,尤其是简化javabean的编写,可以通过采用注解的方式,消除代码中的构造方法,getter/setter等代码 ...
- 一篇搞定vue请求和跨域
vue本身不支持发送AJAX请求,需要使用vue-resource.axios等插件实现 axios是一个基本Promise的HTTP请求客户端,用来发送请求,也是vue2.0官方推荐的,同时不再对v ...
- paper reading:gaze tracking
https://www.cv-foundation.org/openaccess/content_cvpr_2016/papers/Krafka_Eye_Tracking_for_CVPR_2016_ ...
- poco库 RSA加解密
#include "poco/Crypto/Cipher.h"#include "poco/Crypto/CipherFactory.h"#include &q ...
- Python WSGI v1.0 中文版(转)
add by zhj: WSGI全称Web Server Gateway Interface,即Web网关接口.其实它并不是OSI七层协议中的协议,它就是一个接口而已,即函数,而WSGI规定了该接口的 ...
- 初识python(二)
初识python(二) 1.变量 变量:把程序运行的中间结果临时的存在内存里,以便后续的代码调用. 1.1 声明变量: #!/usr/bin/env python # -*- coding: utf- ...