https://rtyley.github.io/bfg-repo-cleaner/

an alternative to git-filter-branch

The BFG is a simpler, faster alternative to git-filter-branch for cleansing bad data out of your Git repository history:

  • Removing Crazy Big Files
  • Removing PasswordsCredentials & other Private data

The git-filter-branch command is enormously powerful and can do things that the BFG can't - but the BFG is much better for the tasks above, because:

  • Faster : 10 - 720x faster
  • Simpler : The BFG isn't particularily clever, but is focused on making the above tasks easy
  • Beautiful : If you need to, you can use the beautiful Scala language to customise the BFG. Which has got to be better than Bash scripting at least some of the time.

Usage

First clone a fresh copy of your repo, using the --mirror flag:

$ git clone --mirror git://example.com/some-big-repo.git

This is a bare repo, which means your normal files won't be visible, but it is a full copy of the Git database of your repository, and at this point you should make a backup of it to ensure you don't lose anything.

Now you can run the BFG to clean your repository up:

$ java -jar bfg.jar --strip-blobs-bigger-than 100M some-big-repo.git

The BFG will update your commits and all branches and tags so they are clean, but it doesn't physically delete the unwanted stuff. Examine the repo to make sure your history has been updated, and then use the standard git gc command to strip out the unwanted dirty data, which Git will now recognise as surplus to requirements:

$ cd some-big-repo.git
$ git reflog expire --expire=now --all && git gc --prune=now --aggressive

Finally, once you're happy with the updated state of your repo, push it back up (note that because your clone command used the --mirror flag, this push will update all refs on your remote server):

$ git push

At this point, you're ready for everyone to ditch their old copies of the repo and do fresh clones of the nice, new pristine data. It's best to delete all old clones, as they'll have dirty history that you don't want to risk pushing back into your newly cleaned repo.

Examples

In all these examples bfg is an alias for java -jar bfg.jar.

Delete all files named 'id_rsa' or 'id_dsa' :

$ bfg --delete-files id_{dsa,rsa}  my-repo.git

Remove all blobs bigger than 50 megabytes :

$ bfg --strip-blobs-bigger-than 50M  my-repo.git

Replace all passwords listed in a file (prefix lines 'regex:' or 'glob:' if required) with ***REMOVED***wherever they occur in your repository :

$ bfg --replace-text passwords.txt  my-repo.git

Remove all folders or files named '.git' - a reserved filename in Git. These often become a problemwhen migrating to Git from other source-control systems like Mercurial :

$ bfg --delete-folders .git --delete-files .git  --no-blob-protection  my-repo.git

For further command-line options, you can run the BFG without any arguments, which will output text like this.

【转】BFG Repo-Cleaner: Removes large or troublesome blobs like git-filter-branch does, but faster.的更多相关文章

  1. Mysql_大字段问题Row size too large.....not counting BLOBs, is 8126.

    [问题描述] 1.从myslq(5.7.19-0ubuntu0.16.04.1)中导出sql脚本,导入到mysql(5.5.27)中,报如下错误:Row size too large. The max ...

  2. How to get started with GIT and work with GIT Remote Repo

    https://www.ntu.edu.sg/home/ehchua/programming/howto/Git_HowTo.html#zz-7. 1.  Introduction GIT is a ...

  3. (AOSP)repo checkout指定版本

    aosp 怎么切换分支? To properly switch Android version, all you need to change is branch for your manifest ...

  4. Git与Repo入门

    版本控制 版本控制是什么已不用在说了,就是记录我们对文件.目录或工程等的修改历史,方便查看更改历史,备份以便恢复以前的版本,多人协作... 一.原始版本控制 最原始的版本控制是纯手工的版本控制:修改文 ...

  5. repo的用法

    转自:http://blog.csdn.net/junglyfine/article/details/6299636 注:repo只是google用Python脚本写的调用Git的一个脚本,主要是用来 ...

  6. Git与Repo入门(转载)

    aaarticlea/png;base64,iVBORw0KGgoAAAANSUhEUgAAAykAAADuCAIAAACyDd+sAAAAA3NCSVQICAjb4U/gAAAgAElEQVR4Xu ...

  7. [git]Git与Repo入门

    转自:http://www.cnblogs.com/angeldevil/archive/2013/11/26/3238470.html 注:非常推荐的一篇关于git的博文 目录: 版本控制 一.原始 ...

  8. git和repo入门

    版本控制 版本控制是什么已不用在说了,就是记录我们对文件.目录或工程等的修改历史,方便查看更改历史,备份以便恢复以前的版本,多人协作... 一.原始版本控制 最原始的版本控制是纯手工的版本控制:修改文 ...

  9. repo的小结

    repo仅仅是google用Python脚本写的调用git的一个脚本,主要是用来下载.管理Android项目的软件仓库. 1. 下载 repo 的地址: http://android.git.kern ...

随机推荐

  1. my sql无法删除数据库

    mysql有时候会无法删除数据库,可以通过 1.select @@datadir 查询到文件目录 'C:\\ProgramData\\MySQL\\MySQL Server 5.7\\Data\\' ...

  2. 专访笨叔叔:2019年可能是Linux年?(转)

    链接:https://zhuanlan.zhihu.com/p/57815479 2017年9月<奔跑吧 Linux内核>一书出版后得到了广大Linux从业人员和爱好者(特别是从事Linu ...

  3. 时钟信号的占空比调整——Verilog

    时钟信号的占空比调整——Verilog `timescale 1ns / 1ps /////////////////////////////////////////////////////////// ...

  4. 群晖NAS同步文件,防止Mac OS X自动休眠的办法

    背景: NAS drive同步文件到移动硬盘,需要消耗很长时间.但长时间不动电脑,mac又会自动关闭所有application,进入休眠模式,导致同步任务被终止. 使用系统的节能设置配置也没能成功关闭 ...

  5. 简易Asset工作流

    前言: 当前比较主流的制作流程都可以按顺序细分为三个部分:资产环节(asset section),镜头环节(shot section),合成环节(composite section). 考虑到单一资产 ...

  6. MVC5 方法扩展

    public static MvcHtmlString DataDictionaryDropDownList(this HtmlHelper htmlHelper, string name, obje ...

  7. Android memory dump

    1.读取指定pid和内存地址的字符: #include <stdlib.h> #include <stdio.h> #include <string.h> #inc ...

  8. RN 各种小问题

    问题:vs code 无法启动 双击无反应 方法:控制台 输入 netsh winsock reset 重启 vs 问题:RN Debugger Network 不显示 源网页:https://git ...

  9. C# 6.0:在catch和finally中使用await

    Asyn方法是一个现在很常用的方法,当使用async和await时,你或许曾有这样的经历,就是你想要在catch块或finally块中使用它们,比如当出现一个exception而你希望将日志记在文件或 ...

  10. 为什么我说IPFS社区从卖矿机开始,就是错的

    要回答这个问题,首先要了解去中心化存储项目和传统的区块链项目有什么区别.其中去中心化存储项目包括IPFS,基于IPFS的FileCoin.PPIO.Storj等. 传统区块链项目没有供需问题 首先以比 ...