

split函数java docs的说明:

When there is a positive-width match at the beginning of this string then an empty leading substring is included at the beginning of the resulting array.A zero-width match at the beginning however never produces such empty leading substring.

The limit parameter controls the number of times the pattern is applied and therefore affects the length of the resulting array. If the limit n is greater than zero then the pattern will be applied at most n -  times, the array's length will be no greater than n, and the array's last entry will contain all input beyond the last matched delimiter. If n is non-positive then the pattern will be applied as many times as possible and the array can have any length. If n is zero then the pattern will be applied as many times as possible, the array can have any length, and trailing empty strings will be discarded.




、Limit < , e.g. limit = -
、limit = ,不传默认是0
、Limit > ,e.g. limit =
、limit > size,e.g. limit =



public void test() {
String string = "linux---abc-linux-";
splitStringWithLimit(string, -);
splitStringWithLimit(string, );
splitStringWithLimit(string, );
splitStringWithLimit(string, );
} public void splitStringWithLimit(String string, int limit) {
String[] arrays = string.split("-", limit);
String result = MessageFormat.format("arrays={0}, length={1}", Arrays.toString(arrays), arrays.length);
} // arrays=[linux, , , abc, linux, ], length=6
// arrays=[linux, , , abc, linux], length=5
// arrays=[linux, , -abc-linux-], length=3
// arrays=[linux, , , abc, linux, ], length=6


1、如果regex是正则表达式的元字符:".$|()[{^?*+\\”,或者regex是以\开头,以不是0-9, a-z, A-Z结尾的双字符。
if (((regex.value.length ==  &&
".$|()[{^?*+\\".indexOf(ch = regex.charAt()) == -) ||
(regex.length() == &&
regex.charAt() == '\\' &&
(((ch = regex.charAt())-'')|(''-ch)) < &&
((ch-'a')|('z'-ch)) < &&
((ch-'A')|('Z'-ch)) < )) &&
(ch < Character.MIN_HIGH_SURROGATE ||
ch > Character.MAX_LOW_SURROGATE))
int off = ;
int next = ;
boolean limited = limit > ;
ArrayList<String> list = new ArrayList<>();
while ((next = indexOf(ch, off)) != -) {
if (!limited || list.size() < limit - ) {
list.add(substring(off, next));
off = next + ;
} else { // last one
//assert (list.size() == limit - 1);
list.add(substring(off, value.length));
off = value.length;
// If no match was found, return this
if (off == )
return new String[]{this}; // Add remaining segment
if (!limited || list.size() < limit)
list.add(substring(off, value.length)); // Construct result
int resultSize = list.size();
if (limit == ) {
while (resultSize > && list.get(resultSize - ).length() == ) {
String[] result = new String[resultSize];
return list.subList(, resultSize).toArray(result);
return Pattern.compile(regex).split(this, limit);


 public String[] split(CharSequence input, int limit) {
int index = ;
boolean matchLimited = limit > ;
ArrayList<String> matchList = new ArrayList<>();
Matcher m = matcher(input); // Add segments before each match found
while(m.find()) {
if (!matchLimited || matchList.size() < limit - ) {
if (index == && index == m.start() && m.start() == m.end()) {
// no empty leading substring included for zero-width match
// at the beginning of the input char sequence.
String match = input.subSequence(index, m.start()).toString();
index = m.end();
} else if (matchList.size() == limit - ) { // last one
String match = input.subSequence(index,
index = m.end();
} // If no match was found, return this
if (index == )
return new String[] {input.toString()}; // Add remaining segment
if (!matchLimited || matchList.size() < limit)
matchList.add(input.subSequence(index, input.length()).toString()); // Construct result
int resultSize = matchList.size();
if (limit == )
while (resultSize > && matchList.get(resultSize-).equals(""))
String[] result = new String[resultSize];
return matchList.subList(, resultSize).toArray(result);




如果limit <= 0或者list的长度还没有达到我们设置的Limit数值。那么就把剩下的内容(最后的一个regex位置到末尾)添加到list中。 


这里针对的是limit等于0的处理。如果limit=0,那么会把会从后向前遍历list的内容。去除空的字符串(中间出现的空字符串不会移除) 。





