A recently implemented enhanced wildcard string matcher, features of which including,

  • Supporting wildcard character '*' for matching zero or more characters
  • Supporting wildcard character '?' for matching exactly one character
  • Supporting parentheses '(' and ')' for referencing the matches
  • Supporting escape character (back-slash)

C++ features demonstrated by this implementation,

  • Functors with a consideration of possible function pointers/user instantiated functors with user data
  • Specialized templates
  • Template rebinding

The implementation is maintained as part of the ongoing project of quanben's C++ template library qcpplib publicly on github at https://github.com/lincolnyu/qcpplib/

The current snapshot of the code is following,

 //
// qcpplib v1.00
// quanben's C++ template library
//
// Author Lincoln Yu
//
// lincoln.yu@gmail.com
// https://github.com/lincolnyu/qcpplib
//
// The MIT License (MIT)
//
// Copyright (c) <year> <copyright holders>
//
// Permission is hereby granted, free of charge, to any person obtaining a copy
// of this software and associated documentation files (the "Software"), to deal
// in the Software without restriction, including without limitation the rights
// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
// copies of the Software, and to permit persons to whom the Software is
// furnished to do so, subject to the following conditions:
//
// The above copyright notice and this permission notice shall be included in
// all copies or substantial portions of the Software.
//
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
// THE SOFTWARE.
// #if !defined (_WILDCARD_H_)
#define _WILDCARD_H_ #include <map>
#include <vector>
#include <cstring>
#include <string> /// @brief Contains class definitions that deal with wildcard matching
namespace Qtl { namespace String { namespace Wildcard { /// @brief An implementation of the functor that returns the length of string whose iterators are applicable
/// to subtract operator
template <class TStringRef, class TSubtractableIter>
struct CharDistFunctorIndexed
{
size_t operator()(TStringRef str, TSubtractableIter iterBegin, TSubtractableIter iterEnd)
{
return (iterEnd-iterBegin);
}
}; /// @brief An implementation of the functor that returns the position of the first character for character-based
/// zero-terminated string
struct StringBeginFunctorPsz
{
char * operator()(char *str)
{
return (str);
}
}; /// @brief An implementation of the functor that returns the position of the first character for a std::string
struct StringBeginFunctorStdStr
{
std::string::const_iterator operator()(const std::string& str)
{
return str.begin();
}
}; /// @brief An implementation of the functor that determines if the position is at the end of a character-based
/// zero-terminated string
struct StringEndFunctorPsz
{
bool operator()(char *iter, char *str)
{
return (*iter == );
}
}; /// @brief An implementation of the functor that determines if the position is at the end of a std::string
struct StringEndFunctorStdStr
{
bool operator()(std::string::const_iterator iter, const std::string& str)
{
return (iter==str.end());
}
}; /// @brief An implementation of the functor that appends a character to a character-based zero-terminated string
struct AppendCharFunctorPsz
{
void operator()(char *str, char *&iter, char ch)
{
*iter++ = ch;
}
}; /// @brief An implementation of the functor that appends a character to a std::string
struct AppendCharFunctorStdStr
{
void operator()(std::string &str, std::string::iterator &iter, char ch)
{
str.push_back(ch);
}
}; /// @brief The default class that provides string functors
struct DefaultStringFunctorSelector
{
/// @brief The generic rebinder
template <class TStringRef, class TCharIter>
struct rebind
{
// unimplemented, compiler error occurs if getting here
}; /// @brief The rebinder to the character array based string functors
template <>
struct rebind<char*,char*>
{
typedef StringBeginFunctorPsz StringBeginFunctor;
typedef StringEndFunctorPsz StringEndFunctor;
typedef CharDistFunctorIndexed<char*,char*> CharDistFunctor;
}; /// @brief The rebinder to the std::string based string functors
template <>
struct rebind<const std::string&, std::string::const_iterator>
{
typedef StringBeginFunctorStdStr StringBeginFunctor;
typedef StringEndFunctorStdStr StringEndFunctor;
typedef CharDistFunctorIndexed<std::string::const_iterator, std::string::const_iterator> CharDistFunctor;
};
}; /// @brief The default class that provides functor that appends character to string
struct DefaultAppendCharFunctorSelector
{
/// @brief The generic binder
template <class TStringRef, class TCharIter, class TChar>
struct rebind
{
// unimplemented, compiler error occurs if getting here
}; /// @brief The rebinder to the functor that appends character to character-based zero-terminating string
template <>
struct rebind<char*, char*, char>
{
typedef AppendCharFunctorPsz AppendCharFunctor;
}; /// @brief The rebinder to the functor that appends character to std::string
template <>
struct rebind<std::string&, std::string::iterator, char>
{
typedef AppendCharFunctorStdStr AppendCharFunctor;
};
}; /// @brief A class that encapsulates a wildcard pattern
/// @param TString The type of the pattern string
/// @param TStringRef The type of the reference to the pattern string (for efficient parameter passing)
/// @param TCharIter The type of the iterator through the characters
/// @param TStringBeginFunctor The type of the functor that returns the iterator at the beginning of a string
/// @param TStringEndFunctor The type of the functor that determines if the iterator is at the end of a string
template <class TString=char*, class TStringRef=char*, class TCharIter=char*,
class TStringFunctorSelector=DefaultStringFunctorSelector>
class Pattern
{
public:
typedef TStringRef StringRef; /// @brief The type of iterator through the characters in the pattern string
typedef TCharIter CharIter; /// @brief The type of the functor that returns the iterator at the beginning of a string
typedef typename TStringFunctorSelector::template rebind<TStringRef, TCharIter>::StringBeginFunctor StringBeginFunctor; /// @brief The type of the functor that determines if the iterator is at the end of a string
typedef typename TStringFunctorSelector::template rebind<TStringRef, TCharIter>::StringEndFunctor StringEndFunctor; /// @brief The type of the functor that returns the distance between two characters
typedef typename TStringFunctorSelector::template rebind<TStringRef, TCharIter>::CharDistFunctor CharDistFunctor; private:
/// @brief The pattern string
TString _pattern; /// @brief The functor that returns the beginning of the string
StringBeginFunctor _getStringBegin; /// @brief The functor that returns if the iterator is at the end of the string
StringEndFunctor _isStringEnd; /// @brief The functor that returns the distance between two characters
CharDistFunctor _getCharDist; /// @brief The look-up table that maps iterator of pattern to the index of match result entry
std::map<CharIter, int> _mapIterToIndex; public:
// a typical wildcard pattern:
// a*b?C(*)
//
/// @brief Instantiates a pattern with the pattern string and the functors
/// @param pattern The pattern string
/// @param stringBegin The functor that provides the beginning of the string
/// @param stringEnd The functor that determines the end of the string
/// @remarks A typical wildcard pattern is like: a*b?C(*)D\)
/// where normal characters (alphanumerics, punctuation etc) expect exact match, asteroids match whatever
/// string of whatever length, question marks match any single character and an escape character
/// (back-slash) turns a succeeding special character to a normal matching character.
Pattern(TStringRef pattern, StringBeginFunctor stringBegin, StringBeginFunctor stringEnd)
: _pattern(pattern), _getStringBegin(stringBegin), _isStringEnd(stringEnd)
{
PreProcessParentheses();
} /// @brief Instantiates a pattern with the pattern string
/// @param pattern The pattern string
Pattern(TStringRef pattern) : _pattern(pattern)
{
PreProcessParentheses();
} private:
/// @brief Creates the mapping from parenthesis pointer to index from the pattern string
void PreProcessParentheses()
{
_mapIterToIndex.clear();
int openingIndex = ;
int closingIndex = ;
for (CharIter iter = GetBegin(); !IsEnd(iter); ++iter)
{
if (*iter=='\\')
{
++iter; // skip the character that follows
}
else if (*iter == '(')
{
_mapIterToIndex[iter] = closingIndex = openingIndex++;
}
else if (*iter == ')')
{
// NOTE We don't need to differentiate opening and closing parentheses as
// the matcher has the knowledge of the pattern characters
_mapIterToIndex[iter] = closingIndex--;
}
}
} public:
/// @brief Returns the beginning of the pattern string
/// @return The iterator point to the beginning of the pattern string
CharIter GetBegin()
{
return _getStringBegin(_pattern);
} /// @brief Determines if the iterator is at the end of the pattern string
/// @param The interator in question
/// @return true if the iterator is at the beginning of the pattern string
bool IsEnd(CharIter iter)
{
return _isStringEnd(iter, _pattern);
} /// @brief Returns the match entry index for the specified parenthesis pointer
/// @return The match entry index
int PatternIterToIndex(CharIter patternIter)
{
return _mapIterToIndex[patternIter];
} /// @brief Returns the distance between two characters (the number of characters in between plus one)
/// @param iterBegin The iterator that points to the character on the left hand
/// @param iterEnd The iterator that points to the character on the right hand
/// @return The distance
size_t GetQuotedLength(CharIter iterBegin, CharIter iterEnd)
{
return _getCharDist(_pattern, iterBegin, iterEnd);
}
}; /// @brief A class that converts a wildcard pattern to its equivalent regular expression
/// @param TPattern The type of the pattern class
/// @param TRegexStringRef The type of the reference to the string for regular expression
/// @param TRegexCharIter The iterator through characters in the string for regular expression
/// @param TPatternFunctorSelector The functor selector for pattern
/// @param TRegexAppendCharFunctorSelector The append-character functor selector for regular expression
/// @remarks NOTE TRegexChar has to be compatible with the character type TPattern::CharIter iterates through
template <class TPattern=Pattern<>, class TRegexStringRef=char*, class TRegexCharIter=char*,
class TRegexChar=char, class TRegexAppendCharFunctorSelector=DefaultAppendCharFunctorSelector>
class WildCardToRegex
{
public:
/// @brief The type of the reference to regular expression string
typedef TRegexStringRef RegexStringRef;
/// @brief The type of the iteartor through the characters in the regular expression string
typedef TRegexCharIter RegexCharIter;
/// @brief The type of the character that can be append to the regular expression string
typedef TRegexChar RegexChar; /// @brief The type of the reference to the wildcard string
typedef typename TPattern::StringRef PatternStringRef;
/// @brief The type of the iterator through the characters in the wildcard string
typedef typename TPattern::CharIter PatternStringIter; /// @brief
typedef typename TRegexAppendCharFunctorSelector::template rebind<RegexStringRef, RegexCharIter, RegexChar>::AppendCharFunctor
RegexAppendCharFunctor; private:
/// @brief The functor that appends character to the regular expression string
RegexAppendCharFunctor _regexAppendChar; public:
/// @brief Initialises a WildCardToRegex with the specified functor instances
/// @param regexAppendChar The functor that appends character to the regular expression string
WildCardToRegex(RegexAppendCharFunctor &regexAppendChar)
: _regexAppendChar(regexAppendChar)
{
} /// @brief Initialises a WildCardToRegex with the default settings
WildCardToRegex()
{
} public:
/// @brief Converts a wildcard string to its equivalent regular expression
/// @remarks This is supposed to comply with the rules set by the regex implementation in QSharp
/// See https://qsharp.codeplex.com/SourceControl/latest#QSharp/QSharp.String.Rex/Creator.cs
/// for more detail. It has yet to be tested though.
void Convert(TPattern &pattern, TRegexStringRef regex, TRegexCharIter iterRegex)
{
for (PatternStringIter iter = pattern.GetBegin(); !pattern.IsEnd(iter); ++iter)
{
switch (*iter)
{
case '\\':
_regexAppendChar(regex, iterRegex, *iter);
++iter;
if (!pattern.IsEnd(iter))
{
_regexAppendChar(regex, iterRegex, *iter);
}
else
{
_regexAppendChar(regex, iterRegex, '\\');
}
break;
case '*':
_regexAppendChar(regex, iterRegex, '.');
_regexAppendChar(regex, iterRegex, *iter);
break;
case '?':
_regexAppendChar(regex, iterRegex, '.');
break;
case '(': case ')':
_regexAppendChar(regex, iterRegex, *iter);
break;
case '[': case ']': case '{': case '}': case '^': case '.': case '-': case '+':
_regexAppendChar(regex, iterRegex, '\\');
_regexAppendChar(regex, iterRegex, *iter);
break;
default:
_regexAppendChar(regex, iterRegex, *iter);
break;
}
}
}
}; /// @brief A class that represents a match of quotation enclosed by a pair of parentheses in the pattern
/// @param TCharIter The type of iterator through the source string
/// @param TDiff The type of a integer number that indicates the length of string or the distance between characters
template <class TCharIter=char*, class TDiff=size_t>
class MatchQuote
{
public:
/// @brief The type of iterator through the source string
typedef TCharIter CharIter; /// @brief The type of a integer number that indicates the length of string or the distance between characters
typedef TDiff Diff; public:
/// @brief The beginning of the substring that matches
CharIter Begin; /// @brief The end of the substring that matches
CharIter End;
}; /// @brief A class that contains all the matched quotations
/// @param TCharIter The iterator through the source string
/// @param TDiff The type of the integer that indicates a string length or a character distance
template <class TCharIter=char*, class TDiff=size_t>
class MatchResult
{
public:
/// @brief The iterator through the source string
typedef TCharIter CharIter;
/// @brief The type of the integer that indicates a string length or a character distance
typedef TDiff Diff;
/// @brief The type of match entries listed in this object
typedef MatchQuote<CharIter, Diff> MatchType; public:
/// @brief A list of matched quotation entries
std::vector<MatchType> Matches; public:
/// @brief Records the beginning of a quotation encountered
/// @param index The index of the match entry
/// @param iterChar The pointer to the source string where the quotation starts
void Open(int index, CharIter iterChar)
{
while (index >= Matches.size())
{
Matches.push_back(MatchType());
}
Matches[index].Begin = iterChar;
} /// @brief Records the end of a quotation encountered
/// @param index The index of the match entry
/// @param iterChar The pointer to the source string where the quotation ends
void Close(int index, CharIter iterChar)
{
// cell index must have already been allocated in the array of Matches
Matches[index].End = iterChar;
}
}; /// @brief A default trait class that provides types needed by Matcher
/// @param TChar
template <class TStringRef=char*, class TCharIter=char*, class TDiff=size_t,
class TStringFunctorSelector=DefaultStringFunctorSelector>
struct MatcherTraits
{
/// @brief The type of the reference to the source string
typedef TStringRef StringRef;
/// @brief The type of iterator through the characters in the source string
typedef TCharIter CharIter; /// @brief The type of the match result (matched quotation entry container)
typedef MatchResult<TCharIter, TDiff> MatchResultType;
/// @brief The type of the reference to the match result
typedef MatchResultType & MatchResultRef; /// @brief The type of the functor that returns the beginning of a string
typedef typename TStringFunctorSelector::template rebind<StringRef, CharIter>::StringBeginFunctor StringBeginFunctor;
/// @brief The type of the functor that determines if an iterator is at the end of a string
typedef typename TStringFunctorSelector::template rebind<StringRef, CharIter>::StringEndFunctor StringEndFunctor;
}; /// @brief A wildcard string matcher
template <class Traits = MatcherTraits<>>
class Matcher
{
public:
/// @brief The type of the reference to the source string
typedef typename Traits::StringRef StringRef;
/// @brief The type of iterator through the characters in the source string
typedef typename Traits::CharIter CharIter; /// @brief The type of the reference to the match result
typedef typename Traits::MatchResultRef MatchResultRef; /// @brief The type of the functor that returns the beginning of a string
typedef typename Traits::StringBeginFunctor StringBeginFunctor;
/// @brief The type of the functor that determines if an iterator is at the end of a string
typedef typename Traits::StringEndFunctor StringEndFunctor; private:
/// @brief The functor that returns the beginning of a string
StringBeginFunctor _stringBegin; /// @brief The functor that determines if an iterator is at the end of a string
StringEndFunctor _stringEnd; public:
/// @brief Instantiates a Matcher with the specified string functor instances
/// @param stringBegin The functor that returns the beginning of a string
/// @param stringEnd The functor that determines if an iterator is at the end of a string
Matcher(StringBeginFunctor &stringBegin, StringEndFunctor &stringEnd)
: _stringBegin(stringBegin), _stringEnd(stringEnd)
{
} /// @brief Instantiates a Matcher with default settings
Matcher()
{
} public:
/// @brief Match The source to the pattern
/// @param source The source string to match
/// @param pattern The pattern to match against
/// @param matchResult The container of matched quotation entries
/// @return true if the matching is successful (the pattern is completely consumed)
template <class TPattern>
bool Match(StringRef source, TPattern &pattern, MatchResultRef matchResult)
{
CharIter iterSource = _stringBegin(source);
TPattern::CharIter iterPattern = pattern.GetBegin();
return Match(source, iterSource, pattern, iterPattern, matchResult);
} /// @brief Match the source to the pattern (recursive)
/// @param source The source string to match
/// @param iterSource The iterator through the source string at its current position
/// @param pattern The pattern to match against
/// @param iterPattern The iterator through the pattern string at its current position
/// @param matchResult The container of matched quotation entries
/// @return true if the matching is successful (the pattern is completely consumed)
template <class TPattern>
bool Match(StringRef source, CharIter &iterSource, TPattern &pattern, typename TPattern::CharIter &iterPattern,
MatchResultRef matchResult)
{
while (! pattern.IsEnd(iterPattern))
{
if (*iterPattern == '\\')
{
++iterPattern;
}
else if (*iterPattern == '*')
{
CharIter savedIterSource = iterSource;
TPattern::CharIter savedIterPattern = iterPattern;
// greedy strategy
if (!_stringEnd(savedIterSource, source))
{
++iterSource;
if (Match(source, iterSource, pattern, savedIterPattern, matchResult))
{
return true;
}
}
++iterPattern;
if (Match(source, savedIterSource, pattern, iterPattern, matchResult))
{
return true;
}
return false;
}
else if (*iterPattern == '?')
{
if (_stringEnd(iterSource, source))
{
return false;
}
++iterPattern;
++iterSource;
continue;
}
else if (*iterPattern == '(')
{
int index = pattern.PatternIterToIndex(iterPattern);
matchResult.Open(index, iterSource);
++iterPattern;
continue;
}
else if (*iterPattern == ')')
{
int index = pattern.PatternIterToIndex(iterPattern);
matchResult.Close(index, iterSource);
++iterPattern;
continue;
} if (!_stringEnd(iterSource, source) && *iterPattern == *iterSource)
{
++iterPattern;
++iterSource;
}
else
{
return false;
}
}
return true;
}
};
}}} #endif

A Simple C++ Template Class that Matches a String to a Wildcard Pattern的更多相关文章

  1. c++ simple class template example: Stack

    main.cpp #include "Stack.h" #include <iostream> using namespace std; class Box { pub ...

  2. 关于cas-client单点登录客户端拦截请求和忽略/排除不需要拦截的请求URL的问题(不需要修改任何代码,只需要一个配置)

    前言:今天在网上无意间看到cas单点登录排除请求的问题,发现很多人在讨论如何通过改写AuthenticationFilter类来实现忽略/排除请求URL的功能:突发奇想搜了一下,还真蛮多人都是这么干的 ...

  3. Server Develop (九) Simple Web Server

    Simple Web Server web服务器hello world!-----简单的socket通信实现. HTTP HTTP是Web浏览器与Web服务器之间通信的标准协议,HTTP指明了客户端如 ...

  4. XML Publiser For Excel Template

    1.XML Publisher定义数据 2.XML Publisher定义模板 模板类型选择Microsoft Excel,默认输出类型选择Excel,上传.xls模板 3.定义并发程序 4.定义请求 ...

  5. Package template (html/template) ... Types HTML, JS, URL, and others from content.go can carry safe content that is exempted from escaping. ... (*Template) Funcs ..

    https://godoc.org/text/template GoDoc Home About Go: text/templateIndex | Examples | Files | Directo ...

  6. Viola–Jones object detection framework--Rapid Object Detection using a Boosted Cascade of Simple Features中文翻译 及 matlab实现(见文末链接)

    ACCEPTED CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION 2001 Rapid Object Detection using a B ...

  7. 模板库 ~ Template library

    TOC 建议使用 Ctrl+F 搜索 . 目录 小工具 / C++ Tricks NOI Linux 1.0 快速读入 / 快速输出 简易小工具 无序映射器 简易调试器 文件 IO 位运算 Smart ...

  8. python的Template

    Template模块,可以用来制作web页面的模板,非常的方便. Template属于string中的一个类,所以要使用的话要在头部引入: from string import Template 模板 ...

  9. go语言的模板,text/template包

    go语言的模板,text/template包 定义 模板就是将一组文本嵌入另一组文本里 传入string--最简单的替换 package main import ( "os" &q ...

随机推荐

  1. 当ListView有Header时,onItemClick里的position不正确

    原文:http://blog.chengbo.net/2012/03/09/onitemclick-return-wrong-position-when-listview-has-headerview ...

  2. MVC学习笔记--跟小静学MVC相关语法特性小补习

    http://www.cnblogs.com/janes/archive/2012/10/15/2721101.html http://www.cnblogs.com/h82258652/p/4795 ...

  3. Analysis Services OLAP 概述2

    在DW/BI系统中,关系型数据库是存储和管理数据的最佳场所.但是关系数据库本身的智能化程度不够.关系型数据库缺乏如下功能: 丰富的元数据,帮助用户浏览数据和创建查询. 强大的分析计算和函数,在对上下文 ...

  4. 【leetcode】Single Number && Single Number II(ORZ 位运算)

    题目描述: Single Number Given an array of integers, every element appears twice except for one. Find tha ...

  5. PMP 第十章 项目沟通管理

    1识别干系人 2规划沟通 3发布信息 4管理干系人期望 5报告绩效 1.沟通的维度有哪些?沟通技巧有哪些? 2.规划沟通管理的目的是什么?沟通渠道的计算(重点).影响沟通技术的因素有哪些?沟通模型的步 ...

  6. 深入分析JavaWeb 技术内幕

    1,通过浏览器请求一个资源,会发生以下几种过程 1) http的解析过程,包括对于http请求头和响应头中指令(控制用户浏览器的渲染行为和 服务器的执行逻辑)的解析 2)DNS的解析过程(根据域名获取 ...

  7. 在64位Win7中使用Navicat Premium 和PL\SQL Developer连接Oracle数据库备忘

    最近接手了一个项目,服务器端数据库是oracle 11g 64位.由于主要工作不是开发,也不想在自己的电脑上安装庞大的oracle数据库,因此寻思着只通过数据库管理工具连接数据库进行一些常用的查询操作 ...

  8. php 常见的问题

    1. this指针错误的引用变量($(php)<->*(c)) $this->inputData right wrong 2. json_encode(array) 不一定按数组关键 ...

  9. Android studio导入eclipse项目且不改变目录结构

    Android studio的安装与配置论坛当中已经有很多在此就不在细说了,现在开始说下如何在Android studio当中导入eclipse的项目且不改变其目录结构和配置,让使用eclipse的同 ...

  10. 【前台 】字符串和js对象的相互转化

    利用原生JSON对象,将对象转为字符串 var jsObj = {}; jsObj.testArray = [1,2,3,4,5]; jsObj.name = 'CSS3'; jsObj.date = ...