在项目中使用Umbraco examine search 来search 法语网站时,客户有一个需求,就是

当search  expérience 和 experience 时,需要返回一样的结果。 类似这样的需求, 也就是说需要做accents search

解决方案:

我们需要重写Analyzer (Lucene.Net.Analysis.Analyzer) 中的 TokenStream方法

  1. using System.IO;
  2. using Lucene.Net.Analysis;
  3. using Lucene.Net.Analysis.Standard;
  4.  
  5. namespace MyNamespace
  6. {
  7. public class CustomAnalyserService : Analyzer
  8. {
  9. public override TokenStream TokenStream(string fieldName, TextReader reader)
  10. {
  11. StandardTokenizer tokenizer = new StandardTokenizer(Lucene.Net.Util.Version.LUCENE_29, reader);
  12.  
  13. tokenizer.SetMaxTokenLength();
  14. TokenStream stream = new StandardFilter(tokenizer);
  15. stream = new LowerCaseFilter(stream);
  16. return new ASCIIFoldingFilter(stream);
  17. }
  18.  
  19. }
  20. }

同时修改Umbraco中Examine的配置文件

修改 ExamineSettings.config

Before

  1. <Examine RebuildOnAppStart="true">
  2. <ExamineIndexProviders>
  3. <providers>
  4.  
  5. <!-- default external indexer, which excludes protected and unpublished pages-->
  6. <add name="ExternalIndexer" type="UmbracoExamine.UmbracoContentIndexer, UmbracoExamine"
  7. analyzer="Lucene.Net.Analysis.Standard.StandardAnalyzer, Lucene.Net" />
  8.  
  9. </providers>
  10. </ExamineIndexProviders>
  11.  
  12. <ExamineSearchProviders defaultProvider="ExternalSearcher">
  13. <providers>
  14.  
  15. <add name="ExternalSearcher" type="UmbracoExamine.UmbracoExamineSearcher, UmbracoExamine"
  16. analyzer="Lucene.Net.Analysis.WhitespaceAnalyzer, Lucene.Net" />
  17. </providers>
  18. </ExamineSearchProviders>
  19. </Examine>

After

  1. <Examine RebuildOnAppStart="true">
  2. <ExamineIndexProviders>
  3. <providers>
  4.  
  5. <!-- default external indexer, which excludes protected and unpublished pages-->
  6. <add name="ExternalIndexer" type="UmbracoExamine.UmbracoContentIndexer, UmbracoExamine"
  7. analyzer="MyNamespace.CustomAnalyserService, MyNamespace" />
  8.  
  9. </providers>
  10. </ExamineIndexProviders>
  11.  
  12. <ExamineSearchProviders defaultProvider="ExternalSearcher">
  13. <providers>
  14.  
  15. <add name="ExternalSearcher" type="UmbracoExamine.UmbracoExamineSearcher, UmbracoExamine"
  16. analyzer="MyNamespace.CustomAnalyserService, MyNamespace" />
  17. </providers>
  18. </ExamineSearchProviders>
  19. </Examine>

The only change is the analyzer keyword, where you change it to your assmebly

  1. analyzer="[Namespace].[Class], [AssemblyWithoutDotDll]"

这样之后,ExternalSearcher 就既可以正常的搜索英语,也可以进行accent搜索法语网站 (eg:   expérience 和 experience返回同样的结果)

如果有时候发现这样进行法语accent搜索还不行,怎么办

那就在search term这里下功夫,比如输入 expérience进行搜索,我们在这里就把它转换为experience,使用如下方法

  1. public static class SearchHelper
  2. {
  3.  
  4. public static string FoldToASCII(char[] input, int length)
  5. {
  6. char[] output;
  7. int outputPos;
  8. int targetSize = * length;
  9. output = new char[ArrayUtil.GetNextSize(targetSize)];
  10. outputPos = ;
  11. for (int index = ; index < length; ++index)
  12. {
  13. char ch = input[index];
  14. if ((int)ch < )
  15. {
  16. output[outputPos++] = ch;
  17. }
  18. else
  19. {
  20. switch (ch)
  21. {
  22. case 'ff':
  23. output[outputPos++] = 'f';
  24. output[outputPos++] = 'f';
  25. continue;
  26. case 'fi':
  27. output[outputPos++] = 'f';
  28. output[outputPos++] = 'i';
  29. continue;
  30. case 'fl':
  31. output[outputPos++] = 'f';
  32. output[outputPos++] = 'l';
  33. continue;
  34. case 'ffi':
  35. output[outputPos++] = 'f';
  36. output[outputPos++] = 'f';
  37. output[outputPos++] = 'i';
  38. continue;
  39. case 'ffl':
  40. output[outputPos++] = 'f';
  41. output[outputPos++] = 'f';
  42. output[outputPos++] = 'l';
  43. continue;
  44. case 'st':
  45. output[outputPos++] = 's';
  46. output[outputPos++] = 't';
  47. continue;
  48. case '!':
  49. output[outputPos++] = '!';
  50. continue;
  51. case '"':
  52. case '❝':
  53. case '❞':
  54. case '❮':
  55. case '❯':
  56. case '″':
  57. case '‶':
  58. case '«':
  59. case '»':
  60. case '“':
  61. case '”':
  62. case '„':
  63. output[outputPos++] = '"';
  64. continue;
  65. case '#':
  66. output[outputPos++] = '#';
  67. continue;
  68. case '$':
  69. output[outputPos++] = '$';
  70. continue;
  71. case '%':
  72. case '⁒':
  73. output[outputPos++] = '%';
  74. continue;
  75. case '&':
  76. output[outputPos++] = '&';
  77. continue;
  78. case ''':
  79. case '❛':
  80. case '❜':
  81. case '′':
  82. case '‵':
  83. case '‹':
  84. case '›':
  85. case '‘':
  86. case '’':
  87. case '‚':
  88. case '‛':
  89. output[outputPos++] = '\'';
  90. continue;
  91. case '(':
  92. case '❨':
  93. case '❪':
  94. case '⁽':
  95. case '₍':
  96. output[outputPos++] = '(';
  97. continue;
  98. case ')':
  99. case '❩':
  100. case '❫':
  101. case '⁾':
  102. case '₎':
  103. output[outputPos++] = ')';
  104. continue;
  105. case '*':
  106. case '⁎':
  107. output[outputPos++] = '*';
  108. continue;
  109. case '+':
  110. case '⁺':
  111. case '₊':
  112. output[outputPos++] = '+';
  113. continue;
  114. case ',':
  115. output[outputPos++] = ',';
  116. continue;
  117. case '-':
  118. case '⁻':
  119. case '₋':
  120. case '‐':
  121. case '‑':
  122. case '‒':
  123. case '–':
  124. case '—':
  125. output[outputPos++] = '-';
  126. continue;
  127. case '.':
  128. output[outputPos++] = '.';
  129. continue;
  130. case '/':
  131. case '⁄':
  132. output[outputPos++] = '/';
  133. continue;
  134. case '0':
  135. case '\x2070':
  136. case '\x2080':
  137. case '\x24EA':
  138. case '\x24FF':
  139. output[outputPos++] = '';
  140. continue;
  141. case '1':
  142. case '\x2776':
  143. case '\x2780':
  144. case '\x278A':
  145. case '\x2081':
  146. case '\x2460':
  147. case '\x24F5':
  148. case '\x00B9':
  149. output[outputPos++] = '';
  150. continue;
  151. case '2':
  152. case '\x2777':
  153. case '\x2781':
  154. case '\x278B':
  155. case '\x2082':
  156. case '\x2461':
  157. case '\x24F6':
  158. case '\x00B2':
  159. output[outputPos++] = '';
  160. continue;
  161. case '3':
  162. case '\x2778':
  163. case '\x2782':
  164. case '\x278C':
  165. case '\x2083':
  166. case '\x2462':
  167. case '\x24F7':
  168. case '\x00B3':
  169. output[outputPos++] = '';
  170. continue;
  171. case '4':
  172. case '\x2779':
  173. case '\x2783':
  174. case '\x278D':
  175. case '\x2074':
  176. case '\x2084':
  177. case '\x2463':
  178. case '\x24F8':
  179. output[outputPos++] = '';
  180. continue;
  181. case '5':
  182. case '\x277A':
  183. case '\x2784':
  184. case '\x278E':
  185. case '\x2075':
  186. case '\x2085':
  187. case '\x2464':
  188. case '\x24F9':
  189. output[outputPos++] = '';
  190. continue;
  191. case '6':
  192. case '\x277B':
  193. case '\x2785':
  194. case '\x278F':
  195. case '\x2076':
  196. case '\x2086':
  197. case '\x2465':
  198. case '\x24FA':
  199. output[outputPos++] = '';
  200. continue;
  201. case '7':
  202. case '\x277C':
  203. case '\x2786':
  204. case '\x2790':
  205. case '\x2077':
  206. case '\x2087':
  207. case '\x2466':
  208. case '\x24FB':
  209. output[outputPos++] = '';
  210. continue;
  211. case '8':
  212. case '\x277D':
  213. case '\x2787':
  214. case '\x2791':
  215. case '\x2078':
  216. case '\x2088':
  217. case '\x2467':
  218. case '\x24FC':
  219. output[outputPos++] = '';
  220. continue;
  221. case '9':
  222. case '\x277E':
  223. case '\x2788':
  224. case '\x2792':
  225. case '\x2079':
  226. case '\x2089':
  227. case '\x2468':
  228. case '\x24FD':
  229. output[outputPos++] = '';
  230. continue;
  231. case ':':
  232. output[outputPos++] = ':';
  233. continue;
  234. case ';':
  235. case '⁏':
  236. output[outputPos++] = ';';
  237. continue;
  238. case '<':
  239. case '❬':
  240. case '❰':
  241. output[outputPos++] = '<';
  242. continue;
  243. case '=':
  244. case '⁼':
  245. case '₌':
  246. output[outputPos++] = '=';
  247. continue;
  248. case '>':
  249. case '❭':
  250. case '❱':
  251. output[outputPos++] = '>';
  252. continue;
  253. case '?':
  254. output[outputPos++] = '?';
  255. continue;
  256. case '@':
  257. output[outputPos++] = '@';
  258. continue;
  259. case 'A':
  260. case 'Ⓐ':
  261. case 'À':
  262. case 'Á':
  263. case 'Â':
  264. case 'Ã':
  265. case 'Ä':
  266. case 'Å':
  267. case 'Ā':
  268. case 'Ă':
  269. case 'Ą':
  270. case 'Ə':
  271. case 'Ǎ':
  272. case 'Ǟ':
  273. case 'Ǡ':
  274. case 'Ǻ':
  275. case 'Ȁ':
  276. case 'Ȃ':
  277. case 'Ȧ':
  278. case 'Ⱥ':
  279. case 'ᴀ':
  280. case 'Ḁ':
  281. case 'Ạ':
  282. case 'Ả':
  283. case 'Ấ':
  284. case 'Ầ':
  285. case 'Ẩ':
  286. case 'Ẫ':
  287. case 'Ậ':
  288. case 'Ắ':
  289. case 'Ằ':
  290. case 'Ẳ':
  291. case 'Ẵ':
  292. case 'Ặ':
  293. output[outputPos++] = 'A';
  294. continue;
  295. case 'B':
  296. case 'Ⓑ':
  297. case 'Ɓ':
  298. case 'Ƃ':
  299. case 'Ƀ':
  300. case 'ʙ':
  301. case 'ᴃ':
  302. case 'Ḃ':
  303. case 'Ḅ':
  304. case 'Ḇ':
  305. output[outputPos++] = 'B';
  306. continue;
  307. case 'C':
  308. case 'Ⓒ':
  309. case 'Ç':
  310. case 'Ć':
  311. case 'Ĉ':
  312. case 'Ċ':
  313. case 'Č':
  314. case 'Ƈ':
  315. case 'Ȼ':
  316. case 'ʗ':
  317. case 'ᴄ':
  318. case 'Ḉ':
  319. output[outputPos++] = 'C';
  320. continue;
  321. case 'D':
  322. case 'Ꝺ':
  323. case 'Ⓓ':
  324. case 'Ð':
  325. case 'Ď':
  326. case 'Đ':
  327. case 'Ɖ':
  328. case 'Ɗ':
  329. case 'Ƌ':
  330. case 'ᴅ':
  331. case 'ᴆ':
  332. case 'Ḋ':
  333. case 'Ḍ':
  334. case 'Ḏ':
  335. case 'Ḑ':
  336. case 'Ḓ':
  337. output[outputPos++] = 'D';
  338. continue;
  339. case 'E':
  340. case 'ⱻ':
  341. case 'Ⓔ':
  342. case 'È':
  343. case 'É':
  344. case 'Ê':
  345. case 'Ë':
  346. case 'Ē':
  347. case 'Ĕ':
  348. case 'Ė':
  349. case 'Ę':
  350. case 'Ě':
  351. case 'Ǝ':
  352. case 'Ɛ':
  353. case 'Ȅ':
  354. case 'Ȇ':
  355. case 'Ȩ':
  356. case 'Ɇ':
  357. case 'ᴇ':
  358. case 'Ḕ':
  359. case 'Ḗ':
  360. case 'Ḙ':
  361. case 'Ḛ':
  362. case 'Ḝ':
  363. case 'Ẹ':
  364. case 'Ẻ':
  365. case 'Ẽ':
  366. case 'Ế':
  367. case 'Ề':
  368. case 'Ể':
  369. case 'Ễ':
  370. case 'Ệ':
  371. output[outputPos++] = 'E';
  372. continue;
  373. case 'F':
  374. case 'ꜰ':
  375. case 'Ꝼ':
  376. case 'ꟻ':
  377. case 'Ⓕ':
  378. case 'Ƒ':
  379. case 'Ḟ':
  380. output[outputPos++] = 'F';
  381. continue;
  382. case 'G':
  383. case 'Ᵹ':
  384. case 'Ꝿ':
  385. case 'Ⓖ':
  386. case 'Ĝ':
  387. case 'Ğ':
  388. case 'Ġ':
  389. case 'Ģ':
  390. case 'Ɠ':
  391. case 'Ǥ':
  392. case 'ǥ':
  393. case 'Ǧ':
  394. case 'ǧ':
  395. case 'Ǵ':
  396. case 'ɢ':
  397. case 'ʛ':
  398. case 'Ḡ':
  399. output[outputPos++] = 'G';
  400. continue;
  401. case 'H':
  402. case 'Ⱨ':
  403. case 'Ⱶ':
  404. case 'Ⓗ':
  405. case 'Ĥ':
  406. case 'Ħ':
  407. case 'Ȟ':
  408. case 'ʜ':
  409. case 'Ḣ':
  410. case 'Ḥ':
  411. case 'Ḧ':
  412. case 'Ḩ':
  413. case 'Ḫ':
  414. output[outputPos++] = 'H';
  415. continue;
  416. case 'I':
  417. case 'ꟾ':
  418. case 'Ⓘ':
  419. case 'Ì':
  420. case 'Í':
  421. case 'Î':
  422. case 'Ï':
  423. case 'Ĩ':
  424. case 'Ī':
  425. case 'Ĭ':
  426. case 'Į':
  427. case 'İ':
  428. case 'Ɩ':
  429. case 'Ɨ':
  430. case 'Ǐ':
  431. case 'Ȉ':
  432. case 'Ȋ':
  433. case 'ɪ':
  434. case 'ᵻ':
  435. case 'Ḭ':
  436. case 'Ḯ':
  437. case 'Ỉ':
  438. case 'Ị':
  439. output[outputPos++] = 'I';
  440. continue;
  441. case 'J':
  442. case 'Ⓙ':
  443. case 'Ĵ':
  444. case 'Ɉ':
  445. case 'ᴊ':
  446. output[outputPos++] = 'J';
  447. continue;
  448. case 'K':
  449. case 'Ꝁ':
  450. case 'Ꝃ':
  451. case 'Ꝅ':
  452. case 'Ⱪ':
  453. case 'Ⓚ':
  454. case 'Ķ':
  455. case 'Ƙ':
  456. case 'Ǩ':
  457. case 'ᴋ':
  458. case 'Ḱ':
  459. case 'Ḳ':
  460. case 'Ḵ':
  461. output[outputPos++] = 'K';
  462. continue;
  463. case 'L':
  464. case 'Ꝇ':
  465. case 'Ꝉ':
  466. case 'Ꞁ':
  467. case 'Ⱡ':
  468. case 'Ɫ':
  469. case 'Ⓛ':
  470. case 'Ĺ':
  471. case 'Ļ':
  472. case 'Ľ':
  473. case 'Ŀ':
  474. case 'Ł':
  475. case 'Ƚ':
  476. case 'ʟ':
  477. case 'ᴌ':
  478. case 'Ḷ':
  479. case 'Ḹ':
  480. case 'Ḻ':
  481. case 'Ḽ':
  482. output[outputPos++] = 'L';
  483. continue;
  484. case 'M':
  485. case 'ꟽ':
  486. case 'ꟿ':
  487. case 'Ɱ':
  488. case 'Ⓜ':
  489. case 'Ɯ':
  490. case 'ᴍ':
  491. case 'Ḿ':
  492. case 'Ṁ':
  493. case 'Ṃ':
  494. output[outputPos++] = 'M';
  495. continue;
  496. case 'N':
  497. case 'Ⓝ':
  498. case 'Ñ':
  499. case 'Ń':
  500. case 'Ņ':
  501. case 'Ň':
  502. case 'Ŋ':
  503. case 'Ɲ':
  504. case 'Ǹ':
  505. case 'Ƞ':
  506. case 'ɴ':
  507. case 'ᴎ':
  508. case 'Ṅ':
  509. case 'Ṇ':
  510. case 'Ṉ':
  511. case 'Ṋ':
  512. output[outputPos++] = 'N';
  513. continue;
  514. case 'O':
  515. case 'Ꝋ':
  516. case 'Ꝍ':
  517. case 'Ⓞ':
  518. case 'Ò':
  519. case 'Ó':
  520. case 'Ô':
  521. case 'Õ':
  522. case 'Ö':
  523. case 'Ø':
  524. case 'Ō':
  525. case 'Ŏ':
  526. case 'Ő':
  527. case 'Ɔ':
  528. case 'Ɵ':
  529. case 'Ơ':
  530. case 'Ǒ':
  531. case 'Ǫ':
  532. case 'Ǭ':
  533. case 'Ǿ':
  534. case 'Ȍ':
  535. case 'Ȏ':
  536. case 'Ȫ':
  537. case 'Ȭ':
  538. case 'Ȯ':
  539. case 'Ȱ':
  540. case 'ᴏ':
  541. case 'ᴐ':
  542. case 'Ṍ':
  543. case 'Ṏ':
  544. case 'Ṑ':
  545. case 'Ṓ':
  546. case 'Ọ':
  547. case 'Ỏ':
  548. case 'Ố':
  549. case 'Ồ':
  550. case 'Ổ':
  551. case 'Ỗ':
  552. case 'Ộ':
  553. case 'Ớ':
  554. case 'Ờ':
  555. case 'Ở':
  556. case 'Ỡ':
  557. case 'Ợ':
  558. output[outputPos++] = 'O';
  559. continue;
  560. case 'P':
  561. case 'Ꝑ':
  562. case 'Ꝓ':
  563. case 'Ꝕ':
  564. case 'Ᵽ':
  565. case 'Ⓟ':
  566. case 'Ƥ':
  567. case 'ᴘ':
  568. case 'Ṕ':
  569. case 'Ṗ':
  570. output[outputPos++] = 'P';
  571. continue;
  572. case 'Q':
  573. case 'Ꝗ':
  574. case 'Ꝙ':
  575. case 'Ⓠ':
  576. case 'Ɋ':
  577. output[outputPos++] = 'Q';
  578. continue;
  579. case 'R':
  580. case 'Ꝛ':
  581. case 'Ꞃ':
  582. case 'Ɽ':
  583. case 'Ⓡ':
  584. case 'Ŕ':
  585. case 'Ŗ':
  586. case 'Ř':
  587. case 'Ȑ':
  588. case 'Ȓ':
  589. case 'Ɍ':
  590. case 'ʀ':
  591. case 'ʁ':
  592. case 'ᴙ':
  593. case 'ᴚ':
  594. case 'Ṙ':
  595. case 'Ṛ':
  596. case 'Ṝ':
  597. case 'Ṟ':
  598. output[outputPos++] = 'R';
  599. continue;
  600. case 'S':
  601. case 'ꜱ':
  602. case 'ꞅ':
  603. case 'Ⓢ':
  604. case 'Ś':
  605. case 'Ŝ':
  606. case 'Ş':
  607. case 'Š':
  608. case 'Ș':
  609. case 'Ṡ':
  610. case 'Ṣ':
  611. case 'Ṥ':
  612. case 'Ṧ':
  613. case 'Ṩ':
  614. output[outputPos++] = 'S';
  615. continue;
  616. case 'T':
  617. case 'Ꞇ':
  618. case 'Ⓣ':
  619. case 'Ţ':
  620. case 'Ť':
  621. case 'Ŧ':
  622. case 'Ƭ':
  623. case 'Ʈ':
  624. case 'Ț':
  625. case 'Ⱦ':
  626. case 'ᴛ':
  627. case 'Ṫ':
  628. case 'Ṭ':
  629. case 'Ṯ':
  630. case 'Ṱ':
  631. output[outputPos++] = 'T';
  632. continue;
  633. case 'U':
  634. case 'Ⓤ':
  635. case 'Ù':
  636. case 'Ú':
  637. case 'Û':
  638. case 'Ü':
  639. case 'Ũ':
  640. case 'Ū':
  641. case 'Ŭ':
  642. case 'Ů':
  643. case 'Ű':
  644. case 'Ų':
  645. case 'Ư':
  646. case 'Ǔ':
  647. case 'Ǖ':
  648. case 'Ǘ':
  649. case 'Ǚ':
  650. case 'Ǜ':
  651. case 'Ȕ':
  652. case 'Ȗ':
  653. case 'Ʉ':
  654. case 'ᴜ':
  655. case 'ᵾ':
  656. case 'Ṳ':
  657. case 'Ṵ':
  658. case 'Ṷ':
  659. case 'Ṹ':
  660. case 'Ṻ':
  661. case 'Ụ':
  662. case 'Ủ':
  663. case 'Ứ':
  664. case 'Ừ':
  665. case 'Ử':
  666. case 'Ữ':
  667. case 'Ự':
  668. output[outputPos++] = 'U';
  669. continue;
  670. case 'V':
  671. case 'Ꝟ':
  672. case 'Ꝩ':
  673. case 'Ⓥ':
  674. case 'Ʋ':
  675. case 'Ʌ':
  676. case 'ᴠ':
  677. case 'Ṽ':
  678. case 'Ṿ':
  679. case 'Ỽ':
  680. output[outputPos++] = 'V';
  681. continue;
  682. case 'W':
  683. case 'Ⱳ':
  684. case 'Ⓦ':
  685. case 'Ŵ':
  686. case 'Ƿ':
  687. case 'ᴡ':
  688. case 'Ẁ':
  689. case 'Ẃ':
  690. case 'Ẅ':
  691. case 'Ẇ':
  692. case 'Ẉ':
  693. output[outputPos++] = 'W';
  694. continue;
  695. case 'X':
  696. case 'Ⓧ':
  697. case 'Ẋ':
  698. case 'Ẍ':
  699. output[outputPos++] = 'X';
  700. continue;
  701. case 'Y':
  702. case 'Ⓨ':
  703. case 'Ý':
  704. case 'Ŷ':
  705. case 'Ÿ':
  706. case 'Ƴ':
  707. case 'Ȳ':
  708. case 'Ɏ':
  709. case 'ʏ':
  710. case 'Ẏ':
  711. case 'Ỳ':
  712. case 'Ỵ':
  713. case 'Ỷ':
  714. case 'Ỹ':
  715. case 'Ỿ':
  716. output[outputPos++] = 'Y';
  717. continue;
  718. case 'Z':
  719. case 'Ꝣ':
  720. case 'Ⱬ':
  721. case 'Ⓩ':
  722. case 'Ź':
  723. case 'Ż':
  724. case 'Ž':
  725. case 'Ƶ':
  726. case 'Ȝ':
  727. case 'Ȥ':
  728. case 'ᴢ':
  729. case 'Ẑ':
  730. case 'Ẓ':
  731. case 'Ẕ':
  732. output[outputPos++] = 'Z';
  733. continue;
  734. case '[':
  735. case '❲':
  736. case '⁅':
  737. output[outputPos++] = '[';
  738. continue;
  739. case '\':
  740. output[outputPos++] = '\\';
  741. continue;
  742. case ']':
  743. case '❳':
  744. case '⁆':
  745. output[outputPos++] = ']';
  746. continue;
  747. case '^':
  748. case '‸':
  749. output[outputPos++] = '^';
  750. continue;
  751. case '_':
  752. output[outputPos++] = '_';
  753. continue;
  754. case 'a':
  755. case 'ⱥ':
  756. case 'Ɐ':
  757. case 'ₐ':
  758. case 'ₔ':
  759. case 'ⓐ':
  760. case 'à':
  761. case 'á':
  762. case 'â':
  763. case 'ã':
  764. case 'ä':
  765. case 'å':
  766. case 'ā':
  767. case 'ă':
  768. case 'ą':
  769. case 'ǎ':
  770. case 'ǟ':
  771. case 'ǡ':
  772. case 'ǻ':
  773. case 'ȁ':
  774. case 'ȃ':
  775. case 'ȧ':
  776. case 'ɐ':
  777. case 'ə':
  778. case 'ɚ':
  779. case 'ᶏ':
  780. case 'ᶕ':
  781. case 'ḁ':
  782. case 'ẚ':
  783. case 'ạ':
  784. case 'ả':
  785. case 'ấ':
  786. case 'ầ':
  787. case 'ẩ':
  788. case 'ẫ':
  789. case 'ậ':
  790. case 'ắ':
  791. case 'ằ':
  792. case 'ẳ':
  793. case 'ẵ':
  794. case 'ặ':
  795. output[outputPos++] = 'a';
  796. continue;
  797. case 'b':
  798. case 'ⓑ':
  799. case 'ƀ':
  800. case 'ƃ':
  801. case 'ɓ':
  802. case 'ᵬ':
  803. case 'ᶀ':
  804. case 'ḃ':
  805. case 'ḅ':
  806. case 'ḇ':
  807. output[outputPos++] = 'b';
  808. continue;
  809. case 'c':
  810. case 'Ꜿ':
  811. case 'ꜿ':
  812. case 'ↄ':
  813. case 'ⓒ':
  814. case 'ç':
  815. case 'ć':
  816. case 'ĉ':
  817. case 'ċ':
  818. case 'č':
  819. case 'ƈ':
  820. case 'ȼ':
  821. case 'ɕ':
  822. case 'ḉ':
  823. output[outputPos++] = 'c';
  824. continue;
  825. case 'd':
  826. case 'ꝺ':
  827. case 'ⓓ':
  828. case 'ð':
  829. case 'ď':
  830. case 'đ':
  831. case 'ƌ':
  832. case 'ȡ':
  833. case 'ɖ':
  834. case 'ɗ':
  835. case 'ᵭ':
  836. case 'ᶁ':
  837. case 'ᶑ':
  838. case 'ḋ':
  839. case 'ḍ':
  840. case 'ḏ':
  841. case 'ḑ':
  842. case 'ḓ':
  843. output[outputPos++] = 'd';
  844. continue;
  845. case 'e':
  846. case 'ⱸ':
  847. case 'ₑ':
  848. case 'ⓔ':
  849. case 'è':
  850. case 'é':
  851. case 'ê':
  852. case 'ë':
  853. case 'ē':
  854. case 'ĕ':
  855. case 'ė':
  856. case 'ę':
  857. case 'ě':
  858. case 'ǝ':
  859. case 'ȅ':
  860. case 'ȇ':
  861. case 'ȩ':
  862. case 'ɇ':
  863. case 'ɘ':
  864. case 'ɛ':
  865. case 'ɜ':
  866. case 'ɝ':
  867. case 'ɞ':
  868. case 'ʚ':
  869. case 'ᴈ':
  870. case 'ᶒ':
  871. case 'ᶓ':
  872. case 'ᶔ':
  873. case 'ḕ':
  874. case 'ḗ':
  875. case 'ḙ':
  876. case 'ḛ':
  877. case 'ḝ':
  878. case 'ẹ':
  879. case 'ẻ':
  880. case 'ẽ':
  881. case 'ế':
  882. case 'ề':
  883. case 'ể':
  884. case 'ễ':
  885. case 'ệ':
  886. output[outputPos++] = 'e';
  887. continue;
  888. case 'f':
  889. case 'ꝼ':
  890. case 'ⓕ':
  891. case 'ƒ':
  892. case 'ᵮ':
  893. case 'ᶂ':
  894. case 'ḟ':
  895. case 'ẛ':
  896. output[outputPos++] = 'f';
  897. continue;
  898. case 'g':
  899. case 'ꝿ':
  900. case 'ⓖ':
  901. case 'ĝ':
  902. case 'ğ':
  903. case 'ġ':
  904. case 'ģ':
  905. case 'ǵ':
  906. case 'ɠ':
  907. case 'ɡ':
  908. case 'ᵷ':
  909. case 'ᵹ':
  910. case 'ᶃ':
  911. case 'ḡ':
  912. output[outputPos++] = 'g';
  913. continue;
  914. case 'h':
  915. case 'ⱨ':
  916. case 'ⱶ':
  917. case 'ⓗ':
  918. case 'ĥ':
  919. case 'ħ':
  920. case 'ȟ':
  921. case 'ɥ':
  922. case 'ɦ':
  923. case 'ʮ':
  924. case 'ʯ':
  925. case 'ḣ':
  926. case 'ḥ':
  927. case 'ḧ':
  928. case 'ḩ':
  929. case 'ḫ':
  930. case 'ẖ':
  931. output[outputPos++] = 'h';
  932. continue;
  933. case 'i':
  934. case 'ⁱ':
  935. case 'ⓘ':
  936. case 'ì':
  937. case 'í':
  938. case 'î':
  939. case 'ï':
  940. case 'ĩ':
  941. case 'ī':
  942. case 'ĭ':
  943. case 'į':
  944. case 'ı':
  945. case 'ǐ':
  946. case 'ȉ':
  947. case 'ȋ':
  948. case 'ɨ':
  949. case 'ᴉ':
  950. case 'ᵢ':
  951. case 'ᵼ':
  952. case 'ᶖ':
  953. case 'ḭ':
  954. case 'ḯ':
  955. case 'ỉ':
  956. case 'ị':
  957. output[outputPos++] = 'i';
  958. continue;
  959. case 'j':
  960. case 'ⱼ':
  961. case 'ⓙ':
  962. case 'ĵ':
  963. case 'ǰ':
  964. case 'ȷ':
  965. case 'ɉ':
  966. case 'ɟ':
  967. case 'ʄ':
  968. case 'ʝ':
  969. output[outputPos++] = 'j';
  970. continue;
  971. case 'k':
  972. case 'ꝁ':
  973. case 'ꝃ':
  974. case 'ꝅ':
  975. case 'ⱪ':
  976. case 'ⓚ':
  977. case 'ķ':
  978. case 'ƙ':
  979. case 'ǩ':
  980. case 'ʞ':
  981. case 'ᶄ':
  982. case 'ḱ':
  983. case 'ḳ':
  984. case 'ḵ':
  985. output[outputPos++] = 'k';
  986. continue;
  987. case 'l':
  988. case 'ꝇ':
  989. case 'ꝉ':
  990. case 'ꞁ':
  991. case 'ⱡ':
  992. case 'ⓛ':
  993. case 'ĺ':
  994. case 'ļ':
  995. case 'ľ':
  996. case 'ŀ':
  997. case 'ł':
  998. case 'ƚ':
  999. case 'ȴ':
  1000. case 'ɫ':
  1001. case 'ɬ':
  1002. case 'ɭ':
  1003. case 'ᶅ':
  1004. case 'ḷ':
  1005. case 'ḹ':
  1006. case 'ḻ':
  1007. case 'ḽ':
  1008. output[outputPos++] = 'l';
  1009. continue;
  1010. case 'm':
  1011. case 'ⓜ':
  1012. case 'ɯ':
  1013. case 'ɰ':
  1014. case 'ɱ':
  1015. case 'ᵯ':
  1016. case 'ᶆ':
  1017. case 'ḿ':
  1018. case 'ṁ':
  1019. case 'ṃ':
  1020. output[outputPos++] = 'm';
  1021. continue;
  1022. case 'n':
  1023. case 'ⁿ':
  1024. case 'ⓝ':
  1025. case 'ñ':
  1026. case 'ń':
  1027. case 'ņ':
  1028. case 'ň':
  1029. case 'ʼn':
  1030. case 'ŋ':
  1031. case 'ƞ':
  1032. case 'ǹ':
  1033. case 'ȵ':
  1034. case 'ɲ':
  1035. case 'ɳ':
  1036. case 'ᵰ':
  1037. case 'ᶇ':
  1038. case 'ṅ':
  1039. case 'ṇ':
  1040. case 'ṉ':
  1041. case 'ṋ':
  1042. output[outputPos++] = 'n';
  1043. continue;
  1044. case 'o':
  1045. case 'ꝋ':
  1046. case 'ꝍ':
  1047. case 'ⱺ':
  1048. case 'ₒ':
  1049. case 'ⓞ':
  1050. case 'ò':
  1051. case 'ó':
  1052. case 'ô':
  1053. case 'õ':
  1054. case 'ö':
  1055. case 'ø':
  1056. case 'ō':
  1057. case 'ŏ':
  1058. case 'ő':
  1059. case 'ơ':
  1060. case 'ǒ':
  1061. case 'ǫ':
  1062. case 'ǭ':
  1063. case 'ǿ':
  1064. case 'ȍ':
  1065. case 'ȏ':
  1066. case 'ȫ':
  1067. case 'ȭ':
  1068. case 'ȯ':
  1069. case 'ȱ':
  1070. case 'ɔ':
  1071. case 'ɵ':
  1072. case 'ᴖ':
  1073. case 'ᴗ':
  1074. case 'ᶗ':
  1075. case 'ṍ':
  1076. case 'ṏ':
  1077. case 'ṑ':
  1078. case 'ṓ':
  1079. case 'ọ':
  1080. case 'ỏ':
  1081. case 'ố':
  1082. case 'ồ':
  1083. case 'ổ':
  1084. case 'ỗ':
  1085. case 'ộ':
  1086. case 'ớ':
  1087. case 'ờ':
  1088. case 'ở':
  1089. case 'ỡ':
  1090. case 'ợ':
  1091. output[outputPos++] = 'o';
  1092. continue;
  1093. case 'p':
  1094. case 'ꝑ':
  1095. case 'ꝓ':
  1096. case 'ꝕ':
  1097. case 'ꟼ':
  1098. case 'ⓟ':
  1099. case 'ƥ':
  1100. case 'ᵱ':
  1101. case 'ᵽ':
  1102. case 'ᶈ':
  1103. case 'ṕ':
  1104. case 'ṗ':
  1105. output[outputPos++] = 'p';
  1106. continue;
  1107. case 'q':
  1108. case 'ꝗ':
  1109. case 'ꝙ':
  1110. case 'ⓠ':
  1111. case 'ĸ':
  1112. case 'ɋ':
  1113. case 'ʠ':
  1114. output[outputPos++] = 'q';
  1115. continue;
  1116. case 'r':
  1117. case 'ꝛ':
  1118. case 'ꞃ':
  1119. case 'ⓡ':
  1120. case 'ŕ':
  1121. case 'ŗ':
  1122. case 'ř':
  1123. case 'ȑ':
  1124. case 'ȓ':
  1125. case 'ɍ':
  1126. case 'ɼ':
  1127. case 'ɽ':
  1128. case 'ɾ':
  1129. case 'ɿ':
  1130. case 'ᵣ':
  1131. case 'ᵲ':
  1132. case 'ᵳ':
  1133. case 'ᶉ':
  1134. case 'ṙ':
  1135. case 'ṛ':
  1136. case 'ṝ':
  1137. case 'ṟ':
  1138. output[outputPos++] = 'r';
  1139. continue;
  1140. case 's':
  1141. case 'Ꞅ':
  1142. case 'ⓢ':
  1143. case 'ś':
  1144. case 'ŝ':
  1145. case 'ş':
  1146. case 'š':
  1147. case 'ſ':
  1148. case 'ș':
  1149. case 'ȿ':
  1150. case 'ʂ':
  1151. case 'ᵴ':
  1152. case 'ᶊ':
  1153. case 'ṡ':
  1154. case 'ṣ':
  1155. case 'ṥ':
  1156. case 'ṧ':
  1157. case 'ṩ':
  1158. case 'ẜ':
  1159. case 'ẝ':
  1160. output[outputPos++] = 's';
  1161. continue;
  1162. case 't':
  1163. case 'ⱦ':
  1164. case 'ⓣ':
  1165. case 'ţ':
  1166. case 'ť':
  1167. case 'ŧ':
  1168. case 'ƫ':
  1169. case 'ƭ':
  1170. case 'ț':
  1171. case 'ȶ':
  1172. case 'ʇ':
  1173. case 'ʈ':
  1174. case 'ᵵ':
  1175. case 'ṫ':
  1176. case 'ṭ':
  1177. case 'ṯ':
  1178. case 'ṱ':
  1179. case 'ẗ':
  1180. output[outputPos++] = 't';
  1181. continue;
  1182. case 'u':
  1183. case 'ⓤ':
  1184. case 'ù':
  1185. case 'ú':
  1186. case 'û':
  1187. case 'ü':
  1188. case 'ũ':
  1189. case 'ū':
  1190. case 'ŭ':
  1191. case 'ů':
  1192. case 'ű':
  1193. case 'ų':
  1194. case 'ư':
  1195. case 'ǔ':
  1196. case 'ǖ':
  1197. case 'ǘ':
  1198. case 'ǚ':
  1199. case 'ǜ':
  1200. case 'ȕ':
  1201. case 'ȗ':
  1202. case 'ʉ':
  1203. case 'ᵤ':
  1204. case 'ᶙ':
  1205. case 'ṳ':
  1206. case 'ṵ':
  1207. case 'ṷ':
  1208. case 'ṹ':
  1209. case 'ṻ':
  1210. case 'ụ':
  1211. case 'ủ':
  1212. case 'ứ':
  1213. case 'ừ':
  1214. case 'ử':
  1215. case 'ữ':
  1216. case 'ự':
  1217. output[outputPos++] = 'u';
  1218. continue;
  1219. case 'v':
  1220. case 'ꝟ':
  1221. case 'ⱱ':
  1222. case 'ⱴ':
  1223. case 'ⓥ':
  1224. case 'ʋ':
  1225. case 'ʌ':
  1226. case 'ᵥ':
  1227. case 'ᶌ':
  1228. case 'ṽ':
  1229. case 'ṿ':
  1230. output[outputPos++] = 'v';
  1231. continue;
  1232. case 'w':
  1233. case 'ⱳ':
  1234. case 'ⓦ':
  1235. case 'ŵ':
  1236. case 'ƿ':
  1237. case 'ʍ':
  1238. case 'ẁ':
  1239. case 'ẃ':
  1240. case 'ẅ':
  1241. case 'ẇ':
  1242. case 'ẉ':
  1243. case 'ẘ':
  1244. output[outputPos++] = 'w';
  1245. continue;
  1246. case 'x':
  1247. case 'ₓ':
  1248. case 'ⓧ':
  1249. case 'ᶍ':
  1250. case 'ẋ':
  1251. case 'ẍ':
  1252. output[outputPos++] = 'x';
  1253. continue;
  1254. case 'y':
  1255. case 'ⓨ':
  1256. case 'ý':
  1257. case 'ÿ':
  1258. case 'ŷ':
  1259. case 'ƴ':
  1260. case 'ȳ':
  1261. case 'ɏ':
  1262. case 'ʎ':
  1263. case 'ẏ':
  1264. case 'ẙ':
  1265. case 'ỳ':
  1266. case 'ỵ':
  1267. case 'ỷ':
  1268. case 'ỹ':
  1269. case 'ỿ':
  1270. output[outputPos++] = 'y';
  1271. continue;
  1272. case 'z':
  1273. case 'ꝣ':
  1274. case 'ⱬ':
  1275. case 'ⓩ':
  1276. case 'ź':
  1277. case 'ż':
  1278. case 'ž':
  1279. case 'ƶ':
  1280. case 'ȝ':
  1281. case 'ȥ':
  1282. case 'ɀ':
  1283. case 'ʐ':
  1284. case 'ʑ':
  1285. case 'ᵶ':
  1286. case 'ᶎ':
  1287. case 'ẑ':
  1288. case 'ẓ':
  1289. case 'ẕ':
  1290. output[outputPos++] = 'z';
  1291. continue;
  1292. case '{':
  1293. case '❴':
  1294. output[outputPos++] = '{';
  1295. continue;
  1296. case '}':
  1297. case '❵':
  1298. output[outputPos++] = '}';
  1299. continue;
  1300. case '~':
  1301. case '⁓':
  1302. output[outputPos++] = '~';
  1303. continue;
  1304. case 'Ꜩ':
  1305. output[outputPos++] = 'T';
  1306. output[outputPos++] = 'Z';
  1307. continue;
  1308. case 'ꜩ':
  1309. output[outputPos++] = 't';
  1310. output[outputPos++] = 'z';
  1311. continue;
  1312. case 'Ꜳ':
  1313. output[outputPos++] = 'A';
  1314. output[outputPos++] = 'A';
  1315. continue;
  1316. case 'ꜳ':
  1317. output[outputPos++] = 'a';
  1318. output[outputPos++] = 'a';
  1319. continue;
  1320. case 'Ꜵ':
  1321. output[outputPos++] = 'A';
  1322. output[outputPos++] = 'O';
  1323. continue;
  1324. case 'ꜵ':
  1325. output[outputPos++] = 'a';
  1326. output[outputPos++] = 'o';
  1327. continue;
  1328. case 'Ꜷ':
  1329. output[outputPos++] = 'A';
  1330. output[outputPos++] = 'U';
  1331. continue;
  1332. case 'ꜷ':
  1333. output[outputPos++] = 'a';
  1334. output[outputPos++] = 'u';
  1335. continue;
  1336. case 'Ꜹ':
  1337. case 'Ꜻ':
  1338. output[outputPos++] = 'A';
  1339. output[outputPos++] = 'V';
  1340. continue;
  1341. case 'ꜹ':
  1342. case 'ꜻ':
  1343. output[outputPos++] = 'a';
  1344. output[outputPos++] = 'v';
  1345. continue;
  1346. case 'Ꜽ':
  1347. output[outputPos++] = 'A';
  1348. output[outputPos++] = 'Y';
  1349. continue;
  1350. case 'ꜽ':
  1351. output[outputPos++] = 'a';
  1352. output[outputPos++] = 'y';
  1353. continue;
  1354. case 'Ꝏ':
  1355. output[outputPos++] = 'O';
  1356. output[outputPos++] = 'O';
  1357. continue;
  1358. case 'ꝏ':
  1359. output[outputPos++] = 'o';
  1360. output[outputPos++] = 'o';
  1361. continue;
  1362. case 'Ꝡ':
  1363. output[outputPos++] = 'V';
  1364. output[outputPos++] = 'Y';
  1365. continue;
  1366. case 'ꝡ':
  1367. output[outputPos++] = 'v';
  1368. output[outputPos++] = 'y';
  1369. continue;
  1370. case 'Ꝧ':
  1371. case 'Þ':
  1372. output[outputPos++] = 'T';
  1373. output[outputPos++] = 'H';
  1374. continue;
  1375. case 'ꝧ':
  1376. case 'þ':
  1377. case 'ᵺ':
  1378. output[outputPos++] = 't';
  1379. output[outputPos++] = 'h';
  1380. continue;
  1381. case '\x277F':
  1382. case '\x2789':
  1383. case '\x2793':
  1384. case '\x2469':
  1385. case '\x24FE':
  1386. output[outputPos++] = '';
  1387. output[outputPos++] = '';
  1388. continue;
  1389. case '⸨':
  1390. output[outputPos++] = '(';
  1391. output[outputPos++] = '(';
  1392. continue;
  1393. case '⸩':
  1394. output[outputPos++] = ')';
  1395. output[outputPos++] = ')';
  1396. continue;
  1397. case '‼':
  1398. output[outputPos++] = '!';
  1399. output[outputPos++] = '!';
  1400. continue;
  1401. case '⁇':
  1402. output[outputPos++] = '?';
  1403. output[outputPos++] = '?';
  1404. continue;
  1405. case '⁈':
  1406. output[outputPos++] = '?';
  1407. output[outputPos++] = '!';
  1408. continue;
  1409. case '⁉':
  1410. output[outputPos++] = '!';
  1411. output[outputPos++] = '?';
  1412. continue;
  1413. case '\x246A':
  1414. case '\x24EB':
  1415. output[outputPos++] = '';
  1416. output[outputPos++] = '';
  1417. continue;
  1418. case '\x246B':
  1419. case '\x24EC':
  1420. output[outputPos++] = '';
  1421. output[outputPos++] = '';
  1422. continue;
  1423. case '\x246C':
  1424. case '\x24ED':
  1425. output[outputPos++] = '';
  1426. output[outputPos++] = '';
  1427. continue;
  1428. case '\x246D':
  1429. case '\x24EE':
  1430. output[outputPos++] = '';
  1431. output[outputPos++] = '';
  1432. continue;
  1433. case '\x246E':
  1434. case '\x24EF':
  1435. output[outputPos++] = '';
  1436. output[outputPos++] = '';
  1437. continue;
  1438. case '\x246F':
  1439. case '\x24F0':
  1440. output[outputPos++] = '';
  1441. output[outputPos++] = '';
  1442. continue;
  1443. case '\x2470':
  1444. case '\x24F1':
  1445. output[outputPos++] = '';
  1446. output[outputPos++] = '';
  1447. continue;
  1448. case '\x2471':
  1449. case '\x24F2':
  1450. output[outputPos++] = '';
  1451. output[outputPos++] = '';
  1452. continue;
  1453. case '\x2472':
  1454. case '\x24F3':
  1455. output[outputPos++] = '';
  1456. output[outputPos++] = '';
  1457. continue;
  1458. case '\x2473':
  1459. case '\x24F4':
  1460. output[outputPos++] = '';
  1461. output[outputPos++] = '';
  1462. continue;
  1463. case '\x2474':
  1464. output[outputPos++] = '(';
  1465. output[outputPos++] = '';
  1466. output[outputPos++] = ')';
  1467. continue;
  1468. case '\x2475':
  1469. output[outputPos++] = '(';
  1470. output[outputPos++] = '';
  1471. output[outputPos++] = ')';
  1472. continue;
  1473. case '\x2476':
  1474. output[outputPos++] = '(';
  1475. output[outputPos++] = '';
  1476. output[outputPos++] = ')';
  1477. continue;
  1478. case '\x2477':
  1479. output[outputPos++] = '(';
  1480. output[outputPos++] = '';
  1481. output[outputPos++] = ')';
  1482. continue;
  1483. case '\x2478':
  1484. output[outputPos++] = '(';
  1485. output[outputPos++] = '';
  1486. output[outputPos++] = ')';
  1487. continue;
  1488. case '\x2479':
  1489. output[outputPos++] = '(';
  1490. output[outputPos++] = '';
  1491. output[outputPos++] = ')';
  1492. continue;
  1493. case '\x247A':
  1494. output[outputPos++] = '(';
  1495. output[outputPos++] = '';
  1496. output[outputPos++] = ')';
  1497. continue;
  1498. case '\x247B':
  1499. output[outputPos++] = '(';
  1500. output[outputPos++] = '';
  1501. output[outputPos++] = ')';
  1502. continue;
  1503. case '\x247C':
  1504. output[outputPos++] = '(';
  1505. output[outputPos++] = '';
  1506. output[outputPos++] = ')';
  1507. continue;
  1508. case '\x247D':
  1509. output[outputPos++] = '(';
  1510. output[outputPos++] = '';
  1511. output[outputPos++] = '';
  1512. output[outputPos++] = ')';
  1513. continue;
  1514. case '\x247E':
  1515. output[outputPos++] = '(';
  1516. output[outputPos++] = '';
  1517. output[outputPos++] = '';
  1518. output[outputPos++] = ')';
  1519. continue;
  1520. case '\x247F':
  1521. output[outputPos++] = '(';
  1522. output[outputPos++] = '';
  1523. output[outputPos++] = '';
  1524. output[outputPos++] = ')';
  1525. continue;
  1526. case '\x2480':
  1527. output[outputPos++] = '(';
  1528. output[outputPos++] = '';
  1529. output[outputPos++] = '';
  1530. output[outputPos++] = ')';
  1531. continue;
  1532. case '\x2481':
  1533. output[outputPos++] = '(';
  1534. output[outputPos++] = '';
  1535. output[outputPos++] = '';
  1536. output[outputPos++] = ')';
  1537. continue;
  1538. case '\x2482':
  1539. output[outputPos++] = '(';
  1540. output[outputPos++] = '';
  1541. output[outputPos++] = '';
  1542. output[outputPos++] = ')';
  1543. continue;
  1544. case '\x2483':
  1545. output[outputPos++] = '(';
  1546. output[outputPos++] = '';
  1547. output[outputPos++] = '';
  1548. output[outputPos++] = ')';
  1549. continue;
  1550. case '\x2484':
  1551. output[outputPos++] = '(';
  1552. output[outputPos++] = '';
  1553. output[outputPos++] = '';
  1554. output[outputPos++] = ')';
  1555. continue;
  1556. case '\x2485':
  1557. output[outputPos++] = '(';
  1558. output[outputPos++] = '';
  1559. output[outputPos++] = '';
  1560. output[outputPos++] = ')';
  1561. continue;
  1562. case '\x2486':
  1563. output[outputPos++] = '(';
  1564. output[outputPos++] = '';
  1565. output[outputPos++] = '';
  1566. output[outputPos++] = ')';
  1567. continue;
  1568. case '\x2487':
  1569. output[outputPos++] = '(';
  1570. output[outputPos++] = '';
  1571. output[outputPos++] = '';
  1572. output[outputPos++] = ')';
  1573. continue;
  1574. case '\x2488':
  1575. output[outputPos++] = '';
  1576. output[outputPos++] = '.';
  1577. continue;
  1578. case '\x2489':
  1579. output[outputPos++] = '';
  1580. output[outputPos++] = '.';
  1581. continue;
  1582. case '\x248A':
  1583. output[outputPos++] = '';
  1584. output[outputPos++] = '.';
  1585. continue;
  1586. case '\x248B':
  1587. output[outputPos++] = '';
  1588. output[outputPos++] = '.';
  1589. continue;
  1590. case '\x248C':
  1591. output[outputPos++] = '';
  1592. output[outputPos++] = '.';
  1593. continue;
  1594. case '\x248D':
  1595. output[outputPos++] = '';
  1596. output[outputPos++] = '.';
  1597. continue;
  1598. case '\x248E':
  1599. output[outputPos++] = '';
  1600. output[outputPos++] = '.';
  1601. continue;
  1602. case '\x248F':
  1603. output[outputPos++] = '';
  1604. output[outputPos++] = '.';
  1605. continue;
  1606. case '\x2490':
  1607. output[outputPos++] = '';
  1608. output[outputPos++] = '.';
  1609. continue;
  1610. case '\x2491':
  1611. output[outputPos++] = '';
  1612. output[outputPos++] = '';
  1613. output[outputPos++] = '.';
  1614. continue;
  1615. case '\x2492':
  1616. output[outputPos++] = '';
  1617. output[outputPos++] = '';
  1618. output[outputPos++] = '.';
  1619. continue;
  1620. case '\x2493':
  1621. output[outputPos++] = '';
  1622. output[outputPos++] = '';
  1623. output[outputPos++] = '.';
  1624. continue;
  1625. case '\x2494':
  1626. output[outputPos++] = '';
  1627. output[outputPos++] = '';
  1628. output[outputPos++] = '.';
  1629. continue;
  1630. case '\x2495':
  1631. output[outputPos++] = '';
  1632. output[outputPos++] = '';
  1633. output[outputPos++] = '.';
  1634. continue;
  1635. case '\x2496':
  1636. output[outputPos++] = '';
  1637. output[outputPos++] = '';
  1638. output[outputPos++] = '.';
  1639. continue;
  1640. case '\x2497':
  1641. output[outputPos++] = '';
  1642. output[outputPos++] = '';
  1643. output[outputPos++] = '.';
  1644. continue;
  1645. case '\x2498':
  1646. output[outputPos++] = '';
  1647. output[outputPos++] = '';
  1648. output[outputPos++] = '.';
  1649. continue;
  1650. case '\x2499':
  1651. output[outputPos++] = '';
  1652. output[outputPos++] = '';
  1653. output[outputPos++] = '.';
  1654. continue;
  1655. case '\x249A':
  1656. output[outputPos++] = '';
  1657. output[outputPos++] = '';
  1658. output[outputPos++] = '.';
  1659. continue;
  1660. case '\x249B':
  1661. output[outputPos++] = '';
  1662. output[outputPos++] = '';
  1663. output[outputPos++] = '.';
  1664. continue;
  1665. case '⒜':
  1666. output[outputPos++] = '(';
  1667. output[outputPos++] = 'a';
  1668. output[outputPos++] = ')';
  1669. continue;
  1670. case '⒝':
  1671. output[outputPos++] = '(';
  1672. output[outputPos++] = 'b';
  1673. output[outputPos++] = ')';
  1674. continue;
  1675. case '⒞':
  1676. output[outputPos++] = '(';
  1677. output[outputPos++] = 'c';
  1678. output[outputPos++] = ')';
  1679. continue;
  1680. case '⒟':
  1681. output[outputPos++] = '(';
  1682. output[outputPos++] = 'd';
  1683. output[outputPos++] = ')';
  1684. continue;
  1685. case '⒠':
  1686. output[outputPos++] = '(';
  1687. output[outputPos++] = 'e';
  1688. output[outputPos++] = ')';
  1689. continue;
  1690. case '⒡':
  1691. output[outputPos++] = '(';
  1692. output[outputPos++] = 'f';
  1693. output[outputPos++] = ')';
  1694. continue;
  1695. case '⒢':
  1696. output[outputPos++] = '(';
  1697. output[outputPos++] = 'g';
  1698. output[outputPos++] = ')';
  1699. continue;
  1700. case '⒣':
  1701. output[outputPos++] = '(';
  1702. output[outputPos++] = 'h';
  1703. output[outputPos++] = ')';
  1704. continue;
  1705. case '⒤':
  1706. output[outputPos++] = '(';
  1707. output[outputPos++] = 'i';
  1708. output[outputPos++] = ')';
  1709. continue;
  1710. case '⒥':
  1711. output[outputPos++] = '(';
  1712. output[outputPos++] = 'j';
  1713. output[outputPos++] = ')';
  1714. continue;
  1715. case '⒦':
  1716. output[outputPos++] = '(';
  1717. output[outputPos++] = 'k';
  1718. output[outputPos++] = ')';
  1719. continue;
  1720. case '⒧':
  1721. output[outputPos++] = '(';
  1722. output[outputPos++] = 'l';
  1723. output[outputPos++] = ')';
  1724. continue;
  1725. case '⒨':
  1726. output[outputPos++] = '(';
  1727. output[outputPos++] = 'm';
  1728. output[outputPos++] = ')';
  1729. continue;
  1730. case '⒩':
  1731. output[outputPos++] = '(';
  1732. output[outputPos++] = 'n';
  1733. output[outputPos++] = ')';
  1734. continue;
  1735. case '⒪':
  1736. output[outputPos++] = '(';
  1737. output[outputPos++] = 'o';
  1738. output[outputPos++] = ')';
  1739. continue;
  1740. case '⒫':
  1741. output[outputPos++] = '(';
  1742. output[outputPos++] = 'p';
  1743. output[outputPos++] = ')';
  1744. continue;
  1745. case '⒬':
  1746. output[outputPos++] = '(';
  1747. output[outputPos++] = 'q';
  1748. output[outputPos++] = ')';
  1749. continue;
  1750. case '⒭':
  1751. output[outputPos++] = '(';
  1752. output[outputPos++] = 'r';
  1753. output[outputPos++] = ')';
  1754. continue;
  1755. case '⒮':
  1756. output[outputPos++] = '(';
  1757. output[outputPos++] = 's';
  1758. output[outputPos++] = ')';
  1759. continue;
  1760. case '⒯':
  1761. output[outputPos++] = '(';
  1762. output[outputPos++] = 't';
  1763. output[outputPos++] = ')';
  1764. continue;
  1765. case '⒰':
  1766. output[outputPos++] = '(';
  1767. output[outputPos++] = 'u';
  1768. output[outputPos++] = ')';
  1769. continue;
  1770. case '⒱':
  1771. output[outputPos++] = '(';
  1772. output[outputPos++] = 'v';
  1773. output[outputPos++] = ')';
  1774. continue;
  1775. case '⒲':
  1776. output[outputPos++] = '(';
  1777. output[outputPos++] = 'w';
  1778. output[outputPos++] = ')';
  1779. continue;
  1780. case '⒳':
  1781. output[outputPos++] = '(';
  1782. output[outputPos++] = 'x';
  1783. output[outputPos++] = ')';
  1784. continue;
  1785. case '⒴':
  1786. output[outputPos++] = '(';
  1787. output[outputPos++] = 'y';
  1788. output[outputPos++] = ')';
  1789. continue;
  1790. case '⒵':
  1791. output[outputPos++] = '(';
  1792. output[outputPos++] = 'z';
  1793. output[outputPos++] = ')';
  1794. continue;
  1795. case 'Æ':
  1796. case 'Ǣ':
  1797. case 'Ǽ':
  1798. case 'ᴁ':
  1799. output[outputPos++] = 'A';
  1800. output[outputPos++] = 'E';
  1801. continue;
  1802. case 'ß':
  1803. output[outputPos++] = 's';
  1804. output[outputPos++] = 's';
  1805. continue;
  1806. case 'æ':
  1807. case 'ǣ':
  1808. case 'ǽ':
  1809. case 'ᴂ':
  1810. output[outputPos++] = 'a';
  1811. output[outputPos++] = 'e';
  1812. continue;
  1813. case 'IJ':
  1814. output[outputPos++] = 'I';
  1815. output[outputPos++] = 'J';
  1816. continue;
  1817. case 'ij':
  1818. output[outputPos++] = 'i';
  1819. output[outputPos++] = 'j';
  1820. continue;
  1821. case 'Œ':
  1822. case 'ɶ':
  1823. output[outputPos++] = 'O';
  1824. output[outputPos++] = 'E';
  1825. continue;
  1826. case 'œ':
  1827. case 'ᴔ':
  1828. output[outputPos++] = 'o';
  1829. output[outputPos++] = 'e';
  1830. continue;
  1831. case 'ƕ':
  1832. output[outputPos++] = 'h';
  1833. output[outputPos++] = 'v';
  1834. continue;
  1835. case 'DŽ':
  1836. case 'DZ':
  1837. output[outputPos++] = 'D';
  1838. output[outputPos++] = 'Z';
  1839. continue;
  1840. case 'Dž':
  1841. case 'Dz':
  1842. output[outputPos++] = 'D';
  1843. output[outputPos++] = 'z';
  1844. continue;
  1845. case 'dž':
  1846. case 'dz':
  1847. case 'ʣ':
  1848. case 'ʥ':
  1849. output[outputPos++] = 'd';
  1850. output[outputPos++] = 'z';
  1851. continue;
  1852. case 'LJ':
  1853. output[outputPos++] = 'L';
  1854. output[outputPos++] = 'J';
  1855. continue;
  1856. case 'Lj':
  1857. output[outputPos++] = 'L';
  1858. output[outputPos++] = 'j';
  1859. continue;
  1860. case 'lj':
  1861. output[outputPos++] = 'l';
  1862. output[outputPos++] = 'j';
  1863. continue;
  1864. case 'NJ':
  1865. output[outputPos++] = 'N';
  1866. output[outputPos++] = 'J';
  1867. continue;
  1868. case 'Nj':
  1869. output[outputPos++] = 'N';
  1870. output[outputPos++] = 'j';
  1871. continue;
  1872. case 'nj':
  1873. output[outputPos++] = 'n';
  1874. output[outputPos++] = 'j';
  1875. continue;
  1876. case 'Ƕ':
  1877. output[outputPos++] = 'H';
  1878. output[outputPos++] = 'V';
  1879. continue;
  1880. case 'Ȣ':
  1881. case 'ᴕ':
  1882. output[outputPos++] = 'O';
  1883. output[outputPos++] = 'U';
  1884. continue;
  1885. case 'ȣ':
  1886. output[outputPos++] = 'o';
  1887. output[outputPos++] = 'u';
  1888. continue;
  1889. case 'ȸ':
  1890. output[outputPos++] = 'd';
  1891. output[outputPos++] = 'b';
  1892. continue;
  1893. case 'ȹ':
  1894. output[outputPos++] = 'q';
  1895. output[outputPos++] = 'p';
  1896. continue;
  1897. case 'ʦ':
  1898. output[outputPos++] = 't';
  1899. output[outputPos++] = 's';
  1900. continue;
  1901. case 'ʨ':
  1902. output[outputPos++] = 't';
  1903. output[outputPos++] = 'c';
  1904. continue;
  1905. case 'ʪ':
  1906. output[outputPos++] = 'l';
  1907. output[outputPos++] = 's';
  1908. continue;
  1909. case 'ʫ':
  1910. output[outputPos++] = 'l';
  1911. output[outputPos++] = 'z';
  1912. continue;
  1913. case 'ᵫ':
  1914. output[outputPos++] = 'u';
  1915. output[outputPos++] = 'e';
  1916. continue;
  1917. case 'ẞ':
  1918. output[outputPos++] = 'S';
  1919. output[outputPos++] = 'S';
  1920. continue;
  1921. case 'Ỻ':
  1922. output[outputPos++] = 'L';
  1923. output[outputPos++] = 'L';
  1924. continue;
  1925. case 'ỻ':
  1926. output[outputPos++] = 'l';
  1927. output[outputPos++] = 'l';
  1928. continue;
  1929. default:
  1930. output[outputPos++] = ch;
  1931. continue;
  1932. }
  1933. }
  1934. }
  1935. return new string(output).Trim('\0');
  1936. }
  1937. }

另外,我们在把新修改后的ExamineSettings.config 发布到web server后,需要rebuild index 去使它工作。

但是,在我们的web server上,我们并没有Umbraco back office, 我们把back office删除了. 如何rebuild index 呢

方法是把 App_Data\TEMP\ExamineIndexes\machinename\ 中的 External 文件夹删除。然后在IIS 中的App pool重新启动这个网站

然后再测试,应该就可以了

Umbraco Examine Search (Lucene.net) french accent的更多相关文章

  1. Umbraco examine search media folder 中的pdf文件

    可以参考的文章 http://sleslie.me/2015/selecting-media-using-razor-slow-performance-examine-to-the-rescue/ h ...

  2. Umbraco Examine 实现Fuzzy search

    在Umbraco examine search项目开发中,有一个需求, 就是intercom 和 intercoms需要返回同样的结果 也就是说 搜索intercom 时, 能返回包含intercom ...

  3. Umbraco中的Examine Search功能讲解

    转载原地址: http://24days.in/umbraco/2013/getting-started-with-examine/ Everytime I read the word Examine ...

  4. Lucene学习总结:全文检索的基本原理

    一.总论 根据http://lucene.apache.org/java/docs/index.html定义: Lucene是一个高效的,基于Java的全文检索库. 所以在了解Lucene之前要费一番 ...

  5. Lucene学习笔记:一,全文检索的基本原理

    一.总论 根据http://lucene.apache.org/java/docs/index.html定义: Lucene是一个高效的,基于Java的全文检索库. 所以在了解Lucene之前要费一番 ...

  6. Lucene:信息检索与全文检索

    目录 信息检索的概念 信息检索技术的分类 全文检索与数据库查询对比 全文检索工具一般由三部分构成 全文检索中建立索引和进行检索的流程 索引里面究竟存什么 如何创建索引 如何对索引进行检索 Lucene ...

  7. Lucene学习总结之一:全文检索的基本原理

    一.总论 根据http://lucene.apache.org/java/docs/index.html 定义: Lucene 是一个高效的,基于Java 的全文检索库. 所以在了解Lucene之前要 ...

  8. Lucene原理一

    Lucene 是一个高效的,基于Java 的全文检索库. 所以在了解Lucene之前要费一番工夫了解一下全文检索. 那么什么叫做全文检索呢?这要从我们生活中的数据说起. 我们生活中的数据总体分为两种: ...

  9. Lucene全文检索技术学习

    ---------------------------------------------------------------------------------------------------- ...

随机推荐

  1. nginx gzip 压缩设置

    mime.types 中包含所有文件的类型,不知道的可以去里面查询 gzip配置的常用参数 gzip on|off;  #是否开启gzip gzip_buffers 32 4K| 16 8K #缓冲( ...

  2. 恢复delete删除的数据

    SELECT * FROM tablename AS OF TIMESTAMP TO_TIMESTAMP('2010-12-15 11:10:17', 'YYYY-MM-DD HH:MI:SS')

  3. 从HTTP请求中获取客户IP地址

    /**     * 从HTTP请求中获取客户IP地址     *     * @param request http请求     * @return 客户IP地址     */    public s ...

  4. WEB安全之Token浅谈

    Token一般用在两个地方——防止表单重复提交.anti csrf攻击(跨站点请求伪造). 两者在原理上都是通过session token来实现的.当客户端请求页面时,服务器会生成一个随机数Token ...

  5. 具有增、删、改、查功能的vue-tree树组件

    最近写了一个具有增删改查功能的多级树组件,感觉很实用,啦啦啦啦, 废话不多说,看代码: tree.vue <template> <div> <div class=&quo ...

  6. linux命令学习笔记(0):man命令

    Linux提供了丰富的帮助手册,当你需要查看某个命令的参数时不必到处上网查找,只要man一下即可. Linux的man手册共有以下几个章节: 代號 代表內容 使用者在shell中可以操作的指令或可执行 ...

  7. Hexo 版本

    Mac hexo s 启动Hexo服务报错如下: Error: The module '/usr/local/lib/node_modules/hexo-cli/node_modules/.0.8.0 ...

  8. ffmpeg编码h264只包含I帧P帧的方法

    ffmpeg使用avcodc_encode_video编码,默认产生的h264包含B帧,在安防行业很多地方是不需要用到B帧的. 1.基础知识充电 这就涉及到h264的各种profile格式了,参考 h ...

  9. yarn 官方配置推荐

    http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.4/bk_installing_manually_book/content/rpm-chap ...

  10. NOIP2018爆炸记

    又是一年\(NOIP\),可能是梦结束的地方? 之所以咕了这么久是得先确定自己不会退役,因为分太低了. 和去年一样在学校门前照了相,然后上车走了.高三回来考的只剩下\(p2oileen\)学姐了.新一 ...