Darkest page of my coding life

The code i wrote a while ago recently caused a disaster and as I reviewed it I found it is the silliest code I've ever written,

 static int BadMaximalMatch(TListRef list1, int start1, int count1, TListRef list2, int start2, int count2,

     const IEqualityComparer<T> &comparer, List<int> &indices1, List<int> &indices2)

 {

     if (count1 <=  || count2 <= )

     {

         return ;

     }

     bool eq = comparer.Equals(list1[start1], list2[start2]);

     if (eq)

     {

         indices1.Add(start1);

         indices2.Add(start2);

         return BadMaximalMatch(list1, start1+, count1-, list2, start2+, count2-, comparer, indices1, indices2) + ;

     }

     else

     {

         bool eq01 = count2 >=  && comparer.Equals(list1[start1], list2[start2+]);

         bool eq10 = count1 >=  && comparer.Equals(list1[start1+], list2[start2]);

         int crossDiff;

         if (eq01 && eq10)

         {

             crossDiff = ;

         }

         else if (eq01 && !eq10)

         {

             return BadMaximalMatch(list1, start1, count1, list2, start2+, count2-, comparer, indices1, indices2);

         }

         else if (!eq01 && eq10)

         {

             return BadMaximalMatch(list1, start1+, count1-, list2, start2, count2, comparer, indices1, indices2);

         }

         else

         {

             bool eq11 = count1 >=  && count2 >=  && comparer.Equals(list1[start1+], list2[start2+]);

             if (eq11)

             {

                 indices1.Add(start1+);

                 indices2.Add(start2+);

                 return BadMaximalMatch(list1, start1+, count1-, list2, start2+, count2-, comparer, indices1, indices2)+;

             }

             crossDiff = ;

         }

         List<int> temp11, temp12, temp21, temp22;

         int m1 = , m2 = ;

         if (count1 < count2)

         {

             // calculate m1 first, as maximum of m1 is greater than that of m2

             // maximum: min(count1, count2-crossDiff)

             m1 = BadMaximalMatch(list1, start1, count1, list2, start2+crossDiff, count2-crossDiff, comparer, temp11, temp12);

             if (m1 < count1 && m1 < count2-crossDiff)

             {

                 // m1 hasn't reached its maximum possible value

                 // maximum: min(count1-crossDiff, count2)

                 m2 = BadMaximalMatch(list1, start1+crossDiff, count1-crossDiff, list2, start2, count2, comparer, temp21, temp22);

             }

         }

         else

         {

             // calculate m2 first, as maximum of m2 is greater than that of m1

             m2 = BadMaximalMatch(list1, start1+crossDiff, count1-crossDiff, list2, start2, count2, comparer, temp21, temp22);

             if (m2 < count2 && m2 < count1-crossDiff)

             {

                 // m2 hasn't reached its maximum possible value

                 // maximum: min(count1, count2-crossDiff)

                 m1 = BadMaximalMatch(list1, start1, count1, list2, start2+crossDiff, count2-crossDiff, comparer, temp11, temp12);

             }

         }

         if (m2 > m1)

         {

             for (int i = ; i < m2; i++)

             {

                 indices1.Add(temp21[i]);

                 indices2.Add(temp22[i]);

             }

             return m2;

         }

         else

         {

             for (int i = ; i < m1; i++)

             {

                 indices1.Add(temp11[i]);

                 indices2.Add(temp12[i]);

             }

             return m1;

         }

     }

 }

It simply finds out the maximum common sublist of two. And I was dumb enough to not realize it was a very simple problem and be spending quite a while on a complex recursive algorithm as above to solve that. So far there's no evidence it's buggy, but it's as bad as buggy when dealing with just more than 20 data points. A random test today showed that the code above is problematic in that it doesn't take into account some of the possible options that goes across the one that it believes is optimal. Basically the algorithm should only step forward when the current item from either list has no match in the other. So the correct one should be

 static int MaximalSublistMatch_Slow(TListRef list1, int start1, int count1, TListRef list2, int start2, int count2,

     const IEqualityComparer<T> &comparer, List<int> &indices1, List<int> &indices2)

 {

     if (count1 <=  || count2 <= )

     {

         return ;

     }

     bool eq = comparer.Equals(list1[start1], list2[start2]);

     if (eq)

     {

         indices1.Add(start1);

         indices2.Add(start2);

         return MaximalSublistMatch_Slow(list1, start1 + , count1 - , list2, start2 + , count2 - , comparer, indices1, indices2) + ;

     }

     else

     {

         const T &v1 = list1[start1];

         const T &v2 = list2[start2];

         int l1Match=-, l2Match=-;

         for (int i = start2 + ; i < start2+count2; i++)

         {

             if (list2[i] == v1)

             {

                 l1Match = i;

                 break;

             }

         }

         for (int i = start1 + ; i < start1+count1; i++)

         {

             if (list1[i] == v2)

             {

                 l2Match = i;

                 break;

             }

         }

         if (l1Match <  && l2Match < )

         {

             return MaximalSublistMatch_Slow(list1, start1 + , count1 - , list2, start2 + , count2 - , comparer, indices1, indices2);

         }

         else

         {

             // try both

             List<int> temp11, temp12, temp21, temp22;

             int r2 = ;

             int r1 = MaximalSublistMatch_Slow(list1, start1, count1, list2, start2 + , count2 - , comparer, temp11, temp12);

             if (r1 < std::min(count1 - , count2))

             {

                 r2 = MaximalSublistMatch_Slow(list1, start1 + , count1 - , list2, start2, count2, comparer, temp21, temp22);

             }

             if (r1 < r2)

             {

                 for (int i = ; i < r2; i++)

                 {

                     indices1.Add(temp21[i]);

                     indices2.Add(temp22[i]);

                 }

                 return r2;

             }

             else

             {

                 for (int i = ; i < r1; i++)

                 {

                     indices1.Add(temp11[i]);

                     indices2.Add(temp12[i]);

                 }

                 return r1;

             }

         }

     }

 }

A simpler version naive alternative (not equivalent, but ok for most use; and minor change to it can improve accuracy not so sure of what significance this method can be, with a fast optimum approach found available) is

It's equivalent is,

 // this is a version of maximal match with a complexity of O(N)

 static int MaximalMatch(TListRef list1, int start1, int count1, TListRef list2, int start2, int count2,

     const IEqualityComparer<T> &comparer, List<int> &indices1, List<int> &indices2)

 {

     int matchStart2 = start2;

     for (int i1 = start1; i1 < start1 + count1; i1++)

     {

         const T &v1 = list1[i1];

         for (int i2 = matchStart2; i2 < start2 + count2; i2++)

         {

             const T &v2 = list2[i2];

             if (comparer.Equals(v1,v2))

             {

                 indices1.Add(i1);

                 indices2.Add(i2);

                 matchStart2 = i2+;

                 break;

             }

         }

     }

     return indices1.GetCount();

 }

Of course this is is epically faster, simpler and less error-prone than the previous one. but it doesn't provide the optimal result.
You can imagine how an application would suffer from the exp(N)-complexity shit.

The fast equivalent should be using dynamic programming and go as follows

(The standard C# version has been updated to the QSharp library at https://qsharp.codeplex.com/SourceControl/latest#QSharp/QSharp.Scheme.Classical.Sequential/MaxSublistMatch.cs)

 struct MaxMatchDPResult

 {

     bool Done;

     List<int> Indices1;

     List<int> Indices2;

 };

 // FB: 6462

 // NOTE this is a version of maximal match using dynamic programming

 //      it has a time complexity of around O(N*N) and space complexity of about O(N^4)

 //      This is a recommended version as it provides optimal result and is fast

 static int MaximalSublistMatch_DP(TListRef list1, int start1, int count1, TListRef list2, int start2, int count2,

     const IEqualityComparer<T> &comparer, List<int> &indices1, List<int> &indices2)

 {

     int maxSofar = ;

     std::vector<std::vector<MaxMatchDPResult>> map;

     for (int i = ; i < count1+; i++)

     {

         map.push_back(std::vector<MaxMatchDPResult>());

         for (int j = ; j < count2+; j++)

         {

             map[i].push_back(MaxMatchDPResult());

             map[i][j].Done = false;

         }

     }

     return MaximalSublistMatch_DP(list1, start1, count1, list2, start2, count2, comparer, indices1, indices2, map);

 }

 static int MaximalSublistMatch_DP_Lookup(TListRef list1, int start1, int count1, TListRef list2, int start2, int count2,

     const IEqualityComparer<T> &comparer, List<int> &indices1, List<int> &indices2, std::vector <std::vector<MaxMatchDPResult>> &map)

 {

     if (count1 <=  || count2 <= )

     {

         return ;

     }

     const MaxMatchDPResult &result = map[count1][count2];

     int r;

     if (result.Done)

     {

         r = result.Indices1.GetCount();

         for (int i = ; i < r; i++)

         {

             indices1.Add(result.Indices1[i]);

             indices2.Add(result.Indices2[i]);

         }

     }

     else

     {

         List<int> tempIndices1, tempIndices2;

         r = MaximalSublistMatch_DP(list1, start1, count1, list2, start2, count2, comparer, tempIndices1, tempIndices2, map);

         map[count1][count2].Done = true;

         map[count1][count2].Indices1 = tempIndices1;

         map[count1][count2].Indices2 = tempIndices2;

         for (int i = ; i < r; i++)

         {

             indices1.Add(tempIndices1[i]);

             indices2.Add(tempIndices2[i]);

         }

     }

     return r;

 }

 static int MaximalSublistMatch_DP(TListRef list1, int start1, int count1, TListRef list2, int start2, int count2,

     const IEqualityComparer<T> &comparer, List<int> &indices1, List<int> &indices2, std::vector<std::vector<MaxMatchDPResult>> &map)

 {

     bool eq = comparer.Equals(list1[start1], list2[start2]);

     if (eq)

     {

         indices1.Add(start1);

         indices2.Add(start2);

         int r = MaximalSublistMatch_DP_Lookup(list1, start1 + , count1 - , list2, start2 + , count2 - , comparer, indices1, indices2, map) + ;

         return r;

     }

     List<int> temp11, temp12, temp21, temp22;

     int r2 = ;

     int r1 = MaximalSublistMatch_DP_Lookup(list1, start1, count1, list2, start2 + , count2 - , comparer, temp11, temp12, map);

     if (r1 <

 #if defined(min)

         min(count1 - , count2)

 #else

         std::min(count1 - , count2)

 #endif

         )

     {

         r2 = MaximalSublistMatch_DP_Lookup(list1, start1 + , count1 - , list2, start2, count2, comparer, temp21, temp22, map);

     }

     if (r2 > r1)

     {

         for (int i = ; i < r2; i++)

         {

             indices1.Add(temp21[i]);

             indices2.Add(temp22[i]);

         }

         return r2;

     }

     else

     {

         for (int i = ; i < r1; i++)

         {

             indices1.Add(temp11[i]);

             indices2.Add(temp12[i]);

         }

         return r1;

     }

 }

This was stupid, but classic!

Darkest page of my coding life的更多相关文章

芝麻HTTP：Python爬虫实战之抓取爱问知识人问题并保存至数据库
本次为大家带来的是抓取爱问知识人的问题并将问题和答案保存到数据库的方法,涉及的内容包括: Urllib的用法及异常处理 Beautiful Soup的简单应用 MySQLdb的基础用法正则表达式的简 ...
全栈开发工程师微信小程序-中(下)
全栈开发工程师微信小程序-中(下) 微信小程序视图层 wxml用于描述页面的结构,wxss用于描述页面的样式,组件用于视图的基本组成单元. // 绑定数据 index.wxml <view> ...
简单python爬虫案例(爬取慕课网全部实战课程信息)
技术选型下载器是Requests 解析使用的是正则表达式效果图: 准备好各个包 # -*- coding: utf-8 -*- import requests #第三方下载器 import re ...
python练习册每天一个小程序第0000题
PIL库学习链接:http://blog.csdn.net/column/details/pythonpil.html?&page=1 1 #-*-coding:utf-8-*- 2 __au ...
Selenium的PO模式（Page Object Model）[python版]
Page Object Model 简称POM 普通的测试用例代码: .... #测试用例 def test_login_mail(self): driver = self.driver driv ...
使用page object模式抓取几个主要城市的pm2.5并从小到大排序后写入txt文档
#coding=utf-8from time import sleepimport unittestfrom selenium import webdriverfrom selenium.webdri ...
使用webstom或者idea上传代码到github或coding
鉴于github网络速度太慢,建议用coding.先介绍github上传方式,因为webstom或idea集成了github,方法简单. git是一个版本控制器,他的作用是管理代码.比如你修改了代码, ...
Selenium的PO模式（Page Object Model）|(Selenium Webdriver For Python)
研究Selenium + python 自动化测试有近两个月了,不能说非常熟练,起码对selenium自动化的执行有了深入的认识. 从最初无结构的代码,到类的使用,方法封装,从原始函数 ...
Coding源码学习第二部分(FunctionIntroManager.m)
接上篇.上篇有一个细节忘了写,在Coding_iOS-Info.plist 里面添加了一个key 是 Status bar is initially hidden Value 是 YES,在appl ...

随机推荐

LR性能指标分析
Memory: ·Available Mbytes 简述:可用物理内存数.如果Available Mbytes的值很小(4 MB或更小),则说明计算机上总的内存可能不足,或某程序没有释放内存. 参考值 ...
php上传文件进度条
ps:本文转自脚本之家 Web应用中常需要提供文件上传的功能.典型的场景包括用户头像上传.相册图片上传等.当需要上传的文件比较大的时候,提供一个显示上传进度的进度条就很有必要了. 在PHP 5.4以前 ...
重温WCF之消息契约（MessageContract）（六）
对于SOAP来说主要由两部分构成Header和Body,他们两个共同构成了SOAP的信封,通常来说Body保存具体的数据内容,Header保存一些上下文信息或关键信息.比如:在一些情况下,具有这样的要 ...
Go的基本示例
有空可以看看, 不知能不能超越JAVA的作法. hello.go package main import "fmt" func main() { s := "hello& ...
git push 使用总结
git push命令用于将本地分支的更新,推送到远程主机.它的格式与git pull命令相仿. $ git push <远程主机名> <本地分支名>:<远程分支名> ...
<转>ORA-12154: TNS: 无法解析指定的连接标识符
相信作为ORACLE数据库的开发人员没有少碰到“ORA-12154: TNS: 无法解析指定的连接标识符”,今天我也又碰到了类似的情况,将我的解决方法进行小结,希望能对碰到同样问题的友人们提供帮助. ...
解决css样式被内置样式覆盖的问题
.preImg { height:400px !important } <img id="preImg" class="preImg" style=&qu ...
cordova+angularJS+ionic
1.创建项目 2.路由 angular.module("starter",['ionic']) // 依赖 ionic 提供的ui-router .config(function ...
遍历Map
Map map = new HashMap(); map.put("1", "value1"); map.put("2", "va ...
在Salesforce中处理Xml的生成与解析
在Salesforce中处理Xml的生成与解析 1): Generate Xml private String ConvertAccountToXmlInfo(Account acc){ Dom.Do ...

Darkest page of my coding life

Darkest page of my coding life的更多相关文章

随机推荐

热门专题