<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Ashutosh Mehra's Blog &#187; Programming</title>
	<atom:link href="http://ashutoshmehra.net/blog/category/programming/feed/" rel="self" type="application/rss+xml" />
	<link>http://ashutoshmehra.net/blog</link>
	<description>Notes on Math, Computer Science &#38; Programming</description>
	<lastBuildDate>Sat, 27 Feb 2010 14:22:03 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.2</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>RtlWriteDecodedUcsDataIntoSmartLBlobUcsWritingContext and Other Long Function Names</title>
		<link>http://ashutoshmehra.net/blog/2010/02/long-function-names/</link>
		<comments>http://ashutoshmehra.net/blog/2010/02/long-function-names/#comments</comments>
		<pubDate>Sat, 27 Feb 2010 14:21:27 +0000</pubDate>
		<dc:creator>Ashutosh</dc:creator>
				<category><![CDATA[Humor]]></category>
		<category><![CDATA[Programming]]></category>

		<guid isPermaLink="false">http://ashutoshmehra.net/blog/?p=633</guid>
		<description><![CDATA[Every reader should ask himself periodically &#8220;Toward what end, toward what end?&#8221; &#8212; but do not ask it too often lest you pass up the fun of programming for the constipation of bittersweet philosophy.
&#8211; Alan J. Perlis [foreword to SICP]


I find long functions names charming &#8212; I think they impart a certain personality to the [...]]]></description>
			<content:encoded><![CDATA[<p><i>Every reader should ask himself periodically &#8220;Toward what end, toward what end?&#8221; &#8212; but do not ask it too often lest you pass up the fun of programming for the constipation of bittersweet philosophy.<br/>
<p style="text-align:right">&#8211; Alan J. Perlis [<a href="http://mitpress.mit.edu/sicp/full-text/book/book-Z-H-5.html">foreword to SICP</a>]</p>
<p></i></p>
<div class="alignright" style="margin-left: 5px"><img border="0" src="http://ashutoshmehra.net/images/posts/funcnames/RtlWriteDecodedUcsDataIntoSmartLBlobUcsWritingContext.png" width="250px" alt="RtlWriteDecodedUcsDataIntoSmartLBlobUcsWritingContext (53 characters)"></div>
<p>I find long functions names charming &#8212; I think they impart a certain personality to the API and add to the <i>fun</i> of programming. And while I try to avoid excessively long names in my own code at work (unless they add to clarity), I never fail to get a good kick seeing such names used by other programmers! <span id="more-633"></span></p>
<p>I&#8217;ve often wondered what the longest names would look like. So I decided to explore a large system &#8212; the system DLLs in <tt>C:\Windows</tt> directory &#8212; and dig up some of the biggies. My plan was to write a script that walked through the installed DLLs and looked at their exported functions list for interesting candidates.</p>
<h4>The Winners</h4>
<p>Here are the longer ones I could find:</p>
<ul>
<li><tt>RtlWriteDecodedUcsDataIntoSmartLBlobUcsWritingContext</tt> [wcp.dll], with 53-characters, was the longest one (but this function is not documented)</li>
<li>(52) <a href="http://msdn.microsoft.com/en-us/library/aa376397%28VS.85%29.aspx"><tt>ConvertSecurityDescriptorToStringSecurityDescriptor{A,W}</tt> [advapi32.dll]</a>, the function that <a href="http://en.wikipedia.org/wiki/Raymond_Chen">Raymond Chen</a> admitted was the reason for <a href="http://blogs.msdn.com/oldnewthing/archive/2004/03/12/88572.aspx">his post on security descriptors</a>.
<li>(52) <a href="http://msdn.microsoft.com/en-us/library/aa376397%28VS.85%29.aspx"><tt>ConvertStringSecurityDescriptorToSecurityDescriptor{A,W}</tt> [advapi32.dll]</a>, the complement of the above.</li>
<li>(50) <a href="http://msdn.microsoft.com/en-us/library/aa446582%28VS.85%29.aspx"><tt>CreatePrivateObjectSecurityWithMultipleInheritance</tt> [advapi32.dll]</a>, another security function.</li>
<li>(50) <a href="http://msdn.microsoft.com/en-us/library/aa376038%28VS.85%29.aspx"><tt>CertCreateCTLEntryFromCertificateContextProperties</tt> [crypt32.dll]</a>.</li>
<li>(50) <a href="http://msdn.microsoft.com/en-us/library/bb204685%28VS.85%29.aspx"><tt>EapHostPeerQueryUIBlobFromInteractiveUIInputFields</tt> [eappcfg.dll]</a>.</li>
<li>(49) <a href="http://msdn.microsoft.com/en-us/library/aa374843%28VS.85%29.aspx"><tt>AccessCheckByTypeResultListAndAuditAlarmByHandle{A,W}</tt> [advapi32.dll]</a>, yet another security API &#8212; I guess the security API programmers get a good kick from long names, just like me!</li>
<li>(47) <a href="http://msdn.microsoft.com/en-us/library/dd692949%28VS.85%29.aspx"><tt>GetNumberOfPhysicalMonitorsFromIDirect3DDevice9</tt> [Dxva2.lib]</a>.</li>
<li>(43) <a href="http://msdn.microsoft.com/en-us/library/aa377432%28VS.85%29.aspx"><tt>SetupRemoveInstallSectionFromDiskSpaceList{A,W}</tt> [Setupapi.dll]</a>.</li>
</ul>
<p>And just so that no one thinks I&#8217;m making these up, I&#8217;ve linked to their documentation!</p>
<p>Curious on seeing these names, I dug through the other &#8220;operating system&#8221; I had access to &#8212; GNU EMACS. The longest interactive command there is <tt>slime-compiler-notes-default-action-or-show-details/mouse</tt>: Counting punctuations, this beats the longest Windows export by 4 characters.</p>
<h4>The Method</h4>
<p>I used the <a href="http://code.google.com/p/pefile/">pefile Python library</a> to parse all DLLs in my <tt>windows</tt> directory. I only considered &#8220;system&#8221; DLLs (ignoring any third-party drivers etc.) by checking for &#8220;Microsoft&#8221; in the DLL copyright-string. In addition, I discarded any C++-ish exports, because the <a href="http://en.wikipedia.org/wiki/Name_mangling">C++ name-mangling</a> skewed the results too much and I was too lazy to hook in a &#8220;undecoratify&#8221; procedure.</p>
<p>Finally, on my 64-bit windows installation, there are both 32-bit and 64-bit versions of many core DLLs (in <tt>System32</tt> and <tt>SysWOW64</tt> directories respectively), and I encountered duplicates. A similar thing happened with <a href="http://msdn.microsoft.com/en-us/library/aa376307%28VS.85%29.aspx">SxS</a> DLLs.</p>
<p>Using <a href="https://gist.github.com/230c3f53261a20340118">this Python script I coded</a>, I generated a delimitered text-file with around 1500 entries that I manually scanned (I had some time to waste!) for &#8220;interesting&#8221; names. Below is a histogram of the function-name lengths. The rough bell shape gives me confidence that script wasn&#8217;t totally off the mark.</p>
<p><a href="http://ashutoshmehra.net/images/posts/funcnames/histogram_large.png"><img border="0" src="http://ashutoshmehra.net/images/posts/funcnames/histogram_large.png" alt="A histogram of the length of functions names exported by system DLLs in the Windows directory"></a></p>
<h4>Functions Taking Lots of Parameters</h4>
<p>A second axis to dig &#8220;interesting&#8221; functions would be to count the <i>number of function params</i>. </p>
<p>This one is a bit of fuzzy due to questions like &#8220;Do you count <i>deep</i>&#8220;? In that when a function accepts a pointer to a struct, do you count the struct members as inputs too? For instance, <a href="http://msdn.microsoft.com/en-us/library/dd183501%28VS.85%29.aspx"><tt>CreateFontIndirectEx</tt> [gdi32.dll]</a> takes a pointer to <a href="http://msdn.microsoft.com/en-us/library/dd162628%28VS.85%29.aspx"><tt>ENUMLOGFONTEXDV</tt></a> structure that holds about two dozen items (all things considered).</p>
<p>Counting params is also more work, at least for DLL exports (where you need to cross-reference with documentation/headers if the export is, at all, public). Otherwise, the problem can be approached differently by forgetting DLL exports entirely and instead using a crawler that walks the the locally installed MSDN and parses the &#8220;Syntax&#8221; section to count the number of params &#8212; just assuming <tt>num_params = num_commas + 1</tt> might be good enough.</p>
<p>Anyway, manually browsing through some of MSDN, I chanced upon a few gems:</p>
<ul>
<li><a href="http://msdn.microsoft.com/en-us/library/aa374843%28VS.85%29.aspx"><tt>AccessCheckByTypeResultListAndAuditAlarmByHandle</tt></a>, our friend from above, true to its character, takes 17 parameters!</li>
<li><a href="http://msdn.microsoft.com/en-us/library/ms690529%28VS.85%29.aspx"><tt>OleCreateFromFileEx</tt></a> takes in a filename, interface ID, flags, sink, connection, site&#8230; 13 in all.</li>
</ul>
<h4>On a Serious Note</h4>
<p>Well named functions (just like well named variables) are good instant documentation. <i>Descriptive</i> function are important when:</p>
<ul>
<li>Such functions all belong to a flat namespace (DLL exports or C code)</li>
<li>Several of them have very similar purpose (like the six or so <tt>AccessCheck*</tt> APIs)</li>
</ul>
<p>Such names become even more relevant when they define the public API of your library/system.</p>
<p>Some people (&#8220;constipated by bittersweet philosophy&#8221;?) dislike long function names, and I wonder why. The argument that longer functions take longer to type seems mostly dud: Visual Studio with the awesome <a href="http://www.wholetomato.com/">Visual Assist X</a> addon does Intellisense beautifully; Emacs with <tt><a href="http://www.emacswiki.org/emacs/HippieExpand">hippie-expand</a></tt> does some good magic (the brave also have <a href="http://cedet.sourceforge.net/intellisense.shtml">CEDET</a>); Eclipse, BBEdit, and Vim surely have their own thing. So typing isn&#8217;t a problem. And with machines so fast and powerful, the (JIT)compilation/linking-time for longer function names should be irrelevant in most cases.</p>
<h4>Links</h4>
<p>While researching for this entry, I came across the post <a href="http://blogs.msdn.com/brada/archive/2005/01/09/349678.aspx">&#8220;Best&#8221; method names ever</a> by <a href="http://blogs.msdn.com/brada/default.aspx">Brad Abrams</a> that has some interesting content/comments.</p>
<p><i>[Note: Lest someone should give a wicked twist to the whole point of this post, I should add that I have most sincere respect for Microsoft engineers, enjoy using their programs every day and working with their APIs.]</i></p>
]]></content:encoded>
			<wfw:commentRss>http://ashutoshmehra.net/blog/2010/02/long-function-names/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Kernighan &amp; Plauger on Programming Style</title>
		<link>http://ashutoshmehra.net/blog/2009/06/kernighan-plauger-programming-style/</link>
		<comments>http://ashutoshmehra.net/blog/2009/06/kernighan-plauger-programming-style/#comments</comments>
		<pubDate>Mon, 08 Jun 2009 17:20:43 +0000</pubDate>
		<dc:creator>Ashutosh</dc:creator>
				<category><![CDATA[Programming]]></category>

		<guid isPermaLink="false">http://ashutoshmehra.net/blog/?p=556</guid>
		<description><![CDATA[

  h4 {
  margin-top: 15px;
  }
  span.tipitem span {
  font-style: italic;
  font-weight: normal;
  color: #000000;
  while-space: nowrap;
  margin: 0 5px 0 5px;
  }
  ol.importantlist li {
  font-weight: bold;
  color: #60743A;
  margin-bottom: 10px;
  }

Last week I finished reading Kernighan &#038; Plauger&#8217;s [...]]]></description>
			<content:encoded><![CDATA[<div class="alignright" style="margin-left: 5px"><a href="http://www.amazon.com/gp/product/0070342075?ie=UTF8&#038;tag=ashmehblo-20&#038;linkCode=as2&#038;camp=1789&#038;creative=390957&#038;creativeASIN=0070342075"><img border="0" src="http://ashutoshmehra.net/images/posts/teops/the_elements_of_programming_style_small_front.jpg" width="100px"></a><img src="http://www.assoc-amazon.com/e/ir?t=ashmehblo-20&#038;l=as2&#038;o=1&#038;a=0070342075" width="1" height="1" border="0" alt="" style="border:none !important; margin:0px !important;" /></div>
<style type="text/css">
  h4 {
  margin-top: 15px;
  }
  span.tipitem span {
  font-style: italic;
  font-weight: normal;
  color: #000000;
  while-space: nowrap;
  margin: 0 5px 0 5px;
  }
  ol.importantlist li {
  font-weight: bold;
  color: #60743A;
  margin-bottom: 10px;
  }
</style>
<p>Last week I finished reading <a href="http://www.cs.bell-labs.com/who/bwk/index.html">Kernighan</a> &#038; <a href="http://www.plauger.com/index.html">Plauger</a>&#8217;s beautiful book <a href="http://en.wikipedia.org/wiki/The_Elements_of_Programming_Style">The Elements of Programming Style</a>, the classic that <a href="http://www.plauger.com/books.html">pioneered</a> the term <i><a href="http://en.wikipedia.org/wiki/Programming_style">programming style</a></i>. I&#8217;ve excerpted below some rules of style from that book. I hope these get you excited to reading the book too!</p>
<p><span id="more-556"></span></p>
<h4>How the Book Works</h4>
<p>Here&#8217;s how the book works: The authors pick up a simple &#8220;real-world&#8221; program (mostly from other programming texts) and comment critically on its style. For analyzing the style, they look at the expressions and statements (at the lowest level), program and control structures, I/O-handling, program efficiency and documentation. In addition to suggesting what&#8217;s wrong with the code, the authors fix it, sometimes rewriting the whole thing, to produce a simpler, cleaner, sometimes more efficient, and always a more obviously correct program. </p>
<p>The experience is that of watching two master programmers doing a code-review!</p>
<p>You may read the text in one run. Or, you may challenge yourself and use it as a &#8220;problem-book&#8221; by studying the programs and analyzing them for style and fixing them before reading what the masters have to comment.</p>
<p>In all, it is a very pragmatic book &#8212; full of useful, practical advice.</p>
<p>A word of caution is due: The program fragments are in <a href="http://en.wikipedia.org/wiki/Fortran">Fortran</a> and <a href="http://en.wikipedia.org/wiki/PL/I">PL/I</a>. And while they use only the basic features of the language, it is still somewhat of a quest to figure out the meaning of the longer Fortran programs infested with GOTOs and arithmetic-IFs. PL/I is much easier to read.</p>
<h4>Reaffirm Your Own Beliefs of Good Style</h4>
<p>There&#8217;s probably <i>not much new stuff</i> you&#8217;ll find in this book. Most things the authors say are perhaps ones you already knew, believed, and agreed with (if only subconsciously). And yet it is a delight to read the whole thing if only because it reaffirms those beliefs. This is close to how I felt when reading <i><a href="http://www.pragprog.com/the-pragmatic-programmer">The Pragmatic Programmer</a></i> &#8212; that book hardly had anything <i>new</i>, and yet it was a sheer pleasure to read.</p>
<div class="alignleft" style="margin-right: 5px"><a href="http://en.wikipedia.org/wiki/Brian_Kernighan"><img border="0" src="http://ashutoshmehra.net/images/posts/teops/kernighan_small.jpg" width="75px"></a></div>
<div class="alignleft" style="margin-right: 10px"><a href="http://en.wikipedia.org/wiki/P._J._Plauger"><img border="0" src="http://ashutoshmehra.net/images/posts/teops/plauger_small.jpg" width="75px"></a></div>
<p>Programming is far from being a exact science &#8212; perhaps it is an <a href="http://awards.acm.org/images/awards/140/articles/7143252.pdf">art</a> (or engineering?). As we, programmers, work day after day doing what we do, we build many notions (or guiding principles) about our craft. From reading books to blogging, from studying other people&#8217;s code to writing our own to fixing bugs, from discussions with colleagues &#8212; the entire software development experience helps us build various concepts of what constitutes good practices of programming. And once in a while, it is reassuring to have all these notions validated by the masters &#8212; expert programmers, like Kernighan and Plauger, who we can safely trust to know about the art.</p>
<p>Reading this book will give such a reaffirmation to your own principles of programming. And if (alas!) you had been harboring some truly deviant ideas about programming, reading it would hopefully help set them right.</p>
<h4>What has Changed in These Three Decades?</h4>
<p>It&#8217;s been more than three decades since the second edition of the book came out in 1978. So it is natural to wonder what, if anything, has changed about how we program and what we consider good programs. To what extent have our programming languages and programming principles advanced through the years? Is this book still relevant?</p>
<p>To be honest, I wasn&#8217;t even born until many years after the book came out, so I&#8217;m naturally not qualified to comment. But studying how the programs in the text, perhaps typical of their age, were written, I couldn&#8217;t help but notice the following.</p>
<p>There has been definite improvement in the following respects:</p>
<ul>
<li><i>Structured programming</i> and <i>structured control statements</i> (like <tt>if/then/else</tt>, <tt>for</tt>, <tt>while</tt>, <tt>break</tt>, <tt>continue</tt>, <tt>yield</tt>) have made some of the points (mainly about <tt>goto</tt>, but also some others) in this book less relevant.</li>
<li>Support for <i>recursion</i> appears to be univerally available in all &#8220;modern&#8221; languages I can think of. (To contrast, the original Fortran didn&#8217;t allow recursion.)</li>
<li>Python/Haskell-style <i>layout</i> has solved many problems related to block-indentation.</li>
<li><i>Functional languages</i> like Haskell have made certain class of problems, like forgetting initialization, close to impossible. (And while functional languages definitely existed many decades ago, I think they are much more widely known &#8212; I may be wrong in this.)</li>
<li><i>Profilers and other instrumentation tools</i> for measuring timing performance and &#8220;hotspots&#8221; are available in plenty, though perhaps not used enough.</li>
<li><i>Module/Package/Namespace systems</i> are available in many mainstream languages.</li>
</ul>
<p>But we still have to routinely solve the same old problems and we still make the same old mistakes when solving them: What is the best way to design/structure correct and reliable programs? How to best validate input? How to correctly program with floating point numbers? &#8230; The book would perhaps provide some guidance with respect to these questions.</p>
<h4>The Rest of the Post&#8230;</h4>
<p>I&#8217;ve tried to collect some of the timeless wisdom of Kernighan and Plauger&#8217;s words as quotations in the remainder of the post. </p>
<p>Each chapter in their book is sprinkled with several terse lines summarizing the essence of the discussion. I&#8217;ve also collected some of those towards the end. If you enjoy and appreciate these, <i>you would definitely want to read the whole book</i>. As far as programming books go, this is quite thin (the size of K&amp;R), so you have a good chance of actually finishing it!</p>
<h4>On Obscure Code</h4>
<p>The introductory chapter gives an example of how a Fortran program used the expression <tt>(I/J)*(J/I)</tt> to initialize <tt>V(I,J)</tt> to an identity matrix! Notice that with integer arithmetic, assume <tt>i</tt> and <tt>j</tt> are nonzero, <tt>(I/J)*(J/I)</tt> is same as <tt>I == J ? 1 : 0</tt>.</p>
<p>The author&#8217;s point out why such code is wrong:</p>
<blockquote><p>The problem with obscure code is that debugging and modification become much more difficult, and these are already the hardest aspects of computer programming. Besides, there is the added danger that a too-clever program may not say what you thought it said. <i>(Page 2)</i></p></blockquote>
<h4>On Healthy Skepticism</h4>
<p>The first chapter ends with an advice on healthy skepticism:</p>
<blockquote><p>Nevertheless, mistakes can occur. We encourage you to view with suspicion anything we say that looks peculiar. Test it, try it out. Don&#8217;t treat computer output as gospel. If you learn to be wary of everyone else&#8217;s programs, you will be better able to check your own. <i>(Page 7)</i></p></blockquote>
<p>This is reminiscent of Feynman&#8217;s <a href="http://www.feynmanlectures.info/flp_errata.html">words</a> <i>&#8220;You should, in science, believe logic and arguments, carefully drawn, and not authorities. &#8230; I am not sure how I did it, but I goofed. And you goofed, too, for believing me.&#8221;</i> (Page x, <i><a href="http://www.amazon.com/gp/product/0805390456?ie=UTF8&#038;tag=ashmehblo-20&#038;linkCode=as2&#038;camp=1789&#038;creative=390957&#038;creativeASIN=0805390456">Feynman Lectures in Physics</a>; Vol I</i>).</p>
<h4>On Clever Programming</h4>
<p>The chapter on expressions has the famous words on <i>clever programming</i> that are often quoted:</p>
<blockquote><p>Everyone knows that debugging is twice as hard as writing a program in the first place. So if you&#8217;re as clever as you can be when you write it, how will you ever debug it? <i>(Page 10)</i></p></blockquote>
<p>Simplicity and clarity trump stray microseconds:</p>
<blockquote><p>Simplicity and clarity are often of more value than the microseconds possibly saved by clever coding&#8230; Trivia rarely affect efficiency. Are all the machinations worth it, when their primary effect is to make the code less readable? <i>(Page 127)</i></p></blockquote>
<h4>On Temporary Variables</h4>
<p>Here&#8217;s some advice on the demerits of arbitrary temporary variables:</p>
<blockquote><p>The fewer temporary variables in a program, the less chance there is that one will not be properly initialized, or that one will be altered unexpectedly before it is used. &#8220;Temporary&#8221; is a dirty word in programming &#8212; it suggests that a variable can be used with less thought than a &#8220;normal&#8221; (permanent?) one, and it encourages the use of one variable for several unrelated calculations. Both are dangerous practices. <i>(Page 11)</i></p></blockquote>
<h4>The Telephone Test</h4>
<p>The authors present a peculiar test, to which <i><a href="http://www.codinghorror.com/blog/archives/000962.html">Elevator Test</a></i> bears some resemblance, to assess code readability:</p>
<blockquote><p>A useful way to decide if some piece of code is clear or not is the &#8220;telephone test.&#8221; If someone could understand your code when read aloud over the telephone, it&#8217;s clear enough. If not, then it needs rewriting. Use the &#8220;telephone test&#8221; for readability. <i>(Page 21)</i></p></blockquote>
<h4>On the Shape of Programs</h4>
<p>The text of the program should be close to the process it evokes:</p>
<blockquote><p>It is a good rule of thumb that a program should read from top to bottom in the order that it will be executed; if this is not true, watch out for the bugs that often accompany poor structure. <i>(Page 37)</i></p></blockquote>
<h4>On Program Structuring</h4>
<p>Write short functions/classes with well defined purpose:</p>
<blockquote><p>When a program is not broken up into small enough pieces, the larger modules often fail to deliver on these promises. They try to do too much, or too many different things, and hence are difficult to maintain and are too specialized for general use. <i>(Page 59)</i> &#8230; Combining too many functions in one module is a sure way to limit its usefulness, while at the same time making it more complex and harder to maintain. <i>(Page 64)</i></p></blockquote>
<h4>On Premature Optimization</h4>
<blockquote><p>&#8220;Optimizing&#8221; too early in the life of a program can kill its chances for growth. <i>(Page 61)</i></p></blockquote>
<h4>On Loosely Coupled Modules</h4>
<blockquote><p>It must be possible to describe the function performed by a module in the briefest of terms and it is necessary to minimize whatever relationships exist with other modules, and display those that remain as explicitly as possible. This is how we obtain the minimum &#8220;coupling&#8221;, and hence maximum independence, between modules. <i>(Page 62)</i></p></blockquote>
<blockquote><p>As we have said several times, the hard part of programming is controlling complexity &#8212; keeping the pieces decoupled so they can be dealt with separately instead of all at once. And the need to separate into pieces is not some academically interesting point, but a practical necessity, to keep things from interacting with each other in unexpected ways. <i>(Page 95)</i></p></blockquote>
<h4>On Encapsulation</h4>
<blockquote><p>One good test of the worth of a module, in fact, is how good a job it does of hiding some aspect of the problem from the rest of the code. <i>(Page 65)</i></p></blockquote>
<h4>On Functions as Black-boxes</h4>
<blockquote><p>&#8230; break the job into five small functions, each one of which can be assimilated separately, then treated as a black box that does some part of the job. Once it works, we need no longer concern ourselves with how it does something, only with the fact that it does. We thus have some assurance that we can deal with the program a small section at a time without much concern for the rest of the code. There is no other way to retain control of a large program. <i>(Page 77)</i></p></blockquote>
<h4>On Top-down design</h4>
<blockquote><p>One of the better ways of [planning program structure] is what is often called &#8220;top-down design.&#8221; In a top-down design, we start with a very general pseudo-code statement of the program &#8230; and then elaborate this statement in stages, filling in details until we ultimately reach executable code. Not only does this help to keep the structure fairly well organized, and avoid getting bogged down in coding too early, but it also means that we can back up and alter bad decisions without losing too much investment. <i>(Page 71)</i></p></blockquote>
<h4>On Recursion</h4>
<blockquote><p>Learning to think recursively takes some effort, but it is repaid with smaller and simpler programs. Not every problem benefits from a recursive approach, but those that deal with data that is recursively defined often lead to very complicated programs unless the code is also recursive. <i>(Page 77)</i></p></blockquote>
<h4>I/O Programming &#8212; Never Trust Any Data &amp; Remember the User</h4>
<blockquote><p>Input/output is the interface between a program and its environment. Two rules govern all I/O programming: <b>NEVER TRUST ANY DATA</b>, and <b>REMEMBER THE USER</b>. This requires that a program be as foolproof as is reasonably possible, so that it behaves intelligently even when used incorrectly, and that it be easy to use correctly. Ask yourself: Will it defend itself against the stupidity and ignorance of its users (including myself)? Would I want to have to use it myself? <i>(Page 97)</i></p></blockquote>
<h4>On Bounds-checking</h4>
<blockquote><p>Some compilers allow a check during execution that subscripts do not exceed array dimensions. This is a help &#8230; many programmers do not use such compilers because &#8220;They&#8217;re not efficient.&#8221; (Presumably this means that it is vital to get the wrong answers quickly.) <i>(Page 85)</i></p></blockquote>
<h4>On Bug Infestation</h4>
<blockquote><p>Where there are two bugs, there is likely to be a third. <i>(Page 102)</i></p></blockquote>
<h4>Floating Point Numbers Are Like Sandpiles</h4>
<blockquote><p>Floating point arithmetic adds a new spectrum of errors, all based on the fact that the machine can represent numbers only to a finite precision. <i>(Page 115)</i></p></blockquote>
<blockquote><p>As a wise programmer once said, &#8220;Floating point numbers are like sandpiles: every time you move one, you lose a little sand and you pick up a little dirt.&#8221; And after a few computations, things can get pretty dirty. <i>(Page 117)</i></p></blockquote>
<h4>On Efficiency</h4>
<p>Concerns of efficiency must strike a balance with those of overall cost.</p>
<blockquote><p>Machines have become increasingly cheap compared to people; any discussion of computer efficiency that fails to take this into account is shortsighted. &#8220;Efficiency&#8221; involves the reduction of overall cost &#8212; not just machine time over the life of the program, but also time spent by the programmer and by the users of the program.</p>
<p>A clean design is more easily modified as requirements change or as more is learned about what parts of the code consume significant amounts of execution time. A &#8220;clever&#8221; design that fails to work or to run fast enough can often be salvaged only at great cost. Efficiency does not have to be sacrificed in the interest of writing readable code &#8212; rather, writing readable code is often the only way to ensure efficient programs that are also easy to maintain and modify. </p>
<p>To begin, let us state the obvious. If a program doesn&#8217;t work, it doesn&#8217;t matter how fast it runs. <i>(Page 123)</i></p></blockquote>
<h4>Algorithmic Improvements versus Tuning</h4>
<blockquote><p>How can we really speed it up? Fundamental improvements in performance are most often made by algorithm changes, not by tuning &#8230; There are two lessons. First, time spent selecting a good algorithm is certain to pay larger dividends than time spent polishing an implementation of a poor method. Second, for any given algorithm, polishing is not likely to significantly improve a fundamentally sound, clean implementation. It may even make things worse. <i>(Page 133&#8211;134)</i></p></blockquote>
<h4>On Profiling</h4>
<p>Profile and measure your code before making performance improvements.</p>
<blockquote><p>Beware of preconceptions about where a program spends its time. This avoids the error of looking in the wrong place for improvements. Of course, you have to have some working idea of which part of a program has the most effect on overall speed, but changes designed to improve efficiency should be based on solid measurement, not intuition. </p>
<p>A useful and cheap way to measure how a program spends its time is to count how many times each statement is executed. The resulting set of counts is called the program&#8217;s &#8220;profile&#8221; (a term first used by <a href="http://www-cs-faculty.stanford.edu/~knuth/">D. E. Knuth</a> in an <a href="http://oai.dtic.mil/oai/oai?verb=getRecord&#038;metadataPrefix=html&#038;identifier=AD0715513">article in Software Practice and Experience, April, 1971</a>). Some enlightened computer centers make available a &#8220;profiler&#8221; &#8230; <i>(Page 136)</i></p></blockquote>
<h4>On Documentation</h4>
<p>The sole truth about a program is its text.</p>
<blockquote><p>The only reliable documentation of a computer program is the code itself. The reason is simple &#8212; whenever there are multiple representations of a program, the chance for discrepancy exists. If the code is in error, artistic flowcharts and detailed comments are to no avail. Only by reading the code can the programmer know for sure what the program does. <i>(Page 141)</i></p></blockquote>
<h4>On What Documentation Should Comprise</i></h4>
<blockquote><p>In a project of any size it is vital to maintain readable descriptions of what each program is supposed to do, how it is used, how it interacts with other parts of the system, and on what principles it is based. These form useful guides to the code. What is not useful is a narrative description of what a given routine actually does on a line-by-line basis. Anything that contributes no new information, but merely echoes the code, is superfluous. <i>(Page 141)</i></p></blockquote>
<h4>On Following the Rules</h4>
<p>The book ends with the following paragraph on following the rules of programming style:</p>
<blockquote><p>To paraphrase an observation in <i><a href="http://en.wikipedia.org/wiki/The_Elements_of_Style">The Elements of Style</a></i>, rules of programming style, like those of English, are sometimes broken, even by the best writers. When a rule is broken, however, you will usually find in the program some compensating merit, attained at the cost of the violation. Unless you are certain of doing as well, you will probably do best to follow the rules. <i>(Page 159)</i></p></blockquote>
<h4>A Treasure Trove of Pithy Rules</h4>
<p>You would surely have heard of programming maxims like &#8220;make it right before you make it faster&#8221; or &#8220;don&#8217;t comment bad code &#8212; rewrite it&#8221;. Well, this book is generously sprinkled with such short witty one-lines capturing the essence of the section. Below are some of those words of wisdom.<br/></p>
<ol class="importantlist">
<li>From the Introduction: <span class="tipitem"><span>Write clearly &#8211; don&#8217;t be too clever. </span></span>
  </li>
<li>On Expressions: <span class="tipitem"><span>Say what you mean, simply and directly. </span><span>Use library functions. </span><span>Avoid temporary variables. </span><span>Trying to outsmart a compiler defeats much of the purpose of using one. </span><span>Write clearly &#8211; don&#8217;t sacrifice clarity for &#8220;efficiency&#8221;. </span><span>Let the machine do the dirty work. </span><span>Replace repetitive expressions by calls to a common function. </span><span>Parenthesize to avoid ambiguity. </span><span>Choose variable names that won&#8217;t be confused. </span><span>Use the good features of a language; avoid the bad ones. </span></span>
  </li>
<li>On Control Structures: <span class="tipitem"><span>Use <tt>DO-END</tt> and indenting to delimit groups of statements. </span><span>Use <tt>IF-ELSE</tt> to emphasize that only one of two actions is to be performed. </span><span>Use <tt>DO</tt> and <tt>DO-WHILE</tt> to emphasize the presence of loops. </span><span>Make your programs read from top to bottom. </span><span>Use <tt>IF ... ELSE IF ... ELSE IF ... ELSE ...</tt> to implement multi-way branches. </span><span>Use the fundamental control flow constructs. </span><span>Write first in an easy-to-understand pseudo-language; then translate into whatever language you have to use. </span><span>Avoid <tt>THEN-IF</tt> and null-<tt>ELSE</tt>. </span><span>Avoid <tt>ELSE GOTO</tt> and <tt>ELSE RETURN</tt>. </span><span>Follow each decision as closely as possible with its associated action. </span><span>Use data arrays to avoid repetitive control sequences. </span><span>Choose a data representation that makes the program simple. </span><span>Don&#8217;t stop with your first draft. </span></span>
  </li>
<li>On Program Structures: <span class="tipitem"><span>Modularize. Use subroutines. </span><span>Make the coupling between modules visible. </span><span>Each module should do one thing well. </span><span>Make sure every module hides something. </span><span>Let the data structure the program. </span><span>Don&#8217;t patch bad code &#8211; rewrite it. </span><span>Write and test a big program in small pieces. </span><span>Use recursive procedures for recursively-defined data structures. </span></span>
  </li>
<li>On Input/Output: <span class="tipitem"><span>Test input for validity and plausibility. </span><span>Make sure input cannot violate the limits of the program. </span><span>Terminate input by end-of-file or marker, not by count. </span><span>Identify bad input; recover if possible. </span><span>Treat end-of-file conditions in a uniform manner. </span><span>Make input easy to prepare and output self-explanatory. </span><span>Use uniform input formats. </span><span>Make input easy to proofread. </span><span>Use free-form input when possible. </span><span>Use self-identifying input. Allow defaults. Echo both on output. </span><span>Localize input and output in subroutines. </span></span>
  </li>
<li>On Common Blunders: <span class="tipitem"><span>Make sure all variables are initialized before use. </span><span>Don&#8217;t stop at one bug. </span><span>Use debugging compilers. </span><span>Watch out for off-by-one errors. </span><span>Take care to branch the right way on equality. </span><span>Avoid multiple exits from loops. </span><span>Make sure your code &#8220;does nothing&#8221; gracefully. </span><span>Test programs at their boundary values. </span><span>Program defensively. </span><span>10.0 times 0.1 is hardly ever 1.0. </span><span>Don&#8217;t compare floating point numbers just for equality. </span></span>
  </li>
<li>On Efficiency and Instrumentation: <span class="tipitem"><span>Make it right before you make it faster. </span><span>Keep it right when you make it faster. </span><span>Make it clear before you make it faster. </span><span>Don&#8217;t sacrifice clarity for small gains in &#8220;efficiency.&#8221;</span><span>Let your compiler do the simple optimizations. </span><span>Don&#8217;t strain to re-use code; reorganize instead. </span><span>Make sure special cases are truly special. </span><span>Keep it simple to make it faster. </span><span>Don&#8217;t diddle code to make it faster &#8212; find a better algorithm. </span><span>Instrument your programs. Measure before making &#8220;efficiency&#8221; changes. </span></span>
  </li>
<li>On Documentation: <span class="tipitem"><span>Make sure comments and code agree. </span><span>Don&#8217;t just echo the code with comments &#8212; make every comment count. </span><span>Don&#8217;t comment bad code &#8212; rewrite it. </span><span>Use variable names that mean something. </span><span>Format a program to help the reader understand it. </span><span>Indent to show the logical structure of a program. </span><span>Document your data layouts. </span><span>Don&#8217;t over-comment. </span></span>
  </li>
</ol>
<p><br/></p>
]]></content:encoded>
			<wfw:commentRss>http://ashutoshmehra.net/blog/2009/06/kernighan-plauger-programming-style/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
		</item>
		<item>
		<title>Solving Project Euler Problems 161 (Trominoes Tiling) and 185 (Number Mind) with ZDDs</title>
		<link>http://ashutoshmehra.net/blog/2009/03/solving-project-euler-problems-with-zdds/</link>
		<comments>http://ashutoshmehra.net/blog/2009/03/solving-project-euler-problems-with-zdds/#comments</comments>
		<pubDate>Sun, 08 Mar 2009 14:16:47 +0000</pubDate>
		<dc:creator>Ashutosh</dc:creator>
				<category><![CDATA[Programming]]></category>
		<category><![CDATA[Project Euler]]></category>
		<category><![CDATA[TAOCP]]></category>

		<guid isPermaLink="false">http://ashutoshmehra.net/blog/?p=157</guid>
		<description><![CDATA[Some months ago, I was re-introduced to Project Euler (PE) through the blogosphere. I had visited that site before, but had been left unmotivated by the first dozen or so problem statements that I read &#8212; neither the programming nor the math involved anything new or challenging. Only in the past few months, when I [...]]]></description>
			<content:encoded><![CDATA[<p>Some months ago, I was re-introduced to <a href="http://projecteuler.net/">Project Euler</a> (PE) through the blogosphere. I had visited that site before, but had been left unmotivated by the first dozen or so problem statements that I read &#8212; neither the programming nor the math involved anything new or challenging. Only in the past few months, when I attempted to crack some of the harder nuts, did I realize how interesting many PE problems could be &#8212; requiring a neat algorithm (or invoking a crucial theorem) for an efficient solution. <span id="more-157"></span></p>
<h4>My experience with Project Euler Problems</h4>
<p>As I attempted to solve more and more problems, I found a disproportionately large number of them centered around the theory of numbers. These involved important ideas of number sieves, congruences, continued fractions, the Farey series, solving <a href="http://en.wikipedia.org/wiki/Pell%27s_equation">Pell&#8217;s equation</a>, implementing <a href="http://en.wikipedia.org/wiki/Shanks-Tonelli_algorithm">Shanks Tonelli algorithm</a> etc., and had me sifting through Hardy and Wright&#8217;s book every so often. But after a while they weren&#8217;t so much fun. I did, however, find enough problems of my interest to get me hooked &#8212; enumeration and dynamic programming (<a href="http://projecteuler.net/index.php?section=problems&#038;id=208">208</a>, <a href="http://projecteuler.net/index.php?section=problems&#038;id=161">161</a>, <a href="http://projecteuler.net/index.php?section=problems&#038;id=209">209</a>, <a href="http://projecteuler.net/index.php?section=problems&#038;id=172">172</a>, <a href="http://projecteuler.net/index.php?section=problems&#038;id=169">169</a>, <a href="http://projecteuler.net/index.php?section=problems&#038;id=215">215</a>, <a href="http://projecteuler.net/index.php?section=problems&#038;id=219">219</a>), probability (<a href="http://projecteuler.net/index.php?section=problems&#038;id=227">227</a>, <a href="http://projecteuler.net/index.php?section=problems&#038;id=213">213</a>) and those occasionally unique ones like <a href="http://projecteuler.net/index.php?section=problems&#038;id=197">197</a> (concerning the &#8220;steady-state&#8221; behavior of a certain sequence). Working through these problems made me realize what a treasure PE was. Kudos to Colin &#8220;Euler&#8221; Hughes and the PE-team for their effort in running this great site!</p>
<p>In addition to having the fun of solving the problems myself, I could study the solutions worked out by other members in the forum. Seeing the elegance, efficiency and analyses of some of their solutions was a rewarding (even if a bit humbling) experience. A case in point is the solution to <a href="http://projecteuler.net/index.php?section=problems&#038;id=208">Robot Walks (208)</a> by <tt>sajninredoc</tt> and <tt>stijn263</tt> (among others), where they reduce the enumeration problem to a single summation. And then there were those APL/J programmers with their cute one-liners!</p>
<p>In this entry, I shall outline my solutions (and their performance characteristics) to the <a href="http://projecteuler.net/index.php?section=problems&#038;id=161">Trominoes Tiling (161)</a> and <a href="http://projecteuler.net/index.php?section=problems&#038;id=185">Number Mind (185)</a> problems. To solve these problems, I used the <a href="http://en.wikipedia.org/wiki/Zero_suppressed_decision_diagram">ZDD</a> techniques I had just studied in Knuth&#8217;s <a href="http://www-cs-faculty.stanford.edu/~uno/fasc1b.ps.gz">pre-fascicle 1B</a> (now in print as <a href="http://www.amazon.com/Art-Computer-Programming-Fascicle-Techniques/dp/0321580508">Vol 4 Fascicle 1</a>). I had <a href="http://ashutoshmehra.net/blog/2008/12/notes-on-zdds/trackback/">blogged earlier</a> on <a href="http://www-cs-faculty.stanford.edu/~knuth/musings.html">Knuth&#8217;s Fun with ZDDs musing</a>.</p>
<h4>Trominoes Tiling (161): Enumerating Exact Covers using ZDDs</h4>
<p>Trominoes Tiling (<a href="http://projecteuler.net/index.php?section=problems&#038;id=161">161</a>) is almost the tiling problem of TAOCP 7.1.4 &#8212; (130) with the difference that only trominoes are allowed (no monominoes or dominoes) and the grid-size is slightly larger (<tt>9x12</tt> instead of <tt>8x8</tt>). I reuseed, with the obvious changes, the ZDD routines that I had coded while working out that section. Since Knuth has already explained the ideas involved so beautifully (see the text around 7.1.4 &#8212; (129)), I shall only briefly sketch the ZDD construction before giving some performance statistics.</p>
<p>To begin, we first model the tiling problem as an <a href="http://en.wikipedia.org/wiki/Exact_cover">Exact Cover</a>. This involves creating a boolean matrix <img src="http://ashutoshmehra.net/blog/wp-content/plugins/easy-latex/cache/tex_dc7ce529b32ce1c71aa499962a871b74.png" title="(a_{ij})" style="vertical-align:-20%;" class="tex" alt="(a_{ij})" /> of <tt>9x12</tt> = <tt>108</tt> columns (corresponding to cells of the board) and <tt>526</tt> rows (corresponding to the ways of placing the <tt>L</tt> and <tt>I</tt> trominoes in all possible orientations on a <tt>9x12</tt> grid). We have <img src="http://ashutoshmehra.net/blog/wp-content/plugins/easy-latex/cache/tex_1c9980b7224f0d30f9fca908a5d28d32.png" title="a_{ij} = 1" style="vertical-align:-20%;" class="tex" alt="a_{ij} = 1" /> iff tromino placement <img src="http://ashutoshmehra.net/blog/wp-content/plugins/easy-latex/cache/tex_b2d5fbf99790e9a8f89f51d876bf7d45.png" title="i" style="vertical-align:-20%;" class="tex" alt="i" /> occupies cell <img src="http://ashutoshmehra.net/blog/wp-content/plugins/easy-latex/cache/tex_17205bd7356afc72c6e1cad22749dbb0.png" title="j" style="vertical-align:-20%;" class="tex" alt="j" />. The strategy for placement numbering I used is shown below for the simpler <tt>4x4</tt> case (the cells are numbered in the usual row-major order). Since the placement strategy decides the variable-ordering of the ZDD and hence its size (unless, of course, we choose to sift/reorder variables later on), it is important to pick a placement strategy that is not too inefficient.<br/></p>
<p><center><div id="attachment_210" class="wp-caption aligncenter" style="width: 498px"><a href="http://ashutoshmehra.net/blog/wp-content/uploads/2009/03/tiling.png"><img src="http://ashutoshmehra.net/blog/wp-content/uploads/2009/03/tiling.png" alt="Tromino Placement Numbering" title="Tromino Placement Numbering" width="488" height="394" class="size-full wp-image-210" /></a><p class="wp-caption-text">Tromino Placement Numbering for a 4x4 grid (also used for the rows of the exact cover matrix and ZDD variable ordering)</p></div></center><br/></p>
<p><center><div id="attachment_235" class="wp-caption aligncenter" style="width: 371px"><a href="http://ashutoshmehra.net/blog/wp-content/uploads/2009/03/exactcovermatrix.png"><img src="http://ashutoshmehra.net/blog/wp-content/uploads/2009/03/exactcovermatrix.png" alt="Exact Cover Matrix (for the simplified 4x4 case)" title="Exact Cover Matrix" width="361" height="180" class="size-full wp-image-235" /></a><p class="wp-caption-text">Exact Cover Matrix (for the simplified 4x4 case)</p></div></center><br/></p>
<p>Having constructed the boolean matrix, to enumerate the tilings, we find the number of ways to select some rows of the matrix such that if we inspect any column, precisely one of the selected rows contains a <tt>1</tt> in that column. Hence the name &#8220;exact&#8221; cover &#8212; we neither want to leave any cell uncovered, nor do we want parts of two or more trominoes to overlap.</p>
<p>Exact covers can be enumerated using Knuth&#8217;s <a href="http://en.wikipedia.org/wiki/Knuth%27s_Algorithm_X">Algorithm X</a> &#8212; an efficient backtracking technique implemented using an idea Knuth calls <a href="http://en.wikipedia.org/wiki/Dancing_Links">Dancing Links</a> (<a href="http://www-cs-faculty.stanford.edu/~knuth/programs/dance.w">DANCE program</a>, <a href="http://lanl.arxiv.org/pdf/cs/0011047">paper at arXiv</a>, <a href="http://stanford-online.stanford.edu/seminars/knuth/000222-knuth-100.asx">Computer Musing video</a>, ASL Implementations for <a href="http://stlab.adobe.com:8080/@md=d&#038;cd=//adobe_source_libraries/adobe/&#038;cdf=//adobe_source_libraries/adobe/dancing_links.hpp&#038;c=unH@//adobe_source_libraries/adobe/dancing_links.hpp">dancing_links</a> and <a href="http://stlab.adobe.com:8080/@md=d&#038;cd=//adobe_source_libraries/adobe/&#038;cdf=//adobe_source_libraries/adobe/dancing_links.hpp&#038;c=unH@//adobe_source_libraries/adobe/implementation/toroid.hpp">toroid_node_t</a>). Algorithm X not only <i>enumerates</i> the solution, it in fact <i>generates</i> them all!</p>
<p>To enumerate exact covers using ZDDs, using the <img src="http://ashutoshmehra.net/blog/wp-content/plugins/easy-latex/cache/tex_fe9bb6d8a0edbab4f541d976501129da.png" title="m\times n" style="vertical-align:-20%;" class="tex" alt="m\times n" /> matrix <img src="http://ashutoshmehra.net/blog/wp-content/plugins/easy-latex/cache/tex_dc7ce529b32ce1c71aa499962a871b74.png" title="(a_{ij})" style="vertical-align:-20%;" class="tex" alt="(a_{ij})" /> produced in the step above, we construct the boolean function (Eq. 7.1.4 &#8212; (129)):<br />
<center><img src="http://ashutoshmehra.net/blog/wp-content/plugins/easy-latex/cache/tex_25f8b1b2356d5a5530d3bf454b1bc8ef.png" title="f(x_1,\ldots,x_m) = \bigwedge_{j=1}^n S_1(X_j)" style="vertical-align:-20%;" class="tex" alt="f(x_1,\ldots,x_m) = \bigwedge_{j=1}^n S_1(X_j)" /></center><br />
where boolean variable <img src="http://ashutoshmehra.net/blog/wp-content/plugins/easy-latex/cache/tex_6934f2728bd2faf498c0e5f163f0a6b5.png" title="x_i" style="vertical-align:-20%;" class="tex" alt="x_i" /> indicates selection of row <img src="http://ashutoshmehra.net/blog/wp-content/plugins/easy-latex/cache/tex_b2d5fbf99790e9a8f89f51d876bf7d45.png" title="i" style="vertical-align:-20%;" class="tex" alt="i" /> of the matrix (that is, placement a tromino in the orientation <img src="http://ashutoshmehra.net/blog/wp-content/plugins/easy-latex/cache/tex_b2d5fbf99790e9a8f89f51d876bf7d45.png" title="i" style="vertical-align:-20%;" class="tex" alt="i" />), <img src="http://ashutoshmehra.net/blog/wp-content/plugins/easy-latex/cache/tex_861ea1f62f4123c9fd82e1e3352ac180.png" title="X_j" style="vertical-align:-20%;" class="tex" alt="X_j" /> = <img src="http://ashutoshmehra.net/blog/wp-content/plugins/easy-latex/cache/tex_272031c949db9f3a7a6bc656df4314a2.png" title="\{x_i | a_{ij} = 1\}" style="vertical-align:-20%;" class="tex" alt="\{x_i | a_{ij} = 1\}" /> is the set of rows <img src="http://ashutoshmehra.net/blog/wp-content/plugins/easy-latex/cache/tex_b2d5fbf99790e9a8f89f51d876bf7d45.png" title="i" style="vertical-align:-20%;" class="tex" alt="i" /> that have a <tt>1</tt> in column <img src="http://ashutoshmehra.net/blog/wp-content/plugins/easy-latex/cache/tex_17205bd7356afc72c6e1cad22749dbb0.png" title="j" style="vertical-align:-20%;" class="tex" alt="j" />, and <img src="http://ashutoshmehra.net/blog/wp-content/plugins/easy-latex/cache/tex_569d1d4fd0a693fbd6651ca3238a06ce.png" title="S_1" style="vertical-align:-20%;" class="tex" alt="S_1" /> is the <i>Symmetric Boolean Function</i> that is true if exactly one of its inputs are true. The function <img src="http://ashutoshmehra.net/blog/wp-content/plugins/easy-latex/cache/tex_8cb65257e3377e4035d06ddf990b6657.png" title="f" style="vertical-align:-20%;" class="tex" alt="f" /> will be true iff for each column <img src="http://ashutoshmehra.net/blog/wp-content/plugins/easy-latex/cache/tex_17205bd7356afc72c6e1cad22749dbb0.png" title="j" style="vertical-align:-20%;" class="tex" alt="j" />, exactly one of the selected row <img src="http://ashutoshmehra.net/blog/wp-content/plugins/easy-latex/cache/tex_b2d5fbf99790e9a8f89f51d876bf7d45.png" title="i" style="vertical-align:-20%;" class="tex" alt="i" /> has <img src="http://ashutoshmehra.net/blog/wp-content/plugins/easy-latex/cache/tex_02ad58c6e982b858ab5d2594c9363554.png" title="a_{ij}" style="vertical-align:-20%;" class="tex" alt="a_{ij}" /> = 1 &#8212; this is just the condition for exact cover!<br/></p>
<p>For various ways to efficiently construct the above ZDD, see Exercise 7.1.4 &#8212; 212. The function <img src="http://ashutoshmehra.net/blog/wp-content/plugins/easy-latex/cache/tex_569d1d4fd0a693fbd6651ca3238a06ce.png" title="S_1" style="vertical-align:-20%;" class="tex" alt="S_1" /> can itself be implemented using Exercise 207&#8217;s &#8220;Symmetrizing&#8221; operation.<br/></p>
<p>Once we have the ZDD for the boolean function <img src="http://ashutoshmehra.net/blog/wp-content/plugins/easy-latex/cache/tex_8cb65257e3377e4035d06ddf990b6657.png" title="f" style="vertical-align:-20%;" class="tex" alt="f" /> above, the <i>number of solutions</i>, i.e., the number of vectors <img src="http://ashutoshmehra.net/blog/wp-content/plugins/easy-latex/cache/tex_48928d13ced42d8a72b7bc5b9c4d194a.png" title="(x_1,\ldots,x_m)" style="vertical-align:-20%;" class="tex" alt="(x_1,\ldots,x_m)" /> that make <img src="http://ashutoshmehra.net/blog/wp-content/plugins/easy-latex/cache/tex_8cb65257e3377e4035d06ddf990b6657.png" title="f" style="vertical-align:-20%;" class="tex" alt="f" /> true (which are precisely the vectors representing exact covers) can be readily found using the ZDD-analog of Algorithm 7.1.4C.<br/></p>
<p><i>Runtime performance of the solution</i>: The resulting ZDD involved 526 variables and had a size of <tt>7893743</tt> zeads. Without invoking any garbage collection, the peak memory usage was <tt>~600MB</tt>  (where each zead in my implementation was a <tt>20</tt>-byte node); time taken was <tt>34s</tt> (user) and <tt>85s</tt> (elapsed). Given that my memos weren&#8217;t very optimized (I had used GCC&#8217;s <tt>std::map</tt>) and neither had I tried doing any variable reordering, the performance seemed reasonable.<br/></p>
<p>A dynamic programming approach (like one I later used for <a href="http://projecteuler.net/index.php?section=problems&#038;id=215">Crack-free Walls (215)</a>),  should have been able to give the results in about a second (this was confirmed by posts on the forum). Nevertheless, the ZDD solution was fast enough to keep my conscience clear of any violations of <a href="http://projecteuler.net/index.php?section=about">Project Euler&#8217;s one-minute-rule</a>.<br/></p>
<p><i>Aside</i>: The <a href="http://projecteuler.net/index.php?section=problems&#038;id=215">Crack-free Walls (215)</a> problem is kind of like TAOCP Exercise 7.1.4 &#8212; 214 (Knuth calls it &#8220;faultfree&#8221;) and should be amenable to the ZDD attack. There, however, appears to be a danger of hitting space-out since the grid-size <tt>10x32</tt> is somewhat large. I&#8217;ve not tried this approach.<br/></p>
<h4>Number Mind (185): Using ZDDs to satisfy an ad-hoc set of constraints</h4>
<p><a href="http://projecteuler.net/index.php?section=problems&#038;id=185">Number Mind (185)</a> was the other PE problem that I solved with ZDDs. In this problem we&#8217;re to uncover a 16-digit number given a set of &#8220;guesses&#8221; of the form:<br />
<tt><br />
5616185650518293 ;2 correct<br />
3847439647293047 ;1 correct<br />
5855462940810587 ;3 correct<br />
. . .<br />
</tt></p>
<p>The guesses along with the &#8220;hit-rates&#8221; provide partial information about the secret number. Our aim is to find the &#8220;secret&#8221; number corresponding to the set of guesses.</p>
<p>To solve this problem, we create variables <img src="http://ashutoshmehra.net/blog/wp-content/plugins/easy-latex/cache/tex_6880c20b2ca196156a42d330b95f3113.png" title="x_{i,j}" style="vertical-align:-20%;" class="tex" alt="x_{i,j}" /> representing the condition that <img src="http://ashutoshmehra.net/blog/wp-content/plugins/easy-latex/cache/tex_b2d5fbf99790e9a8f89f51d876bf7d45.png" title="i" style="vertical-align:-20%;" class="tex" alt="i" /><sup>th</sup> digit (<img src="http://ashutoshmehra.net/blog/wp-content/plugins/easy-latex/cache/tex_2afff0691791fbd2606dc4117a512d56.png" title="1\leq i\leq 16" style="vertical-align:-20%;" class="tex" alt="1\leq i\leq 16" />) is <img src="http://ashutoshmehra.net/blog/wp-content/plugins/easy-latex/cache/tex_17205bd7356afc72c6e1cad22749dbb0.png" title="j" style="vertical-align:-20%;" class="tex" alt="j" /> (<img src="http://ashutoshmehra.net/blog/wp-content/plugins/easy-latex/cache/tex_731d15f2e46cf9802cf5a7d84f821fe4.png" title="0\leq j\leq 9" style="vertical-align:-20%;" class="tex" alt="0\leq j\leq 9" />). We then have the following constraints: </p>
<ul>
<li>Since each position <img src="http://ashutoshmehra.net/blog/wp-content/plugins/easy-latex/cache/tex_b2d5fbf99790e9a8f89f51d876bf7d45.png" title="i" style="vertical-align:-20%;" class="tex" alt="i" /> can hold just one digit, for each <img src="http://ashutoshmehra.net/blog/wp-content/plugins/easy-latex/cache/tex_b2d5fbf99790e9a8f89f51d876bf7d45.png" title="i" style="vertical-align:-20%;" class="tex" alt="i" /> exactly one of <img src="http://ashutoshmehra.net/blog/wp-content/plugins/easy-latex/cache/tex_6880c20b2ca196156a42d330b95f3113.png" title="x_{i,j}" style="vertical-align:-20%;" class="tex" alt="x_{i,j}" /> can be true. Constraints of this kind correspond to terms like <img src="http://ashutoshmehra.net/blog/wp-content/plugins/easy-latex/cache/tex_33048833db55c02df3be7c605eb7c54f.png" title="S_1(x_{i0},\ldots,x_{i9})" style="vertical-align:-20%;" class="tex" alt="S_1(x_{i0},\ldots,x_{i9})" />, where <img src="http://ashutoshmehra.net/blog/wp-content/plugins/easy-latex/cache/tex_569d1d4fd0a693fbd6651ca3238a06ce.png" title="S_1" style="vertical-align:-20%;" class="tex" alt="S_1" /> is again our friend, the symmetric boolean function.</li>
<li>Each of the given guess &#8220;hit-rates&#8221; must be satisfied. As an example, the third constraint &#8220;<tt>5855462940810587 (3 correct)</tt>&#8221; is represented as:<br />
<center><img src="http://ashutoshmehra.net/blog/wp-content/plugins/easy-latex/cache/tex_a60bfd11aef715abf28cbc57cace0534.png" title="S_3(x_{15},x_{28},x_{35},x_{45},x_{54},x_{66},x_{72},x_{89}, x_{94},x_{A0},x_{B8},x_{C1},x_{D0},x_{E5},x_{F8},x_{G7})" style="vertical-align:-20%;" class="tex" alt="S_3(x_{15},x_{28},x_{35},x_{45},x_{54},x_{66},x_{72},x_{89}, x_{94},x_{A0},x_{B8},x_{C1},x_{D0},x_{E5},x_{F8},x_{G7})" /></center>
</li>
</ul>
<p>Using the symmetrizing operator from Exercise 7.1.4 &#8212; 207, both the above kinds of constraints are easily represented. Finally, we compute the <tt>AND</tt> (or, <tt>INTERSECT</tt>, if one prefers family-of-subsets point-of-view) of the individual constraints &#8212; and we&#8217;re left with the final ZDD representing the family of feasible solutions (in our case, the solution in fact turns out to be unique).</p>
<p><i>Runtime performance of the solution</i>: The ZDD had <tt>160</tt> variables <img src="http://ashutoshmehra.net/blog/wp-content/plugins/easy-latex/cache/tex_6880c20b2ca196156a42d330b95f3113.png" title="x_{i,j}" style="vertical-align:-20%;" class="tex" alt="x_{i,j}" />, program execution had a peak memory usage of <tt>~1GB</tt> without any garbage collection or reordering (zead-size <tt>20</tt>-bytes), size of the largest partial function was <tt>4665450</tt> zeads. The running time was <tt>~16s</tt> (both user and elapsed).</p>
<h4>Conclusion</h4>
<p>Comparing the ZDD solutions to the two PE problems, I think it was the kind of &#8220;unstructured&#8221; problem like Number Mind where the ZDD technique really shined.</p>
]]></content:encoded>
			<wfw:commentRss>http://ashutoshmehra.net/blog/2009/03/solving-project-euler-problems-with-zdds/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
<enclosure url="http://stanford-online.stanford.edu/seminars/knuth/000222-knuth-100.asx" length="75" type="video/x-ms-asf" />
		</item>
	</channel>
</rss>
