{"id":314,"date":"2005-06-01T20:31:35","date_gmt":"2005-06-01T20:31:35","guid":{"rendered":"http:\/\/fiveforks.com\/jeb\/2005\/06\/neat_story_pipe\/"},"modified":"2005-06-01T20:31:35","modified_gmt":"2005-06-01T20:31:35","slug":"neat_story_pipe","status":"publish","type":"post","link":"https:\/\/www.fiveforks.com\/jeb\/2005\/06\/neat_story_pipe\/","title":{"rendered":"Neat Story (Piper)"},"content":{"rendered":"<p>From: TedCashel@atlmug.org (Ted Cashel)<\/p>\n<p>Date: Wed Mar 31, 1999  11:10:17 PM US\/Eastern<\/p>\n<p>To: jeb@5forks.com<\/p>\n<p>Cc: ndcashel@yahoo.com<\/p>\n<p>Subject: Fwd: Neat story<\/p>\n<p>Reply-To: Ted_Cashin@atlmug.org (Ted Cashin)<\/p>\n<p>Attachments: There is 1 attachment<\/p>\n<p>(Embedded image moved to file: pic09930.jpg)<\/p>\n<p>jcashel@harland.net didn&#8217;t work so I am trying you here<\/p>\n<p>I got this from a recent Tidbits and it reminded me of your prime number program. This guy has a more complex program (albeit nearly as pointless) but has the background to have tried it on a proud line of mainframes and modern PC&#8217;s.<\/p>\n<p>Power Macintosh G3: The Cannonball Express<\/p>\n<p>&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;<\/p>\n<p>by Rick Holzgrafe <\/p>\n<p>The Cannonball Express was the fabled train that was so fast it<\/p>\n<p>took three men to say &#8220;Here she comes,&#8221; &#8220;Here she is,&#8221; and &#8220;There<\/p>\n<p>she goes.&#8221; Computers are fast too, although unlike trains, most<\/p>\n<p>aren&#8217;t self-propelled. What makes a computer fast, and how much<\/p>\n<p>effect does software design have? How much faster are today&#8217;s<\/p>\n<p>computers than yesterday&#8217;s? Recently I revisited some of these<\/p>\n<p>questions, beginning with a trip down memory lane.<\/p>\n<p><!--more--><br \/>\n**Back in the Stone Age** &#8212; Twenty years ago, I was teaching<\/p>\n<p>myself programming and had access to a DEC PDP-11\/60 minicomputer<\/p>\n<p>on evenings and weekends. This beast was bigger than a washing<\/p>\n<p>machine, and during workdays I shared it with two dozen other<\/p>\n<p>technicians and engineers. I found a word puzzle in a magazine and<\/p>\n<p>thought it would be fun to program the PDP to solve it. The puzzle<\/p>\n<p>was as follows.<\/p>\n<p>Given a phrase and a sheet of graph paper, write the phrase on the<\/p>\n<p>graph paper according to these rules:<\/p>\n<p>1. Write one letter per square on the graph paper, like filling in<\/p>\n<p>a crossword puzzle. Ignore case and anything that&#8217;s not one of the<\/p>\n<p>26 letters of the alphabet. The phrase &#8220;N. 1st Street&#8221; is thus<\/p>\n<p>identical to &#8220;NSTSTREET&#8221;.<\/p>\n<p>2. Put the first letter of the phrase in any square you like.<\/p>\n<p>3. After writing any letter, put the next letter of the phrase in<\/p>\n<p>any adjacent square. Here, &#8220;adjacent&#8221; means any of the eight<\/p>\n<p>neighboring squares, up, down, left, right, or diagonally. You may<\/p>\n<p>reuse a square if it is adjacent and already holds the letter you<\/p>\n<p>need; otherwise you must use a blank square. You can&#8217;t use the<\/p>\n<p>same square twice in a row &#8211; no &#8220;hopping in place&#8221; for double<\/p>\n<p>letters.<\/p>\n<p>The goal is to write the phrase inside a rectangle of the smallest<\/p>\n<p>possible area. (A subtle point: you are not trying to write in a<\/p>\n<p>minimal number of squares.) To score your solution, draw the<\/p>\n<p>smallest enclosing rectangle you can and take its area. The<\/p>\n<p>rectangle may enclose some blank squares; count them, too.<\/p>\n<p>Got it? Tongue-twisters are the most fun because they have lots of<\/p>\n<p>opportunities to reuse whole snaky strings of squares. The 37<\/p>\n<p>letters in &#8220;Peter Piper picked a peck of pickled peppers&#8221; can be<\/p>\n<p>packed into a 3 by 5 rectangle, like this (view this in a<\/p>\n<p>monospaced font):<\/p>\n<p><code>OFIPT<\/p>\n<p>KCPER<\/p>\n<p>LEDAS<\/code><\/p>\n<p>In those days I knew computers were &#8220;fast&#8221; but had no idea how<\/p>\n<p>fast. The answer turned out to be &#8220;not very.&#8221; I wrote a program to<\/p>\n<p>solve these puzzles and called it Piper after the tongue-twister.<\/p>\n<p>I set Piper running on a medium-length phrase on a Friday evening,<\/p>\n<p>and came back on Monday to find it still running. It had found<\/p>\n<p>several less-than-best solutions but hadn&#8217;t finished. Way too slow<\/p>\n<p>&#8211; I found a better solution myself on paper in about half an hour.<\/p>\n<p>Why did it take so long? Piper was a &#8220;brute force&#8221; program. It<\/p>\n<p>tried every possible solution to the problem, one after another.<\/p>\n<p>The trouble is that there are too many possible solutions. Exactly<\/p>\n<p>how many depends on the phrase, but for any non-trivial phrase the<\/p>\n<p>number is astronomical. I realized for the first time that &#8220;fast&#8221;<\/p>\n<p>sometimes isn&#8217;t &#8220;fast enough.&#8221; This point may be obvious today,<\/p>\n<p>when we all use computers and are weary of waiting for them. But<\/p>\n<p>in 1979, that PDP-11 was only the second computer I had ever seen!<\/p>\n<p>**What Part of Fast Don&#8217;t You Understand?** I saw that I would<\/p>\n<p>have to make Piper faster. There are two basic ways to speed up a<\/p>\n<p>program. Plan A is to find a better way of solving the problem,<\/p>\n<p>but after twenty years I still haven&#8217;t thought of a better<\/p>\n<p>solution. That leaves plan B, the classic efficiency expert&#8217;s<\/p>\n<p>solution: eliminate unnecessary steps. For example, Piper created<\/p>\n<p>every possible solution, then calculated the area of each. It<\/p>\n<p>built each solution one letter at a time, so instead of taking the<\/p>\n<p>area only for completed solutions, I changed Piper to check the<\/p>\n<p>area after placing each letter. If placing a letter made the<\/p>\n<p>solution-in-progress take up more space than the smallest complete<\/p>\n<p>solution found so far, Piper could skip the rest of that solution<\/p>\n<p>(and all other solutions that started the same way) and move right<\/p>\n<p>on to the next one. This eliminated a huge amount of work and<\/p>\n<p>greatly improved Piper&#8217;s speed. Finding clever ways to track the<\/p>\n<p>area of a growing solution helped too, because it was faster than<\/p>\n<p>calculating the area from scratch after each letter. I also found<\/p>\n<p>a way to calculate a minimum size for the final solution quickly:<\/p>\n<p>I couldn&#8217;t guarantee that the best solution would be that small,<\/p>\n<p>but I could guarantee that it wouldn&#8217;t be smaller. If Piper got<\/p>\n<p>lucky and found a solution as small as that calculated minimum, it<\/p>\n<p>could stop immediately. Otherwise it would continue on after<\/p>\n<p>finding the best solution, vainly seeking a still better one.<\/p>\n<p>Eventually Piper became clever enough to finish that original<\/p>\n<p>phrase in a reasonably short period of time. But the holy grail<\/p>\n<p>continued to elude me: I wanted a solution for &#8220;How much wood<\/p>\n<p>would a woodchuck chuck if a woodchuck could chuck wood?&#8221; That PDP<\/p>\n<p>(and, perhaps, my cleverness) were not up to the task. I had run<\/p>\n<p>out of ideas for speeding up Piper, and runs still took longer<\/p>\n<p>than a weekend. But if I couldn&#8217;t improve Piper, I could at least<\/p>\n<p>hope to run it on a faster computer.<\/p>\n<p>**Big Iron** &#8212; People tend to think of processor speed as the<\/p>\n<p>speed of a computer, but many factors affect overall performance.<\/p>\n<p>Virtual memory lets you work on bigger data sets or on more<\/p>\n<p>problems at a time, but it&#8217;s slow, so adding more physical RAM<\/p>\n<p>helps by reducing your reliance on virtual memory. Faster disks<\/p>\n<p>and I\/O buses load and save data more quickly. RAM disks and disk<\/p>\n<p>caching replace slow disk operations with lightning-quick RAM<\/p>\n<p>access. Instruction and data caches in special super-fast RAM<\/p>\n<p>offer big improvements for some programs. Well-written operating<\/p>\n<p>systems and toolboxes can outrun poorly written ones.<\/p>\n<p>But in the end, little of this matters to Piper. Piper has always<\/p>\n<p>used only a small amount of data, doesn&#8217;t read or write the disk<\/p>\n<p>after it gets going, and does little I\/O of any kind. With its<\/p>\n<p>small code and data set Piper can take good advantage of data and<\/p>\n<p>instruction caching, but what it mostly needs is &#8220;faster hamsters&#8221;<\/p>\n<p>&#8211; a faster processor to make the wheels turn more quickly.<\/p>\n<p>As the years rolled on, I ran versions of Piper on my first Macs,<\/p>\n<p>but in the middle 1980&#8217;s I worked for Apple Computer, and had<\/p>\n<p>access to a programmer&#8217;s dream: Apple&#8217;s $15 million Cray Y-MP<\/p>\n<p>supercomputer, one of only two dozen in the world and arguably the<\/p>\n<p>fastest computer in existence at the time. I figured the Cray<\/p>\n<p>would make short work of Piper. But the Cray was not well suited<\/p>\n<p>to the problem. It could barrel through parallel-processing<\/p>\n<p>floating-point matrix calculations like the Cannonball Express,<\/p>\n<p>but Piper was a highly linear, non-mathematical problem. Piper<\/p>\n<p>used only one of the Cray&#8217;s four processors and didn&#8217;t do the kind<\/p>\n<p>of operations at which the Cray excelled. Piper wasn&#8217;t a fair test<\/p>\n<p>of the Cray&#8217;s power, but the Cray was still the fastest machine<\/p>\n<p>I&#8217;d ever used. The Cray succeeded where all previous machines<\/p>\n<p>(that PDP, my Mac Plus, my Mac II) had failed. It solved<\/p>\n<p>&#8220;woodchuck&#8221; in less than a day, taking only about 20 hours to<\/p>\n<p>finish its run. I was awestruck &#8211; 20 hours?! I&#8217;d no idea that<\/p>\n<p>&#8220;woodchuck&#8221; was _that_ big a problem!<\/p>\n<p>**Young Whippersnappers** &#8212; I set Piper aside for many years, but<\/p>\n<p>recently I began to wonder how a modern desktop box compares to<\/p>\n<p>those old minicomputers and mainframes. I rewrote Piper from<\/p>\n<p>memory and ran it on my new 400 MHz ice-blue Power Macintosh G3<\/p>\n<p>with &#8220;woodchuck.&#8221; The output is below. Piper first reprints the<\/p>\n<p>phrase, then prints solutions and elapsed times as it finds them.<\/p>\n<p>Each solution is the best found so far, culminating in the best of<\/p>\n<p>all. The times are in seconds from the beginning of the run; the<\/p>\n<p>final time is the total run time. (Unfortunately, the best<\/p>\n<p>solution for &#8220;woodchuck&#8221; is larger than Piper&#8217;s calculated<\/p>\n<p>minimum, so Piper continued to run for a bit after finding the<\/p>\n<p>best solution.)<\/p>\n<p>Here are the results. Some intermediate solutions have been left<\/p>\n<p>out for brevity, but you can see Piper finding ever smaller<\/p>\n<p>solutions. In the end, the 57 letters in &#8220;woodchuck&#8221; are packed<\/p>\n<p>into a 4 by 4 rectangle. Have a look at Piper&#8217;s total run time,<\/p>\n<p>and the time needed to find its best solution:<\/p>\n<p>How much wood would a woodchuck chuck if a woodchuck could chuck wood?<\/p>\n<p>0 seconds:<\/p>\n<p><code>      ULD  ADLU<\/p>\n<p>HDOAIUCOHWUHOD<\/p>\n<p>UCOWFKHDOMCWOW<\/p>\n<p>K   WO<\/code><\/p>\n<p>1 seconds:<\/p>\n<p><code>   DLIFADLU<\/p>\n<p>UCKOHWUHOD<\/p>\n<p>OHDWOMCWOW<\/code><\/p>\n<p>2 seconds:<\/p>\n<p><code>      ULCHC<\/p>\n<p>HWUHODKUK<\/p>\n<p>OMCWOWAFI<\/code><\/p>\n<p>7 seconds:<\/p>\n<p><code>   HWUHOW<\/p>\n<p>OMCWOD<\/p>\n<p>IKAUCW<\/p>\n<p>FLDHK<\/code><\/p>\n<p>9 seconds:<\/p>\n<p><code>   HWUH<\/p>\n<p>OMCW<\/p>\n<p>LUOK<\/p>\n<p>HDOI<\/p>\n<p>CWAF<\/code><\/p>\n<p>65 seconds:<\/p>\n<p><code>   HWM<\/p>\n<p>OUC<\/p>\n<p>IKH<\/p>\n<p>FWC<\/p>\n<p>ADO<\/p>\n<p>LUO<\/code><\/p>\n<p>67 seconds:<\/p>\n<p><code>   HWMU<\/p>\n<p>OOCH<\/p>\n<p>UDWK<\/p>\n<p>LAFI<\/code><\/p>\n<p>Total run time: 107 seconds<\/p>\n<p>There you have it: a shade over a minute to find the best<\/p>\n<p>solution, under two minutes to finish its run. Two minutes! So<\/p>\n<p>much for the big iron of the 1980&#8217;s. My new G3 Mac finished<\/p>\n<p>&#8220;woodchuck&#8221; over 600 times faster (and 5,000 times cheaper) than<\/p>\n<p>that 15 megabuck Cray. (For a more realistic comparison, see this<\/p>\n<p>description of UCLA&#8217;s Project Appleseed.)<\/p>\n<p><a href=\"http:\/\/www.apple.com\/education\/hed\/aua0101\/appleseed\/\">www.apple.com\/education\/hed\/aua0101\/appleseed\/<\/a><\/p>\n<p>If you want to check Piper&#8217;s speed on your Mac, I&#8217;ve placed the<\/p>\n<p>code in the public domain; it&#8217;s a 40K package.<\/p>\n<p>ftp:\/\/ftp.tidbits.com\/pub\/tidbits\/misc\/piper.hqx<\/p>\n<p>**The Future** &#8212; What&#8217;s yet to come? 400 MHz already looks a<\/p>\n<p>little pokey. It&#8217;s the best Apple offers today, but I&#8217;ve seen<\/p>\n<p>claims of 550 MHz or so from third party accelerators and over-<\/p>\n<p>clocking tricks. People are predicting 1 GHz (1,000 MHz) chips for<\/p>\n<p>the near future. Buses are getting faster, and caches hold more<\/p>\n<p>data in less space and are moving onto the processor chip for<\/p>\n<p>still more speed. (Small is fast. Did you know that the speed of<\/p>\n<p>light is a serious limiting factor in modern computer design? The<\/p>\n<p>closer together the components are, the faster they can signal<\/p>\n<p>each other.)<\/p>\n<p>And like the old Cray, multi-processor desktop systems are<\/p>\n<p>starting to appear. They gang up on a problem by having separate<\/p>\n<p>processors work on different parts of the problem simultaneously.<\/p>\n<p>Although I didn&#8217;t try to use the Cray&#8217;s extra processors, I&#8217;ve<\/p>\n<p>done a little thinking lately. Piper doesn&#8217;t have to be completely<\/p>\n<p>linear. On an eight-processor system, I bet I could come close to<\/p>\n<p>making Piper run in one-eighth of the time of a single processor.<\/p>\n<p>Are you ready?<\/p>\n<p>Here she comes &#8211;<\/p>\n<p>Here she is &#8211;<\/p>\n<p>There she GOES!<\/p>\n","protected":false},"excerpt":{"rendered":"<p>From: TedCashel@atlmug.org (Ted Cashel) Date: Wed Mar 31, 1999 11:10:17 PM US\/Eastern To: jeb@5forks.com Cc: ndcashel@yahoo.com Subject: Fwd: Neat story Reply-To: Ted_Cashin@atlmug.org (Ted Cashin) Attachments: There is 1 attachment (Embedded image moved to file: pic09930.jpg) jcashel@harland.net didn&#8217;t work so I &hellip; <a href=\"https:\/\/www.fiveforks.com\/jeb\/2005\/06\/neat_story_pipe\/\">Continue reading <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[16],"tags":[],"class_list":["post-314","post","type-post","status-publish","format-standard","hentry","category-apple"],"_links":{"self":[{"href":"https:\/\/www.fiveforks.com\/jeb\/wp-json\/wp\/v2\/posts\/314","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.fiveforks.com\/jeb\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.fiveforks.com\/jeb\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.fiveforks.com\/jeb\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.fiveforks.com\/jeb\/wp-json\/wp\/v2\/comments?post=314"}],"version-history":[{"count":0,"href":"https:\/\/www.fiveforks.com\/jeb\/wp-json\/wp\/v2\/posts\/314\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.fiveforks.com\/jeb\/wp-json\/wp\/v2\/media?parent=314"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.fiveforks.com\/jeb\/wp-json\/wp\/v2\/categories?post=314"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.fiveforks.com\/jeb\/wp-json\/wp\/v2\/tags?post=314"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}