<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>M@Blog &#187; Coding</title>
	<atom:link href="http://mattwork.potsdam.edu/blog/category/general-coding/feed/" rel="self" type="application/rss+xml" />
	<link>http://mattwork.potsdam.edu/blog</link>
	<description></description>
	<lastBuildDate>Wed, 18 Nov 2009 23:09:58 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.8.6</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>An Anniversary, of sorts</title>
		<link>http://mattwork.potsdam.edu/blog/2009/11/18/an-anniversary-of-sorts/</link>
		<comments>http://mattwork.potsdam.edu/blog/2009/11/18/an-anniversary-of-sorts/#comments</comments>
		<pubDate>Wed, 18 Nov 2009 23:09:58 +0000</pubDate>
		<dc:creator>M</dc:creator>
				<category><![CDATA[Coding]]></category>
		<category><![CDATA[Life]]></category>

		<guid isPermaLink="false">http://mattwork.potsdam.edu/blog/?p=599</guid>
		<description><![CDATA[Twenty years ago this week, I wrote my first real computer program, kinda. I know this only because I found the source code in an unlikely place recently, and surprisingly I dated it in the comments. It was on a VIC-20, written in BASIC. I say &#8220;kinda&#8221;, because I didn&#8217;t write it, originally, I improved [...]]]></description>
			<content:encoded><![CDATA[<p>Twenty years ago this week, I wrote my first real computer program,<em> kinda</em>. I know this only because I found the source code in an unlikely place recently, and surprisingly I dated it in the comments. It was on a <a href="http://en.wikipedia.org/wiki/VIC_20">VIC-20</a>, written in <a href="http://en.wikipedia.org/wiki/BASIC">BASIC</a>. I say &#8220;kinda&#8221;, because I didn&#8217;t write it, originally, I improved it. I took a one-keyboard multiplayer baseball game, (QWEASDZXC (left) for player 1 and UIOJKLM&lt;&gt; (right)  for player 2), and wrote what became my first network protocol (I didn&#8217;t know that until a few years later) so that two copies could run on two VIC-20&#8217;s connected by a serial cable. To this day I&#8217;ve never met or heard of anyone who networked two VIC-20&#8217;s together.</p>
<p>Finding my copy of BASICball, re-reading my cute little grade-school comments (&#8221;Screen math is stupid&#8221;), remembering  anecdotes about what was going on at the time (&#8221;Our class gerbil is coming home with me for Thanksgiving break!!!!&#8221;), caught me up in an atypical wave of nostalgia. And in that wave, in looking back &#8211; really thinking about everything I have written over the years: millions upon millions of lines &#8211; that two things really stand out as being key to my success: <a href="http://en.wikipedia.org/wiki/Worse_is_better">Worse is Better</a>, and <a href="http://en.wikipedia.org/wiki/Modularity_%28programming%29">Modularity</a> is Ultimate.</p>
<p>I&#8217;ll write more about those things, I&#8217;m sure, but for now: Happy 20th Anniversary BASICball v6.0.</p>
<p>I just aged a bit, happily.</p>
]]></content:encoded>
			<wfw:commentRss>http://mattwork.potsdam.edu/blog/2009/11/18/an-anniversary-of-sorts/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>More About Endocrys</title>
		<link>http://mattwork.potsdam.edu/blog/2009/10/30/more-about-endocrys/</link>
		<comments>http://mattwork.potsdam.edu/blog/2009/10/30/more-about-endocrys/#comments</comments>
		<pubDate>Sat, 31 Oct 2009 02:12:22 +0000</pubDate>
		<dc:creator>M</dc:creator>
				<category><![CDATA[Architecture]]></category>
		<category><![CDATA[Coding]]></category>
		<category><![CDATA[Linuxy]]></category>

		<guid isPermaLink="false">http://mattwork.potsdam.edu/blog/?p=584</guid>
		<description><![CDATA[I previously mentioned that I&#8217;ve re-acquired rights to Endocrys, and that I was excited about it. My copious free time has been spent, of late, ripping it apart and making it cleaner and applying the lessons learned over 7 years of maintaining a sizable (458 system (peak)) Endocrys network.
Endocrys has two primary modular components: Autocrys [...]]]></description>
			<content:encoded><![CDATA[<p>I <a href="http://mattwork.potsdam.edu/blog/2009/10/27/introducing-endocrys/">previously mentioned</a> that I&#8217;ve re-acquired rights to Endocrys, and that I was excited about it. My copious free time has been spent, of late, ripping it apart and making it cleaner and applying the lessons learned over 7 years of maintaining a sizable (458 system (peak)) Endocrys network.</p>
<p>Endocrys has two primary modular components: Autocrys and Paracrys.</p>
<p><strong>Autocrys</strong> is an extensible communication protocol atop XMPP. It governs the syntax of commands or queries sent to systems or groups, the responses of systems to those queries, how to manage their presence, and how to react to presence changes in others.</p>
<p><strong>Paracrys</strong> is a database-driven deployment and configuration system. Paracrys allows module code and configuration data to be stored centrally and deployed to Endocrys nodes on-demand. Paracrys fully supports versioning, thus allowing changes to be rolled-back in the case of a major oopsie. How small can a Paracrys module be? Here&#8217;s an example that implements a command called &#8217;shell&#8217; that allows you to do, essentially, whatever you want on an Endocrys client:</p>
<pre>BEGIN { $Endo::MODS{SHELL}++; $Endo::CMDS{SHELL} = \&amp;shell; }
END { delete $Endo::MODS{SHELL}; delete $Endo::CMDS{SHELL}; }

sub shell {
 return `@_`;
}</pre>
<p>Drop that puppy into the Paracrys MODULES table with some other data, issue a mass &#8220;fetch module SHELL; refresh;&#8221; command, and bingo, all of your systems now let you do very bad things. It&#8217;s that easy to create a command to do something&#8230; Hopefully something useful.</p>
<p>Of course you should note that there is no access control in the above code&#8230; How do we prevent Bad People from using our horrendously very bad shell command? That used to be managed by the Communication Masters using another database called EndoACL, but has been folded into Paracrys&#8217; duties and drastically simplified. Each Endocrys client, when receiving the shell command, will now ask Paracrys if the user who sent it is authorized to issue that command. Previously, the clients never even received commands from users not authorized to send them, at great expense.</p>
<p>One of the major goals of the project originally was to have absolutely minimal dependencies on third-party code, so I reinvented the wheel in numerous places. Now that it&#8217;s mine again, those requirements are vapor and I&#8217;m ripping out large swaths of my code, and exchanging it for API calls into other code that is the de facto standard to do whatever. For example, I wrote a function that copies a file from one location to another. Ew. The <a href="http://search.cpan.org/perldoc?File::Copy">File::Copy</a> module is the Perl Way to do that, so that&#8217;s how we do it now. Less code I have to maintain, and less code you have to read to understand Endocrys.</p>
<p>Another major goal of the original project was absolute redundancy on all levels. With a requirement like that, I over-engineered what were called the Communication Masters (CMs) so that they heart-beated each other, transferred each other&#8217;s sessions, held elections to decide who was authoritative for which IP ranges, dealt with segmentation and partitioning, etc. All of this at the cost of highly-customized hybrid XMPP/SQL servers that weren&#8217;t readily upgradeable. Wednesday night I spent a lot of time diagramming, and tonight solidified the spec to separate the XMPP server from the SQL database, and rely on established high-availability tools like <a href="http://siag.nu/pen/">pen</a> or an SLB appliance to ensure connectivity to a farm of XMPP servers if needed. Additionally, this separation has allowed me to use MySQL clusters for the Paracrys bits, which adds scary levels of redundancy to those very critical bits.</p>
<p>Lastly for this post, the entire ithread Endocrys implementation has been ripped out and replaced with <a href="http://search.cpan.org/perldoc?EV">EV</a> and <a href="http://search.cpan.org/perldoc?AnyEvent">AnyEvent</a>, and the <a href="http://search.cpan.org/perldoc?Net::XMPP">Net::XMPP</a> code has been replaced with <a href="http://search.cpan.org/perldoc?AnyEvent::XMPP">AnyEvent::XMPP</a> for one cohesive event loop that runs very very fast. Originally I envisioned an Endocrys client maintaining dozens of XMPP sessions while handling dozens of system events and receiving dozens of commands, so I stuck everything in threads, and allowed it to scream along on SMP boxes. While this works just fine, there is a LOT of extra complexity involved with sharing variables across threads, dealing with races, etc. and the benefits are dubious when compared against a good, strong, <a href="http://search.cpan.org/perldoc?EV">event-loop system</a>. I&#8217;m not quite done yet, but the net loss should be about 30% of the main code modules, with reduced complexity for all sub-modules as well.</p>
<p>I don&#8217;t have an ETA as to when the code will be generally available, but I&#8217;ve had some pings from some bright people interested in hammering the retooled version in non-critical environments, so hopefully it will be this year.</p>
]]></content:encoded>
			<wfw:commentRss>http://mattwork.potsdam.edu/blog/2009/10/30/more-about-endocrys/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Introducing Endocrys</title>
		<link>http://mattwork.potsdam.edu/blog/2009/10/27/introducing-endocrys/</link>
		<comments>http://mattwork.potsdam.edu/blog/2009/10/27/introducing-endocrys/#comments</comments>
		<pubDate>Tue, 27 Oct 2009 22:52:43 +0000</pubDate>
		<dc:creator>M</dc:creator>
				<category><![CDATA[Architecture]]></category>
		<category><![CDATA[Coding]]></category>
		<category><![CDATA[Linuxy]]></category>

		<guid isPermaLink="false">http://mattwork.potsdam.edu/blog/?p=581</guid>
		<description><![CDATA[Endocrys [en doe kriss] (was Endocryn until a TradeMark popped up) is a distributed, encrypted, modular, real-time, hot-upgradable, self-healing system geared at autonomous communication between distributed systems. It was developed for a client in 2002 and 2003, and they&#8217;ve decided to let the 10-year exclusivity lapse early.
This is one of my favorite products, and I&#8217;m [...]]]></description>
			<content:encoded><![CDATA[<p>Endocrys [en doe kriss] (was Endocryn until a TradeMark popped up) is a distributed, encrypted, modular, real-time, hot-upgradable, self-healing system geared at autonomous communication between distributed systems. It was developed for a client in 2002 and 2003, and they&#8217;ve decided to let the 10-year exclusivity lapse early.</p>
<p>This is one of my favorite products, and I&#8217;m more than a little excited to get it back and get it out. It&#8217;s been battle-tested for many years, and I&#8217;m very proud of it. I&#8217;m working on getting the code cleaned up and abstracted before releasing it under the GPLv2. Below are edited points from slides describing Endocrys and why you might be interested in it.</p>
<h2>The Problems</h2>
<p>Dozens&#8230; hundreds&#8230; of systems, physical and virtual, all going about their business. Then something happens: maybe a disk drive failed, maybe a process died, maybe someone ordered a &#8216;reboot&#8217; or &#8216;halt&#8217;. Those systems don&#8217;t have a way of communicating that externally. There is no &#8220;Hey, I&#8217;m rebooting, BRB&#8221; in the server world.</p>
<p>Dozens&#8230; hundreds&#8230; of systems, physical and virtual, all going about their business. Then you have a question: How many of them have Western Digital harddrives listed in a recent recall? How many of them are running &lt;2GB of RAM? How many of them are running a certain version of some software listed in a security advisory? There&#8217;s no way to ask that question to the farm. There is no &#8220;Dear Lazyweb, answer this question for me&#8221; in the server world.</p>
<h2>The Purpose</h2>
<p>At its most basic level, Endocrys is a conduit between all of the systems and you. Think of it like a gigantic Instant Messaging buddy list, where all of your buddies are systems. When they&#8217;re online, they are in the list and can set their status messages, send you messages, send each other messages, receive messages, etc. Endocrys leverages the eXtensible Messaging and Presence Protocol (XMPP) to tie this framework into existing clients, transports and APIs, enabling a near-infinite number of possible applications or functions you can deploy.</p>
<h2>The Technology</h2>
<p>Endocrys is built as a framework &#8211; an abstract set of rules that can be extended at any time by writing little modules. These modules can be applied across the Endrocrys network instantly, without any downtime.</p>
<p>By leveraging XMPP, the Endocrys network is highly-redundant with no single fail points. Any number of &#8220;Communication Masters&#8221; (XMPP servers) are online, but only one is needed to keep communication flowing. All network communication is encrypted and signed. Partitioning and segmentation is handled rationally.</p>
<p>Communication is very similar to Instant Messaging, there is relatively no latency, and XMPP assures delivery even to systems offline when the message was sent.</p>
<p>Monitoring and control systems can participate on Endocrys, automating the remediation of problems remotely and automatically.</p>
<h2>The Protocol</h2>
<p>XMPP sits atop TCP, and atop that sits the Endocrys Communication Protocol aka Autocrys. ECP is a fully-authenticated, fully-controlled, skeptical protocol that serves both for sending structured announcements as well as sending and processing commands. The entire ECP specification is listed in AUTOCRYS.TXT. Endocrys agents can be written in any programming language, attached to any other framework, at any OSI level, as long as they can speak XMPP and implement ECP appropriately.</p>
]]></content:encoded>
			<wfw:commentRss>http://mattwork.potsdam.edu/blog/2009/10/27/introducing-endocrys/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Can You Have Too Many Roombas?</title>
		<link>http://mattwork.potsdam.edu/blog/2009/10/08/can-you-have-too-many-roombas/</link>
		<comments>http://mattwork.potsdam.edu/blog/2009/10/08/can-you-have-too-many-roombas/#comments</comments>
		<pubDate>Thu, 08 Oct 2009 22:45:00 +0000</pubDate>
		<dc:creator>M</dc:creator>
				<category><![CDATA[Architecture]]></category>
		<category><![CDATA[Coding]]></category>
		<category><![CDATA[Linuxy]]></category>
		<category><![CDATA[Roomba]]></category>

		<guid isPermaLink="false">http://mattwork.potsdam.edu/blog/?p=553</guid>
		<description><![CDATA[I have four Roombas of three different models (I blame Steve for telling me about &#8220;deals&#8221;). I think I may have too many. Regardless, the one thing they all have in common is a hacked together BlueTooth connection so I can run various software on them remotely. While I haven&#8217;t really talked a lot about [...]]]></description>
			<content:encoded><![CDATA[<p>I have four Roombas of three different models (I blame Steve for telling me about &#8220;deals&#8221;). I think I may have too many. Regardless, the one thing they all have in common is a hacked together BlueTooth connection so I can run various software on them remotely. While I haven&#8217;t really talked a lot about those &#8220;various softwares&#8221;, I&#8217;m really excited about a project I&#8217;m working on now, working title of RooCluster.</p>
<p>RooCluster is a command-and-control application designed for the special needs of  multiple robots operating in the same space, or over large multi-room spaces. Each Roomba is being fitted with an RFID tag, which, in coordiation with some more wireless access points, allows me to triangulate where a Roomba is and its travel vector (sometimes, math is cool). This information can help RooCluster avoid nasty Roomba-on-Roomba collisions, and also presents the possibility of meta-virtual walls.</p>
<p>If you have a Roomba, you probably have a virtual wall &#8211; the little pylon that sends out an infrared beam that the Roombas treat just like a wall. With some work, RooCluster should be able to honor coordinate-based lines (which could, in turn, form other shapes) and effectively &#8220;wall-off&#8221; areas without needing a physical barrier, or a battery-sucking virtual wall. You can also overlay the position and vector data onto floorplans, and see exactly where the Roombas are, and where they&#8217;re going.</p>
<p>Of course, you can also use it to make your Roombas dance with each other.</p>
<p>Or joust.</p>
]]></content:encoded>
			<wfw:commentRss>http://mattwork.potsdam.edu/blog/2009/10/08/can-you-have-too-many-roombas/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>FOSS&#8217;s Microsoft Hatred &#8220;a Disease&#8221;</title>
		<link>http://mattwork.potsdam.edu/blog/2009/07/27/fosss-microsoft-hatred-a-disease/</link>
		<comments>http://mattwork.potsdam.edu/blog/2009/07/27/fosss-microsoft-hatred-a-disease/#comments</comments>
		<pubDate>Mon, 27 Jul 2009 13:11:32 +0000</pubDate>
		<dc:creator>M</dc:creator>
				<category><![CDATA[Coding]]></category>
		<category><![CDATA[Linuxy]]></category>
		<category><![CDATA[Work]]></category>
		<category><![CDATA[Linux]]></category>

		<guid isPermaLink="false">http://mattwork.potsdam.edu/blog/?p=489</guid>
		<description><![CDATA[Purloined wholesale from SlashDot, and reposted. The hypocrisy of a vocal minority of FOSS-purists has been overwhelming. As Linus accurately points out, we&#8217;re all in it for selfish reasons. No one likes working on the boring bits, and we all contribute our best code to the things we care about at the moment.
&#8220;In the aftermath [...]]]></description>
			<content:encoded><![CDATA[<p>Purloined wholesale from <a href="http://linux.slashdot.org/story/09/07/25/1757253/Linus-Calls-Microsoft-Hatred-a-Disease">SlashDo</a>t, and reposted. The hypocrisy of a vocal minority of FOSS-purists has been overwhelming. As Linus accurately points out, we&#8217;re all in it for selfish reasons. No one likes working on the boring bits, and we all contribute our best code to the things we care about at the moment.</p>
<blockquote><p><em>&#8220;In the aftermath of <a href="http://news.slashdot.org/story/09/07/20/1643251">Microsoft&#8217;s recent decision to contribute 20,000 lines of device driver code to the Linux community</a>, Christopher Smart of Linux Magazine talked to Linus Torvalds and asked if the code was something he would be happy to include, even though it&#8217;s from Microsoft. &#8216;Oh, I&#8217;m a big believer in &#8220;technology over politics.&#8221; I don&#8217;t care who it comes from, as long as there are solid reasons for the code, and as long as we don&#8217;t have to worry about licensing etc. issues,&#8217; says Torvalds. &#8216;I may make jokes about Microsoft at times, but at the same time, <a href="http://www.linux-mag.com/cache/7439/1.html">I think the Microsoft hatred is a disease</a>. I believe in open development, and that very much involves not just making the source open, but also not shutting other people and companies out.&#8217; Smart asked Torvalds if Microsoft was contributing the code to benefit the Linux community or Microsoft. &#8216;I agree that it&#8217;s driven by selfish reasons, but that&#8217;s how all open source code gets written! We all &#8220;scratch our own itches.&#8221; It&#8217;s why I started Linux, it&#8217;s why I started git, and it&#8217;s why I am still involved. It&#8217;s the reason for everybody to end up in open source, to some degree,&#8217; says Torvalds. &#8216;So complaining about the fact that Microsoft picked a selfish area to work on is just silly. Of course they picked an area that helps them. That&#8217;s the point of open source — the ability to make the code better for your particular needs, whoever the &#8220;your&#8221; in question happens to be.&#8217;&#8221;</em></p></blockquote>
]]></content:encoded>
			<wfw:commentRss>http://mattwork.potsdam.edu/blog/2009/07/27/fosss-microsoft-hatred-a-disease/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>&#8220;Secure Programming&#8221;</title>
		<link>http://mattwork.potsdam.edu/blog/2009/05/19/secure-programming/</link>
		<comments>http://mattwork.potsdam.edu/blog/2009/05/19/secure-programming/#comments</comments>
		<pubDate>Tue, 19 May 2009 04:57:45 +0000</pubDate>
		<dc:creator>M</dc:creator>
				<category><![CDATA[Architecture]]></category>
		<category><![CDATA[Coding]]></category>
		<category><![CDATA[Linuxy]]></category>
		<category><![CDATA[Opinions]]></category>
		<category><![CDATA[Work]]></category>

		<guid isPermaLink="false">http://mattwork.potsdam.edu/blog/?p=454</guid>
		<description><![CDATA[Earlier this year I developed a two-day seminar for a client on &#8220;secure programming&#8221; techniques. The intended audience was automation programmers, but at the last minute, after seeing the &#8220;syllabus&#8221;, they decided to include their application programmers. The result was a flurry of re-working to accommodate the Java crowd in addition to the Perl/PHP/Python crowd. [...]]]></description>
			<content:encoded><![CDATA[<p>Earlier this year I developed a two-day seminar for a client on &#8220;secure programming&#8221; techniques. The intended audience was automation programmers, but at the last minute, after seeing the &#8220;syllabus&#8221;, they decided to include their application programmers. The result was a flurry of re-working to accommodate the Java crowd in addition to the Perl/PHP/Python crowd. In the end, the bulk of the message was the same. While I can&#8217;t republish my stack for trade-secret reasons, I can dump the agnostic bits (mostly my slide notes) here. The code examples will be in human-readable Perl or pseudo-code, but will work in any language.</p>
<h2>Users Are Un[trust]worthy</h2>
<p>The first, and most important point, is that you can never trust user-supplied data. Ever. Just because your whiz-bang drop-down box only shows the user four options, doesn&#8217;t mean they can&#8217;t end up sending `rm -Rf /`. This is the <strong>single largest threat</strong> against an in-house application. User-supplied data must always always be vetted, sanitized and treated as if it could take over the world. A lot of programmers use rapid-application-development tools to create interfaces. These tools are a boon for creating slick interfaces, but generally a complete bust at data integrity enforcement. You tell it &#8220;make this a date field&#8221;, and it does&#8230; But anyone with a quick script can insert 40MB of garbage data, which will at best crash your application, and at worst, overflow a buffer that had improper bounds (see next point) and allow &#8220;remote execution of arbitrary code&#8221; (READ: You&#8217;re fucked).</p>
<p>So how do you fix this? Regardless of the languages you program in, they probably have a <a href="http://store.xkcd.com/xkcd/#RegularExpressionsShirt">regular expressions</a> (regexp) library. Many popular languages even have &#8220;Perl-Compatible Regular Expressions&#8221; (PCRE), as there is no better language for data process than Perl, and its regexp syntax is well-defined (albeit with a moderate learning-curve). Regexps are the easiest way to enforce your intentions with data.</p>
<pre>unless($userdate =~ m#^\d\d\d\d/\d\d/\d\d$#) { Freak_Out(); }</pre>
<p>That&#8217;s a one-liner that says, exactly,&#8221;unless the data in the variable $userdate begins with four-digits, followed by a fore-slash, followed by two-digits, followed by a fore-slash, followed by two-digits, and that&#8217;s it &#8211; Freak Out&#8221;. (m &#8211; we&#8217;re matching, # &#8211; start the regexp, ^ &#8211; match at the begining of the data, \d &#8211; a digit (0-9), $ &#8211; match the end of the data, # &#8211; end the regexp).</p>
<p>You could easily write a function to do this for you:</p>
<pre>sub Is_Date {
  my $userdate;
  if($userdate =~ m#^\d\d\d\d/\d\d/\d\d$#) { return 1; }
  else { return 0; }
}</pre>
<p>Then just call</p>
<pre>unless(Is_Date($usercrap)) { Freak_Out(); }</pre>
<p>every time you need to check that you&#8217;ve REALLY got a date. Not hard. The user could still be entering the data wrong, but that&#8217;s not the secure programmer&#8217;s job. <img src='http://mattwork.potsdam.edu/blog/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
<p>It is imperative that your application go out of its way to recognize crap as soon as it can. Don&#8217;t pass a variable containing unchecked user-supplied data to a dozen  or so functions before checking it. Accept it, and check it immediately. SQL injection attacks are another whole can of worms that can take advantage of improper sanitization, and frequently get executed in strange places. Sanitize early.</p>
<p>I know you guys aren&#8217;t writing AJAX stuff, but I&#8217;d be derelict if I didn&#8217;t hammer this out anyhow: If you trust JavaScript or any other client or browser -executed code to do your security bits for you, you are not only a moron, but should be made to wear a Scarlet Zero for the next few years. Zero for &#8220;0wn3d&#8221;. Just because you write JavaScript, doesn&#8217;t mean that&#8217;s how the browser will execute it. ANYONE can rewrite your JavaScript to do whatever they want. The above IsDate function, if implemented in client-side JavaScript can easily simply &#8220;return 1&#8243; regardless of what&#8217;s passed. <strong>You must never trust client-supplied data</strong> regardless of whether you believe it was vetted client-side. It <em><strong>must</strong></em> be vetted server-side. Must. Must. Must. If, for user convenience, you want to implement something to trap errors quickly client-side, that&#8217;s groovy, but in no way does it excuse you from vetting the content once submitted to your application.</p>
<p>I can&#8217;t say this enough. It seems like common-sense, and a lot of you are nodding at me right now, but a month on you&#8217;re going to write a quick webform with radio-button selection, and someone in Bejing is going to own your database server because they submitted `echo root:fXAWEOo7DsNSM:0:0:0:0::: &gt; /etc/shadow` to the &#8220;What&#8217;s your favorite fruit?&#8221; question. [PAUSE] It&#8217;s funny right now, it&#8217;s not funny when you&#8217;re facing a DoD inquiry.</p>
<h2>Data Sizes Are Critical (AKA Know Your Data)</h2>
<p>The second most important thing I can convey to you is the concept of sanitizing types. Those of us in the room that are automation programmers and write in the P-languages don&#8217;t have to give a shit about this, but those of you using Java, C or anything that comes out of Redmond, WA need to listen up: If you&#8217;re blindly stuffing data into a restricted type, you better be damn certain it&#8217;s the right size. Last year one of your competitors had a nice lad write an internal automation system that took data from a database and did things with it. Anyone know how large a MySQL blob-type is? 64k bytes. Anyone know how large a Microsoft Visual BASIC date-type is? 8 bytes. Thankfully, he wasn&#8217;t writing a user-facing application. [PAUSE] Last I heard, he was writing TARP-laundering algorithms for AIG.</p>
<p>Everytime you hear about a &#8220;buffer-overflow&#8221; attack, this is because some moron didn&#8217;t check data before stuffing it into memory. The lower-level you code, the more important this is. The example I gave about the VB app only cost the contractor a few dozen-thousand dollars and a loss-of-face from their client because VB just panics and dies when you do that. If you&#8217;re coding in C and not running on a very secure linux box with a memory-jockeying system, you may well have just overwritten your security code with:</p>
<pre>blahblaWastingSpaceUntilYourDataTypeIsOverblahblahblahCall Function DoSomethingBad</pre>
<p>Yeah, that&#8217;s what a buffer-overflow looks like. Know your data, and when in doubt <em>cast</em>. A lot of instructors shy away from casting. It slows things down, it causes a lot of compile-time checks that frequently create errors or warnings in otherwise clean code. Casting is the absolute best way to make sure you&#8217;re not blowing out a primitive type. Of course, if you&#8217;re writing into a char-array, that&#8217;s not going to help you, but that&#8217;s your problem. <img src='http://mattwork.potsdam.edu/blog/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
<h2>Lazy Handles</h2>
<p>You guys know all about access permissions, so I&#8217;m going to skip over a bunch of stuff. One thing I&#8217;ve seen here and elsewhere, is opening handles &#8211; network sockets, database handles, file handles, directory handles, etc. &#8211; as users with way more permission than necessary. Sure, it&#8217;s easier for you, but what happens when your code gets hosed and someone from Bulgaria managed to inject themselves in before you&#8217;ve let go of that handle? Explaining to your boss that you opened the files as root &#8220;because it was easier&#8221; isn&#8217;t going to fly. Your application needs to run restricted, which you already know, and you absolutely cannot wantonly elevate handles without damn good reason. I can&#8217;t count the number of applications I&#8217;ve seen &#8211; Hell, applications I&#8217;ve <em>written</em> &#8211; that immediately connect to a resource as God himself to do some stuff that&#8217;s pretty primitive &#8211; and maybe, MAYBE one operation in fifty that actually needed that access. Which leads to the next problem here&#8230;</p>
<p>Keeping handles open longer than necessary, or in anticipation of future operations. If your application does six operations on a resource and only one needs the assistance of Angels, then that&#8217;s the only operation that gets the Silver Trumpets &#8211; the rest can stew along the rest of us. I don&#8217;t care if you&#8217;re doing all six of those operations at once: you run the primitives as a primitive user, and then either change-user or fork or whatever you need to do as the elevated user for that one operation. Yeah, I groan about it too. But that&#8217;s it. Every moment your application has an elevated resource connection, is a moment that someone is going to get their</p>
<pre>INSERT INTO USERS user='yakov',password='smirnov'; GRANT ALL FOR ALL TO USER identified by 'yakov';</pre>
<p>into your data. It&#8217;ll take your auditors a fiscal quarter before they find that account.</p>
<p>With the exception of operating-system handles, you can almost always switch out, rebind, setuid, or the like, without making a new connection.</p>
<h2>1176da21241f79203fbd93e367f35142</h2>
<p>Yeah, encrypt <em>everything</em>. I know you guys are using SSL for nearly all resource connections, which is ridiculously important. Even the server-to-server stuff that we used to shrug off and say &#8220;yeah, well it&#8217;s on a switched network, and in the same room&#8221; has <em>got</em> to be encrypted on the wire. There are too many techniques and tools out there to get in the middle of those streams, and we all know how reliable network administrators are at picking up that stuff. [PAUSE]</p>
<p>Even within your application, if you&#8217;re writting out temp data, encrypt it from possibly prying eyes and processes. If you&#8217;ve got in-memory data that&#8217;ll be sitting around for some time, and might be a candidate for swapping, encrypt it in memory. How many of you have ever even considered encrypting live data? If that data gets stale, and the operating system decides to swap out some pages, that stuff is possibly going to be visible unless your infrastructure takes into account encrypted swap and the like. In lots of languages, encryption and decryption of relatively small amounts of data (&lt; available RAM) is pretty simple. I&#8217;m not advocating it for everything, but there&#8217;s not much harm in:</p>
<pre>$data=encrypt($data);
{ #Long
  #Running
} #Block.
$data=decrypt($data);</pre>
<p>If something weird happens, and some or all of $data is swapped out because it&#8217;s not being used, it&#8217;s useless to an attacker.</p>
]]></content:encoded>
			<wfw:commentRss>http://mattwork.potsdam.edu/blog/2009/05/19/secure-programming/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>EXT4, AKA &#8220;Watching M@ Wrestle His Demons&#8221;</title>
		<link>http://mattwork.potsdam.edu/blog/2008/03/10/ext4-aka-watching-m-wrestle-his-demons/</link>
		<comments>http://mattwork.potsdam.edu/blog/2008/03/10/ext4-aka-watching-m-wrestle-his-demons/#comments</comments>
		<pubDate>Tue, 11 Mar 2008 00:08:33 +0000</pubDate>
		<dc:creator>M</dc:creator>
				<category><![CDATA[Architecture]]></category>
		<category><![CDATA[Coding]]></category>
		<category><![CDATA[Linuxy]]></category>

		<guid isPermaLink="false">http://mattwork.potsdam.edu/blog/2008/03/10/ext4-aka-watching-m-wrestle-his-demons/</guid>
		<description><![CDATA[As someone with a history in filesystems, I have kept abreast of the EXT4 developments, most recently, an interview with Eric Sandeen on the topic, from an RH/Fedora angle. EXT4 will have lots of features new to EXT3 (as well as possibly pick some of the remnant EXT2 stalwarts up, shake them up a bit, [...]]]></description>
			<content:encoded><![CDATA[<p>As someone with a history in filesystems, I have kept abreast of the <a href="http://en.wikipedia.org/wiki/Ext4">EXT4</a> developments, most recently, an <a href="http://fedoraproject.org/wiki/Interviews/EricSandeen">interview with Eric Sandeen</a> on the topic, from an RH/Fedora angle. EXT4 will have lots of features new to EXT3 (as well as possibly pick some of the remnant EXT2 stalwarts up, shake them up a bit, and get them to convert). EXT4, unfortunately, has no features new to filesystems in general, nor even open source ones. After reading through the latest &#8220;new feature&#8221; list in Eric&#8217;s interview, it read like a list of <a href="http://oss.sgi.com/projects/xfs/">XFS</a> features from a decade ago&#8230; Then I got to the bottom, and Eric worked @ <a href="http://www.sgi.com/">SGI</a> on <a href="http://oss.sgi.com/projects/xfs/">XFS</a> for 5 years, almost a decade ago. Go figure.</p>
<p>This is divisive for me.  Because I&#8217;m a Darwinist: I believe that it&#8217;s perfectly fine if there are multiple organisms that do exactly the same thing, as eventually one will win and one will die. At the same time, I see all the energy being put into EXT4, and the practical side of me is screaming &#8220;HEY, YOU&#8217;RE JUST GETTING EXT TO WHERE XFS WAS 10 YEARS AGO- HOW ABOUT PUTTING YOUR ENERGY INTO XFS AND GETTING IT TO WHERE IT WILL BE 10 YEARS FROM NOW, FASTER?! KTHXBAI&#8221;. The main source for this schism is that the environment is rigged. XFS has been open source and stably available on Linux since springish of 2000 (and BSDs, shortly thereafter), yet it still has to fight tooth and nail for any recognition or prominent placement within Linux distros.</p>
<p>XFS has been a workhorse for those in-the-know for a long, long time. I have clients that bought SGI products <em>because</em> they wanted XFS, yet is has consistently been relegated to the backseat, in favor of the old-guard <strike>UFS</strike>.. er I mean EXT family. XFS continues to amaze various Linux administrators I work with, who, after having XFS inflicted on them, can&#8217;t imagine life without it. The catastrophes you can survive- albeit painfully- with XFS, are totally unrecoverable with EXT. The extreme operating conditions you can endure with XFS, are outside the realm where EXT can safely exist.</p>
<p>So there it is: my internal Filing System Fight Club. It is the beauty of open source, while at the same time the ugliness of it. Reinventing the wheel. Again. Thankfully <a href="http://lkml.org/lkml/2007/6/12/242">some</a> <a href="http://fuse.sourceforge.net/wiki/index.php/FileSystems">people</a> are doing new things in this space.</p>
]]></content:encoded>
			<wfw:commentRss>http://mattwork.potsdam.edu/blog/2008/03/10/ext4-aka-watching-m-wrestle-his-demons/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Introducing: RAID-E</title>
		<link>http://mattwork.potsdam.edu/blog/2008/03/08/introducing-raid-e/</link>
		<comments>http://mattwork.potsdam.edu/blog/2008/03/08/introducing-raid-e/#comments</comments>
		<pubDate>Sat, 08 Mar 2008 15:48:22 +0000</pubDate>
		<dc:creator>M</dc:creator>
				<category><![CDATA[Architecture]]></category>
		<category><![CDATA[Coding]]></category>
		<category><![CDATA[Linuxy]]></category>

		<guid isPermaLink="false">http://mattwork.potsdam.edu/blog/2008/03/08/introducing-raid-e/</guid>
		<description><![CDATA[Introduction
Do you have some data you&#8217;d love to have backed-up, in real-time, somewhere else,  but don&#8217;t trust the destination? RAID-E is a great solution for you. The concept is simple, every time you modify a file, RAID-E makes a copy, encrypts it using one of numerous methods described  below, encrypts the name of the [...]]]></description>
			<content:encoded><![CDATA[<p><strong>Introduction</strong></p>
<p>Do you have some data you&#8217;d love to have backed-up, in real-time, somewhere else,  but don&#8217;t trust the destination? RAID-E is a great solution for you. The concept is simple, every time you modify a file, RAID-E makes a copy, encrypts it using one of numerous methods described  below, encrypts the <em>name </em>of the file as well (by default), and then copies it off to where it should be.</p>
<p>RAID-E is a <a href="http://fuse.sf.net/">FUSE </a>filesystem that can use any underlying, mountable filesystem (or folders within the filesystem) as its sources and targets. You can, for example, RAID-E your &#8220;Documents&#8221; folder on your laptop to some Windows-shared space on a file server or NetApp.  RAID-E may be ported to other FUSEless operating systems someday.</p>
<p>RAID-E supports a rainbow of encryption algorithms and works in one of three modes:</p>
<p><strong>PGP/GPG Encryption</strong></p>
<p>If you have a PGP/GPG key, and would like everything to be encrypted using that, RAID-E is happy to oblige, and will use your public key to encrypt your files before copy. If you want them signed as well, then you will need to provide your keyring passphrase in order to access your private and/or signing key.</p>
<p><strong>Standard Encryption</strong></p>
<p>RAID-E can generate an encryption key (or you can provide one) using one of a number of user-selectable algorithms which it will then encrypt (using one of a number, user-selectable algorithms) using a phrase of your choice. The encrypted key can be stored on a USB stick or other flash-based removable media. When RAID-E is mounted, the phrase is required to decrypt the key, allowing files to be encrypted.</p>
<p><strong>Cornucopia Encryption</strong></p>
<p>Using the same concept as &#8216;Standard Encryption&#8217;, Cornucopia uses a number of different encryption algorithms. Individual files are encrypted using a pseudo-randomly picked algorithm. For example, one file might use AES, while the next Two-Fish. While the security advantage of this is admittedly dubious (don&#8217;t Doghouse me, Bruce &#8211; I admit it!) , it won&#8217;t decrease your security and may protect fractions of your dataset against attacks directed at particular algorithms.</p>
<p><strong>Bootstrapping/Offline Synchronization</strong></p>
<p>To aid in the initial bootstrapping, and to make synchronizing after making off-line changes a snap, a tool called &#8216;mirror-e&#8217; is also included. &#8216;mirror-e&#8217; will use the same configuration and methodology as RAID-E to encrypt and copy any changed files or new files between the source and target.</p>
<p><strong>Various Configurable Defaults</strong></p>
<p>By default, RAID-E will never delete files from a target.</p>
<p>By default, RAID-E is only concerned with file names and contents, not metadata, attributes, etc.</p>
<p>Be default, RAID-E does not verify a copy operation.</p>
<p>By default, RAID-E will always overwrite a target file.</p>
<p><strong>Status and Errata</strong></p>
<p>RAID-E and its toolset is being developed independently, and will be released under  the GPLv2 license.</p>
]]></content:encoded>
			<wfw:commentRss>http://mattwork.potsdam.edu/blog/2008/03/08/introducing-raid-e/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Back to Billboarding</title>
		<link>http://mattwork.potsdam.edu/blog/2008/03/04/back-to-billboarding/</link>
		<comments>http://mattwork.potsdam.edu/blog/2008/03/04/back-to-billboarding/#comments</comments>
		<pubDate>Wed, 05 Mar 2008 02:33:49 +0000</pubDate>
		<dc:creator>M</dc:creator>
				<category><![CDATA[Architecture]]></category>
		<category><![CDATA[Coding]]></category>
		<category><![CDATA[Linuxy]]></category>

		<guid isPermaLink="false">http://mattwork.potsdam.edu/blog/2008/03/04/back-to-billboarding/</guid>
		<description><![CDATA[Billboarding used to be the way to share groups of read-heavy bit or small-byte data across numerous processes or systems. The concept was simple: Have a file with a bunch of zeros in it, and on occasion change one of the zeros to a one. Processes/systems reading that file would say &#8220;Hmmm, the 4th &#8217;slot&#8217; [...]]]></description>
			<content:encoded><![CDATA[<p>Billboarding used to be <em>the</em> way to share groups of read-heavy bit or small-byte data across numerous processes or systems. The concept was simple: Have a file with a bunch of zeros in it, and on occasion change one of the zeros to a one. Processes/systems reading that file would say &#8220;Hmmm, the 4th &#8217;slot&#8217; is now a &#8216;1&#8242;, and that means [thing]&#8220;. This was used for everything from node status, to process states, to primitive anticipatory scheduling. Then objects became popular. Why read out of a billboard, when you can just share flag data across objects?</p>
<p>Well they&#8217;re back, it seems. Billboards have made a small-yet-noticeable resurgence in a number of systemic regions, and more than a couple system architects have noticed clients requesting solutions that boil down to billboarding (although generally given a more sexy name like &#8220;shared state file&#8221; or &#8220;offline node graph&#8221;).</p>
<p>I&#8217;ve had two projects in the last 6 months that have required, in general, a billboard, and am very happy to see them come back into vogue. They&#8217;ve been &#8220;gone&#8221; long enough so that the cool kids think they&#8217;re brilliant and &#8220;out-of-the-box&#8221; when they propose the &#8220;radical concept&#8221; of the &#8220;shared state file&#8221;, and those of us who&#8217;ve been doing advanced system architecture for .. gasp .. almost 15 years now, just smile and jot &#8220;billboard&#8221; in our notepad.</p>
<p><strong>Systemic Billboards </strong></p>
<p>In environments where there is shared (and preferably clustered) storage, billboards make a ton of sense as a means of easily communicating with other nodes. More than just 0&#8217;s and 1&#8217;s, a node can communicate an array of information about its health &#8211; and also other nodes can ask other nodes to do something. Maybe setting a node state to &#8216;2&#8242; means &#8220;please restart your user services&#8221; or &#8216;3&#8242; means &#8220;please reboot&#8221;. Whenever a node reads its own entry it can go &#8220;Hey cool, I&#8217;m suppose to do something.&#8221; As a means of pre-failure fencing, this is exceptionally handy as one does not necessarily have a connection to a greater network, but still may have a connection to storage- Allowing a control system to tell others &#8220;Hey, something bad is happening, please shut down&#8221;, for example if one node detects a UPS failure or pending drainage.</p>
<p>I currently have a project that requires the assumption that if bad things happen, the only communication available between nodes will be a shared IEEE1394 drive array &#8211; A better place to use a billboard does not exist.</p>
<p><strong>Non-Systemic, or Quasi-Systemic Billboards </strong></p>
<p>Given an application that may have any number of processes (for example, any web-based application), using a billboard as a light IPC for each process to communicate its state or intentions, can save a lot of otherwise tricksy IPC coding. Sure, you can have a pipe dangling out there- But what happens if a process needs to skip a pipe read for a given cycle, and in the mean time the status of the pipe changes? Yes, you could have one pipe per process, but then you&#8217;re looking at a mess that could be more elegantly solved with &#8230; a billboard. With 256 ASCII characters available (I&#8217;m not even going to get into the possibilities with Unicode), you can communicate up to 255 different things per process&#8211; all sitting happily waiting to be read whenever is convenient, or necessary, with very very little side-effect.</p>
<p>Yes, contention issues need to be addressed by your application.</p>
<p>Yes, if your application is poorly designed and process spawn rates are out-of-control, a billboard will destroy your performance.</p>
<p>Yes, if your storage disappears, there is a problem: A big problem that a web application cannot solve, so its inability to get to the billboard could definitely be a sign that maybe it should go into a maintenance state instead of processing transaction it can&#8217;t actually handle.</p>
<p>Yes, if the file becomes corrupted, Bad Things could happen: The application can/should detect such problems and Do The Right Thing.</p>
<p>There are certainly situations where billboards are not the answer. There are certainly situations where using IPC or nodal communication is a better solution. I&#8217;ve never advocated billboards as the end-all of inter-process or inter-nodal communication: Only that it shouldn&#8217;t be discounted in cases it DOES make sense.</p>
]]></content:encoded>
			<wfw:commentRss>http://mattwork.potsdam.edu/blog/2008/03/04/back-to-billboarding/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Back in the Saddle Again</title>
		<link>http://mattwork.potsdam.edu/blog/2007/09/02/back-in-the-saddle-again/</link>
		<comments>http://mattwork.potsdam.edu/blog/2007/09/02/back-in-the-saddle-again/#comments</comments>
		<pubDate>Mon, 03 Sep 2007 03:20:27 +0000</pubDate>
		<dc:creator>M</dc:creator>
				<category><![CDATA[Coding]]></category>
		<category><![CDATA[Game Coding]]></category>

		<guid isPermaLink="false">http://mattwork.potsdam.edu/blog/?p=123</guid>
		<description><![CDATA[The Game&#8230; Ah, yes. Basically I got to the point that I needed more skills- More knowledge- before I could advance. I needed to understand 3D modeling. I needed to understand rigging, and extrusion, and polygon counts. I needed to build my toolkit before I could take what I&#8217;ve done with the &#8220;easy&#8221; parts of [...]]]></description>
			<content:encoded><![CDATA[<p>The Game&#8230; Ah, yes. Basically I got to the point that I needed more skills- More knowledge- before I could advance. I needed to understand 3D modeling. I needed to understand rigging, and extrusion, and polygon counts. I needed to build my toolkit before I could take what I&#8217;ve done with the &#8220;easy&#8221; parts of game design- the concepts, the 2D art, the pen-and-paper mechanics, the plots, and the engine- and actually make them come alive. So, they&#8217;re coming alive now. I&#8217;ve built my first terrain. I&#8217;ve built my first NPC model. Things are moving along. Not by leaps and bounds, by baby steps: But they are moving along, and it feels great to be back in the saddle again.</p>
]]></content:encoded>
			<wfw:commentRss>http://mattwork.potsdam.edu/blog/2007/09/02/back-in-the-saddle-again/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
