<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>M@Blog &#187; Architecture</title>
	<atom:link href="http://mattwork.potsdam.edu/blog/category/architecture/feed/" rel="self" type="application/rss+xml" />
	<link>http://mattwork.potsdam.edu/blog</link>
	<description></description>
	<lastBuildDate>Wed, 18 Nov 2009 23:09:58 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.8.6</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>More About Endocrys</title>
		<link>http://mattwork.potsdam.edu/blog/2009/10/30/more-about-endocrys/</link>
		<comments>http://mattwork.potsdam.edu/blog/2009/10/30/more-about-endocrys/#comments</comments>
		<pubDate>Sat, 31 Oct 2009 02:12:22 +0000</pubDate>
		<dc:creator>M</dc:creator>
				<category><![CDATA[Architecture]]></category>
		<category><![CDATA[Coding]]></category>
		<category><![CDATA[Linuxy]]></category>

		<guid isPermaLink="false">http://mattwork.potsdam.edu/blog/?p=584</guid>
		<description><![CDATA[I previously mentioned that I&#8217;ve re-acquired rights to Endocrys, and that I was excited about it. My copious free time has been spent, of late, ripping it apart and making it cleaner and applying the lessons learned over 7 years of maintaining a sizable (458 system (peak)) Endocrys network.
Endocrys has two primary modular components: Autocrys [...]]]></description>
			<content:encoded><![CDATA[<p>I <a href="http://mattwork.potsdam.edu/blog/2009/10/27/introducing-endocrys/">previously mentioned</a> that I&#8217;ve re-acquired rights to Endocrys, and that I was excited about it. My copious free time has been spent, of late, ripping it apart and making it cleaner and applying the lessons learned over 7 years of maintaining a sizable (458 system (peak)) Endocrys network.</p>
<p>Endocrys has two primary modular components: Autocrys and Paracrys.</p>
<p><strong>Autocrys</strong> is an extensible communication protocol atop XMPP. It governs the syntax of commands or queries sent to systems or groups, the responses of systems to those queries, how to manage their presence, and how to react to presence changes in others.</p>
<p><strong>Paracrys</strong> is a database-driven deployment and configuration system. Paracrys allows module code and configuration data to be stored centrally and deployed to Endocrys nodes on-demand. Paracrys fully supports versioning, thus allowing changes to be rolled-back in the case of a major oopsie. How small can a Paracrys module be? Here&#8217;s an example that implements a command called &#8217;shell&#8217; that allows you to do, essentially, whatever you want on an Endocrys client:</p>
<pre>BEGIN { $Endo::MODS{SHELL}++; $Endo::CMDS{SHELL} = \&amp;shell; }
END { delete $Endo::MODS{SHELL}; delete $Endo::CMDS{SHELL}; }

sub shell {
 return `@_`;
}</pre>
<p>Drop that puppy into the Paracrys MODULES table with some other data, issue a mass &#8220;fetch module SHELL; refresh;&#8221; command, and bingo, all of your systems now let you do very bad things. It&#8217;s that easy to create a command to do something&#8230; Hopefully something useful.</p>
<p>Of course you should note that there is no access control in the above code&#8230; How do we prevent Bad People from using our horrendously very bad shell command? That used to be managed by the Communication Masters using another database called EndoACL, but has been folded into Paracrys&#8217; duties and drastically simplified. Each Endocrys client, when receiving the shell command, will now ask Paracrys if the user who sent it is authorized to issue that command. Previously, the clients never even received commands from users not authorized to send them, at great expense.</p>
<p>One of the major goals of the project originally was to have absolutely minimal dependencies on third-party code, so I reinvented the wheel in numerous places. Now that it&#8217;s mine again, those requirements are vapor and I&#8217;m ripping out large swaths of my code, and exchanging it for API calls into other code that is the de facto standard to do whatever. For example, I wrote a function that copies a file from one location to another. Ew. The <a href="http://search.cpan.org/perldoc?File::Copy">File::Copy</a> module is the Perl Way to do that, so that&#8217;s how we do it now. Less code I have to maintain, and less code you have to read to understand Endocrys.</p>
<p>Another major goal of the original project was absolute redundancy on all levels. With a requirement like that, I over-engineered what were called the Communication Masters (CMs) so that they heart-beated each other, transferred each other&#8217;s sessions, held elections to decide who was authoritative for which IP ranges, dealt with segmentation and partitioning, etc. All of this at the cost of highly-customized hybrid XMPP/SQL servers that weren&#8217;t readily upgradeable. Wednesday night I spent a lot of time diagramming, and tonight solidified the spec to separate the XMPP server from the SQL database, and rely on established high-availability tools like <a href="http://siag.nu/pen/">pen</a> or an SLB appliance to ensure connectivity to a farm of XMPP servers if needed. Additionally, this separation has allowed me to use MySQL clusters for the Paracrys bits, which adds scary levels of redundancy to those very critical bits.</p>
<p>Lastly for this post, the entire ithread Endocrys implementation has been ripped out and replaced with <a href="http://search.cpan.org/perldoc?EV">EV</a> and <a href="http://search.cpan.org/perldoc?AnyEvent">AnyEvent</a>, and the <a href="http://search.cpan.org/perldoc?Net::XMPP">Net::XMPP</a> code has been replaced with <a href="http://search.cpan.org/perldoc?AnyEvent::XMPP">AnyEvent::XMPP</a> for one cohesive event loop that runs very very fast. Originally I envisioned an Endocrys client maintaining dozens of XMPP sessions while handling dozens of system events and receiving dozens of commands, so I stuck everything in threads, and allowed it to scream along on SMP boxes. While this works just fine, there is a LOT of extra complexity involved with sharing variables across threads, dealing with races, etc. and the benefits are dubious when compared against a good, strong, <a href="http://search.cpan.org/perldoc?EV">event-loop system</a>. I&#8217;m not quite done yet, but the net loss should be about 30% of the main code modules, with reduced complexity for all sub-modules as well.</p>
<p>I don&#8217;t have an ETA as to when the code will be generally available, but I&#8217;ve had some pings from some bright people interested in hammering the retooled version in non-critical environments, so hopefully it will be this year.</p>
]]></content:encoded>
			<wfw:commentRss>http://mattwork.potsdam.edu/blog/2009/10/30/more-about-endocrys/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Introducing Endocrys</title>
		<link>http://mattwork.potsdam.edu/blog/2009/10/27/introducing-endocrys/</link>
		<comments>http://mattwork.potsdam.edu/blog/2009/10/27/introducing-endocrys/#comments</comments>
		<pubDate>Tue, 27 Oct 2009 22:52:43 +0000</pubDate>
		<dc:creator>M</dc:creator>
				<category><![CDATA[Architecture]]></category>
		<category><![CDATA[Coding]]></category>
		<category><![CDATA[Linuxy]]></category>

		<guid isPermaLink="false">http://mattwork.potsdam.edu/blog/?p=581</guid>
		<description><![CDATA[Endocrys [en doe kriss] (was Endocryn until a TradeMark popped up) is a distributed, encrypted, modular, real-time, hot-upgradable, self-healing system geared at autonomous communication between distributed systems. It was developed for a client in 2002 and 2003, and they&#8217;ve decided to let the 10-year exclusivity lapse early.
This is one of my favorite products, and I&#8217;m [...]]]></description>
			<content:encoded><![CDATA[<p>Endocrys [en doe kriss] (was Endocryn until a TradeMark popped up) is a distributed, encrypted, modular, real-time, hot-upgradable, self-healing system geared at autonomous communication between distributed systems. It was developed for a client in 2002 and 2003, and they&#8217;ve decided to let the 10-year exclusivity lapse early.</p>
<p>This is one of my favorite products, and I&#8217;m more than a little excited to get it back and get it out. It&#8217;s been battle-tested for many years, and I&#8217;m very proud of it. I&#8217;m working on getting the code cleaned up and abstracted before releasing it under the GPLv2. Below are edited points from slides describing Endocrys and why you might be interested in it.</p>
<h2>The Problems</h2>
<p>Dozens&#8230; hundreds&#8230; of systems, physical and virtual, all going about their business. Then something happens: maybe a disk drive failed, maybe a process died, maybe someone ordered a &#8216;reboot&#8217; or &#8216;halt&#8217;. Those systems don&#8217;t have a way of communicating that externally. There is no &#8220;Hey, I&#8217;m rebooting, BRB&#8221; in the server world.</p>
<p>Dozens&#8230; hundreds&#8230; of systems, physical and virtual, all going about their business. Then you have a question: How many of them have Western Digital harddrives listed in a recent recall? How many of them are running &lt;2GB of RAM? How many of them are running a certain version of some software listed in a security advisory? There&#8217;s no way to ask that question to the farm. There is no &#8220;Dear Lazyweb, answer this question for me&#8221; in the server world.</p>
<h2>The Purpose</h2>
<p>At its most basic level, Endocrys is a conduit between all of the systems and you. Think of it like a gigantic Instant Messaging buddy list, where all of your buddies are systems. When they&#8217;re online, they are in the list and can set their status messages, send you messages, send each other messages, receive messages, etc. Endocrys leverages the eXtensible Messaging and Presence Protocol (XMPP) to tie this framework into existing clients, transports and APIs, enabling a near-infinite number of possible applications or functions you can deploy.</p>
<h2>The Technology</h2>
<p>Endocrys is built as a framework &#8211; an abstract set of rules that can be extended at any time by writing little modules. These modules can be applied across the Endrocrys network instantly, without any downtime.</p>
<p>By leveraging XMPP, the Endocrys network is highly-redundant with no single fail points. Any number of &#8220;Communication Masters&#8221; (XMPP servers) are online, but only one is needed to keep communication flowing. All network communication is encrypted and signed. Partitioning and segmentation is handled rationally.</p>
<p>Communication is very similar to Instant Messaging, there is relatively no latency, and XMPP assures delivery even to systems offline when the message was sent.</p>
<p>Monitoring and control systems can participate on Endocrys, automating the remediation of problems remotely and automatically.</p>
<h2>The Protocol</h2>
<p>XMPP sits atop TCP, and atop that sits the Endocrys Communication Protocol aka Autocrys. ECP is a fully-authenticated, fully-controlled, skeptical protocol that serves both for sending structured announcements as well as sending and processing commands. The entire ECP specification is listed in AUTOCRYS.TXT. Endocrys agents can be written in any programming language, attached to any other framework, at any OSI level, as long as they can speak XMPP and implement ECP appropriately.</p>
]]></content:encoded>
			<wfw:commentRss>http://mattwork.potsdam.edu/blog/2009/10/27/introducing-endocrys/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Death To Passwords</title>
		<link>http://mattwork.potsdam.edu/blog/2009/10/23/death-to-passwords/</link>
		<comments>http://mattwork.potsdam.edu/blog/2009/10/23/death-to-passwords/#comments</comments>
		<pubDate>Fri, 23 Oct 2009 13:22:10 +0000</pubDate>
		<dc:creator>M</dc:creator>
				<category><![CDATA[Architecture]]></category>
		<category><![CDATA[Opinions]]></category>
		<category><![CDATA[Work]]></category>

		<guid isPermaLink="false">http://mattwork.potsdam.edu/blog/?p=565</guid>
		<description><![CDATA[A close friend forwarded me a note from a relative who was trying to solve a password-management problem. What was going to be a short statement of opinion turned into a moderately-humorous manifesto, and I thought I&#8217;d share (lightly edited).
I certainly empathize with your password management situation. Passwords  are, actually, horrible security mechanisms and [...]]]></description>
			<content:encoded><![CDATA[<p>A close friend forwarded me a note from a relative who was trying to solve a password-management problem. What was going to be a short statement of opinion turned into a moderately-humorous manifesto, and I thought I&#8217;d share (lightly edited).</p>
<p>I certainly empathize with your password management situation. Passwords  are, actually, horrible security mechanisms and it is my opinion that  they should be done away with altogether. Problem solved: No passwords  means no password management headaches.</p>
<p>So, how to do prove you&#8217;re who you are? How do your systems <em>trust</em> who  you say you are? A token. A &#8220;key&#8221;. A physical and logical item possessed  by the user. Something they can lose or get stolen or drop in their  coffee mug, but doesn&#8217;t matter because it&#8217;s useless without them leashed  to it- and can be reproduced by authorized personnel in a jiffy.</p>
<p>The security industry likes calling it &#8220;two-factor authentication&#8221;: The  two factors being something you <em>have</em> (the token) and something you <em>know</em> (the sentence uttered by your first girlfriend when she dumped  you, song lyrics, the title of a book &#8230; whatever). Behind the scenes  we shift from password management (gross and abhorrent) to key  management (fun and exciting!)</p>
<p>Encrypted-key security is the only managed authentication scheme I have  rolled out in client environments for the last 7&#8230;8 years. It can be  &#8220;difficult&#8221; to wrench into an existing infrastructure, changing the  culture, disrupting the status quo- but technologically is a vastly  superior solution to identity management.</p>
<p>The defacto standard is PGP [1], although there are a lot of players in this market  with varying quality of products, some aiming at various vertical markets. The link below gives a nice picture of how various systemic pieces tie together.</p>
<p>I know I didn&#8217;t answer your question- people tell me that a lot- but I  can&#8217;t in good faith recommend password management. I haven&#8217;t been able  to since 1999 or so, and certainly can&#8217;t as 2009 winds down. Sure, there  are things you can do &#8211; the DoD uses the Mandylion [2], which you can buy on  ThinkGeek [3] for $50 &#8211; but it doesn&#8217;t solve the actual problem of  secure identity management: Please pardon the crudeness, but it&#8217;s like  putting whipped-cream on dogshit.</p>
<p>[1] <a href="http://www.pgp.com/products/index.html">http://www.pgp.com/products/index.html</a><br />
[2] <a href="http://www.mandylionlabs.com/">http://www.mandylionlabs.com/</a><br />
[3] <a href="http://www.thinkgeek.com/gadgets/security/91a2/">http://www.thinkgeek.com/gadgets/security/91a2/</a></p>
]]></content:encoded>
			<wfw:commentRss>http://mattwork.potsdam.edu/blog/2009/10/23/death-to-passwords/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Can You Have Too Many Roombas?</title>
		<link>http://mattwork.potsdam.edu/blog/2009/10/08/can-you-have-too-many-roombas/</link>
		<comments>http://mattwork.potsdam.edu/blog/2009/10/08/can-you-have-too-many-roombas/#comments</comments>
		<pubDate>Thu, 08 Oct 2009 22:45:00 +0000</pubDate>
		<dc:creator>M</dc:creator>
				<category><![CDATA[Architecture]]></category>
		<category><![CDATA[Coding]]></category>
		<category><![CDATA[Linuxy]]></category>
		<category><![CDATA[Roomba]]></category>

		<guid isPermaLink="false">http://mattwork.potsdam.edu/blog/?p=553</guid>
		<description><![CDATA[I have four Roombas of three different models (I blame Steve for telling me about &#8220;deals&#8221;). I think I may have too many. Regardless, the one thing they all have in common is a hacked together BlueTooth connection so I can run various software on them remotely. While I haven&#8217;t really talked a lot about [...]]]></description>
			<content:encoded><![CDATA[<p>I have four Roombas of three different models (I blame Steve for telling me about &#8220;deals&#8221;). I think I may have too many. Regardless, the one thing they all have in common is a hacked together BlueTooth connection so I can run various software on them remotely. While I haven&#8217;t really talked a lot about those &#8220;various softwares&#8221;, I&#8217;m really excited about a project I&#8217;m working on now, working title of RooCluster.</p>
<p>RooCluster is a command-and-control application designed for the special needs of  multiple robots operating in the same space, or over large multi-room spaces. Each Roomba is being fitted with an RFID tag, which, in coordiation with some more wireless access points, allows me to triangulate where a Roomba is and its travel vector (sometimes, math is cool). This information can help RooCluster avoid nasty Roomba-on-Roomba collisions, and also presents the possibility of meta-virtual walls.</p>
<p>If you have a Roomba, you probably have a virtual wall &#8211; the little pylon that sends out an infrared beam that the Roombas treat just like a wall. With some work, RooCluster should be able to honor coordinate-based lines (which could, in turn, form other shapes) and effectively &#8220;wall-off&#8221; areas without needing a physical barrier, or a battery-sucking virtual wall. You can also overlay the position and vector data onto floorplans, and see exactly where the Roombas are, and where they&#8217;re going.</p>
<p>Of course, you can also use it to make your Roombas dance with each other.</p>
<p>Or joust.</p>
]]></content:encoded>
			<wfw:commentRss>http://mattwork.potsdam.edu/blog/2009/10/08/can-you-have-too-many-roombas/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>The Next Five Years of Storage</title>
		<link>http://mattwork.potsdam.edu/blog/2009/10/05/the-next-five-years-of-storage/</link>
		<comments>http://mattwork.potsdam.edu/blog/2009/10/05/the-next-five-years-of-storage/#comments</comments>
		<pubDate>Mon, 05 Oct 2009 22:51:40 +0000</pubDate>
		<dc:creator>M</dc:creator>
				<category><![CDATA[Architecture]]></category>
		<category><![CDATA[Opinions]]></category>
		<category><![CDATA[Work]]></category>

		<guid isPermaLink="false">http://mattwork.potsdam.edu/blog/?p=547</guid>
		<description><![CDATA[[NOTE: This essay was commissioned by a client in December 2006. It's the second in a series of old-yet-relevant position-papers whose exclusivity has expired, that I'm editing and posting. Things for the next five look "similar". There is no formal "conclusion", as this is one section of a larger piece.]
Over the next five years, gross [...]]]></description>
			<content:encoded><![CDATA[<p>[<strong>NOTE:</strong> This essay was commissioned by a client in December 2006. It's the second in a series of old-yet-relevant position-papers whose exclusivity has expired, that I'm editing and posting. Things for the next five look "similar". There is no formal "conclusion", as this is one section of a larger piece.]</p>
<p>Over the next five years, gross storage needs will double every other year, sparked by industry trends that avoid deleting anything, ever; continued bloat in software programs; increased user demand for larger-file storage; increased user demand for indefinite storage; increased user, corporate, and industry expectation of system-side backups and frequent snapshots; and the enabling factor of meteoric-disk-size -to- paltry-disk-cost ratios.</p>
<p>Since the late 1990s, we have seen rapid acceleration of infinite data life. While storage vendors will use terms such as &#8220;information life-cycle management&#8221;, &#8220;information archiving&#8221; or &#8220;data warehousing&#8221; &#8211; they all converge onto the premise that corporate data life is no longer finite. The value of this is dubious, but irrelevant to argue: financial workers expect to be able to look at historical data for modelling purposes; draft and product workers expect to be able to look at long-dead projects that might now be of value with new knowledge; in the throes of bankruptcy, competent managers (and lawyers) will want to mine the archives for something&#8230; anything that may provide some value. Everything your organization has ever known is expected to be retained, indefinitely.</p>
<p>The average 10-page MS Word document in 1995 was 13K in size. The average 10-page MS Word document in 2006 is 1.4MB. While that size may still seem small, it&#8217;s indicative of a growing trend of software generating vastly wasteful content because they can. Software vendors don&#8217;t need to worry about their data fitting onto floppies anymore, so they don&#8217;t. Multiply this across dozens of applications, add in media, and you have truly huge data files with only a few pages of actual content.</p>
<p>Similarly, the users want ever-larger files. Gone are the days of compressing graphics, video and audio to the Nth degree: users want full-quality content. They don&#8217;t want a 120&#215;120 &#8220;thumbnail&#8221; video, they want something that takes some real-estate on their oversized monitor. As bandwidth increases, so will the user-desire for better content faster. They then want to save that same content to their network volume. They want it backed up in case of catastrophe (or their own error). What was a 3MB MP3 file is now a 45MB FLAC or WAV file sitting in your database.</p>
<p>The increase in user-end space (desktop harddisks) has led users to demand not only more and more space from their storage providers, but also indefinite storage. Users no longer have to selectively delete their e-mails to stay in a predefined space, so they keep them all, forever. They expect the same from the rest of their digital attics: they expect every bad poem, doodle, patent-idea-on-a-napkin, picture of their grandkids, etc. to be immediately available, forever.</p>
<p>Forever. Even if your disks die. Even if they accidentally delete them. Even if a meteor pummels your datacenter. The old standard of weekly backups have long passed the borders of Being Prudent, travelled through the Fields of Marginally Acceptable, and have entered the Mountains of Irreparable Harm to Your Reputation. Users, customers, regulators, etc. are barely tolerant of losing a day of data, and this will get worse. In the next half-decade a truly monumental shift into multi-media backups, near-real-time data snapshots, and 100% protection of data assets will be fully realized, requiring several multiples more mixed-media backup storage than live data storage.</p>
<p>On the up-side, disk sizes are sky-rocketing, costs are plummeting and the reliability of the new serial ATA (SATA) architected drives have come up to a level that allows anyone to build in or expand networked disk with a trivial investment. A new generation of storage vendors are coming up and challenging the old way of thinking about networked storage, and adopting technologies with more agility than their behemoth competitors. We&#8217;re quickly on our way to 1TB disk drives, flash-based storage continues to be refined and is nearing enterprise-grade, holographic storage is being commercially realized for some applications, and all of these technologies are driving the cost per megabyte down.</p>
]]></content:encoded>
			<wfw:commentRss>http://mattwork.potsdam.edu/blog/2009/10/05/the-next-five-years-of-storage/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>The Next Five Years of Bandwidth</title>
		<link>http://mattwork.potsdam.edu/blog/2009/09/25/the-next-five-years-of-bandwidth/</link>
		<comments>http://mattwork.potsdam.edu/blog/2009/09/25/the-next-five-years-of-bandwidth/#comments</comments>
		<pubDate>Fri, 25 Sep 2009 22:27:37 +0000</pubDate>
		<dc:creator>M</dc:creator>
				<category><![CDATA[Architecture]]></category>
		<category><![CDATA[Opinions]]></category>
		<category><![CDATA[Work]]></category>

		<guid isPermaLink="false">http://mattwork.potsdam.edu/blog/?p=529</guid>
		<description><![CDATA[[NOTE: This essay was commissioned by a client in December 2006. It's the second in a series of old-yet-relevant position-papers whose exclusivity has expired, that I'm editing and posting. Things for the next five look "similar", yet scaled up in some areas. There is no formal "conclusion", as this is one section of a larger [...]]]></description>
			<content:encoded><![CDATA[<p>[<strong>NOTE:</strong> This essay was commissioned by a client in December 2006. It's the second in a series of old-yet-relevant position-papers whose exclusivity has expired, that I'm editing and posting. Things for the next five look "similar", yet scaled up in some areas. There is no formal "conclusion", as this is one section of a larger piece.]</p>
<p>Over the next five years, datacenter bandwidth will level off for a bit. With the 10GigE standard behind us we can finally pull our backbones up to a level where they&#8217;ll be able to breathe easier for a while. Storage speeds are still being gated by the storage devices themselves, and until either solid-state media becomes cost effective or disks rotate twice as fast as they are now, that isn&#8217;t going to change much. Aggregating virtual systems is actually causing an interesting bandwidth phenomena that I&#8217;ll address later. Regardless, a 10Gig, or Nx1Gig backbone should be able to breathe well for the next half-decade. Planned year-over-year demand increases of 5-7% should be expected.</p>
<p>Desktop network speeds have been about the same for the last five years, and will largely remain unchanged. A 32-bit computer system running a commercial desktop operating systems has too many architectural limitations, still, to be make use of more than 60-85Mb/s of bandwidth. While some vendors are running 64bit processors, they generally are using bus architectures that aren&#8217;t that wide, thus gating peripheral speeds back to 32bit. In the next five years that will clean up a bit, and 64bit &#8220;extensions&#8221; to the 32bit processors will become more common place, but still not impacting the network noticeably due largely to OS and bus architectural issues.</p>
<p>Environments consolidating onto virtualized systems are seeing an interesting gross decrease in datacenter network bandwidth use. Not surprisingly, they&#8217;re also seeing peak utilization well above what they had prior to consolidation. The latter is easily explained by virtualized systems generally &#8220;netbooting&#8221; their OS from the storage network or a bootserver, and now more than ever embracing networked storage <em>completely</em>. The gross decrease has been unexpected because of the higher demands on the network, but is explained by architectural constraints. We&#8217;re now seeing 10-15 virtual servers sharing one or two network connections, where previously each had one or two of their own. This has somewhat of a levelling effect on network use, but isn&#8217;t dramatically impacting service performance as one would expect. The network is more important in these environments, but as a whole not as taxed.</p>
<p>It was largely believed that mobile &#8220;broadband&#8221; availability and use would be much higher by now, but we have yet to see a real platform for use. The Palm Treo series is getting an overhaul &#8220;soon&#8221; and rumored platforms by Google and Apple may change that landscape. In general, even if fully realized, the network demands by these users will largely have no impact on the greater network, or on datacenter network needs. The next-generation, &#8220;4G&#8221;, will be changing that, but I don&#8217;t expect to see that kind of horsepower in a phone until late-2010-to-2012: the processors are still just too slow.</p>
<p>What will change dramatically will be the bandwidth access for remote users. While not directly impacting the datacenter we&#8217;re going to see dramatic growth in the cable/DSL/satellite &#8220;broadband&#8221; space. Internet-facing applications may see a 20-30% rise in client demands as users become less tolerant of waiting for application loads due to their expectations of &#8220;faster&#8221; service, on the order of 200-250% more bandwidth. It is expected that OSP asymmetrical provisioning will continue.</p>
]]></content:encoded>
			<wfw:commentRss>http://mattwork.potsdam.edu/blog/2009/09/25/the-next-five-years-of-bandwidth/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Disposable Appliance Computing</title>
		<link>http://mattwork.potsdam.edu/blog/2009/09/24/disposable-appliance-computing/</link>
		<comments>http://mattwork.potsdam.edu/blog/2009/09/24/disposable-appliance-computing/#comments</comments>
		<pubDate>Thu, 24 Sep 2009 21:54:45 +0000</pubDate>
		<dc:creator>M</dc:creator>
				<category><![CDATA[Architecture]]></category>
		<category><![CDATA[Linuxy]]></category>
		<category><![CDATA[Opinions]]></category>
		<category><![CDATA[Work]]></category>

		<guid isPermaLink="false">http://mattwork.potsdam.edu/blog/?p=530</guid>
		<description><![CDATA[[NOTE: This essay was commissioned by a client in February 2007. It's the first in a series of old-yet-relevant position-papers whose exclusivity has expired, that I'm editing and posting]
The hosted systems industry has turned another critical point. Several years ago we eschewed large mainframe systems in exchange for commodity servers that could divide load and [...]]]></description>
			<content:encoded><![CDATA[<p>[<strong>NOTE:</strong> This essay was commissioned by a client in February 2007. It's the first in a series of old-yet-relevant position-papers whose exclusivity has expired, that I'm editing and posting]</p>
<p>The hosted systems industry has turned another critical point. Several years ago we eschewed large mainframe systems in exchange for commodity servers that could divide load and work together to provide services without single-vendor lock-in and without a single piece of &#8220;iron&#8221; waiting to fail. The computing power of a $2,000,000 mainframe was dwarfed by the implementation of $80,000 in commodity hardware. With virtualization coming-of-age- with Intel and AMD putting hooks into their processors and chipsets to allow virtualization to be fully realized and not just a software-only hack- we&#8217;ve seen those same commodity systems hosting dozens of virtual systems reliably and at near-metal efficiency. The cost per virtual system is a number <em>rapidly approaching zero</em>.</p>
<p>New offerings from Sun, IBM and HP/Compaq are emphasizing something that &#8220;the server guys&#8221; haven&#8217;t needed to care much about: infrastructure. Historically, your network engineers and analysts worried about interconnection, route redundancy, and ensuring the bits could flow where they needed, reliably and sufficiently; and your system engineers worried about everything up to the point the bits hit &#8220;the network&#8221;. Moving forward, that is almost a debilitating dichotomy. Traditionally, in the post-mainframe era, a physical system did one or two things and its exclusion from the network or its under-performance on the network was a minor issue. With a physical system possibly hosting dozens of virtual systems- all with unique networking requirements, cross-talking requirements, and of course: networked storage requirements- your system engineers must be well-versed in network engineering. &#8220;<a href="http://en.wikipedia.org/wiki/John_Gage">The Network Is The Computer</a>&#8221; is not just a Sun tag-line, or a lame cliche&#8217;. We&#8217;re now fully realizing the potency of that statement. Every system offering from the Big Three contains significant &#8220;infrastructure&#8221; features: Network features.</p>
<p>By pushing more and more network features into server systems- IBM servers with Cisco &#8220;swrouters&#8221; built-in, for example &#8211; the server itself has become more important and less relevant at the same time. Keeping it up and running <em>well</em> will require a new kind of system engineer because &#8220;the box&#8221; is now more complex: But at the same time, collections of &#8220;boxes&#8221; should be able to self-heal and adapt to the failures of others. Each system has now become disposable.</p>
<p>A large swath of the architectural literati are already deploying quantities of self-healing farms that take over the work &#8211; the very virtual machines &#8211; of failed or failing physical systems. Virtualization on its own wasn&#8217;t a game-changer. Virtualization with processor support and recognition sparked real potential. Virtualization on top of &#8220;infrastructure&#8221;-aware (e.g. heavily networked) physical systems has dramatically shifted the value of hybrid &#8220;networked systems engineers&#8221;, raised the bar for the &#8220;server guys&#8221; to get up to speed on the real internals of networking, and has provided the unprecedented opportunity to deploy redundantly resilient systems that can in-<em>practice </em>achieve five-to-seven &#8220;nines&#8221; of reliability.</p>
]]></content:encoded>
			<wfw:commentRss>http://mattwork.potsdam.edu/blog/2009/09/24/disposable-appliance-computing/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>A Rebuttal of a Rebuttal&#8230;</title>
		<link>http://mattwork.potsdam.edu/blog/2009/09/08/a-rebuttal-of-a-rebuttal/</link>
		<comments>http://mattwork.potsdam.edu/blog/2009/09/08/a-rebuttal-of-a-rebuttal/#comments</comments>
		<pubDate>Tue, 08 Sep 2009 15:30:31 +0000</pubDate>
		<dc:creator>M</dc:creator>
				<category><![CDATA[Architecture]]></category>
		<category><![CDATA[Opinions]]></category>
		<category><![CDATA[Work]]></category>
		<category><![CDATA[cloud computing]]></category>

		<guid isPermaLink="false">http://mattwork.potsdam.edu/blog/?p=503</guid>
		<description><![CDATA[Somehow, while clicking away at the Internet, I managed to get to this blog post, which is a rebuttal of this article by Cory Doctrow about cloud computing. Please keep in mind I dislike most of the things that come off of Cory&#8217;s fingers intensely, and even elements of the aforementioned article. Where I blew [...]]]></description>
			<content:encoded><![CDATA[<p>Somehow, while clicking away at the Internet, I managed to get to <a href="http://www.nirak.net/2009/09/03/computing-in-the-clouds/">this blog post</a>, which is a rebuttal of <a href="http://www.guardian.co.uk/technology/2009/sep/02/cory-doctorow-cloud-computing">this article</a> by <a href="http://en.wikipedia.org/wiki/Cory_Doctrow">Cory Doctrow</a> about cloud computing. Please keep in mind I dislike most of the things that come off of Cory&#8217;s fingers intensely, and even elements of the aforementioned article. Where I blew an artery was at the rebutter saying:</p>
<blockquote><p>Cloud computing fosters intense competition.</p></blockquote>
<p>This is not just wrong (people are wrong all the time, and I don&#8217;t blog about it), it&#8217;s <em>dangerously</em> wrong. Nowhere in technology have we had less competition potential, and more monopoly potential than in the realm of cloud computing. Why? Cloud computing is all about size and marketing. I remember back in the day when I was happily using <a href="http://http://www.webcrawler.com/">WebCrawler</a> to scour the Intertubes for what I wanted to find, when some jerks from Stanford started some crappy search engine called <a href="http://www.google.com/">Google</a>. Pffft. &#8220;There&#8217;s tons of competition in the search engine market&#8221;, the analysts said. I was concerned then, I&#8217;m concerned now.</p>
<p>Any centralized Internet service is about size and marketing. Read up <a href="http://mattwork.potsdam.edu/blog/2009/09/02/distributed-by-design/">on my last post</a> if you want to know how I feel about centralized services in general. Right now there are three major players in the &#8220;cloud computing&#8221; market. That will grow, as did search engines, to probably 20 major players and everyone will think I&#8217;m nuts, and then they will be slammed against the wall by One.</p>
<p>One that claims not to be evil.</p>
<p>One that believes (truly) they&#8217;re doing good.</p>
<p>One that rules them all.</p>
<p>This is not some psychic prediction, this is the recognition of a pattern that has played out too many times in the young life of the commercial Internet.</p>
]]></content:encoded>
			<wfw:commentRss>http://mattwork.potsdam.edu/blog/2009/09/08/a-rebuttal-of-a-rebuttal/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Distributed By Design</title>
		<link>http://mattwork.potsdam.edu/blog/2009/09/02/distributed-by-design/</link>
		<comments>http://mattwork.potsdam.edu/blog/2009/09/02/distributed-by-design/#comments</comments>
		<pubDate>Wed, 02 Sep 2009 21:41:28 +0000</pubDate>
		<dc:creator>M</dc:creator>
				<category><![CDATA[Architecture]]></category>
		<category><![CDATA[Life]]></category>
		<category><![CDATA[Opinions]]></category>
		<category><![CDATA[Work]]></category>

		<guid isPermaLink="false">http://mattwork.potsdam.edu/blog/?p=498</guid>
		<description><![CDATA[The Internet, by taking its name literally, means &#8220;the network between networks&#8221;. 40 years ago today, the Men and Women that first started implementing the technologies that grew up to be the Internet didn&#8217;t want to create one network- they wanted to unify distributed networks redundantly. They wanted to do it this way so that [...]]]></description>
			<content:encoded><![CDATA[<p>The Internet, by taking its name literally, means &#8220;the network between networks&#8221;. <em><strong>40 years ago today</strong></em>, the Men and Women that first started implementing the technologies that grew up to be the Internet didn&#8217;t want to create one network- they wanted to unify distributed networks redundantly. They wanted to do it this way so that a problem on one network, had literally no impact on other networks &#8211; and especially no impact on the Greater Network.</p>
<p>Yet humans are congregational, fashionable beasts. Humans, in general, want to be where there are other humans- even virtually. They want to be a &#8220;part of something&#8221;. They want to be on Facebook or MySpace. They want an iPhone. They want a Gmail account. They want to be LinkedIn. They love fads and the feeling of faux exclusivity that they bring, like Twitter. They fight <em>against</em> distribution.</p>
<h2>The Fallout of Centralization</h2>
<p>When Gmail is offline for a few hours, it makes international news. Millions of people across the globe are without e-mail! Crisis! Panic! When Facebook &#8220;gets a virus&#8221;, it&#8217;s international news! Millions of people are at risk! Crisis! Panic! When Twitter gets knocked offline, it&#8217;s international news! Millions of people can&#8217;t tell each other what they&#8217;re eating! Crisis! Panic!</p>
<p>Why won&#8217;t they get their e-mail from a more local provider? Why won&#8217;t they get other services from a more local provider? Why do people by Macs? Why do people like malls? They fight against distributed designs, and try to pull things together. &#8220;One-stop shopping&#8221; to a network neophyte is all giggles and rainbows- To a network architect it&#8217;s &#8220;bad design&#8221;.</p>
<p>Internet history is littered with fads and centralized solutions that inevitably fail to retain attention after the Next Big Thing arrives: But distributed solutions tend to last much much longer. Why?</p>
<h2>The Argument for Centralization</h2>
<p>There&#8217;s only one reason Google loses more money than my lunch cost every time someone watches a YouTube video: Control. Companies offer centralized services so they can control their users, their content, and their market. Facebook doesn&#8217;t care about user privacy or quality services, they care about making money (or, in their case, losing less money). They want you to be on their site as often and for as long as possible, and they&#8217;ve <em>already</em> sold your information in order to fund more gizmos and whirligigs (&#8221;apps&#8221;, they call them) to keep you coming back and sticking around. Control.</p>
<p>If Facebook was distributed &#8211; anyone could put up their own Facebook-compatible site, and allow other geographically-similar people to use it &#8211; how would they exercise control? How would they censor you, collect your personal data and sell it? How would they push you the latest version of your favorite whirligig? How would they profit off of you (or prevent someone else from profiting off of you)? Control.</p>
<p>While Gmail is a centralized interface to a distributed system (collectively known as &#8220;e-mail&#8221;), they&#8217;d just love it if everyone used Gmail and there was no distributed e-mail system. They mine your e-mail for keywords to target ads at you, making truckloads of money at the expense of your privacy- Ads that companies pay a premium for because Google can already &#8220;guarantee&#8221; you&#8217;re interested. Control.</p>
<h2>The Argument for Distribution</h2>
<p>Keeping components small, simple, and &#8220;close&#8221; is a doctrine of many disciplines, and is imperative for &#8220;critical infrastructure&#8221;. I have more than one client that intentionally buy equipment from multiple vendors, to reduce risk of single-vendor product failures. Several only update a fraction of systems and application software to new versions at a time, to reduce risk of debilitating bugs (one client, literally, has some 8 year-old operating systems running unpatched for this reason). More, and more institutions have multiple WAN (&#8221;Internet&#8221;) connections, from multiple providers, to reduce risk of single-provider failures.</p>
<p>Distributed infrastructure is more about &#8220;us&#8221; (the global network users)  than &#8220;you&#8221; (the individual). You don&#8217;t care whether Gmail (with a few million users) is down, or whether your ISP (with a few thousand users) is down: all you care about is that you can&#8217;t get to your e-mail. Because of this, it&#8217;s hard to convince &#8220;you&#8221; that distributed is better. You, the individual, don&#8217;t care about network survivability or how many <em>other</em> people are impacted- you only care about yourself, and your inability to access your mail.</p>
<p>Would you like to travel to your State capital to go to a centralized hospital? How about to the National Library to check out a book? When it comes to travel-related logistics, most people go &#8220;duh, local hospitals and libraries and schools and whatnot make tons of sense&#8221;: But once they go online &#8211; once that facade of geography is lifted &#8211; eYouSpaceFaceGoogleTwitterBookMailBay.com really is the best thing in the world, OMG?!!</p>
<h2>HOWTO</h2>
<p>So what can we do? What can <em>you</em> do? Appreciate and utilize local, distributed infrastructure. Don&#8217;t outsource/offshore your new whizbang application when your local hosting provider can handle it. Limit your dependence on centralized services, and find distributed substitutes where needed. Pretty much every infrastructure application has a distributed counterpart, many of them with much much less evil than their centralized cousins. I&#8217;m not saying &#8220;don&#8217;t go to website X&#8221; &#8211; I enjoy Wikipedia and CNN and all sorts of centralized information sources &#8211; but I don&#8217;t depend on them for communication, and I certainly know other places to go if they&#8217;re offline.</p>
<p>Distributed service architectures are about freedom and survivability. Centralized service architectures are about captivity and control. Happy Birthday, Internet. I look forward to another 40 years of architectural ingenuity.</p>
]]></content:encoded>
			<wfw:commentRss>http://mattwork.potsdam.edu/blog/2009/09/02/distributed-by-design/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>&#8220;Secure Programming&#8221;</title>
		<link>http://mattwork.potsdam.edu/blog/2009/05/19/secure-programming/</link>
		<comments>http://mattwork.potsdam.edu/blog/2009/05/19/secure-programming/#comments</comments>
		<pubDate>Tue, 19 May 2009 04:57:45 +0000</pubDate>
		<dc:creator>M</dc:creator>
				<category><![CDATA[Architecture]]></category>
		<category><![CDATA[Coding]]></category>
		<category><![CDATA[Linuxy]]></category>
		<category><![CDATA[Opinions]]></category>
		<category><![CDATA[Work]]></category>

		<guid isPermaLink="false">http://mattwork.potsdam.edu/blog/?p=454</guid>
		<description><![CDATA[Earlier this year I developed a two-day seminar for a client on &#8220;secure programming&#8221; techniques. The intended audience was automation programmers, but at the last minute, after seeing the &#8220;syllabus&#8221;, they decided to include their application programmers. The result was a flurry of re-working to accommodate the Java crowd in addition to the Perl/PHP/Python crowd. [...]]]></description>
			<content:encoded><![CDATA[<p>Earlier this year I developed a two-day seminar for a client on &#8220;secure programming&#8221; techniques. The intended audience was automation programmers, but at the last minute, after seeing the &#8220;syllabus&#8221;, they decided to include their application programmers. The result was a flurry of re-working to accommodate the Java crowd in addition to the Perl/PHP/Python crowd. In the end, the bulk of the message was the same. While I can&#8217;t republish my stack for trade-secret reasons, I can dump the agnostic bits (mostly my slide notes) here. The code examples will be in human-readable Perl or pseudo-code, but will work in any language.</p>
<h2>Users Are Un[trust]worthy</h2>
<p>The first, and most important point, is that you can never trust user-supplied data. Ever. Just because your whiz-bang drop-down box only shows the user four options, doesn&#8217;t mean they can&#8217;t end up sending `rm -Rf /`. This is the <strong>single largest threat</strong> against an in-house application. User-supplied data must always always be vetted, sanitized and treated as if it could take over the world. A lot of programmers use rapid-application-development tools to create interfaces. These tools are a boon for creating slick interfaces, but generally a complete bust at data integrity enforcement. You tell it &#8220;make this a date field&#8221;, and it does&#8230; But anyone with a quick script can insert 40MB of garbage data, which will at best crash your application, and at worst, overflow a buffer that had improper bounds (see next point) and allow &#8220;remote execution of arbitrary code&#8221; (READ: You&#8217;re fucked).</p>
<p>So how do you fix this? Regardless of the languages you program in, they probably have a <a href="http://store.xkcd.com/xkcd/#RegularExpressionsShirt">regular expressions</a> (regexp) library. Many popular languages even have &#8220;Perl-Compatible Regular Expressions&#8221; (PCRE), as there is no better language for data process than Perl, and its regexp syntax is well-defined (albeit with a moderate learning-curve). Regexps are the easiest way to enforce your intentions with data.</p>
<pre>unless($userdate =~ m#^\d\d\d\d/\d\d/\d\d$#) { Freak_Out(); }</pre>
<p>That&#8217;s a one-liner that says, exactly,&#8221;unless the data in the variable $userdate begins with four-digits, followed by a fore-slash, followed by two-digits, followed by a fore-slash, followed by two-digits, and that&#8217;s it &#8211; Freak Out&#8221;. (m &#8211; we&#8217;re matching, # &#8211; start the regexp, ^ &#8211; match at the begining of the data, \d &#8211; a digit (0-9), $ &#8211; match the end of the data, # &#8211; end the regexp).</p>
<p>You could easily write a function to do this for you:</p>
<pre>sub Is_Date {
  my $userdate;
  if($userdate =~ m#^\d\d\d\d/\d\d/\d\d$#) { return 1; }
  else { return 0; }
}</pre>
<p>Then just call</p>
<pre>unless(Is_Date($usercrap)) { Freak_Out(); }</pre>
<p>every time you need to check that you&#8217;ve REALLY got a date. Not hard. The user could still be entering the data wrong, but that&#8217;s not the secure programmer&#8217;s job. <img src='http://mattwork.potsdam.edu/blog/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
<p>It is imperative that your application go out of its way to recognize crap as soon as it can. Don&#8217;t pass a variable containing unchecked user-supplied data to a dozen  or so functions before checking it. Accept it, and check it immediately. SQL injection attacks are another whole can of worms that can take advantage of improper sanitization, and frequently get executed in strange places. Sanitize early.</p>
<p>I know you guys aren&#8217;t writing AJAX stuff, but I&#8217;d be derelict if I didn&#8217;t hammer this out anyhow: If you trust JavaScript or any other client or browser -executed code to do your security bits for you, you are not only a moron, but should be made to wear a Scarlet Zero for the next few years. Zero for &#8220;0wn3d&#8221;. Just because you write JavaScript, doesn&#8217;t mean that&#8217;s how the browser will execute it. ANYONE can rewrite your JavaScript to do whatever they want. The above IsDate function, if implemented in client-side JavaScript can easily simply &#8220;return 1&#8243; regardless of what&#8217;s passed. <strong>You must never trust client-supplied data</strong> regardless of whether you believe it was vetted client-side. It <em><strong>must</strong></em> be vetted server-side. Must. Must. Must. If, for user convenience, you want to implement something to trap errors quickly client-side, that&#8217;s groovy, but in no way does it excuse you from vetting the content once submitted to your application.</p>
<p>I can&#8217;t say this enough. It seems like common-sense, and a lot of you are nodding at me right now, but a month on you&#8217;re going to write a quick webform with radio-button selection, and someone in Bejing is going to own your database server because they submitted `echo root:fXAWEOo7DsNSM:0:0:0:0::: &gt; /etc/shadow` to the &#8220;What&#8217;s your favorite fruit?&#8221; question. [PAUSE] It&#8217;s funny right now, it&#8217;s not funny when you&#8217;re facing a DoD inquiry.</p>
<h2>Data Sizes Are Critical (AKA Know Your Data)</h2>
<p>The second most important thing I can convey to you is the concept of sanitizing types. Those of us in the room that are automation programmers and write in the P-languages don&#8217;t have to give a shit about this, but those of you using Java, C or anything that comes out of Redmond, WA need to listen up: If you&#8217;re blindly stuffing data into a restricted type, you better be damn certain it&#8217;s the right size. Last year one of your competitors had a nice lad write an internal automation system that took data from a database and did things with it. Anyone know how large a MySQL blob-type is? 64k bytes. Anyone know how large a Microsoft Visual BASIC date-type is? 8 bytes. Thankfully, he wasn&#8217;t writing a user-facing application. [PAUSE] Last I heard, he was writing TARP-laundering algorithms for AIG.</p>
<p>Everytime you hear about a &#8220;buffer-overflow&#8221; attack, this is because some moron didn&#8217;t check data before stuffing it into memory. The lower-level you code, the more important this is. The example I gave about the VB app only cost the contractor a few dozen-thousand dollars and a loss-of-face from their client because VB just panics and dies when you do that. If you&#8217;re coding in C and not running on a very secure linux box with a memory-jockeying system, you may well have just overwritten your security code with:</p>
<pre>blahblaWastingSpaceUntilYourDataTypeIsOverblahblahblahCall Function DoSomethingBad</pre>
<p>Yeah, that&#8217;s what a buffer-overflow looks like. Know your data, and when in doubt <em>cast</em>. A lot of instructors shy away from casting. It slows things down, it causes a lot of compile-time checks that frequently create errors or warnings in otherwise clean code. Casting is the absolute best way to make sure you&#8217;re not blowing out a primitive type. Of course, if you&#8217;re writing into a char-array, that&#8217;s not going to help you, but that&#8217;s your problem. <img src='http://mattwork.potsdam.edu/blog/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
<h2>Lazy Handles</h2>
<p>You guys know all about access permissions, so I&#8217;m going to skip over a bunch of stuff. One thing I&#8217;ve seen here and elsewhere, is opening handles &#8211; network sockets, database handles, file handles, directory handles, etc. &#8211; as users with way more permission than necessary. Sure, it&#8217;s easier for you, but what happens when your code gets hosed and someone from Bulgaria managed to inject themselves in before you&#8217;ve let go of that handle? Explaining to your boss that you opened the files as root &#8220;because it was easier&#8221; isn&#8217;t going to fly. Your application needs to run restricted, which you already know, and you absolutely cannot wantonly elevate handles without damn good reason. I can&#8217;t count the number of applications I&#8217;ve seen &#8211; Hell, applications I&#8217;ve <em>written</em> &#8211; that immediately connect to a resource as God himself to do some stuff that&#8217;s pretty primitive &#8211; and maybe, MAYBE one operation in fifty that actually needed that access. Which leads to the next problem here&#8230;</p>
<p>Keeping handles open longer than necessary, or in anticipation of future operations. If your application does six operations on a resource and only one needs the assistance of Angels, then that&#8217;s the only operation that gets the Silver Trumpets &#8211; the rest can stew along the rest of us. I don&#8217;t care if you&#8217;re doing all six of those operations at once: you run the primitives as a primitive user, and then either change-user or fork or whatever you need to do as the elevated user for that one operation. Yeah, I groan about it too. But that&#8217;s it. Every moment your application has an elevated resource connection, is a moment that someone is going to get their</p>
<pre>INSERT INTO USERS user='yakov',password='smirnov'; GRANT ALL FOR ALL TO USER identified by 'yakov';</pre>
<p>into your data. It&#8217;ll take your auditors a fiscal quarter before they find that account.</p>
<p>With the exception of operating-system handles, you can almost always switch out, rebind, setuid, or the like, without making a new connection.</p>
<h2>1176da21241f79203fbd93e367f35142</h2>
<p>Yeah, encrypt <em>everything</em>. I know you guys are using SSL for nearly all resource connections, which is ridiculously important. Even the server-to-server stuff that we used to shrug off and say &#8220;yeah, well it&#8217;s on a switched network, and in the same room&#8221; has <em>got</em> to be encrypted on the wire. There are too many techniques and tools out there to get in the middle of those streams, and we all know how reliable network administrators are at picking up that stuff. [PAUSE]</p>
<p>Even within your application, if you&#8217;re writting out temp data, encrypt it from possibly prying eyes and processes. If you&#8217;ve got in-memory data that&#8217;ll be sitting around for some time, and might be a candidate for swapping, encrypt it in memory. How many of you have ever even considered encrypting live data? If that data gets stale, and the operating system decides to swap out some pages, that stuff is possibly going to be visible unless your infrastructure takes into account encrypted swap and the like. In lots of languages, encryption and decryption of relatively small amounts of data (&lt; available RAM) is pretty simple. I&#8217;m not advocating it for everything, but there&#8217;s not much harm in:</p>
<pre>$data=encrypt($data);
{ #Long
  #Running
} #Block.
$data=decrypt($data);</pre>
<p>If something weird happens, and some or all of $data is swapped out because it&#8217;s not being used, it&#8217;s useless to an attacker.</p>
]]></content:encoded>
			<wfw:commentRss>http://mattwork.potsdam.edu/blog/2009/05/19/secure-programming/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
