Category: Architecture

Apr 21 2009

Beating CAPTCHA-Crackers

A CAPTCHA That "can't" be cracked

A CAPTCHA That "can't" be cracked

Everyone is in this arms-race. Those who make CAPTCHAs, and those who want to crack them.

The solution for the former is simple: Animate them. I’m not talking about making a 6-frame looping GIF, whereby the cracker can steal a frame and crack at THAT, I’m talking about an animation where any one frame doesn’t have all of the information- Even each of the frames looked at on their own doesn’t have all of the information, but the sum of viewing them makes it obvious.

There are 6 frames to the CAPTCHA on the right. The number “4″ and letter “K” are normal – if a cracking algorithm ripped these frames apart, they could trivially determine those. But the 8 is made of two frames- both of the letter “O”… The “X” is made up of two frames- one a “foreslash” the other a “backslash”. I’m not going to claim that this exact CAPTCHA is uncrackable, but the concept – spending more than 45 seconds in the Gimp- will yield a product that cannot be beaten by non-morphing algorithms, and I don’t see the CAPTCHA-cracking-clique getting that sophisticated for a few more years at least.

Go Forth And Code.

UPDATE 5/11: A colleague challenged that this could be beaten by a simple “flattening” algorithm, thus looking at all the frames at the same time. Again, the simple animation I made wasn’t meant has a true example, merely the gist. Introducing multi-color backgrounds, “erasing” parts of previous frames with future frames, among other techniques, would nullify the “flattening bypass”.

Jun 02 2008

Nessus 3

Nessus is the security scanner. That’s not just a tagline, it’s the truth. Yes, other people make scanners. Yes, other companies make tons of money off of their scanners, but having used ALL of them (yes, every single purported ’security scanner’, that I have ever heard of, for the last 10 years or so. ALL), Nessus is the “best”.

Historically, one of its major values was its cost (free) as well as its source license (Free/Open). Cost is still, generally, free (restrictions and various non-free necessities apply) … but the source, and the product, are… not. This may upset some to the point of refusing to use it. I’m not one of them.

Nessus is very light-weight, its rule language (NASL) is very intuitive and powerful, and the sheer volume of support and flexibility provided is exceptional. It finds things other scanners only dream to. It can (not by default) work very stealthily, not setting off some IDSes. It can be configured not to destroy the systems it’s scanning. No one builds a better scanner. Truthfully, I wish someone would: Not because I don’t like or want to use Nessus, but because the ecosystem is very homogenous in this space: You either use Nessus, or you may as well find a dowser with a divining rod to point out your vulnerabilities.

Over the years, one of the things that has been failing with Nessus is the interface. I enjoy command-line interfaces as much as the next UNIX junkie, but when you need a tool you can put in the hands of Joe Average, it has to have a graphical interface… It just must. Historically, back when Nessus was free and Free, there was no shortage of Tk/GTK/Web/Qt/etc. etc. interfaces: A lot of those are still around, but are generally handicapped from the newer features. When your license prohibits reverse engineering, it makes it hard for people to maintain their interfaces to your product.

Recently, I re-acquired my main scanning system (after a stealth project became production, literally overnight (over a year ago)), and decided to upgrade everything… Including Nessus. The upgrade was flawless. It found my old install, upgraded it, re-registered my feed, etc. etc. Very slick. I then upgraded the client on my laptop. That was not so smart of me. Following a common trend, Tenable has outed the old interface in place of a “new” Qt-based interface. I have nothing against “new” or “Qt”, but the new interface is missing dozens of previously-exposed features and one… one very IMPORTANT feature has been removed from both the Qt version and the CLI version: html_graph. It appears there is no html_graph in Nessus 3.

What is html_graph? It was a report export format that presented an HTML report of the scanning results along with pie and bar graphs giving you a visual representation of what the report contained. Very very very important when presenting this information to people wearing ties.

I searched the Interlink: No vast outcry! Just one, lonely, unanswered post, reporting the problem and asking for help. I would guess that because he wrote in all lower-case, no one wanted to tell him the harsh truth: You can’t with Nessus 3. Nope. No more XML output. No more HTML+Graphs. No more. I’ll hypothesize this is because of the license changes: They can’t link against a whole slew of Free libraries that do this work for you, because of the shift from libre to locked. That’s just a guess. At first I thought it was just in the Qt-interface, which could have been simply a CBA or porting issue, still on the TODO list: But when I saw the neutered CLI – without XML or html_graph – well, that’s indicative of politics as opposed to growing pains.

So I don’t know, as I’ve said, why these are gone. But they are. Thankfully, as of right now, I can still convert the Nessus 3-generated scan files (NBE) into html_graph and XML using an older version of the Nessus CLI (version 2.something), but it’s a pain, it’s an extra-step (okay, okay, I automated it with 2 lines of Perl, but it’s a pain to someone!!), and it devalues an otherwise very very valuable tool… For if you cannot show pie and bar-graphs to those wearing ties, on-demand, the fact that you have found 4781 security problems is moot.

Mar 10 2008

EXT4, AKA “Watching M@ Wrestle His Demons”

As someone with a history in filesystems, I have kept abreast of the EXT4 developments, most recently, an interview with Eric Sandeen on the topic, from an RH/Fedora angle. EXT4 will have lots of features new to EXT3 (as well as possibly pick some of the remnant EXT2 stalwarts up, shake them up a bit, and get them to convert). EXT4, unfortunately, has no features new to filesystems in general, nor even open source ones. After reading through the latest “new feature” list in Eric’s interview, it read like a list of XFS features from a decade ago… Then I got to the bottom, and Eric worked @ SGI on XFS for 5 years, almost a decade ago. Go figure.

This is divisive for me. Because I’m a Darwinist: I believe that it’s perfectly fine if there are multiple organisms that do exactly the same thing, as eventually one will win and one will die. At the same time, I see all the energy being put into EXT4, and the practical side of me is screaming “HEY, YOU’RE JUST GETTING EXT TO WHERE XFS WAS 10 YEARS AGO- HOW ABOUT PUTTING YOUR ENERGY INTO XFS AND GETTING IT TO WHERE IT WILL BE 10 YEARS FROM NOW, FASTER?! KTHXBAI”. The main source for this schism is that the environment is rigged. XFS has been open source and stably available on Linux since springish of 2000 (and BSDs, shortly thereafter), yet it still has to fight tooth and nail for any recognition or prominent placement within Linux distros.

XFS has been a workhorse for those in-the-know for a long, long time. I have clients that bought SGI products because they wanted XFS, yet is has consistently been relegated to the backseat, in favor of the old-guard UFS.. er I mean EXT family. XFS continues to amaze various Linux administrators I work with, who, after having XFS inflicted on them, can’t imagine life without it. The catastrophes you can survive- albeit painfully- with XFS, are totally unrecoverable with EXT. The extreme operating conditions you can endure with XFS, are outside the realm where EXT can safely exist.

So there it is: my internal Filing System Fight Club. It is the beauty of open source, while at the same time the ugliness of it. Reinventing the wheel. Again. Thankfully some people are doing new things in this space.

Mar 08 2008

Introducing: RAID-E

Introduction

Do you have some data you’d love to have backed-up, in real-time, somewhere else, but don’t trust the destination? RAID-E is a great solution for you. The concept is simple, every time you modify a file, RAID-E makes a copy, encrypts it using one of numerous methods described  below, encrypts the name of the file as well (by default), and then copies it off to where it should be.

RAID-E is a FUSE filesystem that can use any underlying, mountable filesystem (or folders within the filesystem) as its sources and targets. You can, for example, RAID-E your “Documents” folder on your laptop to some Windows-shared space on a file server or NetApp.  RAID-E may be ported to other FUSEless operating systems someday.

RAID-E supports a rainbow of encryption algorithms and works in one of three modes:

PGP/GPG Encryption

If you have a PGP/GPG key, and would like everything to be encrypted using that, RAID-E is happy to oblige, and will use your public key to encrypt your files before copy. If you want them signed as well, then you will need to provide your keyring passphrase in order to access your private and/or signing key.

Standard Encryption

RAID-E can generate an encryption key (or you can provide one) using one of a number of user-selectable algorithms which it will then encrypt (using one of a number, user-selectable algorithms) using a phrase of your choice. The encrypted key can be stored on a USB stick or other flash-based removable media. When RAID-E is mounted, the phrase is required to decrypt the key, allowing files to be encrypted.

Cornucopia Encryption

Using the same concept as ‘Standard Encryption’, Cornucopia uses a number of different encryption algorithms. Individual files are encrypted using a pseudo-randomly picked algorithm. For example, one file might use AES, while the next Two-Fish. While the security advantage of this is admittedly dubious (don’t Doghouse me, Bruce – I admit it!) , it won’t decrease your security and may protect fractions of your dataset against attacks directed at particular algorithms.

Bootstrapping/Offline Synchronization

To aid in the initial bootstrapping, and to make synchronizing after making off-line changes a snap, a tool called ‘mirror-e’ is also included. ‘mirror-e’ will use the same configuration and methodology as RAID-E to encrypt and copy any changed files or new files between the source and target.

Various Configurable Defaults

By default, RAID-E will never delete files from a target.

By default, RAID-E is only concerned with file names and contents, not metadata, attributes, etc.

Be default, RAID-E does not verify a copy operation.

By default, RAID-E will always overwrite a target file.

Status and Errata

RAID-E and its toolset is being developed independently, and will be released under  the GPLv2 license.

Mar 04 2008

Back to Billboarding

Billboarding used to be the way to share groups of read-heavy bit or small-byte data across numerous processes or systems. The concept was simple: Have a file with a bunch of zeros in it, and on occasion change one of the zeros to a one. Processes/systems reading that file would say “Hmmm, the 4th ’slot’ is now a ‘1′, and that means [thing]“. This was used for everything from node status, to process states, to primitive anticipatory scheduling. Then objects became popular. Why read out of a billboard, when you can just share flag data across objects?

Well they’re back, it seems. Billboards have made a small-yet-noticeable resurgence in a number of systemic regions, and more than a couple system architects have noticed clients requesting solutions that boil down to billboarding (although generally given a more sexy name like “shared state file” or “offline node graph”).

I’ve had two projects in the last 6 months that have required, in general, a billboard, and am very happy to see them come back into vogue. They’ve been “gone” long enough so that the cool kids think they’re brilliant and “out-of-the-box” when they propose the “radical concept” of the “shared state file”, and those of us who’ve been doing advanced system architecture for .. gasp .. almost 15 years now, just smile and jot “billboard” in our notepad.

Systemic Billboards

In environments where there is shared (and preferably clustered) storage, billboards make a ton of sense as a means of easily communicating with other nodes. More than just 0’s and 1’s, a node can communicate an array of information about its health – and also other nodes can ask other nodes to do something. Maybe setting a node state to ‘2′ means “please restart your user services” or ‘3′ means “please reboot”. Whenever a node reads its own entry it can go “Hey cool, I’m suppose to do something.” As a means of pre-failure fencing, this is exceptionally handy as one does not necessarily have a connection to a greater network, but still may have a connection to storage- Allowing a control system to tell others “Hey, something bad is happening, please shut down”, for example if one node detects a UPS failure or pending drainage.

I currently have a project that requires the assumption that if bad things happen, the only communication available between nodes will be a shared IEEE1394 drive array – A better place to use a billboard does not exist.

Non-Systemic, or Quasi-Systemic Billboards 

Given an application that may have any number of processes (for example, any web-based application), using a billboard as a light IPC for each process to communicate its state or intentions, can save a lot of otherwise tricksy IPC coding. Sure, you can have a pipe dangling out there- But what happens if a process needs to skip a pipe read for a given cycle, and in the mean time the status of the pipe changes? Yes, you could have one pipe per process, but then you’re looking at a mess that could be more elegantly solved with … a billboard. With 256 ASCII characters available (I’m not even going to get into the possibilities with Unicode), you can communicate up to 255 different things per process– all sitting happily waiting to be read whenever is convenient, or necessary, with very very little side-effect.

Yes, contention issues need to be addressed by your application.

Yes, if your application is poorly designed and process spawn rates are out-of-control, a billboard will destroy your performance.

Yes, if your storage disappears, there is a problem: A big problem that a web application cannot solve, so its inability to get to the billboard could definitely be a sign that maybe it should go into a maintenance state instead of processing transaction it can’t actually handle.

Yes, if the file becomes corrupted, Bad Things could happen: The application can/should detect such problems and Do The Right Thing.

There are certainly situations where billboards are not the answer. There are certainly situations where using IPC or nodal communication is a better solution. I’ve never advocated billboards as the end-all of inter-process or inter-nodal communication: Only that it shouldn’t be discounted in cases it DOES make sense.

Aug 08 2007

BitTorrent Protocol and Source Closed

Now I guess it makes sense why Bram wanted to “buy all rights” from the other quasi-silent BT devs, in particular the protocol design. Shame, Bram. You should’ve been honest.

UPDATE 8/10: After a couple days to ponder this more, I’m even more disappointed than I was. While a large proponent of open source, I don’t much care what Bram does with it. The source code to BT isn’t where the value is, it’s the protocol. Closing a protocol, is like telling someone they’re not allowed to talk your language anymore. It would be the Bush Administration saying “Cubans are evil, so we’re not going to allow them to speak English anymore.” Given the historical ignorance of this White House, I’d have expected that BEFORE I expected Bram to close the BT protocol.

A protocol is a culture embedded in technology: It defines etiquette between systems; it defines grammar and speech; it defines nouns and verbs. You learn a lot about humanity when you read well-written protocols: It’s not like learning a new language, it’s like being immersed in a new culture. That culture has been taken to a private island, and no one is allowed to experience it unless you’re declared fit by the czar.

On the plus side, this will spur some new innovations. BT has some serious flaws: flaws that have been well known and accepted on the grounds of ease-of-coding and portability. The paradigm can change, the flaws fixed, and a new format can be created and in a Darwinistic dual, we’ll see which succeeds. Of course, this assumes that the czar doesn’t unleash the lawyer-hounds.

Protocols should never be exclusive. Never. They should be the most open parts of our technological society.

WordPress Themes