ON-DRIVE PROCESSING I wasted most of the time I was going to spend writing this on fixing GophHub again, because GitHub inevitably removed newlines from their JSON output, which I was relying on to parse it in Bash. So this is the short version: Large data storage/processing systems are based around clusters of small computers. These can execute tasks in parallel patterns to reduce lattency. When you're just doing something simple like searching for a string or moving/copying data, why not have some way to program the drive so it can do that itself. GPUs already have OpenGL for telling them to go and do repetitive mathematical transformations on their own, why not HDDs and SSDs? The drive controllers aren't like modern GPUs in processing capacity, but like early GPUs had simple logical manipulations that could be applied to the image buffer, drive controllers could have simple transformation routines too. Obviously different formatting, fragmentation, and partitioning would get in the way, but provided compression or encryption isn't used there should be some way to take advantage of this approach at a stage before the usual loops of data access/manipulation. If there was a standard way of running custom code on the drive controller, like CUDA on GPUs, you could even send the drive a minimal routine for reading/writing in the partition format. It seems like a missed opportunity to me - instead of clusters of computer boxes full of drives, the drives become the computing clusters themselves. On a small scale maybe it wouldn't make much difference compared to adding small computer boards, but scaled up to these massive datacentres you hear about... Anyway this came to me while thinking about how I'd implement my big website idea. Getting the most out of hardware is important on my budget of hardly anything. Unfortunately hacking drives to add this functionality to their firmware would tie you to particular models for replacement, so really you need it part of the package from the manufacturer. But if you had a database and partition format in one (I'm crazy enough to be considering this), you could potentially send database commands to the drives directly. Perhaps there are some cheap microcontroller type chips that can interface with drives while sharing the bus with another computer somehow? The main computer gives the little drive computer a command then the drive computer stalls any further accidental drive commands from the main computer on the drive bus until it's finished the job. Presumably fancy things like DMA will make this less tidy in practice though. - The Free Thinker