IO, IO, it’s off to work we go.
IO in computer systems, and more specifically disk IO, is probably the most overlooked and ignored cause of performance problems in modern systems. My intent through the upcoming installments of this blog post is to delve deeply into this subject and help shed some light on IO.
What is IO? In simplistic terms it is a chunk of data getting moved from one device to another. These chunks can be easily depicted by imagining moving dirt in trucks. If you take a dump truck full of dirt from one place to another, it is like one IO. Now imagine 10,000 dump trucks per second moving down the highway at light speed – that is what it’s like inside a computer.
Why is it important to think about IO in this fashion? Let me explain by using the dirt truck analogy once more. More often than not, the scoopers filling the truck are equivalent to the size of the truck carrying it. Keeping these sizes the same makes loading and unloading more efficient and cost effective. If you were to have a scooper half the size of the truck, trucks would be waiting around for a second scoop or leaving half full. Now if you were to have smaller trucks, it would mean multiple trucks per scoop- not very efficient. Most people don’t realize but these two scenarios happen inside computers all the time.
Now that the concept is hopefully clearer, let’s put some numbers to it. A standard computer hard drive is capable of sustaining around 80 IO’s per second also known as 80 IOPS. Standard IO’s are 4 Kilobytes in size or 4KB. This comes out to roughly 327 Megabytes or 327MB per second. So under ideal conditions you could transfer a 327MB file in one second. However, if you are only filling the dump truck half full you can only process 80 per second, limiting your system to 163MB per second of movable data.
In typical systems we find that the dump trucks might only be filled to 10% of capacity limiting you further. The typical response to slow file movement has been to get faster hard drives. Technology has now provided us with systems that are capable of 1 Million IOPS or more. At a million IOPS, even at 10% efficiency, you can move over 400 Billion Bytes or 400GB in one second. These systems are rather expensive however with prices between $250,000 and $500,000. If you have loads of spare money, the solution is easy; buy a ridiculously fast system to process inefficient data. If you have a budget, you might want to dig into your systems and optimize your IO.