Generations of Data Processing

To-morrow, and to-morrow, and to-morrow...

Macbeth Act 5, scene 5, 19–2

Dominic Vautier
updated 6-12
again 1-14

Unit Record - The Zero Generation

I don’t know exactly why the Unit Record era is referred to as the zero generation but it does seem to make sense when you look at history. Maybe it’s also because programmers have a strange way of counting from zero. Or maybe it’s like WWI which nobody called WWI until WWII.

Unit Record doesn't mean too much today but to many older people who were there it means a lot. It is reverently referred to as the first stage of computer development even though there were no computers, or vacuum tubes or transistors or any emergent electronic technology. It instead relied on common switches, relays, capacitors, resistors, solenoids, springs and other then available electric circuits put together in really imaginative ways.

The zero generation's building block was a card, a so called Unit Record of information, made of thin stiff cardboard of very precise size, shape and thickness to allow it to be rapidly machine fed into readers and then interpreted by electronic brushes. It was more popularly known as the IBM card or the punch card or Hollerith card or simply the 5081. Early Hollerith cards were of various sizes and designs but the final card design was soon firmly established in 1926. Each card contained up to 80 characters of either numeric or alphameric (sic) data and could be stored and organized in any pre-designed sequence. These card files or decks could be processed by a series of machines. The card was known as a unit record because it contained one type of unit data or one collection of homogeneous information.

Hollerith immediately seized upon the idea that some information changed while other information was fairly permanent. He came up with a way to store the permanent data separately from the more changeable data. The permanent data could be then used over and over again. This was a pretty brilliant idea. Permanent data was referred to variously as master files or main files or customer files or employee files. Less permanent data that changed daily or weekly was called all kinds of other things such as transaction files, secondary files, update files, time cards, location cards, work-in-process cards, deduction cards, etc, because their contents were transitory. A paper invoice document contained fixed customer information along with time and other billing data. To separate and digitize the permanent customer data from the rest if the data could be a huge savings in time and increased accuracy. All that was needed to associate things together was a customer number or invoice number or billing number or some kind of tag.

The separation of common data into groups or files was a key issue in data processing because it saved time, allowed reporting things faster, reduced errors, avoided duplication, fraud, abuse, and waste. All this will be discussed again under Unit Record but this concept of files and file structures was the beginning of modern indexed data bases used today.

The zero generation was resilient and imaginative, and lasted for quite a long time through two world wars and then some. In the scope of things this time period gave testimony to its resilience. It lasted for many years before what we understand today as modern information systems and it continued to be around long after reliable computers finally did show up and began slowly taking over. Unit Record data applications lasted well into third generation computing and even beyond. There was a lot of overhang. Today special applications still exist involving punch cards that contain embedded codes or microfilm, some called aperture cards and they are still used because nobody can figure out any batter way to do it. Sometimes methods get to be so good that the replacement is less perfect. Air traffic controllers pass around little blocks of wood representing flights. Aircraft carriers have scale model planes for hanger placement. Not everything can be computer simulated.

In the early days electronic data management solid state technological leaps were more like crawls then actual leaps and were viewed with much skepticism by the old-timers who did Unit Record work, and with good reason because the old ways continued to remain very reliable for most companies. As the new brands of “computers” came on line, they were used more for bragging rights than having any substantial effect on company operations. Because of the conservative nature of Unit Record shops, new computers were looked upon with mistrust. TAB men viewed new young uppity programmers with scorn. Unit Record continued to dominate data processing until these computers could actually prove themselves.

But computers did gradually prove themselves and integrated into systems but only in small increments and it was so slow going. They were at first somewhat loosely tied into existing card applications only because of their multiplication and division capabilities which accounting machines could not do. Accounting machines could add and subtract just fine but not much more.

A common cartoon of the time displayed a front office with shiny new computers and clean shaven young men smartly at work doing nothing. Meanwhile in the back room out of sight was a bunch of old guys in green rimmed shades running all key corporate functions on rows of card sorters and collators.

I remember instances where time cards were punched out from a reproducing machine from a TAB operation. The punched cards were then read into a computer that extended out (multiplied) certain numbers like pay and overtime hours worked. The computer punched a similar deck adding the new calculated fields and the deck was fed back into the existing TAB process. That was a typical integration of computer functions into a TAB operation, not very efficient but it worked. And it was what you would consider "integrating".

One fairly progressive company I worked for was solely dependant on Unit Record to do it's payroll, and one old TAB guy was the only one who knew how the system worked. After 25 years he finally came down sick and payroll was late. Needless to say the payroll system was redone to run on IBM computers.

A more complete discussion of unit record can be found here.

First generation

The advent of vacuum tube technology was the first visible modern step in computing. These first generation computer attempts did not work well and were not reliable and are today usually consigned to museums or scrap heaps or old batman movies. The period remains pretty much a reminder of our hasty, fervent but nonetheless failed efforts to improve information processing. Because of World War Two, everything had to be done fast and it was a time when more effort was spent on results than real research. An attempt to apply wonderful scientific discoveries to standard business practice was unsuccessful. The first generation was a primitive attempt at best and it offered little to business.

Computers were so slow that you could practically stand and watch a computer go through its cycles. The computer language in use was directly machine based and was definitely inscrutable to most people. Computer languages were all very low level (metal scratchers) and had very little significance to practical everyday data processing business needs.

Incidentally, all of the huge amount of work done in great secrecy under the ULTRA project during WW II was zero generation technology. Not much of involved vacuum tube development. It was all cogs, wheels, and electric circuits.

Reliability was the biggest problem with first generation vacuum tube computers. Stories are told of these early machines and their propensity to break down or get “buggy” at any time and at the least provocation. They were tube based and were particularly prone to failure because of the aggregated heat and corrosion caused by the close placement of many tube elements. The heat also attracted insects. One urban legend describes how a dead fly shorted one of the resisters causing failure and resulted in the term "bug". These monster machines were loaded with dead bugs.

You also hear stories of how it was necessary to tip-toe around the giant Gotham-like boxes or wear slippers in order not to jar or disturb tubes. During maintenance large carts containing new tubes could be seen pushed around while specialists replaced vacuum tubes that looked suspicious or foggy before the tubes actually could fail. When a computer stopped an onsite crew was there with tube carts to begin looking for the smoking gun (tube), or anything else that didn't quite smell right, like a blackened resistor of blown up capacitor or too many dead bugs.

I think that there was quite a gap between the very large computers of this time like ENIAC and smaller business computers. The larger scientific experimental ENIAC was just that--scientific and experimental.

Second generation

The advent of the solid state ushered in a better and most important, a cooler running and more reliable computer. The famous IBM 1401 and the much faster IBM 7090 series revolutionized computing to a great extent although few mainline card oriented applications were immediately affected, and continued to use Unit Record for their day-to-day work. With second generation technology however, there was greater reliability and a higher level of interest from the business community. But there was still much to be desired with this next wave of computer technology. Programmers still needed to use low level programming languages, in IBM's case called Autocoder. It was low level because it directly generated machine code. Machine coding corresponds directly to actual machine code, and is sometimes referred to as “metal scratching”. Programmers had to have a good knowledge of just how a computer machine decoded and executed it's instruction set.

By this time IBM had gained significant market share. People referred to the company as SNOW WHITE, because it controlled most of business computing while seven other companies struggled over what was left: Burroughs, NCR, Univac, Honeywell, Control Data, RCA, and GE. This complete dominance continued for many years.

I programmed extensively in Autocoder, the IBM assembly language for it's second generation machines. I found it useful, fun, and actually quite powerful, but it did require a good knowledge about the internals of machine functioning: calculations, interrupts, I/O, channel waits (TIOBs), memory access, file structure, bit manipulation. The code was hardly portable. Each machine needed it’s own type of programming that conformed to the hardware and device configuration. In cases of simple data movement, it was necessary to do at least two operations for each data move. This involved a SWM command (set word mark), followed by a MWM (move to word mark). These were some of the complexities and pain of Autocoder. But it was so much fun to do.

Of all the restrictions that plagued programmers was the shortage of core memory. As often was the case we had to work with one “k” of core memory (1024 bytes) or even less. We counted our bytes for sure. There’s not a lot that can be done with that much core. We needed to make our programs extremely small and had to devise methods to reduce size, often reusing the same core by overlaying instruction sets or modifying previous instruction areas. It was a time of innovation at a really low level. Such programming logistics were quite real and required much additional planning that was not related directly to problem solving. A lot of the programming time was spent trying to develop a way to get things to work in the restricted computer environment available.

But then everything suddenly changed:

Things really took off during the third generation. It was the defining days for IBM.