LAUGHTON ELECTRONICS


Cheap Video à la Lancaster and the back story re: KIM-1

Why the main article is entitled Bride of Son of Cheap Video
At the dawn of the microcomputer era, back in the days of the Z-80 and the Motorola 6800, a certain Pioneer of the Art published plans for a remarkable microprocessor interface. Don Lancaster's book was called The Cheap Video Cookbook, and it explained an unorthodox technique that allowed rudimentary microcomputers like the Altair and the KIM-1 to generate video output. The novelty of Lancaster's approach impressed me deeply, and I took an important lesson to heart.

The title of my KimKlone article is a grateful acknowledgment to Mr Lancaster, and it takes the humorous style of the title he gave his supplementary book, Son Of Cheap Video.


Background on Cheap Video
and Lying To the Machine


The lowdown on Cheap Video.


Cheap Video is a means of outputting video without the need for DMA hardware or a Video Controller chip. Instead what's used is programmed I/O: the video is generated as the output of an actual program running on the computer. This would ordinarily be impossible (due to the very high data rate required), but Cheap Video has a trick up its sleeve, something devious done in hardware to fool the CPU. But first let's look at the software.

The video program takes the form of a loop within a loop. The innermost loop does the actual video output. Each iteration outputs one row of pixels, with timing to match one horizontal sweep of the CRT monitor. The CPU begins by doing a Jump To Subroutine (JSR) to the first address in a portion of memory defined as the video buffer. Rather than the video data, what it "sees" there is a subroutine composed almost entirely of NOPs. The CPU executes all the NOPs and then a Return From Subroutine (RTS); all this occurs simultaneously with one horizontal sweep of the CRT monitor. Following the RTS the CPU advances the buffer address (for the next JSR) and repeats the loop; there's no exit until there have been enough scans (horizontal lines) to refresh the entire screen from top to bottom. (This description applies to bit-mapped displays, the simplest case. If a Character Generator ROM is used then an intermediate level of looping is required for the multiple pixel rows that form each character.)

The outermost loop checks whether the program ought to terminate (in order to process a keystroke from the keyboard, perhaps). If there's no termination, the buffer address for the JSR is rolled back to its initial value, a Vertical Sync pulse is created by toggling an output port on and off, and the inner loop is called to produce another frame.

The devious hardware hoax mentioned earlier is what causes the CPU to see memory as containing NOPs rather than what's really there (the video data). Here's how it's done, and why:

The usual premise of computer operation is that when the CPU sends out an address, memory will faithfully reply with the byte stored at that address. But with Cheap Video a major connection — that between the data buses — gets temporarily severed. This lets Cheap Video "lie" about what's in memory. (See the diagrams on the left, Business as Usual and Cheap Video.) During each scan, the bytes fetched onto the memory data bus do not get relayed back to the CPU's data bus. Instead, the bytes of memory data (ie; the characters we needed to fetch) get merrily shipped off to the video display. Meanwhile, some Cheap Video flimflam logic feeds the CPU bus a brazen fabrication, a persistent NOP (and eventual RTS) which appear to reside at the addresses actually containing data! The rationale is as follows.

Generating a video display requires that the microcomputer's memory be read byte-by-byte in a rapid sequence. Reading 32 bytes in a row is how you generate a video display which is 32 characters wide, for example, and the high speed is necessary to match the horizontal scan of the monitor's CRT.

Don Lancaster realized that a microprocessor is easily capable of reading 32 or more bytes in a row, even though conventional processing of memory variables can only proceed sporadically and in much smaller chunks. But prolonged sequences of memory reads do occur as the chip fetches the bytes of its program. In fact, broadly speaking we can say that, if there are no branches in a program and no accesses to memory variables, sequential reads for instruction fetching will continue indefinitely. Therefore NOPs yield the desired "scan" behavior: an extended sequence of back-to-back reads of ascending memory locations. The CPU unwittingly mimics a 16-bit counter or a DMA controller, with its address bus outputting an ascending 16-bit count.

I am indebted to Mr Lancaster for the lesson I learned from Cheap Video, namely that a microprocessor can readily be manipulated by hardware tricks in order to produce unusual behaviors that are useful. The KimKlone, of course, relies very heavily on this principle.

KIM (the original) and the KimKlone

My preliminary Cheap Video exploit was on my KIM-1, a classic, 1-MHz 6502 board from MOS Technology. That was in 1980 or so, and I ventured to add an extra wrinkle that used undefined opcodes to let the CPU access 128K of memory. As with the KimKlone, the banks were a full 64K in size and selected precisely as needed according to each bus cycle.

As I recall, the deal in that case was that each xxxxx011 op-code would load a certain bit pattern via a PROM into an 8-bit shift register. Each xxxxx011 op-code was a Prefix instruction, and the shift register would trot out the corresponding pattern, one bit per cycle, as the following instruction executed; this would be a normal 65xx memory reference instruction. The shift register output bit served as A16, the most-significant address line. The patterns were such that A16 would flip from one 64K bank to the other for a single cycle only, exactly as the memory reference instruction performed its fetch or store. (There were other capabilities as well: for instance you could JMP to the alternate bank and stay there. But the doc, and my memory of the exact details, are both vague and imperfect.)

Unlike those of the KK scheme, the KIM-1's prefixes were "dumb" about instruction timing (which of course varies according to address mode). For KIM-1 that increased the number of prefixes that had to be defined, and when programming you had to be careful to choose the prefix that'd yield the timing that matched the address mode you were gonna use! It seemed a shame to use all those undefined op-codes so inefficiently, but with the KIM-1 it didn't really matter because there was nothing else going on.

Later I proved that the envelope can be pushed a great deal further indeed, as the KimKlone, a clean-sheet-of-paper design, demonstrates.

NAVIGATE
Home
Commercial& Manufacturing
Stage& Studio
Laboratory& OEM
copyright notice (Jeff Laughton)