The New Peter Norton Programmer's Guide to the IBM(R) PC and PS/2(R)


════════════════════════════════════════════════════════════════════════════

The New Peter Norton Programmer's Guide to the IBM(R) PC and PS/2(R)

By Peter Norton and Richard Wilton

════════════════════════════════════════════════════════════════════════════

PUBLISHED BY
Microsoft Press
A Division of Microsoft Corporation
16011 NE 36th Way, Box 97017, Redmond, Washington 98073-9717

First edition copyright (C) 1985 by Peter Norton
Second edition revisions copyright (C) 1988 by Microsoft Press
All rights reserved. No part of the contents of this book may be reproduced
or transmitted in any form or by any means without the written permission of
the publisher.

Library of Congress Cataloging in Publication Data

Norton, Peter, 1943- .
The new Peter Norton programmer's guide to the IBM PC and PS/2 :
the ultimate reference guide to the entire family of IBM Personal
Computers / Peter Norton and Richard Wilton.

        p.        cm.

Includes index.

1. IBM microcomputers--Programming.  2. IBM Personal Computer.
3. IBM Personal System/2 (Computer system)  I. Wilton, Richard, 1953- .
II. Title.  III. Title: Programmer's guide to the IBM Personal Computers.
QA76.8.I1015N67    1988    88-21104
005.265--dc19                CIP
ISBN 1-55615-131-4

Printed and bound in the United States of America.

1 2 3 4 5 6 7 8 9  FGFG  3 2 1 0 9 8

Distributed to the book trade in the United States by Harper & Row.

Distributed to the book trade in Canada by General Publishing Company, Ltd.

Distributed to the book trade outside the United States and Canada by
Penguin Books Ltd.

Penguin Books Ltd., Harmondsworth, Middlesex, England
Penguin Books Australia Ltd., Ringwood, Victoria, Australia
Penguin Books N.Z. Ltd., 182-190 Wairau Road, Auckland 10, New Zealand

British Cataloging in Publication Data available

Microsoft(R), Flight Simulator(R), and GW-BASIC(R) are registered trademarks
of Microsoft Corporation.

IBM(R), PC/AT(R), Personal System/2(R), and PS/2(R) are registered
trademarks, and Micro Channel(TM), PCjr(TM), and PC/XT(TM) are trademarks of
International Business Machines Corporation.

Norton Utilities(TM) is a trademark of Peter Norton.

────────────────────────────────────────────────────────────────────────────
Project Editor: Megan E. Sheppard
Technical Editors: Bob Combs and Jim Johnson
────────────────────────────────────────────────────────────────────────────



────────────────────────────────────────────────────────────────────────────
Contents

    Introduction

    1       Anatomy of the PCs and PS/2s
    2       The Ins and Outs
    3       The ROM Software
    4       Video Basics
    5       Disk Basics
    6       Keyboard Basics
    7       Clocks, Timers, and Sound Generation
    8       ROM BIOS Basics
    9       ROM BIOS Video Services
    10       ROM BIOS Disk Services
    11       ROM BIOS Keyboard Services
    12       Miscellaneous Services
    13       ROM BIOS Services Summary
    14       DOS Basics
    15       DOS Interrupts
    16       DOS Functions: Version 1
    17       DOS Functions: Versions 2.0 and Later
    18       DOS Functions Summary
    19       Program Building
    20       Programming Languages

    Appendix A: Installable Device Drivers
    Appendix B: Hexadecimal Arithmetic
    Appendix C: About Characters
    Appendix D: DOS Version 4

    Index



────────────────────────────────────────────────────────────────────────────
Introduction

    The world of personal computers has come a long way in the few years since
    the original edition of this book appeared, yet the goal of this book
    remains a simple but ambitious one: to help you master the principles of
    programming the IBM personal computer family. From the time that the first
    IBM Personal Computer (known simply as "the PC") was introduced in the
    fall of 1981, it was clear that it was going to be a very important
    computer. Later, as PC sales zoomed beyond the expectations of everyone
    (IBM included) and as the original model was joined by a sibling or two,
    the PC became recognized as the standard for serious desktop computers.
    From the original PC, a whole family of computers──a family with many
    branches──has evolved. And at the same time the importance of the PC
    family has also grown.

    The success and significance of the PC family has made the development of
    programs for it very important. However, the fact that each member of the
    family differs in details and characteristics from its relatives has also
    made the development of programs for the family increasingly complex.

    This book is about the knowledge, skills, and concepts that are needed to
    create programs for the PC family──not only for one member of the family
    (though you might perhaps cater to the peculiarities and quirks of one
    member) but for the family as a whole──in a way that is universal enough
    that your programs should work not only on all the present family members,
    but on future members as well.

    This book is for anyone involved in the development of programs for the PC
    family. It is for programmers, but not only for programmers. It is for
    anyone who is involved in or needs to understand the technical details and
    working ideas that are the basis for PC program development, including
    anyone who manages programmers, anyone who plans or designs PC programs,
    and anyone who uses PC programs and wants to understand the details behind
    them.


Some Comments on Philosophy

    One of the most important elements of this book is the discussion of
    programming philosophy. You will find throughout this book explanations of
    the ideas underlying IBM's design of the PC family and of the principles
    of sound PC programming, viewed from experience.

    If this book were to provide you with only facts──tabulations of technical
    information──it would not serve you well. That's why we've interwoven with
    the technical discussion an explanation of what the PC family is all
    about, of the principles that tie the various family members together, and
    of the techniques and methods that can help you produce programs that will
    endure and prosper along with the PC family.


How to Use This Book

    This book is both a reading book and a reference book, and you can
    approach it in at least two ways. You may want to read it as you would any
    other book, from front to back, digging in where the discussion is useful
    to you and quickly glancing through the material you don't yet need. This
    approach provides a grand overview of the workings (and the ideas behind
    the workings) of PC programs. You can also use this book as a pure
    reference, and dip into specific chapters for specific information. We've
    provided a detailed table of contents at the beginning of each chapter and
    an extensive index to help you find what you need.

    When you use this book as a random-access reference to the details of PC
    programming, you'll find that much of the material is intricately
    interrelated. To help you understand the interrelationships, we have
    repeated some details when it was practical to do so and have referred you
    to other sections when such repetition was less practical.

What's New in This Edition

    As you might guess, this edition of the Programmer's Guide has been
    brought up to date for the new generation of IBM personal computers: the
    Personal System/2 computers, or PS/2s.

    In some ways this book is more complex and more detailed than the
    original. There's a good reason for this: The newer members of the PC and
    PS/2 family are more complicated computers, and the later versions of DOS
    are more complicated and have more features than their predecessors. It
    was inevitable that this revised version of the Programmer's Guide would
    reflect this greater complexity in the hardware, the ROM BIOS, and DOS.

    Still, you'll find that a few members of the extended PC family aren't
    covered in this book. The PCjr, the XT/286, and the PC Convertible are
    used relatively infrequently, and the PS/2 Model 70 was released too
    recently to be included. Nevertheless, each of these machines is similar
    to one of the PCs or PS/2s whose innards we will examine in detail, so
    this book should be a useful guide even if you are programming a Model 70
    or one of the less widely used PCs.

    Here are some of the changes you'll find in this new edition:

    New video subsystems. Since the original edition appeared, IBM's Enhanced
    Graphics Adapter (EGA) became a de facto hardware standard for PC
    programmers and users. Then the PS/2s introduced two new video subsystems,
    the Multi-Color Graphics Array (MCGA) and the Video Graphics Array (VGA).
    These new video subsystems receive extensive coverage in Chapters 4 and
    9.

    New keyboards. IBM supports a new, extended keyboard with later versions
    of the PC/AT and with all PS/2s. Chapters 6 and 11 have been expanded to
    cover the new hardware.

    A new focus on C programming. For better or worse, the most recent
    versions of DOS have been strongly influenced by the C programming
    language. This influence is even more apparent in such operating
    environments as Microsoft Windows, UNIX, and OS/2──all of which were
    designed by C programmers. For this reason you'll find new examples of C
    programming in several different chapters. Of course, we haven't abandoned
    Pascal and BASIC──in fact, Chapter 20 examines each of these programming
    languages.

    A new perspective on DOS. DOS has evolved into a mature operating system
    whose design can now be viewed with the clarity of hindsight. The past
    several years of working with DOS have helped us view this immensely
    popular operating system with a practical perspective born of experience.
    Our discussions of DOS emphasize which of its features are obsolescent and
    which are pointers to the future.

    Despite these changes, the direction and philosophy of this book remain
    the same. When you write a program for a PC or PS/2, you can actually
    program for an entire family of computers. Each member of the family──the
    PC, the PC/XT, the PC/AT, and all PS/2s──has hardware and software
    components that are identical or similar to those in other members of the
    family. When you keep this big picture in mind, you'll be able to write
    programs that take advantage of the capabilities of the different PC and
    PS/2 models without sacrificing portability.


Other Resources

    One book, of course, can't provide you with all the knowledge that you
    might possibly need. We've made this book as rich and complete as we
    reasonably can, but there will always be a need for other kinds of
    information. Here are some of the places you might look for material to
    supplement what you find here.

    For detailed technical information about the PC family, the ultimate
    source is IBM's series of technical reference manuals. Specific technical
    reference manuals exist for the original PC, for the XT, for the AT, and
    for PS/2 models 30, 50, 60, and 80. In addition, the detailed IBM BIOS
    Interface Technical Reference Manual summarizes the capabilities of the
    Basic Input/Output System in all members of the extended PC family. You
    should know a few things about using these model-specific manuals:

    ■  Information specific to one model is not differentiated from general
        information for the whole PC family. To be sure of the differences, you
        should use common sense, compare the different manuals, and consult
        this book.

    ■  Remember that each new model in the PC family adds new features. If you
        turn to the manual for a later model, you will find information on a
        wide variety of features; if you turn to the manual for an earlier
        model, you'll avoid being distracted by features that do not apply to
        all models in the family.

    There is also an IBM Options and Adapters Technical Reference Manual for
    the various options and adapters used by the PC family, such as different
    disk drives or display screens. Technical information about this kind of
    equipment is gathered into this manual, which is updated periodically.
    (The updates are available by subscription.) Little of the information in
    this technical reference manual is of use to programmers, but you might
    find some parts of interest.

    IBM also publishes technical reference manuals for special extensions to
    the PC, such as PC Network.

    Perhaps the most important of the IBM technical reference manuals is the
    series for DOS. These manuals contain a wealth of detailed technical
    information which we have summarized in this book.

    A number of other sources can provide information to supplement the IBM
    manuals:

    ■  For a somewhat broader perspective on the IBM Personal Computer──one
        that is not focused on programming──see Peter Norton's Inside the IBM
        Personal Computer, published by Robert J. Brady Company.

    ■  For a broader perspective on DOS, see the third edition of Van
        Wolverton's Running MS-DOS, and The MS-DOS Encyclopedia, both published
        by Microsoft Press.

    Because this book covers the subject of PC programming in a broad fashion,
    it can provide you with only a few key details about individual
    programming languages. For details on particular programming languages and
    the many specific compilers for those languages, you will need more books
    than we could begin to list or recommend.

    With these introductory remarks completed, it's time to plunge into the
    task of mastering the principles of programming the PC family!



────────────────────────────────────────────────────────────────────────────
Chapter 1  Anatomy of the PCs and PS/2s

    The Microprocessor
    The 8088 Microprocessor
    The 8086 Microprocessor
    The 80286 Microprocessor
    The 80386 Microprocessor
    The Math Coprocessor

    The Support Chips
    The Programmable Interrupt Controller
    The DMA Controller
    The Clock Generator
    The Programmable Interval Timer
    Video Controllers
    Input/Output Controllers

    Linking the Parts: The Bus
    The Address Bus
    The Data Bus
    Micro Channel Architecture

    Memory
    CPU Address Space
    The System Memory Map

    Design Philosophy

    From the programmer's point of view, all members of the PC family consist
    of a processor, memory chips, and several smart, or programmable, circuit
    chips. All the main circuit components that make the computer work are
    located on the system board; other important parts are located on
    expansion boards, which can be plugged into the system board.

    The system board (Figures 1-1 through 1-3) contains the microprocessor,
    which is tied to at least 64 KB of memory; some built-in ROM programs,
    such as BASIC and the ROM BIOS; and several very important support chips.
    Some of these chips control external devices, such as the disk drive or
    the display screen, and others help the microprocessor perform its tasks.

    In this section, we discuss each major chip and give a few important
    technical specifications. These chips are frequently known by more than
    one name. For example, some peripheral input/output hardware is supervised
    by a chip known as the 8255. This chip is also referred to as the 8255A
    and the 8255A-5. The suffixes A and 5 refer to revision numbers and to
    parts rated for operation at different speeds. For programming purposes,
    any Intel chip part number that starts with 8255 is identical to any other
    chip whose part number starts with 8255, regardless of the suffix.
    However, when you replace one of these chips on a circuit board, note the
    suffix. If the suffixes are different, the part may not operate at the
    proper speed.


The Microprocessor

    In all PCs, the microprocessor is the chip that runs programs. The
    microprocessor, or central processing unit (CPU), carries out a variety of
    computations, numeric comparisons, and data transfers in response to
    programs stored in memory.

    The CPU controls the computer's basic operation by sending and receiving
    control signals, memory addresses, and data from one part of the computer
    to another along a group of interconnecting electronic pathways called a
    bus. Located along the bus are input and output (I/O) ports that connect
    the various memory and support chips to the bus. Data passes through these
    I/O ports while it travels to and from the CPU and the other parts of the
    computer.

    In the IBM PCs and PS/2s, the CPU always belongs to the Intel 8086 family
    of microprocessors. (See Figure 1-4.) We'll point out the similarities
    and differences between the different microprocessors as we describe them.

    ┌────────────────────────────────────────────────────────────────────────┐
    │ Figure 1-1 can be found on p.3 of the printed version of the book.     │
    └────────────────────────────────────────────────────────────────────────┘

    Figure 1-1.  The IBM PC system board.

    ┌────────────────────────────────────────────────────────────────────────┐
    │ Figure 1-2 can be found on p.4 of the printed version of the book.     │
    └────────────────────────────────────────────────────────────────────────┘

    Figure 1-2.  The PC/AT system board.

    ┌────────────────────────────────────────────────────────────────────────┐
    │ Figure 1-3 can be found on p.5 of the printed version of the book.     │
    └────────────────────────────────────────────────────────────────────────┘

    Figure 1-3.  The PS/2 Model 60 system board.

    Model                                Microprocessor
    ──────────────────────────────────────────────────────────────────────────
    PC                                   8088
    PC/XT                                8088
    PC/AT                                80286
    PS/2 Models 25, 30                   8086
    PS/2 Models 50, 60                   80286
    PS/2 Model 80                        80386
    ──────────────────────────────────────────────────────────────────────────

    Figure 1-4.  Microprocessors used in IBM PCs and PS/2s.

The 8088 Microprocessor

    The 8088 is the 16-bit microprocessor that controls the standard IBM
    personal computers, including the original PC, the PC/XT, the Portable PC,
    and the PCjr. Almost every bit of data that enters or leaves the computer
    passes through the CPU to be processed.

    Inside the 8088, 14 registers provide a working area for data transfer and
    processing. These internal registers, forming an area 28 bytes in size,
    are able to temporarily store data, memory addresses, instruction
    pointers, and status and control flags. Through these registers, the 8088
    can access 1 MB (megabyte), or more than one million bytes, of memory.

The 8086 Microprocessor

    The 8086 is used in the PS/2 models 25 and 30 (and also in many IBM PC
    clones). The 8086 differs from the 8088 in only one minor respect: It uses
    a full 16-bit data bus instead of the 8-bit bus that the 8088 uses. (The
    difference between 8-bit and 16-bit buses is discussed on page 12.)
    Virtually anything that you read about the 8086 also applies to the 8088;
    for programming purposes, consider them identical.

The 80286 Microprocessor

    The 80286 is used in the PC/AT and in the PS/2 models 50 and 60. Although
    fully compatible with the 8086, the 80286 supports extra programming
    features that let it execute programs much more quickly than the 8086.
    Perhaps the most important enhancement to the 80286 is its support for
    multitasking.

    Multitasking is the ability of a CPU to perform several tasks at a time──
    such as printing a document and calculating a spreadsheet──by quickly
    switching its attention among the controlling programs.

    The 8088 used in a PC or PC/XT can support multitasking with the help of
    sophisticated control software. However, an 80286 can do a much better job
    of multitasking because it executes programs more quickly and addresses
    much more memory than the 8088. Moreover, the 80286 was designed to
    prevent tasks from interfering with each other.

    The 80286 can run in either of two operating modes: real mode or protected
    mode. In real mode, the 80286 is programmed exactly like an 8086. It can
    access the same 1 MB range of memory addresses as the 8086. In protected
    mode, however, the 80286 reserves a predetermined amount of memory for an
    executing program, preventing that memory from being used by any other
    program. This means that several programs can execute concurrently without
    the risk of one program accidentally changing the contents of another
    program's memory area. An operating system using 80286 protected mode can
    allocate memory among several different tasks much more effectively than
    can an 8086-based operating system.

The 80386 Microprocessor

    The PS/2 Model 80 uses the 80386, a faster, more powerful microprocessor
    than the 80286. The 80386 supports the same basic functions as the 8086
    and offers the same protected-mode memory management as the 80286.
    However, the 80386 offers two important advantages over its predecessors:

    ■  The 80386 is a 32-bit microprocessor with 32-bit registers. It can
        perform computations and address memory 32 bits at a time instead of 16
        bits at a time.

    ■  The 80386 offers more flexible memory management than the 80286 and
        8086.

    We'll say more about the 80386 in Chapter 2.

The Math Coprocessor

    The 8086, 80286, and 80386 can work only with integers. To perform
    floating-point computations on an 8086-family microprocessor, you must
    represent floating-point values in memory and manipulate them using only
    integer operations. During compilation, the language translator represents
    each floating-point computation as a long, slow series of integer
    operations. Thus, "number-crunching" programs can run very slowly──a
    problem if you have a large number of calculations to perform.

    A good solution to this problem is to use a separate math coprocessor that
    performs floating-point calculations. Each of the 8086-family
    microprocessors has an accompanying math coprocessor: The 8087 math
    coprocessor is used with an 8086 or 8088; the 80287 math coprocessor is
    used with an 80286; and the 80387 math coprocessor is used with an 80386.
    (See Figure 1-5.) Each PC and PS/2 is built with an empty socket on its
    motherboard into which you can plug a math coprocessor chip.

    From a programmer's point of view, the 8087, 80287, and 80387 math
    coprocessors are fundamentally the same: They all perform arithmetic with
    a higher degree of precision and with much greater speed than is usually
    achieved with integer software emulation. In particular, programs that use
    math coprocessors to perform trigonometric and logarithmic operations can
    run up to 10 times faster than their counterparts that use integer
    emulation.

    Programming these math coprocessors in assembly language can be an
    exacting process. Most programmers rely on high-level language translators
    or commercial subroutine libraries when they write programs to run with
    the math coprocessors. The techniques of programming the math coprocessors
    directly are too specialized to cover in this book.

╓┌─┌────────────────────┌────────────────────┌────────────────────┌─────────┌►
                            Approximate Range                                Si
    Data Type            (from)               (to)                 Bits      (d
    ───────────────────────────────────────────────────────────────────────────
    Word integer         -32,768              +32,767              16
    Short integer        -2 x 10E9            +2 x 10E9            32
    Long integer         -9 x 10E18           +9 x 10E18           64
    Packed decimal       -99...99             +99...99             80
    Short real           8.43 x 10E-37        3.37 x 10E38         32
    Long real            4.19 x 10E-307       1.67 x 10E308        64        15
    Temporary real       3.4 x 10E-4932       1.2 x 10E4932        80
    ───────────────────────────────────────────────────────────────────────────


    Figure 1-5.  The range of numeric data types supported by the 8087,
    80287, and 80387 math coprocessors.


The Support Chips

    The microprocessor cannot control the entire computer without some
    help──nor should it. By delegating certain control functions to other
    chips, the CPU is free to attend to its own work. These support chips
    can be responsible for such processes as controlling the flow of
    information throughout the internal circuitry (as the interrupt
    controller and the DMA controller are) and controlling the flow of
    information to or from a particular device (such as a video display or
    disk drive) attached to the computer. These so-called device controllers
    are often mounted on a separate board that plugs into one of the PC's
    expansion slots.

    Many support chips in the PCs and PS/2s are programmable, which means
    they can be manipulated to perform specialized tasks. Although direct
    programming of these chips is generally not a good idea, the following
    descriptions will point out which chips are safe to program directly and
    which aren't. Because this book does not cover direct hardware control,
    you should look in the IBM technical manuals as well as in the chip
    manufacturers' technical literature for details about programming
    individual chips.

The Programmable Interrupt Controller

    In a PC or PS/2, one of the CPU's essential tasks is to respond to
    hardware interrupts. A hardware interrupt is a signal generated by a
    component of the computer, indicating that component's need for CPU
    attention. For example, the system timer, the keyboard, and the disk
    drive controllers all generate hardware interrupts at various times. The
    CPU responds to each interrupt by carrying out an appropriate
    hardware-specific activity, such as incrementing a time-of-day counter
    or processing a keystroke.

    Each PC and PS/2 has a programmable interrupt controller (PIC) circuit
    that monitors interrupts and presents them one at a time to the CPU. The
    CPU responds to these interrupts by executing a special software routine
    called an interrupt handler. Because each hardware interrupt has its own
    interrupt handler in the ROM BIOS or in DOS, the CPU can recognize and
    respond specifically to the hardware that generates each interrupt. In
    the PC, PC/XT, and PS/2 models 25 and 30, the PIC can handle 8 different
    hardware interrupts. In the PC/AT and PS/2 models 50, 60, and 80, two
    PICs are chained together to allow a total of 15 different hardware
    interrupts to be processed.

    Although the programmable interrupt controller is indeed programmable,
    hardware interrupt management is not a concern in most programs. The ROM
    BIOS and DOS provide nearly all of the services you'll need for managing
    hardware interrupts. If you do plan to work directly with the PIC, we
    suggest you examine the ROM BIOS listings in the IBM technical reference
    manuals for samples of actual PIC programming.

The DMA Controller

    Some parts of the computer are able to transfer data to and from the
    computer's memory without passing through the CPU. This operation is
    called direct memory access, or DMA, and it is handled by a chip known
    as the DMA controller. The main purpose of the DMA controller is to
    allow disk drives to read or write data without involving the
    microprocessor. Because disk I/O is relatively slow compared to CPU
    speeds, DMA speeds up the computer's overall performance quite a bit.

The Clock Generator

    The clock generator supplies the multiphase clock signals that
    coordinate the microprocessor and the peripherals. The clock generator
    produces a high-frequency oscillating signal. For example, in the
    original IBM PC, this frequency was 14.31818 megahertz (MHz, or million
    cycles per second); in the newer machines, the frequency is higher.
    Other chips that require a regular timing signal obtain it from the
    system clock generator by dividing the base frequency by a constant to
    obtain the frequency they need to accomplish their tasks. For example,
    the IBM PC's 8088 is driven at 4.77 MHz, one-third of the base
    frequency. The PC's internal bus and the programmable interval timer
    (discussed shortly) use a frequency of 1.193 MHz, running at a quarter
    of the 8088 rate and one-twelfth of the base rate.

The Programmable Interval Timer

    The programmable interval timer generates timing signals at regular
    intervals controlled by software. The chip can generate timing signals
    on three different channels at once (four channels in the PS/2 models
    50, 60, and 80).

    The timer's signals are used for various system tasks. One essential
    timer function is to generate a clock-tick signal that keeps track of
    the current time of day. Another of the timer's output signals can be
    used to control the frequency of tones produced with the computer's
    speaker. See Chapter 7 for more information about programming the
    system timer.

Video Controllers

    The many video subsystems available with the PCs and PS/2s present a
    variety of programmable control interfaces to the video hardware. For
    example, all PC and PS/2 video subsystems have a cathode ray tube (CRT)
    controller circuit to coordinate the timing signals that control the
    video display.

    Although the video control circuits can be programmed in application
    software, all video subsystems have different programming interfaces.
    Fortunately, all PCs and PS/2s are equipped with basic video control
    routines in the ROM BIOS. We'll describe these routines in Chapter 9.

Input/Output Controllers

    PCs and PS/2s have several input/output subsystems with specialized
    control circuitry that provides an interface between the CPU and the
    actual I/O hardware. For example, the keyboard has a dedicated
    controller chip that transforms the electrical signals generated by
    keystrokes into 8-bit codes that represent the individual keys. All disk
    drives have separate controller circuitry that directly controls the
    drive; the CPU communicates with the controller through a consistent
    interface. The serial and parallel communications ports also have
    dedicated input/output controllers.

    You rarely need to worry about programming these hardware controllers
    directly because the ROM BIOS and DOS provide services that take care of
    these low-level functions. If you need to know the details of the
    interface between the CPU and a hardware I/O controller, see the IBM
    technical reference manuals and examine the ROM BIOS listings in the PC
    and PC/AT manuals.


Linking the Parts: The Bus

    As we mentioned, the PC family of computers links all internal control
    circuitry by means of a circuit design known as a bus. A bus is simply a
    shared path on the main circuit board to which all the controlling parts
    of the computer are attached. When data is passed from one component to
    another, it travels along this common path to reach its destination.
    Every microprocessor, every control chip, and every byte of memory in
    the computer is connected directly or indirectly to the bus. When a new
    adapter is plugged into one of the expansion slots, it is actually
    plugged directly into the bus, making it an equal partner in the
    operation of the entire unit.

    Any information that enters or leaves a computer system is temporarily
    stored in at least one of several locations along the bus. Data is
    usually placed in main memory, which in the PC family consists of
    thousands or millions of 8-bit memory cells (bytes). But some data may
    end up in a port or register for a short time while it waits for the CPU
    to send it to its proper location. Generally, ports and registers hold
    only 1 or 2 bytes of information at a time and are usually used as
    stopover sites for data being sent from one place to another. (Ports and
    registers are described in Chapter 2.)

    Whenever a memory cell or port is used as a storage site, its location
    is known by an address that uniquely identifies it. When data is ready
    to be transferred, its destination address is first transmitted along
    the address bus; the data follows along behind on the data bus. So the
    bus carries more than data: It carries power and control information,
    such as timing signals (from the system clock) and interrupt signals, as
    well as the addresses of the thousands or millions of memory cells and
    the many devices attached to the bus. To accommodate these four
    different functions, the bus is divided into four parts: the power
    lines, the control bus, the address bus, and the data bus. We're going
    to discuss the subjects of address and data buses in greater detail
    because they move information in a way that helps to explain some of the
    unique properties of the PC family.

The Address Bus

    The address bus in the PC, PC/XT, and PS/2 models 25 and 30 uses 20
    signal lines to transmit the addresses of the memory cells and devices
    attached to the bus. (Memory addressing is discussed more fully on page
    13 and in Chapter 3.) Because two possible values (either 1 or 0) can
    travel along each of the 20 address lines, these computers can specify
    2^20 addresses──the limit of the addressing capability of the 8088 and
    8086 microprocessors. This amounts to more than a million possible
    addresses.

    The 80286 used in the PC/AT can address 2^24 bytes of memory, so the AT
    has a 24-line address bus. The bus in the 80286-based PS/2 models 50 and
    60 also supports 24-bit memory addressing; in the 80386-based PS/2 Model
    80, the bus has 32-bit addressing capability.

The Data Bus

    The data bus works with the address bus to carry data throughout the
    computer. The PC's 8088-based system uses a data bus that has 8 signal
    lines, each of which carries a single binary digit (bit); data is
    transmitted across this 8-line bus in 8-bit (1-byte) units. The 80286
    microprocessor of the AT uses a 16-bit data bus and therefore passes
    data in 16-bit (1-word) units.

    The 8088, being a 16-bit microprocessor, can work with 16 bits of data
    at a time, exactly like its relative the 80286. Although the 8088 can
    work with 16-bit numbers internally, the size of its data bus allows the
    8088 to pass data only 8 bits at a time. This has led some people to
    comment that the 8088 is not a true 16-bit microprocessor. Rest assured
    that it is, even though it is less powerful than the 80286. The 16-bit
    data bus of the 80286 does help it move data around more efficiently
    than the 8088, but the real difference in speed between the 8088 and the
    AT comes from the AT's faster clock rate and its more powerful internal
    organization.

    There is an important practical reason why so many computers, including
    the older members of the PC family, use the 8088 with its 8-bit data
    bus, rather than the 8086 with its 16-bit bus. The reason is simple
    economics. A variety of 8-bit circuitry elements are available in large
    quantities at low prices. When the PC was being designed, 16-bit
    circuitry was more expensive and was less readily available. The use of
    the 8088, rather than the 8086, was important not only to hold down the
    cost of the PC, but also to avoid a shortage of parts. The price of
    16-bit circuitry elements has decreased significantly since then,
    however, and it has become economically feasible to use the more
    efficient 80286 with its 16-bit bus. Furthermore, the 80286 is able to
    use a mixture of 8-bit parts and 16-bit parts, thereby maintaining
    compatibility within the PC family.

Micro Channel Architecture

    The PS/2 models 50, 60, and 80 introduced a new bus hardware design that
    IBM calls Micro Channel architecture. Both the Micro Channel bus in the
    PS/2s and the earlier PC and PC/AT bus accomplish the same task of
    communicating addresses and data to plug-in adapters. The Micro Channel
    bus hardware is designed to run at higher speeds than its predecessors
    as well as to allow for more flexible adapter hardware designs. The
    Micro Channel differs from the PC and PC/AT bus design both in its
    physical layout and in its signal specifications, so an adapter that can
    be used with one bus is incompatible with the other.

    The differences between the original PC bus, the PC/AT bus, and the
    Micro Channel bus are important in operating system software but not in
    applications programs. Although all programs rely implicitly on the
    proper functioning of the address and data buses, very few programs are
    actually concerned with programming the bus directly. We'll come back to
    the Micro Channel architecture only when we describe PS/2 ROM BIOS
    services that work specifically with it.


Memory

    So far, we've discussed the CPU, the support chips, and the bus, but
    we've only touched on memory. We've saved our discussion of memory for
    the end of this chapter because memory chips, unlike the other chips
    we've discussed, don't control or direct the flow of information through
    a computer system; they merely store information until it is needed.

    The number and storage capacity of memory chips that exist inside the
    computer determine the amount of memory we can use for programs and
    data. Although this may vary from one computer to another, all PCs and
    PS/2s come with at least 40 KB of read-only memory (ROM)──with space for
    more──and between 64 KB and 2 MB of random-access memory (RAM). Both ROM
    and RAM capacities can be augmented by installing additional memory
    chips in empty sockets on the motherboard as well as by installing a
    memory adapter in one of the system expansion slots. But this is only
    the physical view of memory. A program sees memory not as a set of
    individual chips, but as a set of thousands or millions of 8-bit
    (1-byte) storage cells, each with a unique address.

    Programmers must also think of memory in this way──not in terms of how
    much physical memory there is, but in terms of how much addressable
    memory there is. The 8088 and 8086 can address up to 1 MB (1024 KB, or
    exactly 1,048,576 bytes) of memory. In other words, that's the maximum
    number of addresses, and therefore the maximum number of individual
    bytes of information, the processors can refer to. Memory addressing is
    discussed in more detail in Chapter 2.

CPU Address Space

    Each byte is referred to by a 20-bit numeric address. In the 8086 memory
    scheme, the addresses are 20 bits "wide" because they must travel along
    the 20-bit address bus. This gives the 8086 an address space with
    address values that range from 00000H through FFFFFH (0 through
    1,048,576 in decimal notation). If you have trouble understanding hex
    notation, you might want to take a quick look at Appendix B.

    Similarly, the 80286's 24-bit addressing scheme lets it use extended
    address values in the range 000000H through FFFFFFH, or 16 MB. The 80386
    can use extended 32-bit addresses, so its maximum address value is
    FFFFFFFFH; that is, the 80386 can directly address up to 4,294,967,296
    bytes, or four gigabytes (GB), of memory. This is enough memory for most
    practical purposes, even for the most prolific programmer.

    Although the 80286 and 80386 can address more than 1 MB of memory, any
    program compatible with the 8086 and with DOS must limit itself to
    addresses that lie in the 1 MB range available to the 8086. When the IBM
    PC first appeared in 1981, 1 MB seemed like a lot of memory, but large
    business-applications programs, memory-resident utility programs, and
    system software required for communications and networking can easily
    fill up the entire 8086 address space.

    One way to work around the 1 MB limit is with the LIM
    (Lotus-Intel-Microsoft) Expanded Memory Specification (EMS). The EMS is
    based on special hardware and software that map additional RAM into the
    8086 address space in 16 KB blocks. The EMS hardware can map a number of
    different 16 KB blocks into the same 16 KB range of 8086 addresses.
    Although the blocks must be accessed separately, the EMS lets up to 2048
    different 16 KB blocks map to the same range of 8086 addresses. That's
    up to 32 MB of expanded memory.

    ────────────────────────────────────────────────────────────────────────
    NOTE:
    Don't confuse EMS "expanded" memory with the "extended" memory located
    above the first megabyte of 80286 or 80386 memory. Although many
    memory expansion adapters can be configured to serve as either
    expanded or extended memory (or both), these two memory configurations
    are very different from both a hardware and software point of view.
    ────────────────────────────────────────────────────────────────────────

The System Memory Map

    On the original IBM PC, the 1 MB address space of the 8088 was split
    into several functional areas. (See Figure 1-6.) This memory map has
    been carried forward for compatibility in all subsequent PC and PS/2
    models.

                ┌────────────────────────────────────┐
                │   PC/AT and PS/2 extended memory   │
                └────────────────────────────────────┘
    100000H ────►┌────────────────────────────────────┐
                │      Reserved for ROM BIOS         │
    E0000H ────►├────────────────────────────────────┤
                │    Reserved for installable ROM    │
    C0000H ────►├────────────────────────────────────┤
                │           Video buffers            │
    A0000H ────►└────────────────────────────────────┘
                ┌────────────────────────────────────┐──┐
                │      Transient portion of DOS      │  │
                ├────────────────────────────────────┤  │
                │                                    │  │
                │       Transient Program Area       │  │
                │      (user programs and data)      │  │
                ├────────────────────────────────────┤  │
                │       Resident portion of DOS      │  ├── System
                ├────────────────────────────────────┤  │   RAM
                │  Data area for ROM BIOS and BASIC  │  │
    00500H ────►├────────────────────────────────────┤  │
                │       Data area for ROM BIOS       │  │
    00400H ────►├────────────────────────────────────┤  │
                │          Interrupt vectors         │  │
    00000H ────►└────────────────────────────────────┘──┘

    Figure 1-6.  An outline of memory usage in PCs and PS/2s.

    Some of the layout of the PC and PS/2 memory map is a consequence of the
    design of the 8086 microprocessor. For example, the 8086 always maintains
    a list of interrupt vectors (addresses of interrupt handling routines) in
    the first 1024 bytes of RAM. Similarly, all 8086-based microcomputers have
    ROM memory at the high end of the 1 MB address space, because the 8086,
    when first powered up, executes the program that starts at address FFFF0H.

    The rest of the memory map follows this general division between RAM at
    the bottom of the address space and ROM at the top. A maximum of 640 KB of
    RAM can exist between addresses 00000H and 9FFFFH. (This is the memory
    area described by the DOS CHKDSK program.) Subsequent memory blocks are
    reserved for video RAM (A0000H through BFFFFH), installable ROM modules
    (C0000H through DFFFFH), and permanent ROM (E0000H through FFFFFH). We'll
    explore each of these memory areas in greater detail in the chapters that
    follow.


Design Philosophy

    Before leaping into the following chapters, we should discuss the design
    philosophy behind the PC family. This will help you understand what is
    (and what isn't) important or useful to you.

    Part of the design philosophy of the IBM personal computer family centers
    around a set of ROM BIOS service routines (see Chapters 8 through 13)
    that provide essentially all the control functions and operations that IBM
    considers necessary. The basic philosophy of the PC family is: Let the ROM
    BIOS do it; don't mess with direct control. In our judgment, this is a
    sound idea that has several beneficial results. Using the ROM BIOS
    routines encourages good programming practices, and it avoids some of the
    kludgy tricks that have been the curse of many other computers. It also
    increases the chances of your programs working on every member of the PC
    family. In addition, it gives IBM more flexibility in making improvements
    and additions to the line of PC computers. However, it would be naive for
    us to simply say to you, "Don't mess with direct control of the hardware."
    For good reasons or bad, you may want or may need to have your programs
    work as directly with the computer hardware as possible, doing what is
    colorfully called "programming down to the bare metal."

    Still, as the PC family has evolved, programmers have had the opportunity
    to work with increasingly powerful hardware and system software. The newer
    members of the PC family provide faster hardware and better system
    software, so direct programming of the hardware does not necessarily
    result in significantly faster programs. For example, with an IBM PC
    running DOS, the fastest way to display text on the video display is to
    use assembly-language routines that bypass DOS and directly program the
    video hardware. Video screen output is many times slower if you route it
    through DOS. Contrast this with a PC/AT or PS/2 running OS/2, where the
    best way to put text on the screen is to use the operating system output
    functions. The faster hardware and the efficient video output services in
    OS/2 make direct programming unnecessary.

    As you read the programming details we present in this book, keep in mind
    that you can often obtain a result or accomplish a programming task
    through several means, including direct hardware programming, calling the
    ROM BIOS, or using a DOS service. You must always balance portability,
    convenience, and performance as you weigh the alternatives. The more you
    know about what the hardware, the ROM BIOS, and the operating system can
    do, the better your programs can use them.



────────────────────────────────────────────────────────────────────────────
Chapter 2  The Ins and Outs

    How the 8086 Communicates
    The 8086 Data Formats

    How the 8086 Addresses Memory
    Segmented Addresses
    80286 and 80386 Protected-Mode Addresses
    Address Compatibility

    The 8086 Registers
    The Scratch-Pad Registers
    The Segment Registers
    The Offset Registers
    The Flags Register
    Addressing Memory Through Registers
    Rules for Using Registers

    How the 8086 Uses I/O Ports

    How the 8086 Uses Interrupts
    Software Interrupts
    Hardware Interrupts

    Generally speaking, the more you know about how your computer works, the
    more effective you'll be at writing programs for it. High-level
    programming languages, such as BASIC and C, are not designed to include
    every possible function that you might need while programming──though
    admittedly, some are better than others. At some point, you will want to
    go deeper into your system and use some of the routines the languages
    themselves use, or perhaps go even deeper and program at the hardware
    level.

    Although some languages provide limited means to talk directly to memory
    (as with PEEK and POKE in BASIC) or even to some of the chips (as with
    BASIC's INP and OUT statements), most programmers eventually resort to
    assembly language, the basic language from which all other languages and
    operating systems are built. The 8086 assembly language, like all other
    assembly languages, is composed of a set of symbolic instructions, as
    shown in Figure 2-1. An assembler translates these instructions and the
    data associated with them into a binary form, called machine language,
    that can reside in memory and be processed by the 8086 to accomplish
    specific tasks.

╓┌─┌──────────────────┌──────────────────────────────────────────────────────╖
    Mnemonic           Full Name
    Mnemonic           Full Name
    ──────────────────────────────────────────────────────────────────────────
    Instructions recognized by all 8086-family microprocessors
    AAA                ASCII Adjust After Addition
    AAD                ASCII Adjust After Division
    AAM                ASCII Adjust After Multiplication
    AAS                ASCII Adjust After Subtraction
    ADC                ADd with Carry
    ADD                ADD
    AND                AND
    CALL               CALL
    CBW                Convert Byte to Word
    CLC                CLear Carry flag
    CLD                CLear Direction flag
    CLI                CLear Interrupt flag
    CMC                CoMplement Carry flag
    CMP                CoMPare
    CMPS               CoMPare String
    CMPSB              CoMPare String (Bytes)
    CMPSW              CoMPare String (Words)
    CWD                Convert Word to Doubleword
    Mnemonic           Full Name
    ──────────────────────────────────────────────────────────────────────────
    CWD                Convert Word to Doubleword
    DAA                Decimal Adjust After Addition
    DAS                Decimal Adjust After Subtraction
    DEC                DECrement
    DIV                Unsigned DIVide
    ESC                ESCape
    HLT                HaLT
    IDIV               Integer DIVide
    IMUL               Integer MULtiply
    IN                 INput from I/O port
    INC                INCrement
    INT                INTerrupt
    INTO               INTerrupt on Overflow
    IRET               Interrupt RETurn
    JA                 Jump if Above
    JAE                Jump if Above or Equal
    JB                 Jump if Below
    JBE                Jump if Below or Equal
    JC                 Jump if Carry
    Mnemonic           Full Name
    ──────────────────────────────────────────────────────────────────────────
    JC                 Jump if Carry
    JCXZ               Jump if CX Zero
    JE                 Jump if Equal
    JG                 Jump if Greater than
    JGE                Jump if Greater than or Equal
    JL                 Jump if Less than
    JLE                Jump if Less than or Equal
    JMP                JuMP
    JNA                Jump if Not Above
    JNAE               Jump if Not Above or Equal
    JNB                Jump if Not Below
    JNBE               Jump if Not Below or Equal
    JNC                Jump if No Carry
    JNE                Jump if Not Equal
    JNG                Jump if Not Greater than
    JNGE               Jump if Not Greater than or Equal
    JNL                Jump if Not Less than
    JNLE               Jump if Not Less than or Equal
    JNO                Jump if Not Overflow
    Mnemonic           Full Name
    ──────────────────────────────────────────────────────────────────────────
    JNO                Jump if Not Overflow
    JNP                Jump if Not Parity
    JNS                Jump if Not Sign
    JNZ                Jump if Not Zero
    JO                 Jump if Overflow
    JP                 Jump if Parity
    JPE                Jump if Parity Even
    JPO                Jump if Parity Odd
    JS                 Jump if Sign
    JZ                 Jump if Zero
    LAHF               Load AH with Flags
    LDS                Load pointer using DS
    LEA                Load Effective Address
    LES                Load pointer using ES
    LOCK               LOCK bus
    LODS               LOaD String
    LODSB              LOaD String (Bytes)
    LODSW              LOaD String (Words)
    LOOP               LOOP
    Mnemonic           Full Name
    ──────────────────────────────────────────────────────────────────────────
    LOOP               LOOP
    LOOPE              LOOP while Equal
    LOOPNE             LOOP while Not Equal
    LOOPNZ             LOOP while Not Zero
    LOOPZ              LOOP while Zero
    MOV                MOVe data
    MOVS               MOVe String
    MOVSB              MOVe String (Bytes)
    MOVSW              MOVe String (Words)
    MUL                MULtiply
    NEG                NEGate
    NOP                No OPeration
    NOT                NOT
    OR                 OR
    OUT                OUTput to I/O port
    POP                POP
    POPF               POP Flags
    PUSH               PUSH
    PUSHF              PUSH Flags
    Mnemonic           Full Name
    ──────────────────────────────────────────────────────────────────────────
    PUSHF              PUSH Flags
    RCL                Rotate through Carry Left
    RCR                Rotate through Carry Right
    REP                REPeat
    REPE               REPeat while Equal
    REPNE              REPeat while Not Equal
    REPNZ              REPeat while Not Zero
    REPZ               REPeat while Zero
    RET                RETurn
    ROL                ROtate Left
    ROR                ROtate Right
    SAHF               Store AH into Flags
    SAL                Shift Arithmetic Left
    SAR                Shift Arithmetic Right
    SBB                SuBtract with Borrow
    SCAS               SCAn String
    SCASB              SCAn String (Bytes)
    SCASW              SCAn String (Words)
    SHL                SHift Left
    Mnemonic           Full Name
    ──────────────────────────────────────────────────────────────────────────
    SHL                SHift Left
    SHR                SHift Right
    STC                SeT Carry flag
    STD                SeT Direction flag
    STI                SeT Interrupt flag
    STOS               STOre String
    STOSB              STOre String (Bytes)
    STOSW              STOre String (Words)
    SUB                SUBtract
    TEST               TEST
    WAIT               WAIT
    XCHG               eXCHanGe
    XLAT               transLATe
    XOR                eXclusive OR

    Instructions recognized by the 80286 and 80386 only:
    ARPL               Adjust RPL field of selector
    BOUND              Check array index against BOUNDs
    CLTS               CLear Task-Switched flag
    Mnemonic           Full Name
    ──────────────────────────────────────────────────────────────────────────
    CLTS               CLear Task-Switched flag
    ENTER              Establish stack frame
    INS                INput String from I/O port
    LAR                Load Access Rights
    LEAVE              Discard stack frame
    LGDT               Load Global Descriptor Table register
    LIDT               Load Interrupt Descriptor Table register
    LLDT               Load Local Descriptor Table register
    LMSW               Load Machine Status Word
    LSL                Load Segment Limit
    LTR                Load Task Register
    OUTS               OUTput String to I/O port
    POPA               POP All general registers
    PUSHA              PUSH All general registers
    SGDT               Store Global Descriptor Table register
    SIDT               Store Interrupt Descriptor Table register
    SLDT               Store Local Descriptor Table register
    SMSW               Store Machine Status Word
    STR                Store Task Register
    Mnemonic           Full Name
    ──────────────────────────────────────────────────────────────────────────
    STR                Store Task Register
    VERR               VERify a segment selector for Reading
    VERW               VERify a segment selector for Writing

    Instructions recognized by the 80386 only:
    BSF                Bit Scan Forward
    BSR                Bit Scan Reverse
    BT                 Bit Test
    BTC                Bit Test and Complement
    BTR                Bit Test and Reset
    BTS                Bit Test and Set
    CDQ                Convert Doubleword to Quadword
    CMPSD              CoMPare String (Doublewords)
    CWDE               Convert Word to Doubleword in EAX
    LFS                Load pointer using FS
    LGS                Load pointer using GS
    LSS                Load pointer using SS
    LODSD              LOaD String (Doublewords)
    MOVSD              MOVe String (Doublewords)
    Mnemonic           Full Name
    ──────────────────────────────────────────────────────────────────────────
    MOVSD              MOVe String (Doublewords)
    MOVSX              MOVe with Sign-eXtend
    MOVZX              MOVe with Zero-eXtend
    SCASD              SCAn String (Doublewords)
    SETA               SET byte if Above
    SETAE              SET byte if Above or Equal
    SETB               SET byte if Below
    SETBE              SET byte if Below or Equal
    SETC               SET byte if Carry
    SETE               SET byte if Equal
    SETG               SET byte if Greater
    SETGE              SET byte if Greater or Equal
    SETL               SET byte if Less
    SETLE              SET byte if Less or Equal
    SETNA              SET byte if Not Above
    SETNAE             SET byte if Not Above or Equal
    SETNB              SET byte if Not Below
    SETNBE             SET byte if Not Below or Equal
    SETNC              SET byte if No Carry
    Mnemonic           Full Name
    ──────────────────────────────────────────────────────────────────────────
    SETNC              SET byte if No Carry
    SETNE              SET byte if Not Equal
    SETNG              SET byte if Not Greater
    SETNGE             SET byte if Not Greater or Equal
    SETNL              SET byte if Not Less
    SETNLE             SET byte if Not Less or Equal
    SETNO              SET byte if Not Overflow
    SETNP              SET byte if Not Parity
    SETNS              SET byte if Not Sign
    SETNZ              SET byte if Not Zero
    SETO               SET byte if Overflow
    SETP               SET byte if Parity
    SETPE              SET byte if Parity Even
    SETPO              SET byte if Parity Odd
    SETS               SET byte if Sign
    SETZ               SET byte if Zero
    SHLD               SHift Left (Doubleword)
    SHRD               SHift Right (Doubleword)
    STOSD              STOre String (Doublewords)
    Mnemonic           Full Name
    ──────────────────────────────────────────────────────────────────────────
    STOSD              STOre String (Doublewords)
    ──────────────────────────────────────────────────────────────────────────


    Figure 2-1.  The instruction set used with the 8086, 80286, and 80386.

    ──────────────────────────────────────────────────────────────────────────
    NOTE:
    Although this chapter discusses the details of 8086 programming,
    remember that we're implicitly talking about the 8088, 80286, and 80386
    as well. Information pertaining exclusively to the 80286 or 80386 will
    be noted.
    ──────────────────────────────────────────────────────────────────────────

    The operations that the 8086 instructions can perform break down into only
    a few categories. They can do simple, four-function integer arithmetic.
    They can move data around. They can, using only slightly clumsy methods,
    manipulate individual bits. They can test values and take logical action
    based on the results. And last but not least, they can interact with the
    circuitry around them. The size of each instruction varies, but generally
    the most basic and often-used instructions are the shortest.

    Assembly-language programming can be carried out on one of two levels: to
    create interface routines that will tie high-level programs to the
    lower-level DOS and ROM-BIOS routines; or to create full-fledged
    assembly-language programs that are faster and smaller than equivalent
    high-level programs, or that perform exotic tasks at the hardware level,
    perhaps accomplishing a feat that is accomplished nowhere else. Either
    way, to understand how to use assembly language, you must understand how
    8086-family microprocessors process information and how they work with the
    rest of the computer. The rest of this chapter describes how the
    microprocessor and the computer's other parts communicate.


How the 8086 Communicates

    The 8086, 80286, and 80386 interact with the circuitry around them in
    three ways: through direct and indirect memory access, through
    input/output (I/O) ports, and with signals called interrupts.

    The microprocessor uses memory by reading or writing values at memory
    locations that are identified with numeric addresses. The memory locations
    can be accessed in two ways: through the direct memory access (DMA)
    controller or through the microprocessor's internal registers. The disk
    drives and the serial communications ports can directly access memory
    through the DMA controller. All other devices transfer data to and from
    memory by way of the microprocessor's registers.

    Input/Output ports are the microprocessor's general means of communicating
    with any computer circuitry other than memory. Like memory locations, I/O
    ports are identified by number, and data can be read from or written to
    any port. I/O port assignment is unique to the design of any particular
    computer. Generally, all members of the IBM PC family use the same port
    specifications, with just a few variations among the different models.
    (See page 37.)

    Interrupts are the means by which the circuitry outside the microprocessor
    reports that something (such as a keystroke) has happened and requests
    that some action be taken. Although interrupts are essential to the
    microprocessor's interaction with the hardware around it, the concept of
    an interrupt is useful for other purposes as well. For example, a program
    can use the INT instruction to generate a software interrupt that requests
    a service from DOS or from the system ROM BIOS. Interrupts are quite
    important when programming the PC family, so we'll devote a special
    section to them at the end of this chapter.

The 8086 Data Formats

    Numeric data. The 8086 and 80386 are able to work with only four simple
    numeric data formats, all of which are integer values. The formats are
    founded on two building blocks: the 8-bit byte and the 16-bit (2-byte)
    word. Both of these basic units are related to the 16-bit processing
    capacity of the 8086. The byte is the more fundamental unit; and when the
    8086 and 80286 address memory, bytes are the basic unit addressed. In a
    single byte, these microprocessors can work with unsigned positive numbers
    ranging in value from 0 through 255 (that is, 2^8 possibilities). If the
    number is a signed value, one of the 8 bits represents the sign, so only 7
    bits represent the value. Thus a signed byte can represent values ranging
    from -128 through +127. (See Figure 2-2.)

    The 8086 and 80286 can also operate on 16-bit signed and unsigned values,
    or words. Words are stored in memory in two adjacent bytes, with the low-
    order byte preceding the high-order byte. (See the discussion of
    "back-words storage" on page 24.)

                                    Range
    Size        Signed?      Dec                     Hex
    ──────────────────────────────────────────────────────────────────────────
    8          No           0 through 255           00H through FFH

    8          Yes          -128 through 0 through  80H through 00H through
                            +127                    7FH

    16          No           0 through 65,535        0000H through FFFFH

    16          Yes          -32,768 through 0       8000H through 0000H
                            through +32,767         through 7FFFH

    32          No           0 through 4,294,967,295 00000000H through
                                                    FFFFFFFFH

    32          Yes          -2,147,483,648 through  00000000H through
                            +2,147,483,647          00000000H through
                                                    7FFFFFFFH
    ──────────────────────────────────────────────────────────────────────────

    Figure 2-2.  The six data formats used in the 8086 family. (Only the 80386
    supports 32-bit formats.)

    A word interpreted as an unsigned, positive number can have 2^16 different
    values ranging from 0 through 65,535. As a signed number, the value can
    range from -32,768 through +32,767.

    The 80386 differs from its predecessors in that it can also work with
    32-bit integer values, or doublewords. A doubleword represents a signed or
    unsigned 4-byte integer with any of 2^32 (or 4,294,967,295) different
    values.

    Character data. Character data is stored in the standard ASCII format,
    with each character occupying 1 byte. The 8086 family knows nothing about
    ASCII characters and treats them as arbitrary bytes, with one exception:
    The instruction set accommodates decimal addition and subtraction
    performed on binary coded decimal (BCD) characters. The actual arithmetic
    is done in binary, but the combination of the AF flag (see page 33) and a
    few special instructions makes it practical to work on decimal characters
    and get decimal results, which can easily be converted to ASCII.

    ──────────────────────────────────────────────────────────────────────────
    Back-Words Storage
    While the PC's memory is addressed in units of individual 8-bit bytes,
    many operations involve 16-bit words. In memory, a 16-bit word is stored
    in any two adjacent 8-bit bytes. The least-significant byte of the word
    is stored in the lower memory location, and the most significant byte is
    stored in the higher memory location. From some points of view, storing
    a word this way is the opposite of what you might expect. Due to the
    backward appearance of this storage scheme, it is sometimes whimsically
    called "back-words" storage.

    ─── Higher addresses ───────►
    ┌───────┬───────┐
    │  9C   │  E6   │                   Value of word is E69CH
    └───────┴───────┘

    ─── Higher addresses ───────►
    ┌───────┬───────┬───────┬───────┐
    │  4A   │  5B   │  00   │  12   │   Value of doubleword is 12005B4AH
    └───────┴───────┴───────┴───────┘

    If you are working with bytes and words in memory, you should take care
    not to be confused by back-words storage. The source of the confusion
    has mostly to do with how you write data. For example, if you are
    writing a word value in hex, you write it like this: ABCD. The order of
    significance is the same as if you are writing a decimal number: The
    most significant digit is written first. But a word is stored in memory
    with the lowest address location first. So, in memory, the number ABCD
    appears as CDAB, with the bytes switched.
    ──────────────────────────────────────────────────────────────────────────

    See Appendix C for more information on ASCII and the PC family's extended
    ASCII character set.


How the 8086 Addresses Memory

    The 8086 is a 16-bit microprocessor and cannot therefore work directly
    with numbers larger than 16 bits. Theoretically, this means that the 8086
    should be able to access only 64 KB of memory. But, as we noted in the
    previous chapter, it can in fact access much more than that──1024 KB to be
    exact. This is possible because of the 20-bit addressing scheme used with
    the 8086, which expands the full range of memory locations that the 8086
    can work with from 2^16 (65,536) to 2^20 (1,048,576). But the 8086 is
    still limited by its 16-bit processing capacity. To access the 20-bit
    addresses, it must use an addressing method that fits into the 16-bit
    format.

Segmented Addresses

    The 8086 divides the addressable memory space into segments, each of which
    contains 64 KB of memory. Each segment begins at a paragraph address──that
    is, a byte location that is evenly divisible by 16. To access individual
    bytes or words, you use an offset that points to an exact byte location
    within a particular segment. Because offsets are always measured relative
    to the beginning of a segment, they are also called relative addresses or
    relative offsets.

    Together, a segment and an offset form a segmented address that can
    designate any byte in the 8086's 1 MB address space. The 8086 converts a
    given 32-bit segmented address into a 20-bit physical address by using the
    segment value as a paragraph number and adding the offset value to it. In
    effect, the 8086 shifts the segment value left by 4 bits and then adds the
    offset value to create a 20-bit address.

    Figure 2-3 shows how this is done for a segment value of 1234H and an
    offset of 4321H. The segmented address is written as 1234:4321, with
    4-digit hexadecimal values and with a colon separating the segment and
    offset.

        1234:4321
    shift│    │
    left│    │
        │    │
        ▼    │
    12340   │
    +4321◄──┘
    ─────
    16661

    Figure 2-3.  Decoding an 8086 segmented address. The segment value 1234H
    is shifted left 4 bits (one hex digit) and added to the offset 4321H to
    give the 20-bit physical address 16661H.

    On the 8086, there's obviously a great deal of overlap in the range of
    values that can be expressed as segmented addresses. Any given physical
    address can be represented by up to 2^12 different segmented addresses.
    For example, the physical address 16661H could be represented not only as
    1234:4321, but also as 1666:0001, 1665:0011, 1664:0021, and so on.

80286 and 80386 Protected-Mode Addresses

    The 80286 also uses segmented addresses, but when the 80286 runs in
    protected mode, the addresses are decoded differently than on an 8086 or
    in 80286 real mode. The 80286 decodes protected-mode segmented addresses
    through a table of segment descriptors. The "segment" part of a segmented
    address is not a paragraph value, but a "selector" that represents an
    index into a segment descriptor table (Figure 2-4). Each descriptor in
    the table contains a 24-bit base address that indicates the actual start
    of a segment in memory. The resulting address is the sum of the 24-bit
    base address and the 16-bit offset specified in the segmented address.
    Thus, in protected mode the 80286 can access up to 2^24 bytes of memory;
    that is, physical addresses are 24 bits in size.

    This table-driven addressing scheme gives the 80286 a great deal of
    control over memory usage. In addition to a 24-bit base address, each
    segment descriptor specifies a segment's attributes (executable code,
    program data, read-only, and so on), as well as a privilege level that
    lets an operating system restrict access to the segment. This ability to
    specify segment attributes and access privileges is of great use to a
    multitasking operating system like OS/2.

    The 80386 supports both 8086 and 80286 protected-mode addressing. The
    80386 enhances the protected-mode addressing scheme by allowing 32-bit
    segment base addresses and 32-bit offsets. Thus a single segmented
    address, consisting of a 16-bit selector and a 32-bit offset, can specify
    any of 2^32 different physical addresses.

                            0038:4321
                            │    │
    │            │         │    │
    ├────────────┤         │    │
    │            │ 28      │    │
    ├────────────┤         │    │
    │            │ 30      │    │
    ├────────────┤         │    │
    ┌─┤   012340   │ 38 ◄────┘    │
    │ ├────────────┤              │
    │ │            │ 40           │
    │ ├────────────┤              │
    │ │            │              │
    │                             │
    └──────────────────► 012340   │
                        + 4321 ◄─┘
                        ──────
                        016661

    Figure 2-4.  Decoding an 80286 protected-mode segmented address. The
    segment selector 38H indicates an entry in a segment descriptor table. The
    segment descriptor contains a 24-bit segment base address which is added
    to the offset 4321H to give the 24-bit physical address 016661H.

    The 80386 also provides a "virtual 8086" addressing mode, in which
    addressing is the same as the usual 8086 16-bit addressing, but with the
    physical addresses corresponding to the 1 MB 8086 address space mapped
    anywhere in the 4 gigabyte (GB) 80386 address space. This lets an
    operating system execute several different 8086 programs, each in its own
    1 MB, 8086-compatible address space.

Address Compatibility

    The different addressing schemes used by the 80286 and 80386 are generally
    compatible (except, of course, for 32-bit addressing on the 80386).
    However, if you are writing an 8086 program that you intend to convert for
    use in protected mode, be careful to use segments in an orderly fashion.
    Although it's possible to specify a physical 8086 address with many
    different segment-offset combinations, you will find it easier to convert
    8086 programs to 80286 protected-mode addressing if you keep your segment
    values as constant as possible.

    For example, imagine that your program needs to access an array of
    160-byte strings of characters, starting at physical address B8000H. A
    poor way to access each string would be to exploit the fact that the
    strings are each 10 paragraphs long by using a different segment value to
    locate the start of each string:

    B800:0000H (physical address B8000H)
    B80A:0000H (physical address B80A0H)
    B814:0000H (physical address B8140H)
    B81E:0000H (physical address B81E0H)

    A better way to accomplish the same addressing would be to keep a constant
    segment value and change the offset value:

    B800:0000H (physical address B8000H)
    B800:00A0H (physical address B80A0H)
    B800:0140H (physical address B8140H)
    B800:01E0H (physical address B81E0H)

    Although the result is the same on an 8086 and in real mode on an 80286,
    you'll find that the second method is much better suited to 80286
    protected mode, where each different segment selector designates a
    different segment descriptor.


The 8086 Registers

    The 8086 was designed to execute instructions and perform arithmetic and
    logical operations as well as receive instructions and pass data to and
    from memory. To do this, it uses a variety of 16-bit registers.

    There are fourteen registers in all, each with a special use. Four
    scratch-pad registers are used by programs to temporarily hold the
    intermediate results and operands of arithmetic and logical operations.
    Four segment registers hold segment values. Five pointer and index
    registers hold the offsets that are used with the values in the segment
    registers to locate data in memory. Finally, one flags register contains
    nine 1-bit flags that are used to record 8086 status information and
    control 8086 operations. (See Figure 2-5.)

                                        Scratch-pad registers

                            7                      0 7                      0
                            ╓──┬──┬──┬──┬──┬──┬──┬──╥──┬──┬──┬──┬──┬──┬──┬──╖
    AX (accumulator)        ║          AH           ║          AL           ║
                            ╟──┼──┼──┼──┼──┼──┼──┼──╫──┼──┼──┼──┼──┼──┼──┼──╢
    BX (base)               ║          BH           ║          BL           ║
                            ╟──┼──┼──┼──┼──┼──┼──┼──╫──┼──┼──┼──┼──┼──┼──┼──╢
    CX (count)              ║          CH           ║          CL           ║
                            ╟──┼──┼──┼──┼──┼──┼──┼──╫──┼──┼──┼──┼──┼──┼──┼──╢
    DX (data)               ║          DH           ║          DL           ║
                            ╙──┴──┴──┴──┴──┴──┴──┴──╨──┴──┴──┴──┴──┴──┴──┴──╜

                                            Segment registers

                            15                                              0
                            ╓──┬──┬──┬──┬──┬──┬──┬──┬──┬──┬──┬──┬──┬──┬──┬──╖
    CS (code segment)       ║                                               ║
                            ╟──┼──┼──┼──┼──┼──┼──┼──┼──┼──┼──┼──┼──┼──┼──┼──╢
    DS (data segment)       ║                                               ║
                            ╟──┼──┼──┼──┼──┼──┼──┼──┼──┼──┼──┼──┼──┼──┼──┼──╢
    SS (stack segment)      ║                                               ║
                            ╟──┼──┼──┼──┼──┼──┼──┼──┼──┼──┼──┼──┼──┼──┼──┼──╢
    ES (extra segment)      ║                                               ║
                            ╙──┴──┴──┴──┴──┴──┴──┴──┴──┴──┴──┴──┴──┴──┴──┴──╜

                                            Offset registers

                            15                                              0
                            ╓──┬──┬──┬──┬──┬──┬──┬──┬──┬──┬──┬──┬──┬──┬──┬──╖
    IP (instruction pointer)║                                               ║
                            ╟──┼──┼──┼──┼──┼──┼──┼──┼──┼──┼──┼──┼──┼──┼──┼──╢
    SP (stack pointer)      ║                                               ║
                            ╟──┼──┼──┼──┼──┼──┼──┼──┼──┼──┼──┼──┼──┼──┼──┼──╢
    BP (base pointer)       ║                                               ║
                            ╟──┼──┼──┼──┼──┼──┼──┼──┼──┼──┼──┼──┼──┼──┼──┼──╢
    SI (source index)       ║                                               ║
                            ╟──┼──┼──┼──┼──┼──┼──┼──┼──┼──┼──┼──┼──┼──┼──┼──╢
    DI (destination index)  ║                                               ║
                            ╙──┴──┴──┴──┴──┴──┴──┴──┴──┴──┴──┴──┴──┴──┴──┴──╜

                                            Flags register

                            15                                              0
                            ╓──┬──┬──┬──┬──┬──┬──┬──┬──┬──┬──┬──┬──┬──┬──┬──╖
                    Flags  ║           │OF│DF│IF│TF│SF│ZF│  │AF│  │PF│  │CF║
                            ╙──┴──┴──┴──┴──┴──┴──┴──┴──┴──┴──┴──┴──┴──┴──┴──╜

    Figure 2-5.  The 8086 registers and flags.

The Scratch-Pad Registers

    When a computer is processing data, a great deal of the microprocessor's
    time is spent transferring data to and from memory. This access time can
    be greatly reduced by keeping frequently used operands and results inside
    the 8086. Four 16-bit registers, usually called the scratch-pad or data
    registers, are designed for this purpose.

    The scratch-pad registers are known as AX, BX, CX, and DX. Each of them
    can also be subdivided and separately used as two 8-bit registers. The
    high-order 8-bit registers are known as AH, BH, CH, and DH, and the low-
    order 8-bit registers are known as AL, BL, CL, and DL.

    The scratch-pad registers are used mostly as convenient temporary working
    areas, particularly for arithmetic operations. Addition and subtraction
    can be done in memory without using the registers, but the registers are
    faster.

    Although these registers are available for any kind of scratch-pad work,
    each also has some special uses:

    ■  The AX (accumulator) register is the main register used to perform
        arithmetic operations. (Although addition and subtraction can be
        performed in any of the scratch-pad or offset registers, multiplication
        and division must be done in AX or AL.)

    ■  The BX (base) register can be used to point to the beginning of a
        translation table in memory. It can also be used to hold the offset
        part of a segmented address.

    ■  The CX (count) register is used as a repetition counter for loop
        control and repeated data moves. For example, the LOOP instruction in
        assembly language uses CX to count the number of loop iterations.

    ■  The DX register is used to store data for general purposes, although
        it, too, has certain specialized functions. For example, DX contains
        the remainder of division operations performed in AX.

The Segment Registers

    As we discussed earlier, the complete address of a memory location
    consists of a 16-bit segment value and a 16-bit offset within the segment.
    Four registers, called CS, DS, ES, and SS, are used to identify four
    specific segments of memory. Five offset registers, which we'll discuss
    shortly, can be used to store the relative offsets of the data within each
    of the four segments.

    Each segment register is used for a specific type of addressing:

    ■  The CS register identifies the code segment, which contains the program
        that is being executed.

    ■  The DS and ES registers identify data segments where data used in a
        program is stored.

    ■  The SS register identifies the stack segment. (See page 32 for more
        information about stacks.)

    Programs rarely use four separate segments to address four different 64 KB
    areas of memory. Instead, the four segments specified in CS, DS, ES, and
    SS usually refer to overlapping or identical areas in memory. In effect,
    the different segment registers identify areas of memory used for
    different purposes.

    For example, Figure 2-6 shows how the values in the segment registers
    correspond to the memory used in a hypothetical DOS program. The values in
    the segment registers are chosen to correspond to the start of each
    logically different area of memory, even though the 64 KB areas of memory
    identified by each segment overlap each other. (See Chapter 20 for more
    about segments and the memory layout of DOS programs.)

    All 8086 instructions that use memory have an implied use of a particular
    segment register for the operation being performed. For example, the MOV
    instruction, because it acts on data, uses the DS register. The JMP
    instruction, which affects the flow of a program, automatically uses the
    CS register.

    This means that you can address any 64 KB segment in memory by placing its
    paragraph address in the appropriate segment register. For example, to
    access data in the video buffer used by IBM's Color Graphics Adapter, you
    place the paragraph address of the start of the buffer in a segment
    register and then use the MOV instruction to transfer data to or from the
    buffer.

                ┌─────────────────────┐        ◄───┐
        SS=2919H │       Stack         │ 29190H     ├─ 2 KB
                ├─────────────────────┤        ◄───┤
                │                     │            │
        DS=2419H │    Program data     │            ├─ 20KB
                │                     │ 24190H     │
                ├─────────────────────┤        ◄───┤
                │                     │            │
        CS=2019H │   Executable code   │            ├─ 16 KB
                │                     │ 20190H     │
                └─────────────────────┘        ◄───┘
    Segment registers              Physical addresses

    Figure 2-6.  Segment usage in a typical DOS program. Each segment register
    contains the starting paragraph of a different area of memory.

    mov ax,0B800h                   ; move the segment value into DS
    mov ds,ax
    mov al,[0000]                   ; copy the byte at B800:0000
                                    ; into AL

    In interpreted BASIC you can use this method with the DEF SEG statement:

    DEF SEG = &HB800                ' move the segment value into DS
    X = PEEK(0000)                  ' copy the byte at B800:0000 into X

The Offset Registers

    Five offset registers are used with the segment registers to contain
    segmented addresses. One register, called the instruction pointer (IP),
    contains the offset of the current instruction in the code segment; two
    registers, called the stack registers, are intimately tied to the stack;
    and the remaining two registers, called the index registers, are used to
    address strings of data.

    The instruction pointer (IP), also called the program counter (PC),
    contains the offset within the code segment where the current program is
    executing. It is used with the CS register to track the location of the
    next instruction to be executed.

    Programs do not have direct access to the IP register, but a number of
    instructions, such as JMP and CALL, change the IP value implicitly.

    The stack registers, called the stack pointer (SP) and the base pointer
    (BP), provide offsets into the stack segment. SP gives the location of the
    current top of the stack. Programs rarely change the value in SP directly.
    Instead, they rely on PUSH and POP instructions to update SP implicitly.
    BP is the register generally used to access the stack segment directly.
    You'll see BP used quite often in the assembly-language examples that
    appear in Chapters 8 through 20.

    The index registers, called the source index (SI) and the destination
    index (DI), can be used for general-purpose addressing of data. Also, all
    string move and comparison instructions use SI and DI to address data
    strings.

The Flags Register

    The fourteenth and last register, called the flags register, is really a
    collection of individual status and control bits called flags. The flags
    are maintained in a register, so they can be either saved and restored as
    a coordinated set or inspected as ordinary data. Normally, however, the
    flags are set and tested as independent items──not as a set.

    There are nine 1-bit flags in the 8086's 16-bit flags register, leaving 7
    bits unused. (The 80286 and 80386 use some of the unused flags to support
    protected-mode operation.) The flags can be logically divided into two
    groups: six status flags, which record processor status information
    (usually indicating what happened with a comparison or arithmetic
    operation), and three control flags, which direct some of the 8086
    instructions. Be prepared to see a variety of notations for the flags,
    including distinct names for whether they are set (1) or clear (0). The
    terms used in Figures 2-7 and 2-8 are the most common.

    ──────────────────────────────────────────────────────────────────────────
    The Stack
    The stack is a built-in feature of the 8086. It provides programs with a
    place to store and keep track of work in progress. The most important
    use of the stack is to keep a record of where subroutines were invoked
    from and what parameters were passed to them. The stack can also be used
    for temporary working storage, although this is less fundamental and
    less common.

    The stack gets its name from an analogy to a spring-loaded stack of
    plates in a cafeteria: New data is "pushed" onto the top of the stack
    and old data is "popped" off. A stack always operates in
    last-in-first-out (LIFO) order. This means that when the stack is used
    to keep track of where to return to a program, the most recent calling
    program is returned to first. This way, a stack maintains the orderly
    workings of programs, subroutines, and interrupt handlers, no matter how
    complex their operation.

    A stack is used from the bottom (highest address) to the top (lowest
    address) so that when data is pushed onto the top of the stack, it is
    stored at the memory addresses just below the current top of the stack.
    The stack grows downward so that as data is added, the location of the
    top of the stack moves to lower and lower addresses, decreasing the
    value of SP each time. You need to keep this in mind when you access the
    stack, which you are likely to do in assembly-language interface
    routines.

    Any part of any program can create a new stack space at any time, but
    this is not usually done. Normally, when a program is run, a single
    stack is created for it and used throughout the operation of the
    program.

    There is no simple way to estimate the size of stack that a program
    might need, and the 8086's design does not provide any automatic way of
    detecting when stack space is in short supply or exhausted. This can
    make programmers nervous about the amount of space that should be set
    aside for a stack. A conservative estimate of how much stack space to
    maintain is about 2 KB (2048 bytes), the default amount allocated by
    many high-level language compilers.

                                                ┌──────┐◄─── Bottom of stack
                                        :1008 │ 5E00 │
            ┌──────┐◄─── Bottom of stack       ├──────┤
    :1008 │ 5E00 │                     :1006 │ 4D00 │
            ├──────┤                           ├──────┤
    :1006 │ 4D00 │                     :1004 │ 3C00 │◄─── Old top of stack
            ├──────┤                           ├──────┤
    :1004 │ 3C00 │◄─── Top of stack    :1002 │ 2B00 │◄─── Top of stack
            ├──────┤     (SP=1004)             ├──────┤     (SP=1002)
    :1002 │      │                     :1000 │      │
            ├──────┤                           └────┘
    :1000 │      │                             ║  ║
            └──────┘                             PUSH
    a. Stack before a PUSH             b. Stack after a PUSH
    ──────────────────────             ─────────────────────

                                ┌──────┐◄─── Bottom of stack
                        :1008 │ 5E00 │
                                ├──────┤
                        :1006 │ 4D00 │
                                ├──────┤
                        :1004 │ 3C00 │◄─── Top of stack
                                ├──────┤     (SP=1004)
                        :1002 │      │
                                ├──────┤
                        :1000 │      │
                                └─║──║─┘
                                ▼  ▼
                                POP
                        c. Stack after a POP
                        ────────────────────

    ──────────────────────────────────────────────────────────────────────────

    Code        Name                     Use
    ──────────────────────────────────────────────────────────────────────────
    CF          Carry flag               Indicates an arithmetic carry
    OF          Overflow flag            Indicates signed arithmetic overflow
    ZF          Zero flag                Indicates zero result, or equal
                                        comparison
    SF          Sign flag                Indicates negative result/comparison
    PF          Parity flag              Indicates even number of 1 bits
    AF          Auxiliary carry flag     Indicates adjustment needed in
                                        binary-coded decimal (BCD) arithmetic
                                        operations
    ──────────────────────────────────────────────────────────────────────────

    Figure 2-7.  The six status flags in the 8086's flags register.

    Code        Name                     Use
    ──────────────────────────────────────────────────────────────────────────
    DF          Direction flag           Controls increment direction in
                                        string operations (CMPS, LODS, MOVS,
                                        SCAS, STOS)
    IF          Interrupt flag           Controls whether interrupts are
                                        enabled
    TF          Trap flag                Controls single-step operation (used
                                        by DEBUG) by generating an interrupt
                                        at the end of every instruction
    ──────────────────────────────────────────────────────────────────────────

    Figure 2-8.  The three control flags in the 8086's flags register.

Addressing Memory Through Registers

    We've seen that memory is always addressed by a combination of a segment
    value and a relative offset. The segment value always comes from one of
    the four segment registers.

    In contrast, the relative offset can be specified in many different ways.
    (See Figure 2-9.) For each machine instruction that accesses memory, the
    8086 computes an effective address by combining one, two, or three of the
    following:

    ■  The value in BX or BP

    ■  The value in SI or DI

    ■  A relative-offset value, called a displacement, that is part of the
        instruction itself

╓┌─┌──────────────────┌───────────────────────────┌──────────────────┌───────►
    Name               Effective Address           Example            Comments
    ───────────────────────────────────────────────────────────────────────────
    Immediate          Value "addressed" is part   mov ax,1234h       Stores 12
                        of the 8086 instruction

    Direct             Specified as part of the    mov ax,[1234h]     Copies th
                        8086 instruction                               into AX.
                                                                    register

    Register indirect  Contained in BX, SI, DI, or mov ax,[bx]        Copies th
                        BP                                             offset co
                                                                    AX. The d
                                                                    register
    Name               Effective Address           Example            Comments
    ───────────────────────────────────────────────────────────────────────────
                                                                    register
                                                                    [DI] is D
                                                                    default i

    Based              The sum of a displacement   mov ax,[bx+2]      Copies th
                        (part of the instruction)   or mov ax,2[bx]    past the
                        and the value in BX or BP                      BX into A
                                                                    segment r
                                                                    DS; for [
                                                                    SS.

    Indexed            The sum of a displacement   mov ax,[si+2]      Copies th
                        and the value in SI or DI   or mov ax,2[si]    past the
                                                                    SI into A
                                                                    segment r
    Based indexed      The sum of a displacement,  mov ax,[bp+si+2]   The offse
                        the value in SI or DI, and  or mov ax,2[bp+si] values in
                        the value in BX or BP       or                 When BX i
                                                    mov ax,2[bp][si]   segment r
    Name               Effective Address           Example            Comments
    ───────────────────────────────────────────────────────────────────────────
                                                    mov ax,2[bp][si]   segment r
                                                                    BP is use
                                                                    SS.

    String addressing  Source string: register     movsb              Copies th
                        indirect using SI                              memory at
                        Destination string:                            ES:[DI].
                        register indirect using DI
    ───────────────────────────────────────────────────────────────────────────


    Figure 2-9.  8086 Addressing Modes. In assembly language, some
    instructions can be specified in several different ways.

    Each of the various ways of forming an effective address has its uses. You
    can use the Immediate and Direct methods when you know the offset of a
    particular memory location in advance. You must use one of the remaining
    methods when you can't tell what an address will be until your program
    executes. In the chapters ahead, you'll see examples of most of the
    different 8086 addressing modes.

    The notation used in specifying 8086 addresses is straightforward.
    Brackets, [ ], are used to indicate that the enclosed item specifies a
    relative offset. This is a key element of memory addressing: Without
    brackets, the actual value stored in the register is used in whatever
    operation is specified.

Rules for Using Registers

    It is important to know that various rules apply to the use of registers,
    and it is essential to be aware of these rules when writing assembly-
    language interface routines. Because the rules and conventions of usage
    vary by circumstance and by programming language, exact guidelines are not
    always available, but the general rules that follow will apply in most
    cases. (You will find additional guidance, and working models to copy, in
    the examples in Chapters 8 through 20.) Keep in mind, though, that the
    following rules are general, not absolute.

    Probably the most useful rule for using the registers is simply to use
    them for what they are designed for. The idea that each of the 8086
    registers has certain special uses may seem somewhat quirky, particularly
    to a programmer who is accustomed to working with a CPU that has a less
    specialized set of registers (such as the 68000, for example). On the
    8086, using the registers for their natural functions leads to cleaner,
    more efficient source code and ultimately to more reliable programs.

    For example, the segment registers are designed to contain segment values,
    so don't use them for anything else. (In 80286 protected mode you can't
    use them for anything else anyway without generating an error condition.)
    The BP register is intended for stack addressing; if you use it for
    anything else, you'll have to do some fancy footwork when you need to
    address values in the stack segment.

    Particular rules apply to the four segment registers (CS, DS, ES, and SS).
    The CS register should be changed only through intersegment jumps and
    subroutine calls.

    Most programmers use the DS register to point to a default data segment
    that contains the data most frequently used in a program. This means that
    the value in the DS register is usually initialized at the beginning of a
    program and then left alone. Should it be necessary to use DS to address a
    different segment, its original value is saved, the new segment is
    accessed, and then the original value is restored. In contrast, most
    people use the ES register as needed to access arbitrary segments in
    memory.

    The stack segment (SS) and stack pointer (SP) registers should usually be
    updated implicitly, either by PUSH and POP instructions or by CALL and RET
    instructions that save subroutine return addresses on the stack. When DOS
    loads a program into memory to be executed, it initializes SS and SP to
    usable values. In .COM programs, SS:SP points to the end of the program's
    default segment; in .EXE programs, SS:SP is determined explicitly by the
    size and location of the program's stack segment. In either case, it's
    rare that you need to change SS or SP explicitly.

    If you need to discard a number of values from the stack or reserve
    temporary storage space on top of the stack, you can increment or
    decrement SP directly:

    add sp,8                        ; discard four words (8 bytes)
                                    ; from stack
    sub sp,6                        ; add three empty words (6 bytes)
                                    ; to top of stack

    If you need to move the stack to a different location in memory, you must
    generally update both SS and SP at the same time:

    cli                             ; disable interrupts
    mov ss,NewStackSeg              ; update SS from a memory variable
    mov sp,NewStackPtr              ; update SP from a memory variable
    sti                             ; re-enable interrupts

    Be careful when you change SS and SP explicitly. If you modify SS but fail
    to update SP, SS will be specifying a new stack segment while SP will be
    pointing somewhere inside another stack segment──and that's asking for
    trouble the next time you use the stack.

    It's hard to be explicit about the use of the other registers. In general,
    most programmers try to minimize memory accesses by keeping the
    intermediate results of lengthy computations in registers. This is because
    it takes longer to perform a computation on a value stored in memory than
    on a value stored in a register. Of course, the 8086 has only so many
    registers to work with, so you may find yourself running out of registers
    before you run out of variables.


How the 8086 Uses I/O Ports

    The 8086-family microprocessors communicate with and control many parts of
    the computer through the use of input and output (I/O) ports. The I/O
    ports are doorways through which information passes as it travels to or
    from an I/O device, such as a keyboard or a printer. Most of the support
    chips we described in Chapter 1 are accessed through I/O ports; in fact,
    each chip may use several port addresses for different purposes.

    Each port is identified by a 16-bit port number, which can range from 00H
    through FFFFH (65,535). The CPU identifies a particular port by the port's
    number.

    As it does when accessing memory, the CPU uses the data and address buses
    as conduits for communication with the ports. To access a port, the CPU
    first sends a signal on the system bus to notify all I/O devices that the
    address on the bus is that of a port. The CPU then sends the port address.
    The device with the matching port address responds.

    The port number addresses a memory location that is associated with an I/O
    device but is not part of main memory. In other words, an I/O port number
    is not the same as a memory address. For example, I/O port 3D8H has
    nothing to do with memory address 003D8H. To access an I/O port, you don't
    use data-transfer instructions like MOV and STOS. Instead, you use the
    instructions IN and OUT, which are reserved for I/O port access.

    ──────────────────────────────────────────────────────────────────────────
    NOTE:
    Many high-level programming languages provide functions that access I/O
    ports. The BASIC functions INP and OUT, and the C functions inp and
    outp, are typical examples.
    ──────────────────────────────────────────────────────────────────────────

    The uses of specific I/O ports are determined by the hardware designers.
    Programs that make use of I/O ports need to be aware of the port numbers,
    as well as their use and meaning. Port number assignments differ slightly
    among the PC family members, but, in general, IBM has reserved the same
    ranges of I/O port numbers for the same input/output devices in all PCs
    and PS/2s. (See Figure 2-10.) For details on how each I/O port is used,
    see the descriptions of the various input/output devices in the IBM
    technical reference manuals.

╓┌─┌──────────────────────────────────────────────┌───────────────┌──────────►
    Description                                    I/O Port        Comment
                                                    Numbers
    ───────────────────────────────────────────────────────────────────────────
    Programmable Interrupt Controller (master)     20H─3FH

    System timer                                   40H─5FH

    Keyboard controller                            60H─6FH         On PS/2 Mode
                                                                    are reserved
                                                                    control and

    System control port B                          61H             PS/2 models

    Real-time clock, NMI mask                      70H─7FH         On PC, PC/XT
                                                                    30, NMI mask

    System control port A                          92H             PS/2 models
    Description                                    I/O Port        Comment
                                                    Numbers
    ───────────────────────────────────────────────────────────────────────────
    System control port A                          92H             PS/2 models

    Programmable Interrupt Controller (slave)      A0H─BFH         On PS/2 Mode

    Real-time clock                                B0H─BFH,        PS/2 Model 3
                                                    E0H─EFH

    Clear math coprocessor busy                    F0H

    Reset math coprocessor                         F1H

    Math coprocessor                               F8H─FFH

    Fixed-disk controller                          1F0H─1F8H

    Game control adapter                           200H─207H

    Parallel printer 3                             278H─27BH
    Description                                    I/O Port        Comment
                                                    Numbers
    ───────────────────────────────────────────────────────────────────────────
    Parallel printer 3                             278H─27BH

    Serial communications 2                        2F8H─2FFH

    Fixed-disk controller                          320H─32FH       PC/XT and PS

    PC network                                     360H─363H,
                                                    368H─36BH

    Parallel printer 2                             378H─37BH

    Monochrome Display Adapter                     3B0H─3BBH       Also used by
                                                                    monochrome v

    Parallel printer 1                             3BCH─3BFH

    Enhanced Graphics Adapter (EGA),               3C0H─3CFH
    Video Graphics Array (VGA)
    Description                                    I/O Port        Comment
                                                    Numbers
    ───────────────────────────────────────────────────────────────────────────
    Video Graphics Array (VGA)

    Color Graphics Adapter (CGA),                  3D0H─3DFH       Also used by
    Multi-Color Graphics Array (MCGA)                              color video

    Diskette controller                            3F0H─3F7H

    Serial communications 1                        3F8H─3FFH
    ───────────────────────────────────────────────────────────────────────────


    Figure 2-10.  PC and PS/2 input/output port assignments. This table lists
    the most frequently used I/O ports. For a complete list, see the IBM
    Technical Reference manuals.


How the 8086 Uses Interrupts

    An interrupt is an indication to the microprocessor that its immediate
    attention is needed. The 8086-family microprocessors can respond to
    interrupts from either hardware or software. A hardware device can
    generate an interrupt signal that is processed by the programmable
    interrupt controller (PIC) and passed to the microprocessor; in software,
    the INT instruction generates an interrupt. In both cases, the
    microprocessor stops processing and executes a memory-resident subroutine
    called an interrupt handler. After the interrupt handler has performed its
    task, the microprocessor resumes processing at the point the interrupt
    occurred.

    The 8086 supports 256 different interrupts, each identified by a number
    between 00H and FFH (decimal 255). The segmented addresses of the 256
    interrupt handlers are stored in an interrupt vector table that starts at
    0000:0000H (that is, at the very beginning of available memory). Each
    interrupt vector is 4 bytes in size, so you can locate the address of any
    interrupt handler by multiplying the interrupt number by 4. You can also
    replace an existing interrupt handler with a new one by storing the new
    handler's segmented address in the appropriate interrupt vector.

Software Interrupts

    Probably the most familiar type of interrupts are generated by the INT
    instruction. Consider what happens when the CPU executes the following
    instruction:

    INT 12H

    The CPU pushes the current contents of the flags register, the CS (code
    segment) register, and the IP (instruction pointer) register onto the
    stack. Then it transfers control to the interrupt handler corresponding to
    interrupt number 12H, using the segmented address stored at 0000:0048H.
    The CPU then executes the interrupt 12H handler, which responds
    appropriately to interrupt 12H. The interrupt handler terminates with an
    IRET instruction that pops CS:IP and the flags back into the registers,
    thus transferring control back to the interrupted program.

Hardware Interrupts

    The microprocessor responds to a hardware interrupt in much the same way
    it responds to a software interrupt: by transferring control to an
    interrupt handler. The important difference lies in the way the interrupt
    is signalled.

    Devices such as the system timer, the hard disk, the keyboard, and the
    serial communications ports can generate interrupt signals on a set of
    reserved interrupt request (IRQ) lines. These lines are monitored by the
    PIC circuit, which assigns interrupt numbers to them. When a particular
    hardware interrupt occurs, the PIC places the corresponding interrupt
    number on the system data bus where the microprocessor can find it.

    The PIC also assigns priorities to the various interrupt requests. For
    example, the highest-priority PIC interrupt in all PCs and PS/2s is the
    timer-tick interrupt, which is signalled on interrupt request line 0
    (IRQ0) and is assigned interrupt 08H by the PIC. When a system timer
    generates a timer- tick interrupt, it does so by signalling on IRQ0; the
    PIC responds by signalling the CPU to execute interrupt 08H. If a
    lower-priority hardware interrupt request occurs while the timer-tick
    interrupt is being processed, the PIC delays the lower-priority interrupt
    until the timer interrupt handler signals that it has finished its
    processing.

    When you coldboot the computer, the system start-up routines assign
    interrupt numbers and priorities to the hardware interrupts by
    initializing the PIC. In 8088- and 8086-based machines (PCs, PC/XTs, PS/2
    models 25 and 30), interrupt numbers 08H through 0FH are assigned to
    interrupt request levels 0 through 7 (IRQ0 through IRQ7). In PC/ATs and
    PS/2 models 50, 60, and 80, an additional eight interrupt lines (IRQ8
    through IRQ15) are assigned interrupt numbers 70H through 77H.

    One hardware interrupt bypasses the PIC altogether. This is the
    non-maskable interrupt (NMI), which is assigned interrupt number 02H in
    the 8086 family. The NMI is used by devices that require absolute,
    "now-or-never" priority over all other CPU functions. In particular, when
    a hardware memory error occurs, the computer's RAM subsystem generates an
    NMI. This causes the CPU to pass control to an interrupt 02H handler; the
    default handler in the PC family resides in ROM and issues the "PARITY
    CHECK" message you see when a memory error occurs.

    When you debug a program on any member of the PC family, remember that
    hardware interrupts are occurring all the time. For example, the system
    timer-tick interrupt (interrupt 08H) occurs roughly 18.2 times per second.
    The keyboard and disk-drive controllers also generate interrupts. Each
    time these hardware interrupts occur, the 8086 uses the current stack to
    save CS:IP and the flags register. If your stack is too small, or if you
    are manipulating SS and SP when a hardware interrupt occurs, the 8086 may
    damage valuable data when it saves CS:IP and the flags.

    If you look back at our example of updating SS and SP on page 36, you'll
    see that we explicitly disable hardware interrupts by executing the CLI
    instruction prior to updating SS. This prevents a hardware interrupt from
    occurring between the two MOV instructions while SS:SP is pointing
    nowhere. (Actually, this is a problem only in very early releases of the
    8088; the chip was later redesigned to prevent this problem by disabling
    interrupts during the instruction that follows a data move into SS.)

    We'll talk in more detail about how PCs and PS/2s use interrupts in
    Chapters 3 and 8.



────────────────────────────────────────────────────────────────────────────
Chapter 3  The ROM Software

    The Start-Up ROM

    The ROM BIOS
    Interrupt Vectors
    Key Low-Memory Addresses
    The ROM Version and Machine-ID Markers

    The ROM BASIC

    The ROM Extensions

    Comments

    It takes software to make a computer go. And getting a computer going and
    keeping it going is much easier if some of that software is permanently
    built into the computer. That's what the ROM programs are all about. ROM
    stands for read-only memory──memory permanently recorded in the circuitry
    of the computer's ROM chips, that can't be changed, erased, or lost.

    PCs and PS/2s come with a substantial amount of ROM that contains the
    programs and data needed to start and operate the computer and its
    peripheral devices. The advantage of having a computer's fundamental
    programs stored in ROM is that they are right there──built into the
    computer──and there is no need to load them into memory from disk the way
    that DOS must be loaded. Because they are permanent, the ROM programs are
    very often the foundation upon which other programs (including DOS) are
    built.

    There are four elements to the ROM in IBM's PC family: the start-up
    routines, which do the work of getting the computer started; the ROM
    BIOS──an acronym for Basic Input/Output System──which is a collection of
    machine-language routines that provide support services for the continuing
    operation of the computer; the ROM BASIC, which provides the core of the
    BASIC programming language; and the ROM extensions, which are programs
    that are added to the main ROM when certain optional equipment is added to
    the computer. We'll be examining each of these four major elements
    throughout the rest of this chapter.

    The ROM programs occupy addresses F000:0000H through F000:FFFFH in the
    PC/XT/AT family and the PS/2 models 25 and 30, and E000:0000H through
    F000:FFFFH in the other PS/2s. However, the routines themselves are not
    located at any specific addresses in ROM as they are in other computers.
    The address of a particular ROM routine varies among the different members
    of the PC/XT/AT and PS/2 families.

    Although the exact addresses of the ROM routines can vary, IBM provides a
    consistent interface to the ROM software by using interrupts. Later in
    this book we'll show you exactly how to use interrupts to execute the ROM
    routines.


The Start-Up ROM

    The first job the ROM programs have is to supervise the start-up of the
    computer. Unlike other aspects of the ROM, the start-up routines have
    little to do with programming the PC family──but it is still worthwhile to
    understand what they do.

    The start-up routines perform several tasks:

    ■  They run a quick reliability test of the computer (and the ROM
        programs) to ensure everything is in working order.

    ■  They initialize the chips and the standard equipment attached to the
        computer.

    ■  They set up the interrupt-vector table.

    ■  They check to see what optional equipment is attached.

    ■  They load the operating system from disk.

    The following paragraphs discuss these tasks in greater detail.

    The reliability test, part of a process known as the Power On Self Test
    (POST), is an important first step in making sure the computer is ready.
    All POST routines are quite brief except for the memory tests, which can
    be annoyingly lengthy in computers that contain a large amount of memory.

    The initialization process is slightly more complex. One routine sets the
    default values for interrupt vectors. These default values either point to
    the standard interrupt handlers located inside the ROM BIOS, or they point
    to do-nothing routines in the ROM BIOS that may later be superseded by the
    operating system or by your own interrupt handlers. Another initialization
    routine determines what equipment is attached to the computer and then
    places a record of it at standard locations in low memory. (We'll be
    discussing this equipment list in more detail later in the chapter.) How
    this information is acquired varies from model to model──for example, in
    the PC it is taken mostly from the settings of two banks of switches
    located on the computer's system board; in the PC/AT and the PS/2s, the
    ROM BIOS reads configuration information from a special nonvolatile memory
    area whose contents are initialized by special setup programs supplied by
    IBM. The POST routines learn about the computer's hardware by a logical
    inspection and test. In effect, the initialization program shouts to each
    possible option, "Are you there?", and listens for a response.

    No matter how it is acquired, the status information is recorded and
    stored in the same way for every model so that your programs can examine
    it. The initialization routines also check for new equipment and
    extensions to ROM. If they find any, they momentarily turn control over to
    the ROM extensions so that they can initialize themselves. The
    initialization routines then continue executing the remaining start-up
    routines (more on this later in the chapter).

    The final part of the start-up procedure, after the POST tests, the
    initialization process, and the incorporation of ROM extensions, is called
    the bootstrap loader. It's a short routine that loads a program from disk.
    In essence, the ROM bootstrap loader attempts to read a disk boot program
    from a disk. If the boot program is successfully read into memory, the ROM
    loader passes control of the computer to it. The disk boot program is
    responsible for loading another, larger disk program, which is usually a
    disk operating system such as DOS, but can be a self-contained and
    self-loading program, such as Microsoft Flight Simulator. If the ROM
    bootstrap loader cannot read a disk's boot program, it either activates
    the built-in ROM BASIC or displays an error message if the disk boot
    program contains an error. As soon as either of these two events occurs,
    the system start-up procedure is finished and the other programs take
    over.


The ROM BIOS

    The ROM BIOS is the part of ROM that is in active use whenever the
    computer is at work. The role of the ROM BIOS is to provide the
    fundamental services that are needed for the operation of the computer.
    For the most part, the ROM BIOS controls the computer's peripheral
    devices, such as the display screen, keyboard, and disk drives. When we
    use the term BIOS in its narrowest sense, we are referring to the device
    control programs──the programs that translate a simple command, such as
    read-something-from-the-disk, into all the steps needed to actually
    perform the command, including error detection and correction. In the
    broadest sense, the BIOS includes not only routines needed to control the
    PC's devices, but also routines that contain information or perform tasks
    that are fundamental to other aspects of the computer's operation, such as
    keeping track of the time of day.

    Conceptually, the ROM BIOS programs lie between programs that are
    executing in RAM (including DOS) and the hardware. In effect, this means
    that the BIOS works in two directions in a two-sided process. One side
    receives requests from programs to perform the standard ROM BIOS
    input/output services. A program invokes these services with a combination
    of an interrupt number (which indicates the subject of the service
    request, such as printer services) and a service number (which indicates
    the specific service to be performed). The other side of the ROM BIOS
    communicates with the computer's hardware devices (display screen, disk
    drives, and so on), using whatever detailed command codes each device
    requires. This side of the ROM BIOS also handles any hardware interrupts
    that a device generates to get attention. For example, whenever you press
    a key, the keyboard generates an interrupt to let the ROM BIOS know.

    Of all the ROM software, the BIOS services are probably the most
    interesting and useful to programmers──as a matter of fact, we have
    devoted six chapters to the BIOS services in Chapters 8 through 13. Since
    we deal with them so thoroughly later on, we'll skip any specific
    discussion of what the BIOS services do and instead focus on how the BIOS
    as a whole keeps track of the computer's input and output processes.

Interrupt Vectors

    The IBM PC family, like all computers based on the Intel 8086 family of
    microprocessors, is controlled largely through the use of interrupts,
    which can be generated by hardware or software. The BIOS service routines
    are no exception; each is assigned an interrupt number that you must call
    when you want to use the service.

    When an interrupt occurs, control of the computer is turned over to an
    interrupt-handling subroutine that is often stored in the system's ROM (a
    BIOS service routine is nothing more than an interrupt handler). The
    interrupt handler is called by loading its segment and offset addresses
    into registers that control program flow: the CS (code segment) register
    and the IP (instruction pointer) register──together known as the CS:IP
    register pair. Segment addresses that locate interrupt handlers are called
    interrupt vectors.

    During the system start-up process, the BIOS sets the interrupt vectors to
    point to the interrupt handlers in ROM. The interrupt vector table starts
    at the beginning of RAM, at address 0000:0000H. (See Chapter 2 for more
    about interrupts and interrupt vectors.) Each entry in the table is stored
    as a pair of words, with the offset portion first and the segment portion
    second. The interrupt vectors can be changed to point to a new interrupt
    handler simply by locating the vector and changing its value.

    As a general rule, PC-family interrupts can be divided into six
    categories: microprocessor, hardware, software, DOS, BASIC, and general
    use.

    Microprocessor interrupts, often called logical interrupts, are designed
    into the microprocessor. Four of them (interrupts 00H, 01H, 03H, and 04H)
    are generated by the microprocessor itself, and another (interrupt 02H,
    the nonmaskable interrupt) is activated by a signal generated by certain
    hardware devices, such as the 8087 math coprocessor.

    Hardware interrupts are built into the PC hardware. In PCs, XTs, and PS/2
    models 25 and 30, interrupt numbers 08H through 0FH are used for hardware
    interrupts; in ATs and PS/2 models 50, 60, and 80, interrupt numbers 08H
    through 0FH and 70H through 77H are reserved for hardware interrupts. (See
    Chapter 2 for more about hardware interrupts.)

    ──────────────────────────────────────────────────────────────────────────
    The Part DOS Plays
    The ROM bootstrap loader's only function is to read a bootstrap program
    from a disk and transfer control to it. On a bootable DOS disk, the disk
    bootstrap program verifies that DOS is stored on the disk by looking for
    two hidden files named IBMBIO.COM and IBMDOS.COM. If it finds them, it
    loads them into memory along with the DOS command interpreter,
    COMMAND.COM. During this loading process, optional parts of DOS, such as
    installable device drivers, may also be loaded.

    The IBMBIO.COM file contains extensions to the ROM BIOS. These
    extensions can be changes or additions to the basic I/O operations and
    often include corrections to the existing ROM BIOS, new routines for new
    equipment, or customized changes to the standard ROM BIOS routines.
    Because they are part of disk software, the IBMBIO.COM routines provide
    a convenient way to modify the ROM BIOS. All that is necessary, besides
    the new routine, is that the interrupt vectors for the previous ROM BIOS
    routines be changed to point to the location in memory where the new
    disk BIOS routines are placed. Whenever new devices are added to the
    computer, their support programs can be included in the IBMBIO.COM file
    or as installable device drivers, eliminating the need to replace ROM
    chips. See Appendix A for more on device drivers.

    You can think of the ROM BIOS routines as the lowest-level system
    software available, performing the most fundamental and primitive I/O
    operations. The IBMBIO.COM routines, being extensions of the ROM BIOS,
    are essentially on the same low level, also providing basic functions.
    By comparison, the IBMDOS.COM routines are more sophisticated; think of
    them as occupying the next level up, with applications programs on top.

    The IBMDOS.COM file contains the DOS service routines. The DOS services,
    like the BIOS services, can be called by programs through a set of
    interrupts whose vectors are placed in the interrupt-vector table in low
    memory. One of the DOS interrupts, interrupt 21H (decimal 33), is
    particularly important because when invoked, it gives you access to a
    rather large group of DOS functions. The DOS functions provide more
    sophisticated and efficient control over the I/O operations than the
    BIOS routines do, especially with regard to disk file operations. All
    standard disk processes──formatting diskettes; reading and writing data;
    opening, closing, and deleting files; performing directory searches──are
    included in the DOS functions and provide the foundation for many
    higher-level DOS programs, such as FORMAT, COPY, and DIR. Your programs
    can use the DOS services when they need more control of I/O operations
    than programming languages allow, and when you are reluctant to dig all
    the way down to the BIOS level. The DOS services are a very important
    part of this book, and we have devoted five chapters to them. (See
    Chapters 14 through 18.)

    The COMMAND.COM file is the third and most important part of DOS, at
    least from a utilitarian standpoint. This file contains the routines
    that interpret the commands you type in through the keyboard in the DOS
    command mode. By comparing your input to a table of command names, the
    COMMAND.COM program can differentiate between internal commands that are
    part of the COMMAND.COM file, such as RENAME or ERASE, and external
    commands, such as the DOS utility programs (like DEBUG) or one of your
    own programs. The command interpreter acts by executing the required
    routines for internal commands or by searching for the requested
    programs on disk and loading them into memory. The whole subject of the
    COMMAND.COM file and how it works is intriguing and well worth
    investigating──as are the other DOS programs. We recommend you read the
    DOS Technical Reference Manual or Inside the IBM PC for additional
    information.
    ──────────────────────────────────────────────────────────────────────────

    Software interrupts incorporated into the PC design are part of the ROM
    BIOS programs. ROM BIOS routines invoked by these interrupts cannot be
    changed, but the vectors that point to them can be changed to point to
    different routines. Reserved interrupt numbers are 10H through 1FH
    (decimal 16 through 31) and 40H through 5FH (decimal 64 through 95).

    DOS interrupts are always available when DOS is in use. Many programs and
    programming languages use the services provided by DOS through the DOS
    interrupts to handle basic operations, especially disk I/O. DOS interrupt
    numbers are 20H through 3FH (decimal 32 through 63).

    BASIC interrupts are assigned by BASIC itself and are always available
    when BASIC is in use. The reserved interrupt numbers are 80H through F0H
    (decimal 128 through 240).

    General-use interrupts are available for temporary use in your programs.
    The reserved interrupt numbers are 60H through 66H (decimal 96 through
    102).

    Most of the interrupt vectors used by the ROM BIOS, DOS, and BASIC contain
    the addresses of interrupt handlers. A few interrupt vectors, however,
    point to tables of useful information. For example, interrupt 1EH contains
    the address of a table of diskette drive initialization parameters; the
    interrupt 1FH vector points to a table of bit patterns used by the ROM
    BIOS to display text characters; and interrupts 41H and 46H point to
    tables of fixed-disk parameters. These interrupt vectors are used for
    convenience, not for interrupts. If you tried to execute interrupt 1EH,
    for instance, you'd probably crash the system because the interrupt 1EH
    vector points to data, not to executable code.

    The interrupt vectors are stored at the lowest memory locations; the very
    first location in memory contains the vector for interrupt number 00H, and
    so on. Because each vector is two words in length, you can find a
    particular interrupt's location in memory by multiplying its interrupt
    number by 4. For example, the vector for interrupt 05H, the print-screen
    service interrupt, would be at byte offset 20 (5 x 4 = 20); that is, at
    address 0000:0014H. You can examine the interrupt vectors by using DEBUG.
    For example, you could examine the interrupt 05H vector with DEBUG in the
    following way:

    DEBUG
    D 0000:0014 L 4

    DEBUG will show 4 bytes, in hex, like this:

    54 FF 00 F0

    Converted to a segment and offset address and allowing for "back-words"
    storage, the interrupt vector for the entry point in ROM of the
    print-screen service routine (interrupt 05H) is F000:FF54H. (Of course,
    this address may be different in different members of the PC and PS/2
    families.) The same DEBUG instruction finds any other interrupt vector
    just as easily.

    Figure 3-1 lists the main interrupts and their vector locations. These
    are the interrupts that programmers will probably find most useful.
    Details are available for most of these interrupts in Chapters 8 through
    18. Interrupts that are not mentioned in this list are, for the most part,
    reserved for future development by IBM.

╓┌─┌───────────┌────────────┌────────────────────────────────────────────────╖
    Interrupt   Offset in    Use
    Hex   Dec   Segment
                0000
    ──────────────────────────────────────────────────────────────────────────
    00H   0     0000         Generated by CPU when division by zero is
                            attempted
    01H   1     0004         Used to single-step through programs (as with
                            DEBUG)
    02H   2     0008         Nonmaskable interrupt (NMI)
    03H   3     000C         Used to set break-points in programs (as with
                            DEBUG)
    04H   4     0010         Generated when arithmetic result overflows
    05H   5     0014         Invokes print-screen service routine in ROM BIOS
    08H   8     0020         Generated by hardware clock tick
    Interrupt   Offset in    Use
    Hex   Dec   Segment
                0000
    ──────────────────────────────────────────────────────────────────────────
    08H   8     0020         Generated by hardware clock tick
    09H   9     0024         Generated by keyboard action
    0EH   14    0038         Signals diskette attention (e.g. to signal
                            completion)
    0FH   15    003C         Used in printer control
    10H   16    0040         Invokes video display services in ROM BIOS
    11H   17    0044         Invokes equipment-list service in ROM BIOS
    12H   18    0048         Invokes memory-size service in ROM BIOS
    13H   19    004C         Invokes disk services in ROM BIOS
    14H   20    0050         Invokes communications services in ROM BIOS
    15H   21    0054         Invokes system services in ROM BIOS
    16H   22    0058         Invokes standard keyboard services in ROM BIOS
    17H   23    005C         Invokes printer services in ROM BIOS
    18H   24    0060         Activates ROM BASIC language
    19H   25    0064         Invokes bootstrap start-up routine in ROM BIOS
    1AH   26    0068         Invokes time and date services in ROM BIOS
    1BH   27    006C         Interrupt by ROM BIOS for Ctrl-Break
    Interrupt   Offset in    Use
    Hex   Dec   Segment
                0000
    ──────────────────────────────────────────────────────────────────────────
    1BH   27    006C         Interrupt by ROM BIOS for Ctrl-Break
    1CH   28    0070         Interrupt generated at each clock tick
    1DH   29    0074         Points to table of video control parameters
    1EH   30    0078         Points to diskette drive parameter table
    1FH   31    007C         Points to CGA video graphics characters
    20H   32    0080         Invokes program-terminate service in DOS
    21H   33    0084         Invokes all function-call services in DOS
    22H   34    0088         Address of DOS program-terminate routine
    23H   35    008C         Address of DOS keyboard-break handler
    24H   36    0090         Address of DOS critical-error handler
    25H   37    0094         Invokes absolute disk-read service in DOS
    26H   38    0098         Invokes absolute disk-write service in DOS
    27H   39    009C         Ends program, but keeps it in memory under DOS
    2FH   47    00BC         DOS Multiplex interrupt
    41H   65    0104         Points to fixed-disk drive parameter table
    43H   67    010C         Points to video graphics characters (EGA, PS/2s)
    67H   103   019CH        Invokes LIM Expanded Memory Manager
    Interrupt   Offset in    Use
    Hex   Dec   Segment
                0000
    ──────────────────────────────────────────────────────────────────────────
    67H   103   019CH        Invokes LIM Expanded Memory Manager
    ──────────────────────────────────────────────────────────────────────────


    Figure 3-1.  Important interrupts used in the IBM personal computer
    family.

    Changing Interrupt Vectors

    The main programming interest in interrupt vectors is not to read them but
    to change them to point to a new interrupt-handling routine. To do this,
    you must write a routine that performs a different function than the
    standard ROM BIOS or DOS interrupt handlers perform, store the routine in
    RAM, and then assign the routine's address to an existing interrupt in the
    table.

    A vector can be changed byte by byte on an assembly-language level, or by
    using a programming-language instruction like the POKE statement in BASIC.
    In some cases, there may be a danger of an interrupt occurring in the
    middle of a change to the vector. If you are not concerned about this, go
    ahead and use the POKE method. Otherwise, there are two ways to change
    a vector while minimizing the likelihood of interrupts: by suspending
    interrupts during the process, or by using a DOS interrupt specially
    designed to change vectors.

    The first method requires that you use assembly language to suspend
    interrupts while you change the interrupt vector. You can use the clear
    interrupts instruction (CLI), which suspends all interrupts until a
    subsequent STI (set interrupts) instruction is executed. By temporarily
    disabling interrupts with CLI you ensure that no interrupts can occur
    while you update an interrupt vector.

    ──────────────────────────────────────────────────────────────────────────
    NOTE:
    CLI does not disable the nonmaskable interrupt (NMI). If your
    application is one of the rare ones that needs to supply its own NMI
    handler, the program should temporarily disable the NMI while changing
    the NMI interrupt vector. (See PC or PS/2 technical reference manuals
    for details.)
    ──────────────────────────────────────────────────────────────────────────

    The following example demonstrates how to update an interrupt vector with
    interrupts temporarily disabled. This example uses two MOV instructions to
    copy the segment and offset address of an interrupt handler from DS:DX
    into interrupt vector 60H:

    xor     ax,ax                   ; zero segment register ES
    mov     es,ax
    cli                             ; disable interrupts
    mov     word ptr es:[180h],dx   ; update vector offset
    mov     word ptr es:[182h],ds   ; update vector segment
    sti                             ; enable interrupts

    The second method of updating an interrupt vector is to let DOS do it for
    you using DOS interrupt 21H, service 25H (decimal 37), which was designed
    for this purpose. There are two very important advantages to letting DOS
    set interrupts for you. One advantage is that DOS takes on the task of
    putting the vector into place in the safest possible way. The other
    advantage is more far-reaching. When you use DOS service 25H to change an
    interrupt vector, you allow DOS to track changes to any interrupt vectors
    it may itself be using. This is particularly important for programs that
    might run in the DOS "compatibility box" in OS/2. Using a DOS service to
    set an interrupt vector instead of setting it yourself is only one of many
    ways that you can reduce the risk that a program will be incompatible with
    new machines or new operating-system environments.

    The following example demonstrates how to use interrupt 21H, service 25H
    to update the vector for interrupt 60H from values stored in a memory
    variable:

    mov     dx,seg Int60Handler     ; copy new segment to DS
    mov     ds,dx
    mov     dx,offset Int60Handler  ; store offset address in DX
    mov     al,60h                  ; interrupt number
    mov     ah,25h                  ; DOS set-interrupt function number
    int     21h                     ; DOS function-call interrupt

    This example shows, in the simplest possible way, how to use the DOS
    service. However, it glosses over an important and subtle difficulty: You
    have to load one of the addresses that you're passing to DOS into the DS
    (data segment) register──which effectively blocks normal access to data
    through the DS register. Getting around that problem requires you to
    preserve the contents of the DS register. Here is one way this can be
    done. In this example, taken from the Norton Utilities programs, the
    interrupt 09H vector is updated with the address of a special interrupt
    handler:

    push    ds                      ; save current data segment
    mov     dx,offset PGROUP:XXX    ; store handler's offset in DX
    push    cs                      ; move handler's code segment...
    pop     ds                      ; ...into DS
    mov     ah,25h                  ; request set-interrupt function
    mov     al,9                    ; change interrupt number 9
    int     21h                     ; DOS function-call interrupt
    pop     ds                      ; restore original data segment

Key Low-Memory Addresses

    Much of the operation of the PCs and PS/2s is controlled by data stored in
    low-memory locations, particularly in the two adjacent 256-byte areas
    beginning at segments 40H and 50H (addresses 0040:0000H and 0050:0000H).
    The ROM BIOS uses the 256 bytes from 0040:0000H through 0040:00FFH as a
    data area for its keyboard, video, disk, printer, and communications
    routines. The 256 bytes between 0050:0000H and 0050:00FFH are used
    primarily by BASIC, although a few ROM BIOS status variables are located
    there as well.

    Data is loaded into these areas by the BIOS during the start-up process.
    Although the control data is supposed to be the private reserve of the
    BIOS, DOS, and BASIC, your programs are allowed to inspect or even change
    it. Even if you do not intend to use the information in these control
    areas, it is worth studying because it reveals a great deal about what
    makes the PC family tick.

    The ROM BIOS Data Area

    Some memory locations in the BIOS data area are particularly interesting.
    Most of them contain data vital to the operation of various ROM BIOS and
    DOS service routines. In many instances, your programs can obtain
    information stored in these locations by invoking a ROM BIOS interrupt; in
    all cases, they can access the information directly. You can easily check
    out the values at these locations on your own computer, using either DEBUG
    or BASIC. To use DEBUG, type a command of this form:

    DEBUG
    D XXXX:YYYY L 1

    XXXX represents the segment part of address you want to examine. (This
    would be either 0040H or 0050H, depending on the data area that interests
    you.) YYYY represents the offset part of the address. The L 1 tells DEBUG
    to display one byte. To see two or more bytes, type the number of bytes
    (in hex) you want to see after the L instruction. For example, the BIOS
    keeps track of the current video mode number in the byte at 0040:0049H. To
    inspect this byte with DEBUG, you would type

    DEBUG
    D 0040:0049 L 1

    To display the data with BASIC, use a program of the following form,
    making the necessary substitutions for segment (&H0040 or &H0050),
    number.of.bytes, and offset (the offset part of the address you want to
    inspect):

    10 DEF SEG = segment
    20 FOR I = 0 TO number.of.bytes - 1
    30   VALUE = PEEK(offset + I)
    40   IF VALUE < 16 THEN PRINT "0";    ' needed for leading zero
    50   PRINT HEX$ (VALUE);" ";
    60 NEXT I

    The following pages describe useful low-memory addresses.

    0040:0010H (a 2-byte word). This word holds the equipment-list data that
    is reported by the equipment-list service, interrupt 11H (decimal 17). The
    format of this word, shown in Figure 3-2, was established for the PC and
    XT; certain parts may appear in a different format in later models.

    0040:0013H (a 2-byte word). This word contains the usable memory size in
    KB. BIOS interrupt service 12H (decimal 18) is responsible for reporting
    the value in this word.

    0040:0017H (2 bytes of keyboard status bits). These bytes are actively
    used to control the interpretation of keyboard actions by the ROM BIOS
    routines. Changing these bytes actually changes the meaning of keystrokes.
    You can freely change the first byte, at address 0040:0017H, but it is not
    a good idea to change the second byte. See pages 137 and 138 for the bit
    settings of these 2 bytes.

╓┌─┌──────────────────┌─────────────────┌────────────────────────────────────╖
                    Bit
    F E D C B A 9 8    7 6 5 4 3 2 1 0   Meaning
    ──────────────────────────────────────────────────────────────────────────
    X X . . . . . .    . . . . . . . .   Number of printers installed
    . . X . . . . .    . . . . . . . .   (Reserved)
    . . . X . . . .    . . . . . . . .   1 if game adapter installed
    . . . . X X X .    . . . . . . . .   Number of RS-232 serial ports
    . . . . . . . X    . . . . . . . .   (Reserved)
    . . . . . . . .    X X . . . . . .   +1 = number of diskette drives:
                                            00 = 1 drive; 01 = 2 drives;
                                            10 = 3 drives;
                                            11 = 4 drives (see bit 0)
    . . . . . . . .    . . X X . . . .   Initial video mode:
                                            01 = 40-column color;
                                            10 = 80-column color,
                                            11 = 80-column monochrome;
                    Bit
    F E D C B A 9 8    7 6 5 4 3 2 1 0   Meaning
    ──────────────────────────────────────────────────────────────────────────
                                            11 = 80-column monochrome;
                                            00 = none of the above
    . . . . . . . .    . . . . X X . .   For PC with 64 KB motherboard:
                                            Amount of system board RAM
                                            (11 = 64 KB, 10 = 48 KB,
                                            01 = 32 KB, 00 = 16 KB)
                                        For PC/AT: Not used
                                        For PS/2s: Bit 3: Not used;
                                            Bit 2: 1 = pointing device
                                        installed
    . . . . . . . .    . . . . . . X .   1 if math coprocessor installed
    . . . . . . . .    . . . . . . . X   1 if any diskette drives present (if
                                        so, see bits 7 and 6)
    ──────────────────────────────────────────────────────────────────────────


    Figure 3-2.  The coding of the equipment-list word at address 0040:0010H.

    0040:001AH (a 2-byte word). This word points to the current head of the
    BIOS keyboard buffer at 0040:001EH, where keystrokes are stored until they
    are used.

    0040:001CH (a 2-byte word). This word points to the current tail of the
    BIOS keyboard buffer.

    0040:001EH (32 bytes, used as sixteen 2-byte entries). This keyboard
    buffer holds up to 16 keystrokes until they are read via the BIOS services
    through interrupt 16H (decimal 22). As this is a circular queue buffer,
    two pointers indicate the head and tail. It is not wise to manipulate this
    data.

    0040:003EH (1 byte). This byte indicates if a diskette drive needs to be
    recalibrated before seeking to a track. Bits 0 through 3 correspond to
    drives 0 through 3. If a bit is clear, recalibration is needed. Generally,
    you will find that a bit is clear if there was any problem with the most
    recent use of a drive. For example, the recalibration bit will be clear if
    you try to request a directory (DIR) on a drive with no diskette, and then
    type A in response to the following display:

    Not ready reading drive A
    Abort, Retry, Fail?

    0040:003FH (1 byte). This byte returns the diskette motor status. Bits 0
    through 3 correspond to drives 0 through 3. If the bit is set, the
    diskette motor is running.

    0040:0040H (1 byte). This byte is used by the ROM BIOS to ensure that the
    diskette drive motor is turned off. The value in this byte is decremented
    with every tick of the system clock (that is, about 18.2 times per
    second). When the value reaches 0, the BIOS turns off the drive motor.

    0040:0041H (1 byte). This byte contains the status code reported by the
    ROM BIOS after the most recent diskette operation. (See Figure 3-3.)

    0040:0042H (7 bytes). These 7 bytes hold diskette controller status
    information.

    Beginning at 0040:0049H is a 30-byte area used for video control. This is
    the first of two areas in segment 40H that the ROM BIOS uses to track
    critical video information.

    Value              Meaning
    ──────────────────────────────────────────────────────────────────────────
    00H                No error
    01H                Invalid diskette command requested
    02H                Address mark on diskette not found
    03H                Write-protect error
    04H                Sector not found; diskette damaged or not formatted
    06H                Diskette change line active
    08H                DMA diskette error
    09H                Attempt to DMA across 64 KB boundary
    0CH                Media type not found
    10H                Cyclical redundancy check (CRC) error in data
    20H                Diskette controller failed
    40H                Seek operation failed
    80H                Diskette timed out (drive not ready)
    ──────────────────────────────────────────────────────────────────────────

    Figure 3-3.  Diskette status codes in the ROM BIOS data area at
    0040:0041H.

    Although programs can safely inspect any of this data, you should modify
    the data only when you bypass the ROM BIOS video services and program the
    video hardware directly. In such cases, you should update the video
    control data to reflect the true status of the video hardware.

    0040:0049H (1 byte). The value in this byte specifies the current video
    mode. (See Figure 3-4.) This is the same video-mode number used in the
    ROM BIOS video services. (See Chapter 9 for more on these services and
    page 72 for general information concerning video modes.)

    We've already shown how to use DEBUG to determine the current video mode
    by inspecting the byte at 0040:0049H. BASIC programs can use the following
    instructions to read this byte and determine the video mode:

    DEF SEG = &H40                  ' set BASIC data segment to 40H
    VIDEO.MODE = PEEK(&H49)         ' look at location 0040:0049H

    0040:004AH (a 2-byte word). This word indicates the number of characters
    that can be displayed in each row of text on the screen.

    0040:004CH (a 2-byte word). This word indicates the number of bytes
    required to represent one screenful of video data.

    Number             Description
    ──────────────────────────────────────────────────────────────────────────
    00H                40 x 25 16-color text
                        (CGA composite color burst disabled)
    01H                40 x 25 16-color text
    02H                80 x 25 16-color text
                        (CGA composite color burst disabled)
    03H                80 x 25 16-color text
    04H                320 x 200 4-color graphics
    05H                320 x 200 4-color graphics
                        (CGA composite color burst disabled)
    06H                640 x 200 2-color graphics
    07H                80 x 25 monochrome text
    0DH                320 x 200 16-color graphics
    0EH                640 x 200 16-color graphics
    0FH                640 x 350 monochrome graphics
    10H                640 x 350 16-color graphics
    11H                640 x 480 2-color graphics
    12H                640 x 480 16-color graphics
    13H                320 x 200 256-color graphics
    ──────────────────────────────────────────────────────────────────────────

    Figure 3-4.  BIOS video mode numbers stored at address 0040:0049H.

    0040:004EH (a 2-byte word). This word contains the starting byte offset
    into video display memory of the current display page. In effect, this
    address indicates which page is in use by giving the offset to that page.

    0040:0050H (eight 2-byte words). These words give the cursor locations for
    eight separate display pages, beginning with page 0. The first byte of
    each word gives the character column and the second byte gives the row.

    0040:0060H (a 2-byte word). These 2 bytes indicate the size of the cursor,
    based on the range of cursor scan lines. The first byte gives the ending
    scan line, the second byte the starting scan line.

    0040:0062H (1 byte). This byte holds the current display page number.

    0040:0063H (a 2-byte word). This word stores the port address of the
    hardware CRT controller chip.

    0040:0065H (1 byte). This byte contains the current setting of the CRT
    mode register on the Monochrome Display Adapter and the Color Graphics
    Adapter.

    0040:0066H (1 byte). This byte contains the current setting of the Color
    Graphics Adapter's CRT color register. This byte ends the first block of
    ROM BIOS video control data.

    0040:0067H (5 bytes). The original IBM PC BIOS used the 5 bytes starting
    at 0040:0067H for cassette tape control. In PS/2 models 50, 60, and 80,
    which don't support a cassette interface, the 4 bytes at 0040:0067H can
    contain the address of a system reset routine that overrides the usual
    BIOS startup code. (See the BIOS technical reference manual for details.)

    0040:006CH (4 bytes stored as one 4-byte number). This area is used as a
    master clock count, which is incremented once for each timer tick. It is
    treated as if it began counting from 0 at midnight. When the count reaches
    the equivalent of 24 hours, the ROM BIOS resets the count to 0 and sets
    the byte at 0040:0070H to 1. DOS or BASIC calculates the current time from
    this value and sets the time by putting the appropriate count in this
    field.

    0040:0070H (1 byte). This byte indicates that a clock rollover has
    occurred. When the clock count passes midnight (and is reset to 0), the
    ROM BIOS sets this byte to 1, which means that the date should be
    incremented.

    ──────────────────────────────────────────────────────────────────────────
    NOTE:
    This byte is set to 1 at midnight and is not incremented. There is no
    indication if two midnights pass before the clock is read.
    ──────────────────────────────────────────────────────────────────────────

    0040:0071H (1 byte). The ROM BIOS sets bit 7 of this byte to indicate that
    the Ctrl-Break key combination was pressed.

    0040:0072H (a 2-byte word). This word is set to 1234H after the initial
    power-up memory check. When a warm boot is instigated from the keyboard
    (via Ctrl-Alt-Del), the memory check will be skipped if this location is
    already set to 1234H.

    0040:0074H (4 bytes). These 4 bytes are used by various members of the PC
    family for diskette and fixed-disk drive control. See the IBM BIOS
    Interface Technical Reference Manual for details.

    0040:0078H (4 bytes). These bytes control time-out values for the parallel
    printers. (In the PS/2, only the first 3 bytes are used for this purpose.)

    0040:007CH (4 bytes). These bytes contain time-out values for up to four
    RS-232 serial ports.

    0040:0080H (a 2-byte word). This word points to the start of the keyboard
    buffer area.

    0040:0082H (a 2-byte word). This word points to the end of the keyboard
    buffer area.

    The next 7 bytes are used by the ROM BIOS in the EGA and PS/2s for video
    control:

    0040:0084H (1 byte). The value of this byte is one less than the number of
    character rows displayed on the screen. The BIOS can refer to this value
    to determine how many character rows of data to erase when the screen is
    cleared or how many rows to print when Shift-PrtSc is pressed.

    0040:0085H (2 bytes). This word indicates the height, in scan lines, of
    characters on the screen.

    0040:0087H (4 bytes). These 4 bytes are used by the BIOS video support
    routines to indicate the amount of video RAM available, the initial
    settings of the EGA configuration switches, and other miscellaneous video
    status information.

    0040:008BH (11 bytes). The ROM BIOS uses this data area for control and
    status information regarding the diskette and fixed-disk drives.

    0040:0098H (9 bytes). This data area is used by the PC/AT and PS/2 BIOS to
    control certain functions of the real-time clock.

    0040:00A8H (4 bytes). In the EGA and PS/2 BIOS, these bytes contain the
    segmented address of a table of video parameters and overrides for default
    ROM BIOS video configuration values. The actual contents of the table
    vary, depending on which video hardware you are using. The IBM ROM BIOS
    Interface Technical Reference Manual describes this table in detail.

    0050:0000H (1 byte). This byte is used by the ROM BIOS to indicate the
    status of a print-screen operation. Three possible hex values are stored
    in this location:

    ──────────────────────────────────────────────────────────────────────────
    00H            Indicates OK status
    01H            Indicates a print-screen operation is currently in progress
    FFH            Indicates an error occurred during a print-screen operation
    ──────────────────────────────────────────────────────────────────────────

    0050:0004H (1 byte). This byte is used by DOS when a single-diskette
    system mimics a two-diskette system. The value indicates whether the one
    physical drive is acting as drive A or drive B. These values are used:

    ──────────────────────────────────────────────────────────────────────────
    00H            Acting as drive A
    01H            Acting as drive B
    ──────────────────────────────────────────────────────────────────────────

    0050:0010H (a 2-byte word). This area is used by ROM BASIC to hold its
    default data segment (DS) value.

    BASIC lets you set your own data segment value with the DEF SEG = value
    statement. (The offset into the segment is specified by the PEEK or POKE
    function.) You can also reset the data segment to its default setting by
    using the DEF SEG statement without a value. Although BASIC does not give
    you a simple way to find the default value stored in this location, you
    can get it by using this little routine:

    DEF SEG = &H50
    DATA.SEGMENT = PEEK(&H11) * 256 + PEEK(&H10)

    ──────────────────────────────────────────────────────────────────────────
    NOTE:
    BASIC administers its own internal data based on the default data
    segment value. Attempting to change this value is likely to sabotage
    BASIC's operation.
    ──────────────────────────────────────────────────────────────────────────

    0050:0012H (4 bytes). In some versions of ROM BASIC, these 4 bytes contain
    the segment and offset address of BASIC's clock-tick interrupt handler.

    ──────────────────────────────────────────────────────────────────────────
    NOTE:
    In order to perform better, BASIC runs the system clock at four times
    the standard rate, so BASIC must replace the ROM BIOS clock interrupt
    routine with its own. The standard BIOS interrupt routine is invoked by
    BASIC at the normal rate; that is, once for every four fast ticks.
    There's more about this on page 146.
    ──────────────────────────────────────────────────────────────────────────

    0050:0016H (4 bytes). This area contains the address of ROM BASIC's
    break-key handling routine.

    0050:001AH (4 bytes). This area contains the address of ROM BASIC's
    diskette error-handling routine.

    The Intra-Application Communications Area

    In the PC/XT/AT family, the 16 bytes starting at 0040:00F0H are reserved
    as an intra-application communication area (ICA). This data area provides
    an area of RAM at a known address that an application can use for sharing
    data among separate program modules. In the PS/2 BIOS, however, the ICA is
    no longer documented.

    Few applications actually use the ICA because the amount of RAM is so
    small and because the data within the ICA can be unexpectedly modified
    when more than one program uses it. If you do write a program that uses
    the ICA, we recommend that you include a checksum and also a signature so
    that you can ensure that the data in the ICA is yours and that it has not
    been changed by another program.

    ──────────────────────────────────────────────────────────────────────────
    WARNING:
    The ICA is definitely located in the 16 bytes from 0040:00F0H through
    0040:00FFH. A typographic error in some editions of the IBM PC Technical
    Reference Manual places it at 0050:0000H through 0050:00FFH. This is
    incorrect.
    ──────────────────────────────────────────────────────────────────────────

    The BIOS Extended Data Area

    The PS/2 ROM BIOS start-up routines allocate an additional area of RAM for
    their own use. The BIOS routines use this extended data area for transient
    data storage. For example, the BIOS routines that support the
    pointing-device (mouse) controller hardware use part of the extended data
    area for temporary storage.

    You can determine the starting address of the extended data area by using
    a system service available through ROM BIOS interrupt 15H. (See Chapter
    12.) The first byte in the extended data area contains the size of the
    data area in KB.

The ROM Version and Machine-ID Markers

    Because the BIOS programs are fixed in memory, they can't be easily
    changed when additions or corrections are needed. This means that ROM
    programs must be tested very carefully before they are frozen onto memory
    chips. Although there is a good chance for serious errors to exist in a
    system's ROM programs, IBM has a fine track record; so far, only small and
    relatively unimportant errors have been found in the PC family's ROM
    programs, and IBM has done well to correct errors by revising the BIOS.

    The different versions of ROM software could present a small challenge to
    programmers who discover that the differences affect the operating
    characteristics of their programs. But an even greater challenge for
    programmers is that the PC, XT, AT, and PS/2s each have a slightly
    different set of ROM BIOS routines.

    To ensure that programs can work with the appropriate ROM programs and the
    right computer, IBM has supplied two identifying markers that are
    permanently available at the end of memory in the system ROM. One marker
    identifies the ROM release date, which can be used to identify the BIOS
    version, and the other gives the machine model. These markers are always
    present in IBM's own machines and you'll also find them supplied by the
    manufacturers of a few PC compatibles. The following paragraphs describe
    these markers in detail.

    The ROM release date can be found in an 8-byte storage area from
    F000:FFF5H to F000:FFFCH (2 bytes before the machine ID byte). It consists
    of ASCII characters in the common American date format; for example,
    06/01/83 stands for June 1, 1983. This release marker is a common feature
    of the IBM personal computers, but is present in only a few IBM
    compatibles. For example, the Compaq Portable I does not have it, but the
    Panasonic Senior Partner does.

    You can look at the release date with DEBUG by using the following
    command:

    DEBUG
    D F000:FFF5 L 8

    Or you can let your program look at the bytes using this technique:

    10 DEF SEG = &HF000
    20 FOR I = 0 TO 7
    30   PRINT CHR$(PEEK(&HFFF5 + I));
    40 NEXT
    50 END

    The model ID is a byte located at F000:FFFEH. This byte identifies which
    model of PC or PS/2 you are using. (See Figure 3-5.) In addition, a ROM
    BIOS service in the PC/AT and PS/2s returns more detailed identification
    information, including the submodel byte listed in the figure. (See
    Chapter 12.)

╓┌─┌──────────────────┌────────┌────────┌────────┌─────────┌─────────────────╖
    Machine            Date     Model    Submodel BIOS      Revision Notes
    ──────────────────────────────────────────────────────────────────────────
    PC                 04/24/81 FFH      ☼        00
                        10/19/81 FFH      ☼        01        Some BIOS bugs
                                                            fixed
                        10/27/82 FFH      ☼        02        Upgrade of PC BIOS
                                                            to XT level
    PC/XT              11/08/82 FEH      ☼        00
                        01/10/86 FBH      00       01        256/640 KB system
                                                            board
                        05/09/86 FBH      00       02
    PC/AT              01/10/84 FCH      ☼        00        6 MHz 80286
                        06/10/85 FCH      00       01
                        11/15/85 FCH      01       00        8 MHz 80286
    PS/2 Model 25      06/26/87 FAH      01       00
    PS/2 Model 30      09/02/86 FAH      00       00
                        12/12/86 FAH      00       01
    PS/2 Model 50      02/13/87 FCH      04       00
    PS/2 Model 60      02/13/87 FCH      05       00
    Machine            Date     Model    Submodel BIOS      Revision Notes
    ──────────────────────────────────────────────────────────────────────────
    PS/2 Model 60      02/13/87 FCH      05       00
    PS/2 Model 80      03/30/87 F8H      00       00        16 MHz 80386
    PS/2 Model 80      10/07/87 F8H      01       00        20 MHz 80386
    PCjr               06/01/83 FDH      ☼        00
    PC Convertible     09/13/85 F9H      00       00
    PC/XT Model        04/21/86 FCH      02       00
    286
    ──────────────────────────────────────────────────────────────────────────


    Figure 3-5.  Machine and ROM BIOS version identification.

    It is possible that IBM-compatible computers can be identified in the same
    way, but we do not know of any reliable published information. You may
    need to rely on improvised methods to identify non-IBM compatibles.

    You can examine the machine ID byte with DEBUG by using the following
    command:

    DEBUG
    D F000:FFFE L 1

    A BASIC program can inspect this byte using techniques such as this:

    10 DEF SEG = &HF000
    20 MODEL = PEEK(&HFFFE)
    30 IF MODEL < &HF8 THEN PRINT "I'm not an IBM computer" : STOP
    40 ON (MODEL - &HF7) GOTO 100,110,120,130,140,150,160,170
    100 PRINT "I'm a PS/2 Model 80" : STOP
    110 PRINT "I'm a PC convertible" : STOP
    120 PRINT "I'm a PS/2 Model 30" : STOP
    130 PRINT "I'm a PC/XT" : STOP
    140 PRINT "I'm an 80286-based machine (PC/AT, PS/2 Model 50 or 60)" :
                STOP
    150 PRINT "I'm a PCjr" : STOP
    160 PRINT "I'm a PC/XT" : STOP
    170 PRINT "I'm a PC" : STOP


The ROM BASIC

    Now we move on to the third element of ROM: the ROM BASIC. The ROM BASIC
    acts in two ways. First, it provides the core of the BASIC language, which
    includes most of the commands and the underlying foundation──such as
    memory management──that BASIC uses. The disk versions of interpreted
    BASIC, which are found in the program files BASIC.COM and BASICA.COM, are
    essentially supplements to ROM BASIC, and they rely on ROM BASIC to get
    much of their work done. The second role of ROM BASIC is to provide what
    IBM calls "cassette" BASIC──the BASIC that is activated when you start up
    your computer without a disk.

    Whenever you use any of the interpreted, disk-based BASICs, the ROM BASIC
    programs are also used──although there's nothing to make you aware of it.
    On the other hand, compiled BASIC programs don't make use of the ROM
    BASIC.


The ROM Extensions

    The fourth element of the ROM has more to do with the PC's design than
    with the actual contents of its memory. The PC was designed to allow for
    installable extensions to the built-in software in ROM. The additional ROM
    is usually located on a plug-in adapter such as the Enhanced Graphics
    Adapter or a fixed-disk controller card. Computers in the PC/XT/AT family
    also have empty sockets on their system boards to accommodate additional
    ROM chips. Because the original ROM BIOS could not include support
    programs for future hardware, ROM extensions are obviously a necessary and
    helpful addition.

    Several memory areas are reserved for ROM extensions. Addresses C000:0000H
    through C000:7FFFH are reserved for video adapter ROM. The area between
    C800:0000H and D000:FFFFH can be used by nonvideo adapters. (For example,
    the IBM XT fixed-disk adapter occupies addresses starting at C800:0000H.)
    Finally, ROM extensions on chips placed onto the system board of a PC, XT,
    or AT occupy the address range E000:0000H through E000:FFFFH. In the PS/2
    models 50, 60, and 80, you cannot add ROM chips to the system board. The
    system ROM in these computers occupies the entire address range between
    E000:0000H and F000:FFFFH.


Comments

    As the PC family has evolved, the amount and complexity of the ROM
    software has increased to accommodate the greater sophistication of the
    computer hardware. The source code listings in the PC, XT, and AT
    technical reference manuals consist of tens of thousands of
    assembly-language instructions. Despite the size of the ROM BIOS, a browse
    through the source code can be fun and enlightening.

    We have made every effort in this book to point out when and how to use
    the ROM BIOS routines. We recommend that you read Chapters 8 through 13
    before you begin your own exploration of the ROM BIOS.



────────────────────────────────────────────────────────────────────────────
Chapter 4  Video Basics

    The Video Subsystems
    Memory and the Video Subsystems
    Creating the Screen Image

    The Video Display Modes
    Video Mode Control
    Display Resolution

    The Use of Color
    Color-Suppressed Modes
    Color in Text and Graphics Modes

    Inside the Display Memory
    Display Pages in Text Modes
    Display Pages in Graphics Modes
    Displaying Characters in Text and Graphics Modes

    Controlling the Video Display
    Direct Hardware Control

    Compatibility Considerations

    To many people, the video display is the computer. Programs are often
    judged by their display quality and visual design alone. In this chapter,
    you'll see what kinds of video display output the IBM PC family can
    produce. More importantly, we'll describe how to manipulate the video
    displays to get the effects you want.


The Video Subsystems

    Every PC and PS/2 has a video subsystem responsible for producing the
    image that appears on the screen. At the heart of the video subsystem is
    the special-purpose circuitry that must be programmed to generate the
    electrical signals that control the video display. Most members of the
    PC/XT/AT family require you to install a display adapter, a special video
    circuit board that plugs into one of the computer's expansion slots. On
    the other hand, all PS/2s are equipped with built-in video circuitry and,
    therefore, require no display adapter.

    The video circuitry consists of a group of interrelated components that
    control signal timing, colors, and the generation of text characters. All
    IBM video subsystems have a video buffer, a block of dedicated memory that
    holds the text or graphics information displayed on the screen. The video
    subsystem performs the unique task of translating the raw data in the
    video buffer into the signals that drive the video display.

    The various video subsystems used in PCs and PS/2s all evolved from the
    two video adapters originally released by IBM for the PC: the Monochrome
    Display Adapter (MDA) and the Color Graphics Adapter (CGA). IBM later
    released its Enhanced Graphics Adapter (EGA), a more powerful successor to
    the MDA and CGA.

    When the PS/2s appeared, IBM introduced two more video subsystems: the
    Multi-Color Graphics Array (MCGA), built into the PS/2 models 25 and 30,
    and the Video Graphics Array (VGA), built into the PS/2 models 50, 60, and
    80. At the same time the PS/2s appeared, IBM introduced a VGA adapter that
    can be used in the PC/XT/AT family as well as in the PS/2 Model 30.

    We'll be discussing all five of these IBM subsystems──MDA, CGA, EGA, MCGA,
    and VGA──in this chapter. Although clear differences in hardware design
    exist between the various video subsystems, their strong family
    resemblance should encourage you to consider what they have in common
    before worrying about the differences between them.

    Most of the five video subsystems can be programmed into two fundamentally
    different modes, called text mode and graphics mode by IBM. (The lone
    exception is the MDA, which operates only in text mode.) In text mode you
    can display only text characters, though many of these characters are
    suitable for producing simple line drawings. (See Appendix C for more on
    characters.) Graphics mode is mainly used for complex drawings but you can
    also use it to draw text characters in a variety of shapes and sizes.

    The CGA can operate in both text and graphics modes to produce drawings
    and characters in several formats and colors. By contrast, the MDA can
    operate only in text mode, using a stored set of ASCII alphanumeric and
    graphics characters and displaying them in only one color. The MDA works
    only with the IBM Monochrome Monitor (or its equivalent) while the CGA
    must be connected to either a direct-drive or a composite color monitor.
    (See page 74 for more on monitors.) Many business and professional users
    prefer a monochrome display to a color display because a monochrome screen
    is easier on the eyes and less expensive than an equivalent color display.
    But in choosing monochrome, they sacrifice color, a valuable asset for any
    computer display.

    The MDA's most obvious drawback is its inability to display images in
    graphics mode. For this reason, PC/XT/AT users who prefer a monochrome
    display, yet need to view graphics, must turn to an EGA or to a non-IBM
    adapter like the Hercules Graphics Card, which emulates the MDA's text
    mode but supports a monochrome graphics mode as well.

    Roughly two-thirds of all PCs are equipped with the standard MDA and
    therefore have no graphics or color capability. While there are real
    advantages to using color and graphics, most PCs get along nicely without
    either. Although the clear trend is toward higher-performance video
    subsystems that can display graphics as well as text, keep in mind as you
    plan computer applications that many PCs display text only.

    The best way to understand the video capabilities of the PCs and PS/2s is
    to cover the features that their various video subsystems have in common.
    As we go along, we'll point out the differences and improvements that
    distinguish the newer and more complicated subsystems (EGA, MCGA, and VGA)
    from their predecessors (MDA and CGA).

Memory and the Video Subsystems

    The video buffer memory is connected directly to the display circuitry so
    that the data in the video buffer can be repeatedly read out of the buffer
    and displayed. However, the video buffer is also logically (to the CPU) a
    part of the computer's main memory address space. A full 128 KB of the
    memory address space is set aside for use as video buffers, at addresses
    A000:0000H through B000:FFFFH, but the two original display adapters use
    only two small parts of this memory area. The Monochrome Display Adapter
    (MDA) provides 4 KB of display memory located at segment B000H. The
    original CGA provides 16 KB of display memory located at segment B800H.

    With the other IBM video subsystems, the address at which video memory is
    located isn't fixed──it depends on how the subsystem is configured. For
    example, when an EGA is used with a monochrome display, its text-mode
    video buffer is placed at B000H, just as with an MDA. When an EGA is
    attached to a color display, its video buffer can be addressed at B800H.
    And when you use an EGA in non-CGA graphics modes, the starting buffer
    address is A000H. Like the EGA, the MCGA and the VGA also support this
    chameleon-like method of buffer addressing.

Creating the Screen Image

    You can describe the screen display created by IBM video subsystems as a
    memory-mapped display, because each address in the display memory
    corresponds to a specific location on the screen. (See Figure 4-1.) The
    display circuitry repeatedly reads information from memory and places it
    on the screen. The information can be changed as quickly as the computer
    can write new information from your programs into memory. The display
    circuitry translates the stream of bits it receives from memory into
    bursts of light at particular locations on the screen.

                    ┌─────────────────────────────────────────┐
                    │ ┌─────────────────────────────────────┐ │
                    │ │ •• •••  • •••• •••• ••• Pixels or   │ │
                    │ │                         characters  │ │
                    │ │                         on screen   │ │
                    │ │                                     │ │
                    │ │                                     │ │
                    │ │                                     │ │
                    │ │                                     │ │
                    │ │                                     │ │
                    │ │                                     │ │
                    │ └─────────────────────────────────────┘ │
                    └─────────────────────────────────────────┘
                                        
                                        ║  ║
    ┌─┬─┬─┬─┬─┬─┬─┬─┬─┬─┬─┬─┬─┬─┬─┬─┬─┬─┬─┬─┬─┬─┬─┬─┬─┬─┬─┬─┬─┬─┬─┬─┬─┬─┬─┬─┬─┬
    │▓│▓│ │▓│▓│▓│ │ │▓│ │▓│▓│▓│▓│ │▓│▓│▓│▓│ │▓│▓│▓│ │ │ │ │ │ │ │ │ │ │ │ │ │ │
    └─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴
                            Successive locations in RAM

    Figure 4-1.  The memory-mapped display.

    These dots of light are called pixels and are produced by an electron beam
    striking the phosphorescent surface of the CRT. The electron beam is
    produced by an electron gun that scans the screen line by line. As the gun
    moves across and down the screen in a fixed path called a raster scan, the
    video subsystem generates video control signals that turn the beam on and
    off, matching the pattern of the bits in memory.

    The video circuitry refreshes the screen between 50 and 70 times a second
    (depending on the video mode), making the changing images appear clear and
    steady. At the end of each screen-refresh cycle, the electron beam must
    move from the bottom right corner to the top left corner of the screen to
    begin a new cycle. This movement is called the vertical retrace. During
    the retrace, the beam is blanked and no pixels are written to the screen.

    The vertical retrace period (about 1.25 milliseconds) is important to
    programmers for one main reason, which requires some explanation. The
    special dual-ported design of the video memory gives the CPU and the
    display-refresh circuitry equal access to the display memory. This allows
    the CPU and the display circuitry to access video memory at the same time.

    This causes a problem on the Color Graphics Adapter (CGA). If the CPU
    happens to read or write to the video buffer at the same time the display
    circuitry is copying data out of the buffer to display onscreen, a "snow"
    effect may briefly appear on the screen. However, if you instruct the CPU
    to access memory only during vertical retrace, when the display circuitry
    is not accessing the video buffer, then snow can be eliminated. A program
    running on a CGA can test the value of bit 3 in the adapter's I/O port at
    3DAH. This bit is set on at the beginning of vertical retrace and then set
    off at the end. During this 1.25-millisecond pause, you can have your
    programs write as much data as possible to the video display memory. At
    the end of the retrace, the display circuitry can write this data to the
    screen without snow.

    This technique is useful for any application that directly accesses data
    in the video buffer in text mode on a CGA. Fortunately, the hardware
    design of all other IBM video subsystems avoids this access conflict and
    makes this specialized programming technique unnecessary.


The Video Display Modes

    Originally, there were eight video modes defined for the IBM personal
    computers: seven on the CGA and one on the MDA. The more sophisticated
    EGA, MCGA, and VGA introduced several new modes plus variations on the
    original eight. As a result, among the five IBM video subsystems are 12
    text and graphics modes and, depending how you count them, seven or eight
    variations──and that's not counting the extra modes available with non-IBM
    video hardware and with defunct IBM systems like the PCjr. There's plenty
    of variety when you're working with IBM video subsystems.

    Despite the perplexing proliferation of video modes, what is striking
    about the different modes is not their differences but their similarities
    (Figure 4-2): All video modes are related in resolution and in video
    buffer organization to the original MDA and CGA modes.

    The MDA's 80-column, 25-row monochrome text mode is supported on the EGA
    and VGA. Similarly, the CGA's two text modes (40 x 25 and 80 x 25 16-color
    modes) are also supported on the EGA, MCGA, and VGA. Don't let the
    redundant mode numbers in Figure 4-2 confuse you: The difference between
    mode 0 and mode 1, for example, is that the composite color signal on the
    CGA is modified for composite monochrome monitors in mode 0. (See page
    74 for more on monitors.) With all other monitors and in all other video
    subsystems, modes 0 and 1 are the same, as are modes 2 and 3 and modes 4
    and 5.

╓┌─┌────────────┌─────────────┌────────────┌─────────────┌────────────┌──────►
    BIOS Mode Number
    Hex          Dec           Type         Resolution    Colors       Video Su
    ───────────────────────────────────────────────────────────────────────────
    00H,01H      0, 1          Text         40 x 25       16           CGA, EGA
    02H,03H      2, 3          Text         80 x 25       16           CGA, EGA
    04H,05H      4, 5          Graphics     320 x 200     4            CGA, EGA
    06H          6             Graphics     640 x 200     2            CGA, EGA
    07H          7             Text         80 x 25       Mono         MDA, EGA
    08H,09H,0AH  8, 9, 10                                              (PCjr on
    0BH,0CH      11, 12                                                (Used in
                                                                        BIOS)
    0DH          13            Graphics     320 x 200     16           EGA,VGA
    BIOS Mode Number
    Hex          Dec           Type         Resolution    Colors       Video Su
    ───────────────────────────────────────────────────────────────────────────
    0DH          13            Graphics     320 x 200     16           EGA,VGA
    0EH          14            Graphics     640 x 200     16           EGA,VGA
    0FH          15            Graphics     640 x 350     Mono         EGA,VGA
    10H          16            Graphics     640 x 350     16           EGA,VGA
    11H          17            Graphics     640 x 480     2            MCGA,VGA
    12H          18            Graphics     640 x 480     16           VGA
    13H          19            Graphics     320 x 200     256          MCGA,VGA
    ───────────────────────────────────────────────────────────────────────────


    Figure 4-2.  Video modes available on IBM video subsystems.

    The evolutionary pattern is the same for graphics modes. The CGA supports
    two graphics modes, a 320 x 200 pixel, 4-color mode and a 640 x 200,
    2-color mode. These same two modes are supported on the EGA, MCGA, and
    VGA. The EGA introduced three new graphics modes with more colors and
    better resolution than the original CGA graphics modes: the 320 x 200,
    16-color; 640 x 200, 16-color; and 640 x 350, 16-color modes. The EGA also
    introduced a 640 x 350 monochrome graphics mode that could be used only
    with an MDA-compatible monochrome display.

    When the PS/2s appeared, their video subsystems supported the same modes
    as did the MDA, CGA, and EGA──but again, a few new graphics modes were
    introduced. The MCGA in the PS/2 models 25 and 30 followed the CGA
    tradition: It supported all CGA modes, plus new 640 x 480, 2-color and 320
    x 200, 256-color graphics modes. The VGA in the other PS/2 models strongly
    resembles the EGA. It provides all the EGA's text and graphics modes, the
    two new MCGA graphics modes, and one more graphics mode not supported by
    the other subsystems──a 640 x 480, 16-color mode.

    How do you know which mode to use in a program? Clearly, if broad
    compatibility is a concern, the MDA and CGA modes are the least common
    denominator. If you need more colors or better graphics resolution than
    the CGA modes provide, you can turn to one of the EGA, MCGA, or VGA
    graphics modes. Of course, if your program requires an EGA or a VGA to
    run, users who have only a CGA will be out of luck.

    Many commercial software vendors solve this problem by distributing
    installable video output routines along with their products. Before you
    can use a package like Microsoft Windows or Lotus 1-2-3, for example, you
    must run a special installation program that binds output routines for
    your particular video hardware to the software application. This approach
    is more work for both the people who write software and the people who use
    it, but it is a good way to make applications deliver the best possible
    video performance without stumbling over the diversity of video hardware
    and video modes.

Video Mode Control

    Before we get into the details about resolution and color in video modes,
    let's consider how you select which video mode to use. The most efficient
    way to set up a video mode is to use assembly language to call the ROM
    BIOS. ROM BIOS interrupt 10H (decimal 16), service 00H, provides a way to
    select a video mode using the mode numbers listed in Figure 4-2. (See
    Chapter 9 for more details on this.)

    ──────────────────────────────────────────────────────────────────────────
    Monitors
    The type of video display, or monitor, that might be used has an
    important effect on program design. Many monitors cannot produce color
    or graphics, and a few produce such a poor quality image that you can
    use only the 40-column text display format. The many kinds of monitors
    that can be used with the PC family of computers can be broken down into
    five basic types.

    Direct-drive monochrome monitors. These monitors are designed to work
    with the Monochrome Display Adapter (MDA), although you can also use
    them with an Enhanced Graphics Adapter (EGA). The green IBM Monochrome
    Display is reminiscent of IBM's 3270 series of mainframe computer
    terminals; it's no surprise that many business users are comfortable
    with the combination of an MDA and a green monochrome display.

    Composite monochrome monitors. These monitors are still among the most
    widely used and least expensive monitors available. They connect to the
    composite video output on the Color Graphics Adapter (CGA) and provide a
    fairly clear one-color image (usually green or amber). Don't confuse the
    composite monochrome monitor with the direct-drive monochrome monitor.
    The composite monochrome monitor can be attached only to the CGA,
    whereas the direct-drive monochrome monitor must be used with an MDA or
    EGA.

    Composite color monitors and TV sets. Composite color monitors use a
    single combined signal such as the composite video output of the CGA.
    The composite color monitor produces color and graphics but has
    limitations: An 80-column display is often unreadable; only certain
    color combinations work well; and graphics resolution is low in quality,
    so graphics must be kept simple by using low-resolution graphics modes.

    Although the standard television set (color or black-and-white) is
    technically a composite monitor, it usually produces an even
    lower-quality image than the dedicated composite monitor. Text displays
    must be in 40-column mode to ensure that the display is readable. TVs
    are connected to the composite video output of the CGA, but the
    composite signal must be converted by an RF adapter before going into
    the TV.

    RGB color monitors. The RGB monitors are considered the best of both
    worlds. They combine the high-quality text display of the monochrome
    monitors with high-resolution graphics and color. RGB stands for
    red-green-blue, and RGB monitors are so named because they use separate
    red, green, and blue color signals, unlike the composite monitors, which
    use only one composite signal. The image and color quality of an RGB
    monitor is much better than that available through any screen that
    connects to the composite video output.

    Variable-frequency monitors. One of the problems created by the
    proliferation of different video subsystems is that some subsystems
    produce color and timing signals with different frequencies or different
    encodings than other subsystems. For example, you cannot use a
    PS/2-compatible monitor with a CGA because the color information in the
    monitor drive signals is encoded differently by a CGA than it is by a
    PS/2 video subsystem (MCGA or VGA).

    Monitor manufacturers addressed this problem by designing
    variable-frequency RGB monitors that can be used with a wide range of
    signal frequencies and with more than one type of color signal encoding.
    For example, NEC's MultiSync monitors can adjust to the different signal
    frequencies generated by the CGA, the EGA, and the PS/2 video
    subsystems. These monitors also have a switch that lets you adapt them
    either to the digital color signal encoding used by the CGA and EGA or
    to the analog color signals used by the PS/2 subsystems.

    Many people use variable-frequency monitors because they anticipate the
    need to upgrade their video subsystems at some time in the future, and
    they don't want to be stuck with an incompatible monitor.
    ──────────────────────────────────────────────────────────────────────────

    Many programming languages also offer high-level commands that select
    video modes for you. For example, BASIC gives you control over the video
    modes through the SCREEN statement but refers to them in its own way,
    using different mode numbers than the ROM BIOS routines. You can also
    control some of the video modes through the DOS MODE command. (See Figure
    4-3.)

    BIOS Mode Number         BASIC Statement to      DOS Statement to
    Hex          Dec         Change Mode             Change Mode
    ──────────────────────────────────────────────────────────────────────────
    00H          0           SCREEN 0,0: WIDTH 40    MODE BW40
    01H          1           SCREEN 0,1:WIDTH 40     MODE CO40
    02H          2           SCREEN 0,0:WIDTH 80     MODE BW80
    03H          3           SCREEN 0,1:WIDTH 80     MODE CO80
    04H          4           SCREEN 1,0              n/a
    05H          5           SCREEN 1,1              n/a
    06H          6           SCREEN 2                n/a
    07H          7           n/a                     MODE MONO
    ──────────────────────────────────────────────────────────────────────────

    Figure 4-3.  The BASIC and DOS commands used to change video modes.

Display Resolution

    Video images consist of a large number of closely spaced pixels. The
    display resolution is defined by the number of pixel rows, or scan lines,
    from top to bottom and the number of pixels from left to right in each
    scan line. The horizontal and vertical resolution is limited by the
    capabilities of the video monitor as well as the display circuitry inside
    the computer. The video modes available on the different subsystems were
    carefully designed so that the horizontal and vertical resolution in each
    mode is within the limits imposed by the hardware.

    The MDA's single text mode has 720 x 350 pixel resolution; that is, the
    screen has 350 scan lines, each of which contains 720 pixels. Because 25
    rows of 80 characters of text are displayed in this mode, each character
    is 9 pixels wide (720 ÷ 80) and 14 pixels high (350 ÷ 25). The CGA's text
    modes are a bit lower resolution, because the CGA's pixel resolution is
    only 640 x 200. Thus the 25 rows of 80-character text on a CGA consist of
    characters that are only 8 pixels wide (640 ÷ 80) and 8 pixels high (200 ÷
    25). That's why text looks sharper on an MDA screen than on a CGA.

    The trend in the newer IBM video subsystems is to provide better vertical
    resolution. For example, the EGA's 80 x 25 text mode has 640 x 350 pixel
    resolution, so text characters are 8 x 14 pixels. On the MCGA, the default
    80 x 25 text mode has 640 x 400 resolution (8 x 16 characters), and on the
    VGA the same text mode has 720 x 400 resolution, so characters are each 9
    pixels wide and 16 pixels high. From a program's point of view, the 80 x
    25 text mode is the same on the CGA, the MCGA, and the VGA──it's display
    mode 3 in all cases──but a user sees much higher resolution when using a
    VGA or MCGA than when using one of the older subsystems.

    You see the same trend towards better resolution when you examine the
    graphics modes available with the newer video subsystems. The VGA's 640 x
    480, 16-color mode has more than twice as many pixels on the screen as the
    original CGA's 640 x 200 graphics mode. It's ironic that this CGA mode was
    known as a "high-resolution" mode when the CGA was new.


The Use of Color

    A variety of colors is available in every video mode except of course on a
    monochrome display. You may have noticed that among the various modes
    there are substantial differences in the number of colors available. In
    this section, we will describe the color options for the video modes.

    Colors for the video display screens are produced by combinations of four
    elements: three color components──red, green, and blue──plus an intensity,
    or brightness, component. Text and graphics modes use the same colors and
    intensity options, but they combine them in different ways to produce
    their colored displays. The text modes, whose basic unit is a character
    composed of several pixels, use an entire byte to set the color, the
    intensity, and the blinking characteristics of the character and its
    background. In graphics modes, each pixel is represented by a group of 1
    through 8 bits whose value determines the color and brightness of the
    displayed pixel.

    In 16-color text and graphics modes, the four basic color and brightness
    components can be combined in 16 ways. Colors are specified by a group of
    4 bits. Each bit designates whether a particular color component is on or
    off. The result is 16 color combinations that correspond to the 16 4-bit
    binary numbers. (See Figure 4-4.)

    In some video modes, the data in the video buffer consists of 4-bit
    attribute values that correspond exactly to the 16 possible color
    combinations on the screen. In other video modes, the attribute values do
    not directly specify colors. For example, on the EGA, each attribute value
    designates one of 16 palette registers, each of which contains a color
    value. (See Figure 4-5.) It is the palette color values that determine
    the color combinations displayed on the screen.

    Intensity Red       Green    Blue      Binary    Hex       Description
    ───────────────────────────────────────────────────────────────────────────
    0         0         0        0         0000B     00H       Black
    0         0         0        1         0001B     01H       Blue
    0         0         1        0         0010B     02H       Green
    0         0         1        1         0011B     03H       Cyan (blue-green
    0         1         0        0         0100B     04H       Red
    0         1         0        1         0101B     05H       Magenta
    0         1         1        0         0110B     06H       Brown (or dark
                                                                yellow)
    0         1         1        1         0111B     07H       Light gray (or
                                                                ordinary white)
    1         0         0        0         1000B     08H       Dark gray (black
                                                                many screens)
    1         0         0        1         1001B     09H       Light blue
    1         0         1        0         1010B     0AH       Light green
    1         0         1        1         1011B     0BH       Light cyan
    1         1         0        0         1100B     0CH       Light red
    1         1         0        1         1101B     0DH       Light magenta
    1         1         1        0         1110B     0EH       Yellow (or light
                                                                yellow)
    1         1         1        1         1111B     0FH       Bright white
    ───────────────────────────────────────────────────────────────────────────

    Figure 4-4.  Default colors available in 16-color text and graphics modes.

    ┌──────────────────────────────────── ┌─────────────┐
    │ 0110 1010 1010 0111 0101 0101 1101  │▒▒▒▒▒▒▒▒▒▒▒▒▒│
    │                └─┬┘                 ├─────────────┤
    │                  │                  │▒▒▒▒▒▒▒▒▒▒▒▒▒│
    │                  │                  ├─────────────┤
    │                  │                  │▒▒▒▒▒▒▒▒▒▒▒▒▒│
    │                  │                  ├─────────────┤
    │                  │                  │▒▒▒▒▒▒▒▒▒▒▒▒▒│
    │                  │                  ├─────────────┤    ┌──────────
    │                  └─────────────────►│ Color value ├───►│ ┌────────
    │                                     ├─────────────┤    │ │ ▒
    │                                     │▒▒▒▒▒▒▒▒▒▒▒▒▒│    │ │
    │                                     ├─────────────┤    │ │
    │                                     │▒▒▒▒▒▒▒▒▒▒▒▒▒│    │ │
                                        ├─────────────┤
                                        │▒▒▒▒▒▒▒▒▒▒▒▒▒│
                Attribute value in         └─────────────┘     Color on
                video buffer           Pallette registers    screen

    Figure 4-5.  How EGA colors are specified using palette registers. Each
    attribute value in the video buffer designates a palette register whose
    contents specify a color.

    The use of palettes makes it possible to specify one of a broad range of
    colors using relatively few bits of data in the video buffer. Each of the
    EGA's 16 palette registers, for example, can contain one of 64 different
    6-bit color values. In this way, any 2 of 64 different colors can be used
    in a 2-color EGA video mode, any 4 out of 64 can be used in a 4-color
    mode, and any 16 of 64 can be used in a 16-color mode.

    All IBM video subsystems except the MDA can use palettes to display
    colors. The CGA has three built-in, 4-color palettes for use in 320 x 200,
    4-color mode. The EGA, as we have seen, has a 16-color palette in which
    each color can be selected from a set of 64 colors. The MCGA and the VGA,
    which can display an even wider range of colors, use a separate
    palette-like component, the video digital to analog converter (video DAC),
    to send color signals to the screen.

    The video DAC contains 256 color registers, each of which contains 6-bit
    color values for red, green, and blue. Since there are 64 possible values
    for each of the RGB components, each video DAC color register can contain
    one of 64 x 64 x 64, or 262,144 different color values. That wide range of
    colors can help you display very subtle color shades and contours.

    With the MCGA, the video DAC color registers serve much the same purpose
    as the palette registers do with the EGA. Attribute values in the video
    buffer designate video DAC color registers whose contents specify the
    colors that appear on the screen. Unfortunately, only one MCGA video mode
    can take full advantage of the video DAC's capabilities: 320 x 200,
    256-color mode. Only this video mode uses 8-bit attribute values that can
    specify all 256 of the video DAC's color registers. All remaining video
    modes use attribute values that have no more than 4 bits, so only the
    first 16 video DAC color registers are used.

    The VGA gets around this limitation (and complicates matters somewhat) by
    using a set of 16 palette registers like the EGA's, as well as a set of
    256 video DAC color registers like the MCGA's. An attribute value in the
    video buffer selects one of the 16 palette registers, whose contents
    select one of the 256 video DAC color registers──whose contents "in turn"
    determine the color displayed on the screen. (See Figure 4-6.)

    Specifying colors on an EGA, MCGA, or VGA is clearly more complicated than
    it is on the CGA. To simplify this process, however, the ROM BIOS loads
    the palette registers (on the EGA and VGA) and the video DAC color
    registers (on the MCGA and VGA) with color values that exactly match those
    available on the CGA. If you use CGA-compatible text and graphics modes on
    the newer subsystems and ignore the palette and video DAC registers,
    you'll see the same colors you would on a CGA.

    ┌──────────────────────────────────────
    │ 0110 1010 1010 0111 0101 0101 1101
    │                └─┬┘                            •
    │                  │   ┌─────────────┐    │      •      │
    │                  │   │▒▒▒▒▒▒▒▒▒▒▒▒▒│    │      •      │
    │                  │   ├─────────────┤    ├─────────────┤
    │                  │   │▒▒▒▒▒▒▒▒▒▒▒▒▒│    │▒▒▒▒▒▒▒▒▒▒▒▒▒│
    │                  │   ├─────────────┤    ├─────────────┤
    │ Attribute value  │   │▒▒▒▒▒▒▒▒▒▒▒▒▒│    │▒▒▒▒▒▒▒▒▒▒▒▒▒│
    │ in video buffer  │   ├─────────────┤    ├─────────────┤
    │                  └──►│    Color    ├─┐  │▒▒▒▒▒▒▒▒▒▒▒▒▒│
    │                      │   register  │ │  ├─────────────┤   ┌─────────
    │                      ├─────────────┤ │  │  RGB color  ├──►│ ┌───────
    │                      │▒▒▒▒▒▒▒▒▒▒▒▒▒│ └─►│    value    │   │ │ ▒
    │                      ├─────────────┤    ├─────────────┤   │ │
                            │▒▒▒▒▒▒▒▒▒▒▒▒▒│    │▒▒▒▒▒▒▒▒▒▒▒▒▒│   │ │
                            ├─────────────┤    ├─────────────┤   │ │
                            │▒▒▒▒▒▒▒▒▒▒▒▒▒│    │▒▒▒▒▒▒▒▒▒▒▒▒▒│   │ │
                            └─────────────┘    ├─────────────┤
                            Palette registers  │▒▒▒▒▒▒▒▒▒▒▒▒▒│    Color on
                                            ├─────────────┤     screen
                                            │      •      │
                                            │      •      │
                                                    •
                                        Video DAC color registers

    Figure 4-6.  How VGA colors are specified using palette registers and the
    video DAC.

    For this reason it's usually best to ignore the palette and video DAC
    registers when you start developing an application. Once your application
    works properly with the CGA-compatible colors, you can add program code
    that changes the palette and/or the video DAC colors. The ROM BIOS
    provides a complete set of services that let you access the palette and
    video DAC registers. Chapter 9 covers these services in detail.

    In considering color, read each of the remaining sections, which discuss
    important color-related items.

Color-Suppressed Modes

    In an effort to make the graphics modes compatible with a wide range of
    monitors, both color and monochrome, IBM included a few modes on the Color
    Graphics Adapter that do not produce color: color-suppressed modes. There
    are three color-suppressed modes: modes 0, 2, and 5. In these modes,
    colors are converted into shades of gray, or whatever color the screen
    phosphor produces. There are four gray shades in mode 5, and a variety of
    shades in modes 0 and 2. CGA's color is suppressed in the composite output
    but not in its RGB output. This inconsistency is the result of an
    unavoidable technical limitation.

    ──────────────────────────────────────────────────────────────────────────
    NOTE:
    For each color-suppressed mode, there is a corresponding color mode, so
    modes 0 and 1 correspond to 40-column text, modes 2 and 3 to 80-column
    text, and modes 4 and 5 to medium-resolution graphics. The fact that
    modes 4 and 5 reverse the pattern of modes 0 and 1 and modes 2 and 3,
    where the color-suppressed mode comes first, has led to a complication
    in BASIC. The burst parameter of the BASIC SCREEN statement controls
    color. The meaning of this parameter is reversed for modes 4 and 5 so
    that the statement SCREEN,1 activates color in the text modes (0, 1, 2,
    and 3) but suppresses color in the graphics modes (4 and 5). This
    inconsistency may have been a programming error at first, but it is now
    part of the official definition of the SCREEN statement.
    ──────────────────────────────────────────────────────────────────────────

Color in Text and Graphics Modes

    Text and graphics modes use the same color-decoding circuitry, but differ
    in the way they store the color attribute data in the video buffer. In
    text modes, no matter what video subsystem you use, the foreground and
    background colors of each character are specified by two 4-bit fields in a
    single attribute byte. (See Figure 4-7.) Together, the foreground and
    background attributes describe all of a character's pixels: All foreground
    pixels are displayed with the character's foreground attribute, and all
    background pixels assume the background attribute.

        Bit
    7 6 5 4 3 2 1 0          Use
    ──────────────────────────────────────────────────────────────────────────
    1 . . . . . . .          Blinking of foreground character or intensity
                            component of background color
    . 1 . . . . . .          Red component of background color
    . . 1 . . . . .          Green component of background color
    . . . 1 . . . .          Blue component of background color
    . . . . 1 . . .          Intensity component of foreground color
    . . . . . 1 . .          Red component of foreground color
    . . . . . . 1 .          Green component of foreground color
    . . . . . . . 1          Blue component of foreground color
    ──────────────────────────────────────────────────────────────────────────

    Figure 4-7.  The coding of the color attribute byte.

    In graphics modes, each pixel's attribute is determined by the contents of
    a bit field in the video buffer. The size and format of a pixel's bit
    field depend on the video mode: The smallest bit fields are only 1 bit
    wide (as in 640 x 200, 2-color mode), and the largest bit fields are 8
    bits wide (as in 320 x 200, 256-color mode).

    The reason for having both text and graphics modes becomes clear if you
    think about how much data it takes to describe the pixels on the screen.
    In graphics modes, you need between 1 and 8 bits of data in the video
    buffer for every pixel you display. In 640 x 350, 16-color mode, for
    instance, with 4 bits per pixel, you need 640 x 350 x 4 ÷ 8 (112,000)
    bytes to represent one screenful of video data. But if you display 25 rows
    of 80 characters in a text mode with the same resolution, you need only 80
    x 25 x 2, or 4000, bytes.

    The tradeoff is clear: Text modes consume less memory and require less
    data manipulation than do graphics modes──but you can manipulate each
    pixel independently in graphics modes, as opposed to manipulating entire
    characters in text modes.

    Setting color in text modes

    Let's take a closer look at how you control colors in text modes. (We'll
    get back to graphics modes later in this chapter.) In text modes, each
    character position on the display screen is controlled by a pair of
    adjacent bytes in the video buffer. The first byte contains the ASCII code
    for the character that will be displayed. (See Appendix C for a chart of
    characters.) The second byte is the character's attribute byte. It
    controls how the character will appear, that is, its colors, brightness
    (intensity), and blinking.

    We've already mentioned two attributes that affect a character's
    appearance: color and intensity (brightness). You can assign several other
    attributes to text characters, depending on which video subsystem you're
    using. With all IBM video subsystems, text characters can blink. On
    monochrome-capable subsystems (the MDA, EGA, and VGA), characters can also
    be underlined. Also, on some non-IBM subsystems like the Hercules Graphics
    Card Plus, characters can have attributes such as overstrike and boldface.

    In all cases, you assign these alternate attributes by using the same
    4-bit attributes that specify color. A case in point is the blinking
    attribute. Character blinking is controlled by setting a bit in a special
    register in the video subsystem. (On the CGA, for example, this
    enable-blink bit is bit 5 of the 8-bit register mapped at input/output
    port 3D8H.) When this bit is set to 1, the high-order bit of each
    character's attribute byte is not interpreted as part of the character's
    background color specification. Instead, this bit indicates whether the
    character should blink.

    If you have a CGA, watch what happens when you run the following BASIC
    program:

    10 DEF SEG = &HB800             ' point to start of video buffer
    20 POKE 0,ASC("A")              ' store the ASCII code for A in the buffer
    30 POKE 1,&H97                  ' foreground attribute = 7 (white)
                                    ' background attribute = 9 (intense blue)

    You'll see a blinking white letter A on a blue background. If you add the
    following statement to the program, you'll clear the enable-blink bit and
    cause the CGA to interpret the background attribute as intense blue:

    40 OUT &H3D8,&H09               ' clear the "enable-blink" bit

    The default attribute used by DOS and BASIC is 07H, normal white (7) on
    black (0), without blinking, but you can use any combination of 4-bit
    foreground and background attributes for each character displayed in a
    text mode. If you exchange a character's foreground and background
    attributes, the character is displayed in "reverse video." If the
    foreground and background attributes are the same, the character is
    "invisible."

    Setting attributes in the monochrome mode

    The monochrome mode (mode 7) used by the Monochrome Display Adapter has a
    limited selection of attributes that take the place of color. Like the
    CGA, the MDA uses 4-bit foreground and background attributes, but their
    values are interpreted differently by the MDA attribute decoding
    circuitry.

    Only certain combinations of foreground and background attributes are
    recognized by the MDA. (See Figure 4-8.) Other useful combinations, like
    "invisible" (white-on-white) or a reverse-video/underlined combination,
    aren't supported by the hardware.

    Like the CGA, the MDA has an enable-blink bit that determines whether the
    high-order bit of each character's attribute byte controls blinking or the
    intensity of the background attribute. On the MDA, the enable-blink bit is
    bit 5 of the register at port 3B8H. As on the CGA, the enable-blink bit is
    set by the ROM BIOS when it establishes monochrome text mode 7, so you
    must explicitly clear this bit if you want to disable blinking and display
    characters with intensified background.

    With the EGA, MCGA, and VGA, text-mode attributes work the same as with
    the MDA and CGA. Although the enable-blink bit is not in the same hardware
    register in the newer subsystems, the ROM BIOS offers a service through
    interrupt 10H that toggles the bit on an EGA, MCGA, or VGA. (See Chapter
    9, page 178 for more information about this service.)

╓┌─┌──────────┌──────────────────────────────────────────────────────────────╖
    Attribute  Description
    ──────────────────────────────────────────────────────────────────────────
    00H        Nondisplayed

    01H        Underlined

    07H        Normal (white on black)

    09H        High-intensity underlined

    0FH        High-intensity

    70H        White background, black foreground ("reverse video")

    87H☼       Blinking white on black (if blinking enabled)
                Dim background, normal foreground (if blinking
                disabled)
    Attribute  Description
    ──────────────────────────────────────────────────────────────────────────
                disabled)

    8FH☼       Blinking high-intensity (if blinking enabled)
                Dim background, high-intensity foreground (if blinking
                disabled)

    F0H        Blinking "reverse video" (if blinking enabled)
                High-intensity background, black foreground (if blinking
                disabled)
    ──────────────────────────────────────────────────────────────────────────


    Figure 4-8.  Monochrome text-mode attributes. The appearance of some
    attributes depends on the setting of the enable-blink bit at I/O port
    3B8H.

    Setting color in graphics modes

    So far, we've seen how to set color (and the monochrome equivalent of
    color) in text modes. Setting color in graphics modes is quite different.
    In graphics modes, each pixel is associated with a color. The color is set
    the same way attributes are set in text mode, but there are important
    differences. First, since each pixel is a discrete dot of color, there is
    no foreground and background──each pixel is simply one color or another.
    Second, pixel attributes are not always 4 bits in size──we've already
    mentioned that pixel attributes can range from 1 to 8 bits, depending on
    the video mode being used. These differences give graphics-mode programs a
    subtly different "feel" than they have in text modes, both to programmers
    and to users.

    The most important difference between text-mode and graphics-mode
    attributes, however, is this: In graphics modes you can control the color
    of each pixel. This lets you use colors much more effectively than you can
    in text modes. This isn't so obvious with the CGA and its limited color
    capabilities, but with an MCGA or VGA it's quite apparent.

    Let's start with the CGA. The CGA's two graphics modes are relatively
    limited in terms of color: In 320 x 200, 4-color mode, pixel attributes
    are only 2 bits wide, and you can display only four different colors at a
    time. In 640 x 200, 2-color mode, you have only 1 bit per pixel, so you
    can display only two different colors. Also, the range of colors you can
    display in CGA graphics modes is severely limited.

    In 320 x 200, 4-color mode, pixels can have value 0, 1, 2, or 3,
    corresponding to the 2-bit binary values 00B, 01B, 10B, and 11B. You can
    assign any one of the CGA's 16 color combinations to zero-value pixels,
    but colors for nonzero pixels are derived from one of three built-in
    palettes. (See Figure 4-9.) In 640 x 200, 2-color mode, nonzero pixels
    can be assigned any one of the 16 color combinations, but zero-value
    pixels are always black. In both modes, you can assign palette colors
    using ROM BIOS interrupt 10H services described in Chapter 9.

    The EGA, MCGA, and VGA are considerably more flexible in terms of color
    management, because you can assign any color combination to any palette or
    video DAC color register. Equally important is the fact that you have
    larger pixel values and therefore more colors to work with on the screen.
    The most frequently used graphics modes on the EGA and VGA are the
    16-color modes with pixels that require 4 bits to define the colors. In
    most applications, 16 colors are adequate, because you can select those 16
    colors from the entire range of color combinations the hardware can
    display (64 colors on the EGA and 262,144 colors on the MCGA and VGA).
    Again, the ROM BIOS provides services that let you assign arbitrary color
    combinations to the palette and video DAC color registers on the EGA,
    MCGA, and VGA. See Chapter 9 for details.

    Pixel Bits               Pixel Value             Pixel Color
    ──────────────────────────────────────────────────────────────────────────
    Mode 4, palette 0:
    0 1                      1                       Green
    1 0                      2                       Red
    1 1                      3                       Yellow or brown

    Mode 4, palette 1:
    0 1                      1                       Cyan
    1 0                      2                       Magenta
    1 1                      3                       White

    Mode 5:
    0 1                      1                       Cyan
    1 0                      2                       Red
    1 1                      3                       White
    ──────────────────────────────────────────────────────────────────────────

    Figure 4-9.  Palettes in CGA 320 x 200, 4-color graphics mode.


Inside the Display Memory

    Now we come to the inner workings of the video buffer map. In this
    section, we'll see how the information in the video memory is related to
    the display screen.

    Although the video buffer memory map varies according to the video mode
    you use, a clear family resemblance exists among the video modes. In text
    modes, the video buffer map in all IBM video subsystems is the same. In
    graphics modes, there are two general layouts, a linear map based on the
    map used with the original CGA graphics modes and a parallel map that was
    first used in EGA graphics modes.

    Video Mode     Starting       Memory Used   Subsystem
                    Paragraph      (bytes)
                    Address (hex)
    ──────────────────────────────────────────────────────────────────────────
    00H, 01H       B800H             2000       CGA, EGA, MCGA, VGA
    02H, 03H       B800H             4000       CGA, EGA, MCGA, VGA
    04H, 05H       B800H           16,000       CGA, EGA, MCGA, VGA
    06H            B800H           16,000       CGA, EGA, MCGA, VGA
    07H            B000H             4000       MDA, EGA, VGA
    0DH            A000H           32,000       EGA, VGA
    0EH            A000H           64,000       EGA, VGA
    0FH            A000H           56,000       EGA, VGA
    10H            A000H          112,000       EGA, VGA
    11H            A000H           38,400       MCGA, VGA
    12H            A000H          153,600       VGA
    13H            A000H           64,000       MCGA, VGA
    ──────────────────────────────────────────────────────────────────────────

    Figure 4-10.  Video buffer addresses in IBM video modes.

    Before we examine the actual map of the video buffer, let's look at the
    addresses where the video buffer is located. (See Figure 4-10.) The
    breakdown is straightforward: Color text modes start at paragraph address
    B800H, and monochrome text mode starts at B000H. CGA-compatible graphics
    modes start at B800H. All other graphics modes start at A000H. The amount
    of RAM required to hold a screenful of data varies according to the number
    of characters or pixels displayed, and, in the case of graphics modes,
    with the number of bits that represent a pixel.

Display Pages in Text Modes

    The amount of RAM physically installed in the various video subsystems is
    frequently more than enough to contain more than one screen's worth of
    video data. In video modes where this is true, all IBM video subsystems
    support multiple display pages. When you use display pages, the video
    buffer is mapped into two or more areas, and the video hardware is set up
    to selectively display any one of these areas in the map.

    Because only one page is displayed at any given time, you can write
    information into nondisplayed pages as well as directly to the displayed
    page. Using this technique you can build a screen on an invisible page
    while another page is being displayed and then switch to the new page when
    the appropriate time comes. Switching screen images this way makes screen
    updates seem instantaneous.

    The display pages are numbered 0 through 7, with page 0 starting at the
    beginning of the video buffer. Of course, the amount of available RAM may
    be insufficient to support eight full display pages; the actual number of
    pages you can use (see Figure 4-11) depends on how much video RAM is
    available and on how much memory is required for one screenful of data.
    Each page begins on an even kilobyte memory boundary. The display page
    offset addresses are shown in Figure 4-12.

    To select a display page, use ROM BIOS interrupt 10H, service 05H. To
    determine which page is actively displayed, use interrupt 10H, service
    0FH. (See Chapter 9 for information about these ROM BIOS services.)

    In any of these modes, if the pages are not actively used (actually
    displayed on the screen), then the unused part of the display memory can
    conceivably be used for data besides text or pixels, although this usage
    is neither normal nor advisable. Making any other use of this potentially
    free memory is asking for trouble in the future.

                                        Number of
    Video Mode  Subsystem                Pages       Notes
    ──────────────────────────────────────────────────────────────────────────
    00H, 01H    CGA, EGA, MCGA, VGA      8
    02H, 03H    CGA                      4
                EGA, MCGA, VGA           8
    04H, 05H    CGA, MCGA                1
                EGA, VGA                 2           Not fully supported by
                                                    ROM BIOS
    06H         CGA, EGA, MCGA, VGA      1
    07H         MDA                      1
                EGA, VGA                 8
    0DH         EGA, VGA                 8
    0EH         EGA, VGA                 4
    0FH         EGA, VGA                 2
    10H         EGA, VGA                 2
    11H         MCGA, VGA                1
    12H         VGA                      1
    13H         MCGA, VGA                1
    ──────────────────────────────────────────────────────────────────────────

    Figure 4-11.  Display pages available in IBM video subsystems.

    Page   40 x 25, 16-color      80 x 25, 16-color     80 x 25 Mono
    ──────────────────────────────────────────────────────────────────────────
    0      B800:0000H             B800:0000H            B000:0000H
    1      B800:0800H             B800:1000H            B000:1000H☼
    2      B800:1000H             B800:2000H            B000:2000H☼
    3      B800:1800H             B800:3000H            B000:3000H☼
    4      B800:2000H             B800:4000H☼           B000:4000H☼
    5      B800:2800H             B800:5000H☼           B000:5000H☼
    6      B800:3000H             B800:6000H☼           B000:6000H☼
    7      B800:3800H             B800:7000H☼           B000:7000H☼
    ──────────────────────────────────────────────────────────────────────────

    Figure 4-12.  Start addresses for text-mode display pages in IBM video
    subsystems.

Display Pages in Graphics Modes

    For the EGA, the MCGA, and the VGA, the page concept is as readily
    available in graphics modes as in text modes. Obviously there is no reason
    not to have graphics pages if the memory is there to support them.

    The main benefit of using multiple pages for either graphics or text is to
    be able to switch instantly from one display screen to another without
    taking the time to build the display information from scratch. In theory,
    multiple pages could be used in graphics mode to produce smooth and
    fine-grained animation effects, but there aren't enough display pages to
    take the animation very far.

Displaying Characters in Text and Graphics Modes

    As you have learned, in text modes no character images are stored in video
    memory. Instead, each character is represented in the video buffer by a
    pair of bytes containing the character's ASCII value and display
    attributes. The pixels that make up the character are drawn on the screen
    by a character generator that is part of the display circuitry. The Color
    Graphics Adapter has a character generator that produces characters in an
    8 x 8 pixel block format, while the Monochrome Display Adapter's character
    generator uses a 9 x 14 pixel block format. The larger format is one of
    the factors that makes the MDA's display output easier to read.

    The standard ASCII characters (01H through 7FH [decimal 1 through 127])
    represent only half of the ASCII characters available in the text modes.
    An additional 128 graphics characters (80H through FFH [decimal 128
    through 255]) are available through the same character generator. More
    than half of them can be used to make simple line drawings. A complete
    list of both the standard ASCII characters and the graphics characters
    provided by IBM is given in Appendix C.

    The graphics modes can also display characters, but they are produced
    quite differently. Graphics-mode characters are drawn, pixel by pixel, by
    a ROM BIOS software character generator, instead of by a hardware
    character generator. (ROM BIOS interrupt 10H provides this service; see
    Chapter 9.) The software character generator refers to a table of bit
    patterns to determine which pixels to draw for each character. The ROM of
    every PC and PS/2 contains a default table of character bit patterns, but
    you can also place a custom bit pattern table in RAM and instruct the BIOS
    to use it to display your own character set.

    In CGA-compatible graphics modes (640 x 200, 2-color and 320 x 200,
    4-color), the bit patterns for the second 128 ASCII characters are always
    found at the address stored in the interrupt 1FH vector at 0000:007CH. If
    you store a table of bit patterns in a buffer and then store the buffer's
    segment and offset at 0000:007CH, the ROM BIOS will use the bit patterns
    in the buffer for ASCII characters 80H through FFH (decimal 128 through
    255. In other graphics modes on the EGA, MCGA, and VGA, the ROM BIOS
    provides a service through interrupt 10H that lets you pass the address of
    a RAM-based table of character bit patterns for all 256 characters.

    Mapping characters in text modes

    In text modes, the memory map begins with the top left corner of the
    screen, using 2 bytes per screen position. The memory bytes for succeeding
    characters immediately follow in the order you would read them──from left
    to right and from top to bottom.

    Modes 0 and 1 are text modes with a screen format of 40 columns by 25
    rows. Each row occupies 40 x 2 = 80 bytes. A screen occupies only 2 KB in
    modes 0 and 1, which means the CGA's 16 KB memory can accommodate eight
    display pages. If the rows are numbered 0 through 24 and the columns
    numbered 0 through 39, then the offset to any screen character in the
    first display page is given by the following BASIC formula:

    CHARACTER.OFFSET = (ROW.NUMBER * 80) + (COLUMN.NUMBER * 2)

    Since the attribute byte for any character is in the memory location next
    to the ASCII character value, you can locate it by simply adding 1 to the
    character offset.

    Modes 2, 3, and 7 are also text modes, but with 80 columns in each row
    instead of 40. The byte layout is the same, but each row requires twice as
    many bytes, or 80 x 2 = 160 bytes. Consequently, the 80 x 25 screen format
    uses 4 KB, and the 16 KB memory can accommodate four display pages. The
    offset to any screen location in the first display page is given by the
    following BASIC formula:

    CHARACTER.OFFSET = (ROW.NUMBER * 160) + (COLUMN.NUMBER * 2)

    The beginning of each text display page traditionally starts at an even
    kilobyte boundary. Because each screen page in the text modes actually
    uses only 2000 or 4000 bytes, some unused bytes follow each page: either
    48 or 96 bytes, depending on the size of the page. So, to locate any
    screen position on any page in text mode, use the general formula shown on
    the next page.

    LOCATION = (SEGMENT.PARAGRAPH * 16)
                + (PAGE.NUMBER * PAGE.SIZE) + (ROW.NUMBER * ROW.WIDTH * 2)
                + (COLUMN.NUMBER * 2) + WHICH
    LOCATION  is the 20-bit address of the screen information.
    SEGMENT.PARAGRAPH  is the location of the video display memory
    (for example, B000H or B800H).
    PAGE.NUMBER  is in the range 0 through 3 or 0 through 7.
    PAGE.SIZE  is 2000 or 4000.
    ROW.NUMBER  is from 0 through 24.
    ROW.WIDTH  is 40 or 80.
    COLUMN.NUMBER  is from 0 through 39 or 0 through 79.
    WHICH  is 0 for the display character or 1 for the display attribute.

    Mapping pixels in graphics modes

    When you use a graphics mode, pixels are stored as a series of bit fields,
    with a one-to-one correlation between the bit fields in memory and the
    pixels on the screen. The actual mapping of bit fields in the video buffer
    depends on the video mode.

    In CGA-compatible graphics modes, the display is organized into 200 lines,
    numbered 0 through 199. Each line of pixels is represented in the video
    buffer in 80 bytes of data. In 640 x 200, 2-color mode, each bit
    represents one pixel on the screen, while in 320 x 200, 4-color mode, each
    pixel is represented by a pair of bits in the buffer. (See Figure 4-13.)
    Thus there are eight pixels to each byte in 640 x 200, 2-color mode, and
    80 x 8, or 640, pixels per row. Similarly, there are four pixels to each
    byte in 320 x 200, 4-color mode, and 80 x 4, or 320, pixels per row.

    The storage for the pixel rows is interleaved:

    ■  Pixels in even-numbered rows are stored in the first half of the video
        buffer, starting at B800:0000H.

    ■  Pixels in odd-numbered rows are stored starting at B800:2000H.

    For example, in 640 x 200, 2-color mode, the first pixel in the first row
    (in the upper-left corner of the screen) is represented by the leftmost
    bit (bit 7) in the byte at B800:0000H. The second pixel in the row is
    represented by bit 6 of the same byte. Because of the interleaved buffer
    map, however, the pixel immediately below the first pixel is represented
    in bit 7 of the byte at B800:2000H.

    640 x 200, 2-color mode            320 x 200, 4-color mode

            bit                                bit
        7 6 5 4 3 2 1 0                    7 6 5 4 3 2 1 0
    ┌───────────────────┐              ┌───────────────────┐
    │  1 0 0 0 1 1 1 1  │              │  1 1 0 0 0 1 0 1  │
    └──┼─┼─┼─┼─┼─┼─┼─┼──┘              └──┬─┬─┬─┬─┬─┬─┬─┬──┘
        │ │ │ │ │ │ │ │                    └┬┘ └┬┘ └┬┘ └┬┘
    ┌───│─│─│─│─│─│─│─│───────         ┌────│───│───│───│────────
    │┌──│─│─│─│─│─│─│─│───────         │┌───│───│───│───│────────
    ││  ▼ ▼ ▼ ▼ ▼ ▼ ▼ ▼                ││   ▼   ▼   ▼   ▼
    ││  ▓ • • • ▓ ▓ ▓ ▓                ││   ▒   •   ▓   ▓
    ││                                 ││
    ││                                 ││
    ││                                 ││

    Figure 4-13.  Pixel mapping in CGA-compatible graphics modes.

    In all other graphics modes, the buffer map is linear, as it is in text
    modes. Pixels are stored from left to right in each byte, and one row of
    pixels immediately follows another in the video buffer. On the MCGA and
    VGA, for example, the 1-bit pixels in 640 x 480, 2-color mode and the
    8-bit pixels in 320 x 200, 256-color mode are stored starting at
    A000:0000H and proceeding linearly through the buffer.

    The catch is that pixel bit fields are not always mapped linearly in all
    video modes. On the EGA and VGA, the video buffer in 16-color graphics
    modes is arranged as a set of four parallel memory maps. In effect, the
    video memory is configured to have four 64 KB memory maps spanning the
    same range of addresses starting at A000:0000H. The EGA and VGA have
    special circuitry that accesses all four memory maps in parallel. Thus in
    16-color EGA and VGA graphics modes, each 4-bit pixel is stored with 1 bit
    in each memory map. (See Figure 4-14.) Another way to visualize this is
    that a 4-bit pixel value is formed by concatenating corresponding bits
    from the same address in each memory map.

    There is a good reason why the EGA and VGA were designed to use parallel
    memory maps in graphics modes. Consider the situation in 640 x 350,
    16-color mode: With 4 bits per pixel, you need 640 x 350 x 4 (896,000)
    bits to store one screenful of pixels. That comes out to 112,000 bytes,
    which is bigger than the 64 KB maximum size of one 8086 segment. If you
    organize the pixel data in parallel, however, you only need 112,000 ÷ 4
    (28,000) bytes in each memory map.

    With this variety of memory maps and pixel sizes, it's