Programming Languages

A computer program—whether it's an applet, utility, application, or operating system—is nothing more than a list of instructions for the brain inside your computer, the microprocessor, to carry out. A microprocessor instruction, in turn, is a specific pattern of bits, a digital code. Your computer sends the list of instructions making up a program to its microprocessor one at a time. Upon receiving each instruction, the microprocessor looks up what function the code says to do, then it carries out the appropriate action.

Microprocessors by themselves only react to patterns of electrical signals. Reduced to its purest form, the computer program is information that finds its final representation as the ever-changing pattern of signals applied to the pins of the microprocessor. That electrical pattern is difficult for most people to think about, so the ideas in the program are traditionally represented in a form more meaningful to human beings. That representation of instructions in human-recognizable form is called a programming language.

As with a human language, a programming language is a set of symbols and the syntax for putting them together. Instead of human words or letters, the symbols of the programming language correspond to patterns of bits that signal a microprocessor exactly as letters of the alphabet represent sounds that you might speak. Of course, with the same back-to-the-real-basics reasoning, an orange is a collection of quarks squatting together with reasonable stability in the center of your fruit bowl.

The metaphor is apt. The primary constituents of an orange—whether you consider them quarks, atoms, or molecules—are essentially interchangeable, even indistinguishable. By itself, every one is meaningless. Only when they are taken together do they make something worthwhile (at least from a human perspective): the orange. The overall pattern, not the individual pieces, is what's important.

Letters and words work the same way. A box full of vowels wouldn't mean anything to anyone not engaged in a heated game of Wheel of Fortune. Match the vowels with consonants and arrange them properly, and you might make words of irreplaceable value to humanity: the works of Shakespeare, Einstein's expression of general relativity, or the formula for Coca-Cola. The meaning is not in the pieces but their patterns.

The same holds true for computer programs. The individual commands are not as important as the pattern they make when they are put together. Only the pattern is truly meaningful.

You make the pattern of a computer program by writing a list of commands for a microprocessor to carry out. At this level, programming is like writing reminder notes for a not-too-bright person—first socks, then shoes.

This step-by-step command system is perfect for control freaks but otherwise is more than most people want to tangle with. Even simple computer operations require dozens of microprocessor operations, so writing complete lists of commands in this form can be more than many programmers—let alone normal human beings—want to deal with. To make life and writing programs more understandable, engineers developed higher-level programming languages.

A higher-level language uses a vocabulary that's more familiar to people than patterns of bits, often commands that look something like ordinary words. A special program translates each higher-level command into a sequence of bit-patterns that tells the microprocessor what to do.

Machine Language

Every microprocessor understand its own repertoire of instructions, just as a dog might understands a few spoken commands. Whereas your pooch might sit down and roll over when you ask it to, your processor can add, subtract, and move bit-patterns around as well as change them. Every family of microprocessor has a set of instructions that it can recognize and carry out: the necessary understanding designed into the internal circuitry of each microprocessor chip.

The entire group of commands that a given model of microprocessor understands and can react to is called that microprocessor's instruction set or its command set. Different microprocessor families recognize different instruction sets, so the commands meant for one chip family would be gibberish to another. For example, the Intel family of microprocessors understands one command set; the IBM/Motorola PowerPC family of chips recognizes an entirely different command set. That's the basic reason why programs written for the Apple Macintosh (which is based on PowerPC microprocessors) won't work on computers that use Intel microprocessors.

That native language a microprocessor understands, including the instruction set and the rules for using it, is called machine language. The bit-patterns of electrical signals in machine language can be expressed directly as a series of ones and zeros, such as 0010110. Note that this pattern directly corresponds to a binary (or base-two) number. As with any binary number, the machine language code of an instruction can be translated into other numerical systems as well. Most commonly, machine language instructions are expressed in hexadecimal form (base-16 number system). For example, the 0010110 subtraction instruction becomes 16(hex). (The "(hex)" indicates the number is in hexadecimal notation, that is, base 16.)

Assembly Language

Machine language is great if you're a machine. People, however, don't usually think in terms of bit-patterns or pure numbers. Although some otherwise normal human beings can and do program in machine language, the rigors of dealing with the obscure codes takes more than a little getting used to. After weeks, months, or years of machine language programming, you begin to learn which numbers do what. That's great if you want to dedicate your life to talking to machines, but not so good if you have better things to do with your time.

For human beings, a better representation of machine language codes involves mnemonics rather than strictly numerical codes. Descriptive word fragments can be assigned to each machine language code so that 16(hex) might translate into SUB (for subtraction). Assembly language takes this additional step, enabling programmers to write in more memorable symbols.

Once a program is written in assembly language, it must be converted into the machine language code understood by the microprocessor. A special program, called an assembler, handles the necessary conversion. Most assemblers do even more to make the programmer's life more manageable. For example, they enable blocks of instructions to be linked together into a block called a subroutine, which can later be called into action by using its name instead of repeating the same block of instructions again and again.

Most of assembly language involves directly operating the microprocessor using the mnemonic equivalents of its machine language instructions. Consequently, programmers must be able to think in the same step-by-step manner as the microprocessor. Every action that the microprocessor does must be handled in its lowest terms. Assembly language is consequently known as a low-level language because programmers write at the most basic level.

High-Level Languages

Just as an assembler can convert the mnemonics and subroutines of assembly language into machine language, a computer program can go one step further by translating more human-like instructions into the multiple machine language instructions needed to carry them out. In effect, each language instruction becomes a subroutine in itself.

The breaking of the one-to-one correspondence between language instruction and machine language code puts this kind of programming one level of abstraction farther from the microprocessor. That's the job of the high-level languages. Instead of dealing with each movement of a byte of information, high-level languages enable the programmer to deal with problems as decimal numbers, words, or graphic elements. The language program takes each of these high-level instructions and converts it into a long series of digital code microprocessor commands in machine language.

High-level languages can be classified into two types: interpreted and compiled. Batch languages are a special kind of interpreted language.

Interpreted Languages

An interpreted language is translated from human to machine form each time it is run by a program called an interpreter. People who need immediate gratification like interpreted programs because they can be run immediately, without intervening steps. If the computer encounters a programming error, it can be fixed, and the program can be tested again immediately. On the other hand, the computer must make its interpretation each time the program is run, performing the same act again and again. This repetition wastes the computer's time. More importantly, because the computer is doing two things at once—both executing the program and interpreting it at the same time—it runs more slowly.

Today the most important interpreted computer language is Java, the tongue of the Web created by Sun Microsystems. Your computer downloads a list of Java commands and converts them into executable form inside your computer. Your computer then runs the Java code to put make some obnoxious advertisement dance and flash across your screen.

The interpreted design of Java helps make it universal. The Java code contains instructions that any computer can carry out, regardless of its operating system. The Java interpreter inside your computer converts the universal code into the specific machine language instructions your computer and its operating system understand.

Before Java, the most popular interpreted language was BASIC, an acronym for the Beginner's All-purpose Symbolic Instruction Set. BASIC was the first language for personal computers and was the foundation upon which the Microsoft Corporation was built.

In classic form, using an interpreted language involved two steps. First, you would start the language interpreter program, which gave you a new environment to work in, complete with its own system of commands and prompts. Once in that environment, you then executed your program, typically starting it with a "Run" instruction. More modern interpreted systems such as Java hide the actual interpreter from you. The Java program appears to run automatically by itself, although in reality the interpreter is hidden in your Internet browser or operating system. Microsoft's Visual Basic gets its interpreter support from a runtime module, which must be available to your computer's operating system for Visual Basic programs to run.

Compiled Languages

Compiled languages execute like a program written in assembler, but the code is written in a more human-like form. A program written with a compiled language gets translated from high-level symbols into machine language just once. The resultant machine language is then stored and called into action each time you run the program. The act of converting the program from the English-like compiled language into machine language is called compiling the program. To do this you use a language program called a compiler. The original, English-like version of the program, the words and symbols actually written by the programmer, is called the source code. The resultant machine language makes up the program's object code.

Compiling a complex program can be a long operation, taking minutes, even hours. Once the program is compiled, however, it runs quickly because the computer needs only to run the resultant machine language instructions instead of having to run a program interpreter at the same time. Most of the time, you run a compiled program directly from the DOS prompt or by clicking an icon. The operating system loads and executes the program without further ado. Examples of compiled languages include today's most popular computer programming language, C++, as well as other tongues left over from earlier days of programming—COBOL, Fortran, and Pascal.

Object-oriented languages are special compiled languages designed so that programmers can write complex programs as separate modules termed objects. A programmer writes an object for a specific, common task and gives it a name. To carry out the function assigned to an object, the programmer need only put its name in the program without reiterating all the object's code. A program may use the same object in many places and at many different times. Moreover, a programmer can put a copy of an object into different programs without the need to rewrite and test the basic code, which speeds up the creation of complex programs. C++ is object oriented.

Optimizing compilers do the same thing as ordinary compilers, but do it better. By adding an extra step (or more) to the program compiling process, the optimizing compiler checks to ensure that program instructions are arranged in the most efficient order possible to take advantage of all the capabilities of the computer's processor. In effect, the optimizing compiler does the work that would otherwise require the concentration of an assembly language programmer.

Libraries

Inventing the wheel was difficult and probably took human beings something like a million years—a long time to have your car sitting up on blocks. Reinventing the wheel is easier because you can steal your design from a pattern you already know. But it's far, far easier to simply go out and buy a wheel.

Writing program code for a specific but common task often is equivalent to reinventing your own wheel. You're stuck with stringing together a long list of program commands, just like all the other people writing programs have to do. Your biggest consolation is that you need to do it only once. You can then reuse the same set of instructions the next time you have to write a program that needs the same function.

For really common functions, you don't have to do that. Rather than reinventing the wheel, you can buy one. Today's programming languages include collections of common functions called libraries so that you don't have to bother with reinventing anything. You only need pick the prepackaged code you want from the library and incorporate it into your program. The language compiler links the appropriate libraries to your program so that you only need to refer to a function by a code name. The functions in the library become an extension to the language, called a meta-language.

Development Environments

Even when using libraries, you're still stuck with writing a program in the old-fashioned way—a list of instructions. That's an effective but tedious way of building a program. The computer's strength is taking over tedious tasks, so you'd think you could use some of that power to help you write programs more easily. In fact, you might expect someone to write a program to help you write programs.

A development environment is exactly that kind of program, one that lets you drag and drop items from menus to build the interfaces for your own applications. The environment includes not only all the routines of the library, but also its own, easy-to-use interface for putting the code in those libraries to work. Most environments let you create programs by interactively choosing the features you want—drop-down menus, dialog boxes, and even entire animated screens—and match them with operations. The environment watches what you do, remembers everything, and then kicks in a code generator, which creates the series of programming language commands that achieves the same end. One example of a development environment is Microsoft Visual Studio.

Working with a development environment is a breeze compared to traditional programming. For example, instead of writing all the commands to pop a dialog box on the screen, you click a menu and choose the kind of box you want. After the box obediently pops on the screen, you can choose the elements you want inside it from another menu, using your mouse to drag buttons and labels around inside the box. When you're happy with the results, the program grinds out the code. In a few minutes you can accomplish what it might have taken you days to write by hand.

Batch Languages

A batch language allows you to submit a program directly to your operating system for execution. That is, the batch language is a set of operating system commands that your computer executes sequentially as a program. The resultant batch program works like an interpreted language in that each step gets evaluated and executed only as it appears in the program.

Applications often include their own batch languages. These, too, are merely lists of commands for the application to carry out in the order you've listed them to perform some common, everyday function. Communications programs use this type of programming to automatically log in to the service of your choice and even retrieve files. Databases use their own sort of programming to automatically generate reports that you regularly need. The process of transcribing your list of commands is usually termed scripting. The commands that you can put in your program scripts are sometimes called the scripting language.

Scripting actually is programming. The only difference is the language. Because you use commands that are second nature to you (at least after you've learned to use the program) and follow the syntax that you've already learned running the program, the process seems more natural than writing in a programming language. That means if you've ever written a script to log on to the Internet or have modified an existing script, you're a programmer already. Give yourself a gold star.

[ Team LiB ]