Source code

HomePage | Recent changes | View source | Discuss this page | Page history | Log in |

Printable version | Disclaimers | Privacy policy

Source code is what computer programmers usually write in order to make computer software. It is normally written as an flat text file and translated by a compiler or interpreter or assembler into the object code for a particular computer before execution. (Note: the term is generally used in the singular, as in, "I wrote a lot of source code today.")

There are two fundamental ideas behind the purpose of source code: mnemonics and reusability.

Mnemonics

For a computer to be able to execute a computer program, that program needs to be in a computer-readable form, called binary code, object code, or machine code. This binary code is different for each computer architecture, which is why software for an Intel-compatible processor will not run directly on an Amiga computer nor any other architecture.

This binary code, as its name implies, is simply a series of zeroes and ones that directly tells the computer's processor what instructions to perform. For example, on the TSC Computer architecture, the binary sequence:

11110001 10000010

is broken up into five distinct parts. The first four bits, "1111", are the opcode for an R-format instruction. The next two bits, "00" represent the location of the first number and the following two bits, "01", represent the location of the second number. The next two bits, "10", represent the location where the result is to be stored, and the remaining bits are the function code which instructs the computer to add the first two numbers together and place the result in the third location.

Programming in zeroes and ones is tedious and error-prone. Therefore mnemonics are used in place of the zeroes and ones, and a special program called an assembler program translates the program, or source code, into the zeroes and ones. This low-level source code is called assembly language. An example of the assembly mnemonic for the above instruction is:

ADD $2, $0, $1

Writing computer programs even at this level is still very time-consuming. Simple tasks, such as multiplying numbers or displaying output to the screen, can take tens to hundreds of lines of assembly code. Higher-level programming languages, such as Ada C, C++, and Java allow programmers to write a simple line of code, such as:

for i in 1 .. 10 loop end loop ; -- Ada

for (int i = 0; i <= 10; i++); // C, C++, Java

Note: the above code (the original C and the added Ada do nothing but step "i" through ten values. A better example would probably be:

for i in 1 .. 10 loop
x(i) := x(i) * 10 ;
end loop ;

or

for (int i = 0 ; i <= 10 ; i++ )
x[i] = x[i] * 10 ;

which is then translated by the compiler into many lines of assembly or directly to binary machine code. This saves the programmer vast amounts of time, but it usually isn't quite as fast or resource-efficient as writing assembly code would be. The trade off in productivity, however, is one that most programmers are willing to take. Assembly is usually only used in gaming engines and embedded devices, where resource-efficiency and speed are key.

Reusability

The other purpose for using source code is that it can be easily reused and reimplemented on different computer architectures. Source code for most modern languages can usually be compiled with few modifications on many different types of computers, each time resulting in machine code specific to the computer on which it is compiled. This is usually called software portability.

Open Source

Machine code is required to be able to run the computer program; however, it is usually unintelligible by even highly-trained humans, especially in very complex applications. It is also, under most circumstances, not able to be translated back into understandable source code--even when this is possible, generated source code often lacks human-readable elements vital to understanding, such as program comments, variable names, and so forth. Therefore, programs distributed in binary form are unable to be easily changed and modified.

Open source computer software is distributed with the source code available under various software licenses, thus allowing users to modify the program according to their own wishes, and under some circumstances allowing users to redistribute their changes.

Legal Issues

Currently, court systems are deciding whether source code constitutes Constitutionally protected free speech in the United States. Proponents of the free speech argument claim that because source code conveys information to programmers, can be used to play games, share humour and other artistic pursuits (see obfuscated code or visit | PerlMonks.Org) it is a protected form of communication. The opposing view is that source code is more functional speech than artistic, and is thus not protected by First Amendment Rights of the U.S. Constitution.

A program has actually been written that will take meticulously written English and translate it into C source code. Here is an example:

Here we set up for an iteration loop.
We initialize by performing this instruction:
Assign to `j' the value "1".
We continue the loop as long as the following expression comes out positive:
"`j' checked to be less than or equal to `5'".
At the end of each repetition we perform this to increment things:
Increment `j' up by one.
Here we break from the current loop.

This can be written in C source code as:

for (j = 1; j <= 5; j++);