Machine Code

Read this article, which discusses how a MIPS assembly language statement is assembled into machine code. Carefully study the example in the article.

Machine code is a computer program written in machine language. It uses the instruction set of a particular  computer architecture. It is usually written in binary. Machine code is the lowest level of software. Other programming languages are translated into machine code so the computer can execute them.

An instruction tells the process what operation to perform. Each instruction is made up of an opcode (operation code) and operand(s). The operands are usually memory addresses or data. An instruction set is a list of the opcodes available for a computer. Machine code is what assembly code and other programming languages are compiled to or interpreted as.

Program builders turn code into another language or machine code. Machine code is sometimes called native code. This is used when talking about things that work on only some computers.


Writing machine code


Front panel of an early minicomputer, with switches for entering machine code

Machine code can be written in different forms:

  • Using a number of switches. This generates a sequence of 1 and 0. This was used in the early days of computing. Since the 1970s, it is no longer used.
  • Using a Hex editor. This allows the use of opcodes instead of the number of the command.
  • Using an Assembler. Assembly languages are simpler than opcodes. Their syntax is easier to understand than machine language but harder than high level languages. The assembler will translate the source code into machine code on its own.
  • Using a High-level programming language allows programs that use code that is easier to read and write. These programs are translated into machine code. The translation can happen in many steps. Java programs are first optimized into bytecode. Then it is translated into machine language when it is used.


Typical instructions of machine code

There are many kinds of instructions found usually found in an instruction set:

  • Arithmetical operations: Addition, subtraction, multiplication, division.
  • Logical operations: Conjunction, disjunction, negation.
  • Operations acting on single bits: Shifting bits to the left or right.
  • Operations acting on memory: copying a value from one register to another.
  • Operations that compare two values: bigger than, smaller than, equal.
  • Operations that combine other operations: add, compare, and copy if equal to some value(as one operation), jump to some point in the program if a register is zero.
  • Operations that act on program flow: jump to some address.
  • Operations that convert data types: e.g. convert a 32-bit integer to a 64-bit integer, convert a floating point value to an integer (by truncating).

Many modern processors use microcode for some of the commands. More complex commands tend to use it. This is often done with CISC architectures.


Instructions

Every processor or processor family has its own instruction set. Instructions are patterns of bits that correspond to different commands that can be given to the machine. Thus, the instruction set is specific to a class of processors using (mostly) the same architecture.

Newer processor designs often include all the instructions of a predecessor and may add additional instructions. Sometimes, a newer design will discontinue or alter the meaning of an instruction code (typically because it is needed for new purposes), affecting code compatibility; even nearly completely compatible processors may show slightly different behavior for some instructions, but this is rarely a problem.

Systems may also differ in other details, such as memory arrangement, operating systems, or peripheral devices. Because a program normally relies on such factors, different systems will typically not run the same machine code, even when the same type of processor is used.

Most instructions have one or more opcode fields. They specify the basic instruction type. Other fields may give the type of the operands, the addressing mode, and so on. There may also be special instructions that are contained in the opcode itself. These instructions are called immediates.

Processor designs can be different in other ways. Different instructions can have different lengths. Also, they can have the same length. Having all instructions have the same length can simplify the design.


Example

The MIPS architecture has instructions which are 32 bits long. This section has examples of code. The general type of instruction is in the op (operation) field. It is the highest 6 bits. J-type (jump) and I-type (immediate) instructions are fully given by op. R-type (register) instructions include the field funct. It determines the exact operation of the code.

The fields used in these types are:

  6    5    5    5      5       6 bits
[ op | rs | rt | rd | shamt | funct ] R-type
[ op | rs | rt |   address/immediate] I-type
[ op |        target address        ] J-type

rsrt, and rd indicate register operands.  shamt gives a shift amount. The address or immediate  fields contain an operand directly.

Example: add the registers 1 and 2. Place the result in register 6. It is encoded:

[   op   |   rs   |   rt   |   rd  |shamt| funct]
    0        1        2        6      0      32    decimal
 000000    00001    00010    00110  00000  100000  binary

Load a value into register 8. Take it from the memory cell 68 cells after the location listed in register 3:

[   op   |   rs   |   rt   | address/immediate]
    35       3        8             68          decimal
  100011   00011    01000    00000 00001 000100 binary

Jump to the address 1024:

[   op  |         target address      ]
    2                 1024                decimal
  000010 00000 00000 00000 10000 000000   binary

Source: Wikipedia, https://simple.wikipedia.org/wiki/Machine_code
Creative Commons License This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 License.

Last modified: Wednesday, July 22, 2020, 1:53 PM