How do you explain the concept of linking and relocation?
Explain the concept of Linking and Relocation. in microprocessor and architecture
Linking and Relocation
In constructing a program some program modules may be put in the same source
module and assembled together; others may be in different source modules and assembled separately. If they are assembled separately, then the main module, which has the first instruction to be executed, must be terminated by an END statement with the entry point specified, and each of the other modules must be terminated by an END statement with no operand. In any event, the resulting object modules, some of which may be grouped into libraries, must be linked together to form a load module before the program can be executed. In addition to outputting the load module, normally the linker prints a memory map that indicates where the linked object modules will be loaded into memory. After the load module has been created it is loaded into the memory of the computer by the loader and execution begins. Although the I/O can be performed by modules within the program, normally the I/O is done by I/O drivers that are part of the operating system. All that appears in the user's program are references to the I/O drivers that cause the operating system to execute them.
The general process for creating and executing a program is illustrated in Fig 5.1. The process for a particular system may not correspond exactly to the one diagrammed in the figure, but the general concepts are the same. The arrows
indicate that corrections may be made after anyone of the major stages.
Fig. 5.1: Creation and Execution of a program
In addition to the linker commands, the ASM-86 assembler provides a means of
regulating the way segments in different object modules are organized by the linker. Sometimes segments with the same name are concatenated and sometimes they are overlaid. Just how the segments with the same name are joined together is determined by modifiers attached to the SEGMENT directives. A SEGMENT directive may have the form
Segment name SEGMENT Combine-type
where the combine-type indicates how the segment is to be located within the load module. Segments that have different names cannot be combined and segments with the same name but no combine-type will cause a linker error. The possible combine-types are:
PUBLIC-If the segments in different object modules have the same name and the combine-type PUBLIC, then they are concatenated into a single segment in the load module. The ordering in the concatenation is specified by the linker command.
COMMON-If the segments in different object modules have the same name and the combine-type is COMMON, then they are overlaid so that they have the same beginning address. The length of the common segment is that of the longest segment being overlaid.
segments in different object modules have the same name and the combine-type STACK, then they become one segment whose length is the sum of the lengths of the individually specified segments. In effect, they are combined to form one large stack.
AT-The AT combine-type is followed by an expression that evaluates to a constant which is to be the segment address. It allows the user to specify the exact location of the segment in memory..
MEMORY-This combine-type causes the segment to be placed at the last of the load module. If more than one segment with the MEMORY combine type is being linked, only the first one will be treated as having the MEMORY combine-type; the others will be overlaid as if they had COMMON combine types.
Access to External Identifiers
Clearly, object modules that are being linked together must be able to refer to each other. That is, there must be a way for a module to reference at least some of the variables and/or labels in the other modules. If an identifier is defined in an object module, then it is said to be a local (or internal) identifier relative to the module, and if it is not defined in the module but is defined in one of the other modules being linked, then It is referred to as an external (or global) identifier relative to the module.
For single-object module programs all identifiers that are referenced must be locally defined or an assembler error will occur. For multiple-module programs, the assembler must be informed in advance of any externally defined identifiers that appear in a module so that it will not treat them as being undefined. Also, in order to permit other object modules to reference some of the identifiers in a given module, the given module must include a list of the identifiers to which it will allow access. Therefore, each module may contain two lists, one containing the external identifiers it references and one containing the locally defined identifiers that can be referred to by other modules. These two lists are implemented by the EXTRN and PUBLIC directives, which have the forms:
EXTRN Identifier:Type, . . . , Identifier:Type
PUBLIC Identifier, . . ., Identifier
where the identifiers are the variables and labels being declared as external or as being available to other modules. Because the assembler must know the type of all external identifiers before it can generate the proper machine code, a type specifier must be associated with each identifier in an EXTRN statement. For a variable the type may be BYTE, WORD, or DWORD and for a label it may be NEAR or FAR. In the statement
if VAR1 is external and is associated with a word, then the module containing the statement must also contain a directive such as
EXTRN . . VAR1 :WORD.. . .
and the module in which VARl is defined must contain a statement of the form
PUBLIC .. ..VAR1….
One of the primary tasks of the linker is to verify that every identifier appearing in an EXTRN statement is matched by one in a PUBLIC statement. If this is not the case, then there will be an undefined external reference and a linker error will occur. Fig. 5.2 shows three modules and how the matching is done by the linker while joining them together.
Fig. 5.2: Illustration of the matching verified by the linker
As we have seen, there are two parts to every address, an offset and a segment address. The offsets for the local identifiers can be and are inserted by the assembler, but the offsets for the external identifiers and all segment addresses must be inserted by the linking process. The offsets associated with all external references can be assigned once all of the object modules have been found and their external symbol tables have been examined. The assignment of the segment addresses is called relocation and is done after the king process has determined exactly where each segment is to be put in memory.