# HG changeset patch # User lost # Date 1233284130 0 # Node ID f0881c115010675464c24389423e60e269f049ee # Parent 3706ede361ea66bf42c1c9d3687e625f0cd781c8 More major documentation upgrades diff -r 3706ede361ea -r f0881c115010 doc/lwlink.txt --- a/doc/lwlink.txt Fri Jan 30 01:51:41 2009 +0000 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,51 +0,0 @@ -This is the companion linker to LWASM. It reads object files generated by -LWASM and combines them into an actual binary. - -During linking, each file is read into memory. A list of externally -referenced symbols is made along with where these symbols are referenced. -Each external reference is checked against all previously loaded files (in -order of loading) and if a match is found, a note of that fact is made and a -link between the previously loaded file and the current reference. - -Once all files are loaded, the symbol table is checked for any symbols which -are still unresolved. If any are found, the linking process complains and -bails out. - -Once all the object files have been read, the linker follows a -pre-determined script for the specified target or a script supplied by the -user to lay out the binary. The instructions from the script are followed -blindly as it is assumed the user knows what he is doing. - -For each defined section, the linker begins constructing the section data by -resolving each instance of that section in the order it was encountered. All -symbols defined by that section (local or exported) are assigned addresses. -The exact offset into the final section data is recorded for any incomplete -references in that section. All section base address references are resolved -to actual addresses at this stage. - -Once all sections have been laid out and addresses assigned to all symbols, -all incomplete references are resolved and the resulting value placed into -the appropriate data stream. If any references cannot be resolved at this -stage, the linker will complain and bail out. - -Once all sections, symbols, and incomplete references have been resolved, -the binary will output as appropriate for the specified target. - -See the file "scripts.txt" for information about linker scripts and the -restrictions based on the output target. - -The following output targets are supported: - -Raw: this is a raw binary with no header information, etc. Suitable for ROM -images, etc. By default, the raw target starts the binary at address 0, puts -any section named "init" first, then "code", then all other non-bss -sections, then all bss sections. Note that any "bss" type section that -exists anywhere but at the end of the binary (i.e. is between or before one -or more non-bss sections) will be included as a series of NUL bytes. - -DECB: this creates a LOADM style binary according to the linker script. By -default, this target places the sections in the same order as the raw target -but implements a load address of $2000. bss sections will not be included in -the actual output. If a bss section appears between two non-bss sections, a -new output block will be created in the output file. - diff -r 3706ede361ea -r f0881c115010 doc/manual.docbook.sgml --- a/doc/manual.docbook.sgml Fri Jan 30 01:51:41 2009 +0000 +++ b/doc/manual.docbook.sgml Fri Jan 30 02:55:30 2009 +0000 @@ -174,6 +174,17 @@ + + + + +This option specifies the name of the output file. If not specified, the +default is . + + + + + @@ -1040,7 +1051,263 @@ LWLINK +The LWTOOLS linker is called LWLINK. This chapter documents the various features +of the linker. + +
+Command Line Options + +The binary for LWLINK is called "lwlink". Note that the binary is in lower +case. lwlink takes the following command line arguments. + + + + + + + +Selects the DECB output format target. This is equivalent to + + + + + + + + + +This option specifies the name of the output file. If not specified, the +default is . + + + + + + + + + +This option specifies the output format. Valid values are +and + + + + + + + + + +This option specifies the raw output format. +It is equivalent to . +and + + + + + + + + + +This option allows specifying a linking script to override the linker's +built in defaults. + + + + + + + + + +This option increases the debugging level. It is only useful for LWTOOLS +developers. + + + + + + + + + +This provides a listing of command line options and a brief description +of each. + + + + + + + + +This will display a usage summary. +of each. + + + + + + + + + + +This will display the version of LWLINK. +of each. + + + + +
+ +
+Linker Operation + + +LWLINK takes one or more files in the LWTOOLS object file format and links +them into a single binary. While the precise method is slightly different, +linking can be conceptualized as the following steps. + + + + + +First, the linker loads a linking script. If no script is specified, it +loads a built-in default script based on the output format selected. This +script tells the linker how to lay out the various sections in the final +binary. + + + + + +Next, the linker reads all the input files into memory. At this time, it +flags any format errors in those files. It constructs a table of symbols +for each object at this time. + + + + + +The linker then proceeds with organizing the sections loaded from each file +according to the linking script. As it does so, it is able to assign addresses +to each symbol defined in each object file. At this time, the linker may +also collapse different instances of the same section name into a single +section by appending the data from each subsequent instance of the section +to the first instance of the section. + + + + + +Next, the linker looks through every object file for every incomplete reference. +It then attempts to fully resolve that reference. If it cannot do so, it +throws an error. Once a reference is resolved, the value is placed into +the binary code at the specified section. It should be noted that an +incomplete reference can reference either a symbol internal to the object +file or an external symbol which is in the export list of another object +file. + + + + + +If all of the above steps are successful, the linker opens the output file +and actually constructs the binary. + + + + +
+ +
Linking Scripts + +A linker script is used to instruct the linker about how to assemble the +various sections into a completed binary. It consists of a series of +directives which are considered in the order they are encountered. + + +The sections will appear in the resulting binary in the order they are +specified in the script file. If a referenced section is not found, the linker will behave as though the +section did exist but had a zero size, no relocations, and no exports. +A section should only be referenced once. Any subsequent references will have +an undefined effect. + + + +All numbers are in linking scripts are specified in hexadecimal. All directives +are case sensitive although the hexadecimal numbers are not. + + +A section name can be specified as a "*", then any section not +already matched by the script will be matched. The "*" can be followed +by a comma and a flag to narrow the section down slightly, also. +If the flag is "!bss", then any section that is not flagged as a bss section +will be matched. If the flag is "bss", then any section that is flagged as +bss will be matched. + + +The following directives are understood in a linker script. + + +section name load addr + + +This causes the section name to load at +addr. For the raw target, only one "load at" entry is +allowed for non-bss sections and it must be the first one. For raw targets, +it affects the addresses the linker assigns to symbols but has no other +affect on the output. bss sections may all have separate load addresses but +since they will not appear in the binary anyway, this is okay. + +For the decb target, each "load" entry will cause a new "block" to be +output to the binary which will contain the load address. It is legal for +sections to overlap in this manner - the linker assumes the loader will sort +everything out. + + + + +section name + + +This will cause the section name to load after the previously listed +section. + + +exec addr or sym + + +This will cause the execution address (entry point) to be the address +specified (in hex) or the specified symbol name. The symbol name must +match a symbol that is exported by one of the object files being linked. +This has no effect for targets that do not encode the entry point into the +resulting file. If not specified, the entry point is assumed to be address 0 +which is probably not what you want. The default link scripts for targets +that support this directive automatically starts at the beginning of the +first section (usually "init" or "code") that is emitted in the binary. + + + + + +pad size + +This will cause the output file to be padded with NUL bytes to be exactly +size bytes in length. This only makes sense for a raw target. + + + + + + + +
+
@@ -1051,6 +1318,192 @@ hard to keep it hidden in an open source tool chain anyway. This chapter documents the object file format. + + +An object file consists of a series of sections each of which contains a +list of exported symbols, a list of incomplete references, and a list of +"local" symbols which may be used in calculating incomplete references. Each +section will obviously also contain the object code. + + + +Exported symbols must be completely resolved to an address within the +section it is exported from. That is, an exported symbol must be a constant +rather than defined in terms of other symbols. + + +Each object file starts with a magic number and version number. The magic +number is the string "LWOBJ16" for this 16 bit object file format. The only +defined version number is currently 0. Thus, the first 8 bytes of the object +file are 4C574F424A313600 + + + +Each section has the following items in order: + + + +section name +flags +list of local symbols (and addresses within the section) +list of exported symbols (and addresses within the section) +list of incomplete references along with the expressions to calculate them +the actual object code (for non-BSS sections) + + + +The section starts with the name of the section with a NUL termination +followed by a series of flag bytes terminated by NUL. There are only two +flag bytes defined. A NUL (0) indicates no more flags and a value of 1 +indicates the section is a BSS section. For a BSS section, no actual +code is included in the object file. + + + +Either a NULL section name or end of file indicate the presence of no more +sections. + + + +Each entry in the exported and local symbols table consists of the symbol +(NUL terminated) followed by two bytes which contain the value in big endian +order. The end of a symbol table is indicated by a NULL symbol name. + + + +Each entry in the incomplete references table consists of an expression +followed by a 16 bit offset where the reference goes. Expressions are +defined as a series of terms up to an "end of expression" term. Each term +consists of a single byte which identifies the type of term (see below) +followed by any data required by the term. Then end of the list is flagged +by a NULL expression (only an end of expression term). + + +Object File Term Types + + + +TERMTYPE +Meaning + + + + +00 +end of expression + + + +01 +integer (16 bit in big endian order follows) + + +02 + external symbol reference (NUL terminated symbol name follows) + + + +03 +local symbol reference (NUL terminated symbol name follows) + + + +04 +operator (1 byte operator number) + + +05 +section base address reference + + + +
+ + +External references are resolved using other object files while local +references are resolved using the local symbol table(s) from this file. This +allows local symbols that are not exported to have the same names as +exported symbols or external references. + + +Object File Operator Numbers + + + +Number +Operator + + + + +01 +addition (+) + + +02 +subtraction (-) + + +03 +multiplication (*) + + +04 +division (/) + + +05 +modulus (%) + + +06 +integer division (\) (same as division) + + + +07 +bitwise and + + + +08 +bitwise or + + + +09 +bitwise xor + + + +0A +boolean and + + + +0B +boolean or + + + +0C +unary negation, 2's complement (-) + + + +0D +unary 1's complement (^) + + + +
+ + +An expression is represented in a postfix manner with both operands for +binary operators preceding the operator and the single operand for unary +operators preceding the operator. + +
diff -r 3706ede361ea -r f0881c115010 doc/objectfiles.txt --- a/doc/objectfiles.txt Fri Jan 30 01:51:41 2009 +0000 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,79 +0,0 @@ -An object file consists of a series of sections each of which contains a -list of exported symbols, a list of incomplete references, and a list of -"local" symbols which may be used in calculating incomplete references. Each -section will obviously also contain the object code. - -Exported symbols must be completely resolved to an address within the -section it is exported from. - -Each object file starts with a magic number and version number. The magic -number is the string "LWOBJ16" for this 16 bit object file format. The only -defined version number is currently 0. Thus, the first 8 bytes of the object -file are: - -4C574F424A313600 - -Each section has the following items in order: - -* section name -* flags -* list of local symbols (and addresses within the section) -* list of exported symbols (and addresses within the section) -* list of incomplete references along with the expressions to calculate them -* the actual object code - -The section starts with the name of the section with a NUL termination -followed by a series of flag bytes terminated by NUL. The following flag -bytes are defined: - -Byte Meaning -00 no more flags -01 section is BSS - no actual code is present - -Either a NULL section name or end of file indicate the presence of no more -sections. - -Each entry in the exported and local symbols table consists of the symbol -(NUL terminated) followed by two bytes which contain the value in big endian -order. The end of a symbol table is indicated by a NULL symbol name. - -Each entry in the incomplete references table consists of an expression -followed by a 16 bit offset where the reference goes. Expressions are -defined as a series of terms up to an "end of expression" term. Each term -consists of a single byte which identifies the type of term (see below) -followed by any data required by the term. Then end of the list is flagged -by a NULL expression (only an end of expression term). - -TERMTYPE Meaning -00 end of expression -01 integer (16 bit in big endian order follows) -02 external symbol reference (NUL term symbol) -03 local symbol reference (NUL term symbol) -04 operator (1 byte operator number - see below) -05 section base address reference - -External references are resolved using other object files while local -references are resolved using the local symbol table(s) from this file. This -allows local symbols that are not exported to have the same names as -exported symbols or external references. - -The operator numbers are: - -NUM OP -01 + (plus) -02 - (minus) -03 * (times) -04 / (divide) -05 % (modulus) -06 \ (integer division) -07 bitwise and -08 bitwise or -09 bitwise xor -0A boolean and -0B boolean or -0C - (unary negation, 2's complement) -0D ^ (unary 1's complement) - -An expression is represented in a postfix manner with both operands for -binary operators preceding the operator and the single operand for unary -operators preceding the operator. diff -r 3706ede361ea -r f0881c115010 doc/pseudoops.txt --- a/doc/pseudoops.txt Fri Jan 30 01:51:41 2009 +0000 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,51 +0,0 @@ -The following pseudo operations are understood by LWASM. - -SECTION - -This introduces a section called . This is only valid if assembling to -an object file. Only one section can be open at any given time. Sections -may be ended with ENDSECTION. Only one section can be open at any given -time. A subsequent SECTION directive will end the previous section. It is -important to note that an end of file does not close the currently open -section. There cannot be a symbol on a SECTION line. - -ENDSECTION - -Specifies the end of a section. This is optional. There cannot be a symbol -on an ENDSECTION line. - -ORG - -Specifies the assembly address. For the raw target, this is advisory and -only affects the addresses of symbols. For the object file target, this can -only appear outside of all sections. For the DECB target, each ORG statement -after which any output is generated will generate a segment in the output -file. must be completely resolved during pass 1 of the assembly -process and thus may not refer to forward references or external symbols, or -other symbols that refer to such. - - EQU - -Makes equivalent to . may be an external reference -in which case any references to will also be external references. - -EXPORT [ as ] - -Marks previously defined for export. If is specified, it -will be exported as . must not be an external reference and -must be defined before EXPORT. - -EXTERN [ as ] -IMPORT [ as ] - -Marks as an external reference. If is specified, is -the local name the symbol is references as in this assembly file while - is the actual symbol to be referenced externally. - -END [] - -Marks the end of the assembly process. Immediately terminates assembly -without processing any other lines in this file or any others. It is -optional. is only allowed for the DECB target in which case it -specifies the execution address. If it is not specified, the address -defaults to 0. diff -r 3706ede361ea -r f0881c115010 doc/scripts.txt --- a/doc/scripts.txt Fri Jan 30 01:51:41 2009 +0000 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,63 +0,0 @@ -LWLINK linker scripts - -A linker script is used to instruct the linker about how to assemble the -various sections into a completed binary. It consists of a series of -directives which are considered in the order they are encountered. Any -section not referenced by a directive is assumed to be loaded after the -final section explicitly referenced. - -The sections will appear in the resulting binary in the order they are -specified in the script file. - -If a referenced section is not found, the linker will behave as though the -section did exist but had a zero size, no relocations, and no exports. - -A section may only be referenced once. Any subsequent references will have -no effect. - -All numbers are hexadecimal. - -section load - -This causes the section to load at . For raw target, only one -"load at" entry is allowed for non-bss sections and it must be the first -one. For raw targets, it affects the addresses the linker assigns to symbols -but has no other affect on the output. bss sections may all have separate -load addresses but since they will not appear in the binary anyway, this is -okay. - -For the DECB target, each "load" entry will cause a new "block" to be -output to the binary which will contain the load address. It is legal for -sections to overlap in this manner - the linker assumes the loader will sort -everything out. - -section - -This will cause the section to load after the previously listed -section. - -exec - -This will cause the execution address (entry point) to be the address -specified (in hex) *or* the specified symbol name. The symbol name must -match a symbol that is exported by one of the object files being linked. -This has no effect for targets that do not encode the entry point into the -resulting file. If not specified, the entry point is assumed to be address 0 -which is probably not what you want. The default link scripts for targets -that support this directive automatically starts at the beginning of the -first section (usually "init" or "code") that is emitted in the binary. - -pad - -This will cause the output file to be padded with NUL bytes to be exactly - bytes in length. This only makes sense for a raw target. - - -If is "*", then any section not already matched by the script will be -matched. For format *, can be used to select sections which have -particular flags set (or not set). For instance: - -*,!bss This would match all sections that do not have the bss flag set -*,bss this would match all sections that do have the bss flag set - -