Gli assemblatori di vecchia scuola erano tipicamente codificati a mano in assemblatore e utilizzavano tecniche di analisi ad hoc per elaborare le linee di origine degli assemblaggi per produrre codice effettivo assemblatore. Quando la sintassi dell'assemblatore è semplice (ad esempio, sempre OPCODE REG, OPERAND) ciò ha funzionato abbastanza bene.
Le macchine moderne dispongono di insiemi di istruzioni disordinati con molte variazioni di istruzioni e operandi, che possono essere espressi con sintassi complessa che consente a più registri di indice di partecipare all'espressione dell'operando. Consentendo sofisticate espressioni assemblaggio-tempo con costanti fisse e rilocabili con vari tipi di operatori di addizione, questo complica. Sofisticati assemblatori che consentono compilazione condizionale, macro, dichiarazioni di dati strutturati, ecc., Aggiungono nuove esigenze alla sintassi. Elaborare tutta questa sintassi con metodi ad hoc è molto difficile ed è la ragione per cui i generatori di parser sono stati inventati.
L'utilizzo di un BNF e di un generatore di parser è un modo molto ragionevole per costruire un assemblatore moderno, anche per un processore legacy come lo Z80. Ho costruito tali assemblatori per macchine a 8 bit Motorola come il 6800/6809 e mi sto preparando a fare lo stesso per un x86 moderno. Penso che tu stia andando esattamente nella direzione giusta.
********** EDIT **************** L'OP ha richiesto per esempio le definizioni di lexer e parser. Ho fornito entrambi qui.
Questi sono estratti da specifiche reali per un assembler 6809. Le definizioni complete sono 2-3 volte le dimensioni dei campioni qui.
Per mantenere lo spazio verso il basso, ho eliminato gran parte della complessità dell'angolo scuro che è il punto di queste definizioni. Uno potrebbe essere costernato dalla complessità apparente; il punto è che con tali definizioni, si sta tentando di descrivere la forma della lingua , non codificarla proceduralmente. Pagherete una complessità significativamente più alta se il vostro codice viene codificato in modo ad hoc e sarà meno meno gestibile.
Sarà anche essere di qualche aiuto per sapere che queste definizioni vengono utilizzati con un sistema di analisi programma di high-end che ha strumenti lexing/di analisi come sottosistemi, chiamato The DMS Software Reengineering Toolkit. DMS costruirà automaticamente gli AST dalle regole di grammatica nella parser del parser, che lo rende un lotto più facile da generare degli strumenti di analisi. Infine, , la specifica del parser contiene le cosiddette dichiarazioni "prettyprinter" , che consentono a DMS di rigenerare il testo di origine dagli AST. (Il vero scopo della grammatica è stato quello di permetterci di costruire AST rappresentano assembler istruzioni, e poi sputare fuori per essere alimentato ad un vero e proprio assemblatore!)
Una cosa da notare: come lessemi e regole grammaticali sono iscritti (il metasyntxax!) varia leggermente tra i diversi sistemi di generatore di lexer/parser. La sintassi delle specifiche basate su DMS non fa eccezione. DMS ha regole di grammatica relativamente sofisticate, , che in realtà non sono pratici da spiegare nello spazio disponibile qui. Dovrai convivere con l'idea che altri sistemi usano notazioni simili, per EBNF per le regole e varianti di espressioni regolari per i lessemi.
Dato l'interesse superiore del PO, può implementare simili lexer/parser con qualsiasi strumento di lexer/generatore di parser, per esempio, FLEX/YACC, JavaCC, ANTLR, ...
******* *** lexer **************
-- M6809.lex: Lexical Description for M6809
-- Copyright (C) 1989,1999-2002 Ira D. Baxter
%%
#mainmode Label
#macro digit "[0-9]"
#macro hexadecimaldigit "<digit>|[a-fA-F]"
#macro comment_body_character "[\u0009 \u0020-\u007E]" -- does not include NEWLINE
#macro blank "[\u0000 \ \u0009]"
#macro hblanks "<blank>+"
#macro newline "\u000d \u000a? \u000c? | \u000a \u000c?" -- form feed allowed only after newline
#macro bare_semicolon_comment "\; <comment_body_character>* "
#macro bare_asterisk_comment "\* <comment_body_character>* "
...[snip]
#macro hexadecimal_digit "<digit> | [a-fA-F]"
#macro binary_digit "[01]"
#macro squoted_character "\' [\u0021-\u007E]"
#macro string_character "[\u0009 \u0020-\u007E]"
%%Label -- (First mode) processes left hand side of line: labels, opcodes, etc.
#skip "(<blank>*<newline>)+"
#skip "(<blank>*<newline>)*<blank>+"
<< (GotoOpcodeField ?) >>
#precomment "<comment_line><newline>"
#preskip "(<blank>*<newline>)+"
#preskip "(<blank>*<newline>)*<blank>+"
<< (GotoOpcodeField ?) >>
-- Note that an apparant register name is accepted as a label in this mode
#token LABEL [STRING] "<identifier>"
<< (local (;; (= [TokenScan natural] 1) ; process all string characters
(= [TokenLength natural] ?:TokenCharacterCount)=
(= [TokenString (reference TokenBodyT)] (. ?:TokenCharacters))
(= [Result (reference string)] (. ?:Lexeme:Literal:String:Value))
[ThisCharacterCode natural]
(define Ordinala #61)
(define Ordinalf #66)
(define OrdinalA #41)
(define OrdinalF #46)
);;
(;; (= (@ Result) `') ; start with empty string
(while (<= TokenScan TokenLength)
(;; (= ThisCharacterCode (coerce natural TokenString:TokenScan))
(+= TokenScan) ; bump past character
(ifthen (>= ThisCharacterCode Ordinala)
(-= ThisCharacterCode #20) ; fold to upper case
)ifthen
(= (@ Result) (append (@ Result) (coerce character ThisCharacterCode)))=
);;
)while
);;
)local
(= ?:Lexeme:Literal:String:Format (LiteralFormat:MakeCompactStringLiteralFormat 0)) ; nothing interesting in string
(GotoLabelList ?)
>>
%%OpcodeField
#skip "<hblanks>"
<< (GotoEOLComment ?) >>
#ifnotoken
<< (GotoEOLComment ?) >>
-- Opcode field tokens
#token 'ABA' "[aA][bB][aA]"
<< (GotoEOLComment ?) >>
#token 'ABX' "[aA][bB][xX]"
<< (GotoEOLComment ?) >>
#token 'ADC' "[aA][dD][cC]"
<< (GotoABregister ?) >>
#token 'ADCA' "[aA][dD][cC][aA]"
<< (GotoOperand ?) >>
#token 'ADCB' "[aA][dD][cC][bB]"
<< (GotoOperand ?) >>
#token 'ADCD' "[aA][dD][cC][dD]"
<< (GotoOperand ?) >>
#token 'ADD' "[aA][dD][dD]"
<< (GotoABregister ?) >>
#token 'ADDA' "[aA][dD][dD][aA]"
<< (GotoOperand ?) >>
#token 'ADDB' "[aA][dD][dD][bB]"
<< (GotoOperand ?) >>
#token 'ADDD' "[aA][dD][dD][dD]"
<< (GotoOperand ?) >>
#token 'AND' "[aA][nN][dD]"
<< (GotoABregister ?) >>
#token 'ANDA' "[aA][nN][dD][aA]"
<< (GotoOperand ?) >>
#token 'ANDB' "[aA][nN][dD][bB]"
<< (GotoOperand ?) >>
#token 'ANDCC' "[aA][nN][dD][cC][cC]"
<< (GotoRegister ?) >>
...[long list of opcodes snipped]
#token IDENTIFIER [STRING] "<identifier>"
<< (local (;; (= [TokenScan natural] 1) ; process all string characters
(= [TokenLength natural] ?:TokenCharacterCount)=
(= [TokenString (reference TokenBodyT)] (. ?:TokenCharacters))
(= [Result (reference string)] (. ?:Lexeme:Literal:String:Value))
[ThisCharacterCode natural]
(define Ordinala #61)
(define Ordinalf #66)
(define OrdinalA #41)
(define OrdinalF #46)
);;
(;; (= (@ Result) `') ; start with empty string
(while (<= TokenScan TokenLength)
(;; (= ThisCharacterCode (coerce natural TokenString:TokenScan))
(+= TokenScan) ; bump past character
(ifthen (>= ThisCharacterCode Ordinala)
(-= ThisCharacterCode #20) ; fold to upper case
)ifthen
(= (@ Result) (append (@ Result) (coerce character ThisCharacterCode)))=
);;
)while
);;
)local
(= ?:Lexeme:Literal:String:Format (LiteralFormat:MakeCompactStringLiteralFormat 0)) ; nothing interesting in string
(GotoOperandField ?)
>>
#token '#' "\#" -- special constant introduction (FDB)
<< (GotoDataField ?) >>
#token NUMBER [NATURAL] "<decimal_number>"
<< (local [format LiteralFormat:NaturalLiteralFormat]
(;; (= ?:Lexeme:Literal:Natural:Value (ConvertDecimalTokenStringToNatural (. format) ? 0 0))
(= ?:Lexeme:Literal:Natural:Format (LiteralFormat:MakeCompactNaturalLiteralFormat format))
);;
)local
(GotoOperandField ?)
>>
#token NUMBER [NATURAL] "\$ <hexadecimal_digit>+"
<< (local [format LiteralFormat:NaturalLiteralFormat]
(;; (= ?:Lexeme:Literal:Natural:Value (ConvertHexadecimalTokenStringToNatural (. format) ? 1 0))
(= ?:Lexeme:Literal:Natural:Format (LiteralFormat:MakeCompactNaturalLiteralFormat format))
);;
)local
(GotoOperandField ?)
>>
#token NUMBER [NATURAL] "\% <binary_digit>+"
<< (local [format LiteralFormat:NaturalLiteralFormat]
(;; (= ?:Lexeme:Literal:Natural:Value (ConvertBinaryTokenStringToNatural (. format) ? 1 0))
(= ?:Lexeme:Literal:Natural:Format (LiteralFormat:MakeCompactNaturalLiteralFormat format))
);;
)local
(GotoOperandField ?)
>>
#token CHARACTER [CHARACTER] "<squoted_character>"
<< (= ?:Lexeme:Literal:Character:Value (TokenStringCharacter ? 2))
(= ?:Lexeme:Literal:Character:Format (LiteralFormat:MakeCompactCharacterLiteralFormat 0 0)) ; nothing special about character
(GotoOperandField ?)
>>
%%OperandField
#skip "<hblanks>"
<< (GotoEOLComment ?) >>
#ifnotoken
<< (GotoEOLComment ?) >>
-- Tokens signalling switch to index register modes
#token ',' "\,"
<<(GotoRegisterField ?)>>
#token '[' "\["
<<(GotoRegisterField ?)>>
-- Operators for arithmetic syntax
#token '!!' "\!\!"
#token '!' "\!"
#token '##' "\#\#"
#token '#' "\#"
#token '&' "\&"
#token '(' "\("
#token ')' "\)"
#token '*' "\*"
#token '+' "\+"
#token '-' "\-"
#token '/' "\/"
#token '//' "\/\/"
#token '<' "\<"
#token '<' "\<"
#token '<<' "\<\<"
#token '<=' "\<\="
#token '</' "\<\/"
#token '=' "\="
#token '>' "\>"
#token '>' "\>"
#token '>=' "\>\="
#token '>>' "\>\>"
#token '>/' "\>\/"
#token '\\' "\\"
#token '|' "\|"
#token '||' "\|\|"
#token NUMBER [NATURAL] "<decimal_number>"
<< (local [format LiteralFormat:NaturalLiteralFormat]
(;; (= ?:Lexeme:Literal:Natural:Value (ConvertDecimalTokenStringToNatural (. format) ? 0 0))
(= ?:Lexeme:Literal:Natural:Format (LiteralFormat:MakeCompactNaturalLiteralFormat format))
);;
)local
>>
#token NUMBER [NATURAL] "\$ <hexadecimal_digit>+"
<< (local [format LiteralFormat:NaturalLiteralFormat]
(;; (= ?:Lexeme:Literal:Natural:Value (ConvertHexadecimalTokenStringToNatural (. format) ? 1 0))
(= ?:Lexeme:Literal:Natural:Format (LiteralFormat:MakeCompactNaturalLiteralFormat format))
);;
)local
>>
#token NUMBER [NATURAL] "\% <binary_digit>+"
<< (local [format LiteralFormat:NaturalLiteralFormat]
(;; (= ?:Lexeme:Literal:Natural:Value (ConvertBinaryTokenStringToNatural (. format) ? 1 0))
(= ?:Lexeme:Literal:Natural:Format (LiteralFormat:MakeCompactNaturalLiteralFormat format))
);;
)local
>>
-- Notice that an apparent register is accepted as a label in this mode
#token IDENTIFIER [STRING] "<identifier>"
<< (local (;; (= [TokenScan natural] 1) ; process all string characters
(= [TokenLength natural] ?:TokenCharacterCount)=
(= [TokenString (reference TokenBodyT)] (. ?:TokenCharacters))
(= [Result (reference string)] (. ?:Lexeme:Literal:String:Value))
[ThisCharacterCode natural]
(define Ordinala #61)
(define Ordinalf #66)
(define OrdinalA #41)
(define OrdinalF #46)
);;
(;; (= (@ Result) `') ; start with empty string
(while (<= TokenScan TokenLength)
(;; (= ThisCharacterCode (coerce natural TokenString:TokenScan))
(+= TokenScan) ; bump past character
(ifthen (>= ThisCharacterCode Ordinala)
(-= ThisCharacterCode #20) ; fold to upper case
)ifthen
(= (@ Result) (append (@ Result) (coerce character ThisCharacterCode)))=
);;
)while
);;
)local
(= ?:Lexeme:Literal:String:Format (LiteralFormat:MakeCompactStringLiteralFormat 0)) ; nothing interesting in string
>>
%%Register -- operand field for TFR, ANDCC, ORCC, EXG opcodes
#skip "<hblanks>"
#ifnotoken << (GotoRegisterField ?) >>
%%RegisterField -- handles registers and indexing mode syntax
-- In this mode, names that look like registers are recognized as registers
#skip "<hblanks>"
<< (GotoEOLComment ?) >>
#ifnotoken
<< (GotoEOLComment ?) >>
#token '[' "\["
#token ']' "\]"
#token '--' "\-\-"
#token '++' "\+\+"
#token 'A' "[aA]"
#token 'B' "[bB]"
#token 'CC' "[cC][cC]"
#token 'DP' "[dD][pP] | [dD][pP][rR]" -- DPR shouldnt be needed, but found one instance
#token 'D' "[dD]"
#token 'Z' "[zZ]"
-- Index register designations
#token 'X' "[xX]"
#token 'Y' "[yY]"
#token 'U' "[uU]"
#token 'S' "[sS]"
#token 'PCR' "[pP][cC][rR]"
#token 'PC' "[pP][cC]"
#token ',' "\,"
-- Operators for arithmetic syntax
#token '!!' "\!\!"
#token '!' "\!"
#token '##' "\#\#"
#token '#' "\#"
#token '&' "\&"
#token '(' "\("
#token ')' "\)"
#token '*' "\*"
#token '+' "\+"
#token '-' "\-"
#token '/' "\/"
#token '<' "\<"
#token '<' "\<"
#token '<<' "\<\<"
#token '<=' "\<\="
#token '<|' "\<\|"
#token '=' "\="
#token '>' "\>"
#token '>' "\>"
#token '>=' "\>\="
#token '>>' "\>\>"
#token '>|' "\>\|"
#token '\\' "\\"
#token '|' "\|"
#token '||' "\|\|"
#token NUMBER [NATURAL] "<decimal_number>"
<< (local [format LiteralFormat:NaturalLiteralFormat]
(;; (= ?:Lexeme:Literal:Natural:Value (ConvertDecimalTokenStringToNatural (. format) ? 0 0))
(= ?:Lexeme:Literal:Natural:Format (LiteralFormat:MakeCompactNaturalLiteralFormat format))
);;
)local
>>
... [snip]
%% -- end M6809.lex
**************** ******** parser ******
-- M6809.ATG: Motorola 6809 assembly code parser
-- (C) Copyright 1989;1999-2002 Ira D. Baxter; All Rights Reserved
m6809 = sourcelines ;
sourcelines = ;
sourcelines = sourcelines sourceline EOL ;
<<PrettyPrinter>>: { V(CV(sourcelines[1]),H(sourceline,A<eol>(EOL))); }
-- leading opcode field symbol should be treated as keyword.
sourceline = ;
sourceline = labels ;
sourceline = optional_labels 'EQU' expression ;
<<PrettyPrinter>>: { H(optional_labels,A<opcode>('EQU'),A<operand>(expression)); }
sourceline = LABEL 'SET' expression ;
<<PrettyPrinter>>: { H(A<firstlabel>(LABEL),A<opcode>('SET'),A<operand>(expression)); }
sourceline = optional_label instruction ;
<<PrettyPrinter>>: { H(optional_label,instruction); }
sourceline = optional_label optlabelleddirective ;
<<PrettyPrinter>>: { H(optional_label,optlabelleddirective); }
sourceline = optional_label implicitdatadirective ;
<<PrettyPrinter>>: { H(optional_label,implicitdatadirective); }
sourceline = unlabelleddirective ;
sourceline = '?ERROR' ;
<<PrettyPrinter>>: { A<opcode>('?ERROR'); }
optional_label = labels ;
optional_label = LABEL ':' ;
<<PrettyPrinter>>: { H(A<firstlabel>(LABEL),':'); }
optional_label = ;
optional_labels = ;
optional_labels = labels ;
labels = LABEL ;
<<PrettyPrinter>>: { A<firstlabel>(LABEL); }
labels = labels ',' LABEL ;
<<PrettyPrinter>>: { H(labels[1],',',A<otherlabels>(LABEL)); }
unlabelleddirective = 'END' ;
<<PrettyPrinter>>: { A<opcode>('END'); }
unlabelleddirective = 'END' expression ;
<<PrettyPrinter>>: { H(A<opcode>('END'),A<operand>(expression)); }
unlabelleddirective = 'IF' expression EOL conditional ;
<<PrettyPrinter>>: { V(H(A<opcode>('IF'),H(A<operand>(expression),A<eol>(EOL))),CV(conditional)); }
unlabelleddirective = 'IFDEF' IDENTIFIER EOL conditional ;
<<PrettyPrinter>>: { V(H(A<opcode>('IFDEF'),H(A<operand>(IDENTIFIER),A<eol>(EOL))),CV(conditional)); }
unlabelleddirective = 'IFUND' IDENTIFIER EOL conditional ;
<<PrettyPrinter>>: { V(H(A<opcode>('IFUND'),H(A<operand>(IDENTIFIER),A<eol>(EOL))),CV(conditional)); }
unlabelleddirective = 'INCLUDE' FILENAME ;
<<PrettyPrinter>>: { H(A<opcode>('INCLUDE'),A<operand>(FILENAME)); }
unlabelleddirective = 'LIST' expression ;
<<PrettyPrinter>>: { H(A<opcode>('LIST'),A<operand>(expression)); }
unlabelleddirective = 'NAME' IDENTIFIER ;
<<PrettyPrinter>>: { H(A<opcode>('NAME'),A<operand>(IDENTIFIER)); }
unlabelleddirective = 'ORG' expression ;
<<PrettyPrinter>>: { H(A<opcode>('ORG'),A<operand>(expression)); }
unlabelleddirective = 'PAGE' ;
<<PrettyPrinter>>: { A<opcode>('PAGE'); }
unlabelleddirective = 'PAGE' HEADING ;
<<PrettyPrinter>>: { H(A<opcode>('PAGE'),A<operand>(HEADING)); }
unlabelleddirective = 'PCA' expression ;
<<PrettyPrinter>>: { H(A<opcode>('PCA'),A<operand>(expression)); }
unlabelleddirective = 'PCC' expression ;
<<PrettyPrinter>>: { H(A<opcode>('PCC'),A<operand>(expression)); }
unlabelleddirective = 'PSR' expression ;
<<PrettyPrinter>>: { H(A<opcode>('PSR'),A<operand>(expression)); }
unlabelleddirective = 'TABS' numberlist ;
<<PrettyPrinter>>: { H(A<opcode>('TABS'),A<operand>(numberlist)); }
unlabelleddirective = 'TITLE' HEADING ;
<<PrettyPrinter>>: { H(A<opcode>('TITLE'),A<operand>(HEADING)); }
unlabelleddirective = 'WITH' settings ;
<<PrettyPrinter>>: { H(A<opcode>('WITH'),A<operand>(settings)); }
settings = setting ;
settings = settings ',' setting ;
<<PrettyPrinter>>: { H*; }
setting = 'WI' '=' NUMBER ;
<<PrettyPrinter>>: { H*; }
setting = 'DE' '=' NUMBER ;
<<PrettyPrinter>>: { H*; }
setting = 'M6800' ;
setting = 'M6801' ;
setting = 'M6809' ;
setting = 'M6811' ;
-- collects lines of conditional code into blocks
conditional = 'ELSEIF' expression EOL conditional ;
<<PrettyPrinter>>: { V(H(A<opcode>('ELSEIF'),H(A<operand>(expression),A<eol>(EOL))),CV(conditional[1])); }
conditional = 'ELSE' EOL else ;
<<PrettyPrinter>>: { V(H(A<opcode>('ELSE'),A<eol>(EOL)),CV(else)); }
conditional = 'FIN' ;
<<PrettyPrinter>>: { A<opcode>('FIN'); }
conditional = sourceline EOL conditional ;
<<PrettyPrinter>>: { V(H(sourceline,A<eol>(EOL)),CV(conditional[1])); }
else = 'FIN' ;
<<PrettyPrinter>>: { A<opcode>('FIN'); }
else = sourceline EOL else ;
<<PrettyPrinter>>: { V(H(sourceline,A<eol>(EOL)),CV(else[1])); }
-- keyword-less directive, generates data tables
implicitdatadirective = implicitdatadirective ',' implicitdataitem ;
<<PrettyPrinter>>: { H*; }
implicitdatadirective = implicitdataitem ;
implicitdataitem = '#' expression ;
<<PrettyPrinter>>: { A<operand>(H('#',expression)); }
implicitdataitem = '+' expression ;
<<PrettyPrinter>>: { A<operand>(H('+',expression)); }
implicitdataitem = '-' expression ;
<<PrettyPrinter>>: { A<operand>(H('-',expression)); }
implicitdataitem = expression ;
<<PrettyPrinter>>: { A<operand>(expression); }
implicitdataitem = STRING ;
<<PrettyPrinter>>: { A<operand>(STRING); }
-- instructions valid for m680C (see Software Dynamics ASM manual)
instruction = 'ABA' ;
<<PrettyPrinter>>: { A<opcode>('ABA'); }
instruction = 'ABX' ;
<<PrettyPrinter>>: { A<opcode>('ABX'); }
instruction = 'ADC' 'A' operandfetch ;
<<PrettyPrinter>>: { H(A<opcode>(H('ADC','A')),A<operand>(operandfetch)); }
instruction = 'ADC' 'B' operandfetch ;
<<PrettyPrinter>>: { H(A<opcode>(H('ADC','B')),A<operand>(operandfetch)); }
instruction = 'ADCA' operandfetch ;
<<PrettyPrinter>>: { H(A<opcode>('ADCA'),A<operand>(operandfetch)); }
instruction = 'ADCB' operandfetch ;
<<PrettyPrinter>>: { H(A<opcode>('ADCB'),A<operand>(operandfetch)); }
instruction = 'ADCD' operandfetch ;
<<PrettyPrinter>>: { H(A<opcode>('ADCD'),A<operand>(operandfetch)); }
instruction = 'ADD' 'A' operandfetch ;
<<PrettyPrinter>>: { H(A<opcode>(H('ADD','A')),A<operand>(operandfetch)); }
instruction = 'ADD' 'B' operandfetch ;
<<PrettyPrinter>>: { H(A<opcode>(H('ADD','B')),A<operand>(operandfetch)); }
instruction = 'ADDA' operandfetch ;
<<PrettyPrinter>>: { H(A<opcode>('ADDA'),A<operand>(operandfetch)); }
[..snip...]
-- condition code mask for ANDCC and ORCC
conditionmask = '#' expression ;
<<PrettyPrinter>>: { H*; }
conditionmask = expression ;
target = expression ;
operandfetch = '#' expression ; --immediate
<<PrettyPrinter>>: { H*; }
operandfetch = memoryreference ;
operandstore = memoryreference ;
memoryreference = '[' indexedreference ']' ;
<<PrettyPrinter>>: { H*; }
memoryreference = indexedreference ;
indexedreference = offset ;
indexedreference = offset ',' indexregister ;
<<PrettyPrinter>>: { H*; }
indexedreference = ',' indexregister ;
<<PrettyPrinter>>: { H*; }
indexedreference = ',' '--' indexregister ;
<<PrettyPrinter>>: { H*; }
indexedreference = ',' '-' indexregister ;
<<PrettyPrinter>>: { H*; }
indexedreference = ',' indexregister '++' ;
<<PrettyPrinter>>: { H*; }
indexedreference = ',' indexregister '+' ;
<<PrettyPrinter>>: { H*; }
offset = '>' expression ; -- page zero ref
<<PrettyPrinter>>: { H*; }
offset = '<' expression ; -- long reference
<<PrettyPrinter>>: { H*; }
offset = expression ;
offset = 'A' ;
offset = 'B' ;
offset = 'D' ;
registerlist = registername ;
registerlist = registerlist ',' registername ;
<<PrettyPrinter>>: { H*; }
registername = 'A' ;
registername = 'B' ;
registername = 'CC' ;
registername = 'DP' ;
registername = 'D' ;
registername = 'Z' ;
registername = indexregister ;
indexregister = 'X' ;
indexregister = 'Y' ;
indexregister = 'U' ; -- not legal on M6811
indexregister = 'S' ;
indexregister = 'PCR' ;
indexregister = 'PC' ;
expression = sum '=' sum ;
<<PrettyPrinter>>: { H*; }
expression = sum '<<' sum ;
<<PrettyPrinter>>: { H*; }
expression = sum '</' sum ;
<<PrettyPrinter>>: { H*; }
expression = sum '<=' sum ;
<<PrettyPrinter>>: { H*; }
expression = sum '<' sum ;
<<PrettyPrinter>>: { H*; }
expression = sum '>>' sum ;
<<PrettyPrinter>>: { H*; }
expression = sum '>/' sum ;
<<PrettyPrinter>>: { H*; }
expression = sum '>=' sum ;
<<PrettyPrinter>>: { H*; }
expression = sum '>' sum ;
<<PrettyPrinter>>: { H*; }
expression = sum '#' sum ;
<<PrettyPrinter>>: { H*; }
expression = sum ;
sum = product ;
sum = sum '+' product ;
<<PrettyPrinter>>: { H*; }
sum = sum '-' product ;
<<PrettyPrinter>>: { H*; }
sum = sum '!' product ;
<<PrettyPrinter>>: { H*; }
sum = sum '!!' product ;
<<PrettyPrinter>>: { H*; }
product = term '*' product ;
<<PrettyPrinter>>: { H*; }
product = term '||' product ; -- wrong?
<<PrettyPrinter>>: { H*; }
product = term '/' product ;
<<PrettyPrinter>>: { H*; }
product = term '//' product ;
<<PrettyPrinter>>: { H*; }
product = term '&' product ;
<<PrettyPrinter>>: { H*; }
product = term '##' product ;
<<PrettyPrinter>>: { H*; }
product = term ;
term = '+' term ;
<<PrettyPrinter>>: { H*; }
term = '-' term ;
<<PrettyPrinter>>: { H*; }
term = '\\' term ; -- complement
<<PrettyPrinter>>: { H*; }
term = '&' term ; -- not
term = IDENTIFIER ;
term = NUMBER ;
term = CHARACTER ;
term = '*' ;
term = '(' expression ')' ;
<<PrettyPrinter>>: { H*; }
numberlist = NUMBER ;
numberlist = numberlist ',' NUMBER ;
<<PrettyPrinter>>: { H*; }
Dupe di http://stackoverflow.com/questions/1305091/writing-an-z80-assembler-lexi ng-asm-and-building-a-parse-tree-using-composition dello stesso utente –
@Butterworth: non un duplicato. L'altra domanda riguarda il passaggio di informazioni attorno a qualsiasi albero che potrebbe costruire usando un grammer. Questa domanda ha a che fare con se dovrebbe usare una grammatica e, in tal caso, come sarebbe. La risposta a questa domanda è una precondizione per l'altro essendo interessante. –