|
Unimal
Unified macro language
Language-independent macro
processor
Version 2.1
Documentation revision 2.1a
MacroExpressions http://www.macroexpressions.com
|
0.1.4. Repeat/While loop construct
0.1.5. Save and Restore operators
0.1.10. Bug Fixed in release 2.1 build 231
2.1. Tabulating a function: Unimal loops and built-in math
2.2. Parameters sharing among languages: Export statement
2.3. Software distribution: If statement
2.5. Highlights of other Unimal language features
2.6. Publishing the indices of array entries to a header file
2.6.1. A useful design pattern
2.6.2. Export Push/Pop and string expressions
2.7. Toward truly reusable macros: inspecting properties of the arguments
3.9. Error reporting: Unimal.err
4. Unimal language reference guide
4.3. Macro parameters (compile-time variables)
4.7. Target language interface
4.8.4. Bitwise logic expressions
4.8.5.3. Logical AND ‘&&’ and OR ‘||’ expressions
4.11.3.1. Name-to-string conversion
4.11.3.2. Substring extraction uSubstr
4.11.3.4. Simplified Concatenation
4.11.3.5. Splitting a string uSplit
4.12.14. Expand (non-recursive)
4.12.15. Expand (possibly recursive)
4.13. A useful shorthand for a list of arguments
4.14. Special macro parameters
5. Error Detection and Recovery
5.1. Error logging mechanism in Unimal
5.2. Unimal error messages reference
5.2.1. Default format of an error message
5.2.2. Error message format mimicry
5.2.3. F type: Fatal file errors
5.2.3.3. Errors 0104, 0105 (output file), 0106 (input file)
5.2.4. Special F type error (Usage syntax)
5.2.6.1. Error 2000 (The file has an unbalanced beginning or end of a block)
5.2.6.2. Error 2001 (General syntax error)
5.2.6.3. Error 2002 (Macro redefinition)
5.2.6.4. Error 2004 (Missing actual argument)
5.2.6.5. Errors 2005, 2006, 2007 (Unmatched block operators)
5.2.6.6. Error 2009 (Bad macro reference)
5.2.6.7. Error 2010 (Unexpected type)
5.2.6.8. Error 2011 (undefined parameter)
5.2.6.9. Error 2012 (formatting in composite names or target language interface)
5.2.6.10. Error 2013 (expected macro parameter)
5.2.6.11. Error 2014 (expected a numeric)
5.2.6.12. Error 2015 (undefined string expression)
5.2.6.13. Error 2016 (invalid string expression)
5.2.6.14. Error 2017 (wrong number of arguments to a function)
5.2.6.15. Error 2018 (literal number too large)
5.2.6.16. Error 2019 (string not terminated)
5.2.6.17. Error 2020 (Nested macro definition)
5.2.6.18. Error 2021 (Unmatched While)
5.2.6.19. Error 2022 (Recursive macro expansion)
5.2.8.1. Errors 3500, 3501, 3502 (Arithmetic overflows)
5.2.8.2. Error 3503 (Divide by zero)
5.2.8.3. Error 3504 (Non-positive divisor in remainder operation
5.2.8.4. Errors 3510, 3511, 3512, 3513 (math functions errors)
6.1. (No) implementation limits
(You can skip this change log if you are new to Unimal.)
Generally, Unimal is reluctant to add new features because it wants to be very easy to learn. However, there is a balance to be found between being easy to learn and being easy to apply.
The features added to 2.1 come from the real-world experience and improve Unimal’s ease of use.
Beginning with version 2.0b, Unimal addresses two usability issues related to the placement of macro definitions:
Correspondingly, the error message S2008 is removed and a new error, S2020 "Nested macro definition" is added.
In version 2.1, a traditional macro invocation generates an error if it would cause a recursion. (Prior to 2.1, any such recursion would implicitly be infinite because Unimal must expand false blocks e.g. in a search for unbalanced Endfor.)
To allow recursive macro expansion, the user must indicate that the macro (including macros contained in it) is balanced; the syntax to do so is to pass the argument list in square brackets.
Beginning with Unimal 2.0b, a Repeat/While loop construct analogous to C do/while construct is available. It doesn’t add any new functionality because it can be emulated with a For/Endfor loop and manual manipulation of the loop counter. However, Unimal code in many cases becomes so much cleaner with the Repeat/While loop that its addition felt justified.
Beginning with version 2.1:
Unimal now allows to stash a macro parameter away and to restore it, selectively, if so desired. E.g.
#MP Save myparam
The Restore operator restores the value(s) saved.
Beginning with version 2.1:
Beginning with version 2.1:
Beginning with version 2.0b:
Beginning with Unimal 2.1:
-f<file>
reads the command line arguments from <file>; this serves as a command
line extension
-N<name>=<number>
added in version 2.0c is now checked for correctness
A new error message, F0111, is added to indicate an incorrect command line option
Beginning with Unimal 2.0c:
-S<name>=<string>
defines macro parameter <name> with the string value <string>
-N<name>=<number>
defines macro parameter <name> with the numeric value <number>
-v
Displays version information
Prior to release 2.1 (build 231), Unimal did not log more than one error per line of input. That’s because a second error is likely to be induced by the first one.
Beginning with release 2.1 (build 231), Unimal outputs all errors. Even though some errors are induced by a previous error, the error output allows better understanding of the root cause of the failed execution.
Code snippets, names of variables and such are in courier new font.
Of them, keywords are blue.
Formal arguments of Unimal macros are shown #red#.
Unimal comments are shown in green.
Literal strings are shown in brown.
When line numbers in a code snippet are referenced, the line numbers in parentheses are shown to the left of the code lines.
There is a QuickStart subdirectory of the Samples directory in the distribution; it contains the files referenced in the Quick Start section. If you prefer to work along, do use those files for reference.
Also, don’t miss the Application Notes; some are included in the distribution; more are available on the Web site.
In short, Unimal is an advanced language-independent macro preprocessor. That is, it is a utility that processes a source file into one (or more!) output file(s) which are, in their turn, source file(s) in one (or more!) programming languages.
The name stands for UNIfied MAcro Language. ‘Unified’ here means that the same macro processor applies to various (programming) languages, like C, Assembler, Linker command language, make files, or almost any language whatsoever.
Unimal is not made to replace any software development tools; it is to supplement them with better software management capabilities.
A software developer would use Unimal in a situation where there is a need in a powerful macro processor. The powerful features of Unimal, some unique, make it a macro processor of choice.
Unimal is not (maybe, unfortunately) designed with utmost elegance in mind. Instead, it is designed to be simple in accomplishing simple tasks and powerful enough to make complex solutions possible.
The next section introduces several practical problems and illustrates Unimal features facilitating the solutions.
Conceptually, Unimal macro language is very simple: it is line-based, and each line in a source file is either a line in a target programming language or a Unimal statement. The former can have a special markup, which instructs Unimal to replace the markup with corresponding text.
As an illustration, consider the problem of tabulating a hard-to-compute function in a constant integer array – a problem often encountered in embedded programming. In our first example, we want to tabulate, in C, a scaled sine wave, 10000*sin(x) at seven equidistant points in the segment [0, π/2]. Here is a solution:
(1) const int sinewave[] = {
(2) #MP For n=0, 6
(3) #MP val = Usin(10000, 1, n, 2*6)
(4) #mp%dval,
(5) #MP Endfor
(6) };
The Unimal output is
const int sinewave[] = {
0,
2588,
5000,
7071,
8660,
9659,
10000,
};
(If you prefer to work along, run
Unimal sine6.u
from the command line in the QuickStart folder; the result will be sent to the
standard output.)
Here’s what is happening here:
Line 1 is normal C code; since it doesn’t have any Unimal markup, it is copied from input to output verbatim.
Line 2 starts with the marker #MP; this indicates a Unimal statement. As with all Unimal statements, nothing from this line is sent to output. The statement itself is a loop statement; it instructs Unimal to re-scan the source file until the matching end of loop for n=0,1, …, 6. Matching end of loop is found in line 5, so lines 3 and 4 will be scanned for all values of n.
Line 3 (again, because of the #MP marker) is a Unimal statement. It is a Set statement, assigning the value of a numeric expression to the macro parameter val. In our case, the numeric expression is a built-in function, Usin, which computes a scaled integer sine:
|
Usin(a, b, c, d) = |
|
Line 4 does not start with the #MP marker, so it is considered a line in the target language and is to be sent to the output. However, it contains a Unimal markup, #mp%d, which instructs Unimal to render the macro parameter that follows (val) as a decimal number. What’s outside the markup (in this case, just a comma) is copied to the output unchanged.
As n changes with each rescan of the source, so does val. This is how the numbers in the output are produced.
Line 5 is a Unimal statement (it starts with #MP); this is the end-of-loop statement which indicates the boundary for repeated source rescan.
Finally, line 6 is a target language statement without markup; it is copied to the output.
This simple example hints at Unimal’s powers in automating static (compile-time) initialization. While this is most important in resource-constrained embedded applications, “normal” computer applications can benefit from Unimal in implementing table-driven algorithms and/or software designs.
As a different example, consider sharing constant parameters
across languages. Let’s say we need to share a symbolic definition, which, say
in an Assembler looks like
MYDATA .equ 17
and in C,
#define MYDATA 17
For the purpose of project maintenance, we want to enter one definition once (or else they are going to diverge). How to do this if C doesn’t understand Assembler definitions and vice versa?
Let’s have Unimal make an Assembler include file, mydata.inc, and a C header file, mydata.h, from a single Unimal definition:
Example: Sharing a definition between C and Assembler
#MP Set MYTHING = 17 ;MYTHING is a Unimal name
#MP Export (0) "mything.h"
#define MYTHING #mp%dMYTHING
#MP Export (0) "mything.inc"
MYTHING .equ #mp%dMYTHING
#MP Export (0) ""
Done exporting MYTHING to mything.inc and mything.h
(If you are working along, c-asm.u is the file.)
The first line, since it doesn’t begin with #MP, is considered in a target language; in this case it is English. It is sent to the output which, by default, is standard output stream (stdout).
The second line is an already familiar Set operator. Note the keyword Set (which is optional, as we have seen). As a result, Unimal macro parameter MYTHING is assigned the numeric value 17. This is a common, language-independent definition which we want to be entered only once. Any text from a semicolon to the end of line is a comment and is ignored.
The third line is the Unimal export statement: the argument in parentheses, being a zero, instructs Unimal to overwrite the output file if it exists. The name of the output file is supplied as the second argument, so from the next line on the output is sent to the file mything.h.
Line 4 is a target language interface with a markup telling
Unimal to render the macro parameter MYTHING as a decimal number. The resulting
line,
#define MYTHING 17
is sent to mything.h
Similarly, line 5 switches output to the file mything.inc, and line
6, transformed to
MYTHING .equ 17
is sent to mything.inc
Line 7 switches the output again, but this time the file name is an empty string. By convention, it means the default output (stdout), and line 8 is output there.
A simple yet important application of parameter sharing in the embedded world is sharing the microcontroller memory map among the programming language(s), linker command file, and other post-link tools, like program image CRC calculations.
Unimal can be very useful in configuring and managing multiple projects in a family.
As an almost trivial example, consider a family of projects with customer-specific features implemented as fragments of code. Using C/C++ preprocessor, you might write something like this:
#define CUSTOMER CUSTOMER_B
...........................
#if CUSTOMER==CUSTOMER_A
<customer A code>
#endif
#if CUSTOMER==CUSTOMER_B
<customer B code>
#endif
#if CUSTOMER==CUSTOMER_C
<customer C code>
#endif
If, however, you deliver your product in source code, you don’t necessarily want a customer to see what other customers are getting (or who they are or that they even exist). Unimal allows writing a similar thing:
#MP Set CUSTOMER = CUSTOMER_B
...........................
#MP If CUSTOMER==CUSTOMER_A
<customer A code>
#MP Endif
#MP If CUSTOMER==CUSTOMER_B
<customer B code>
#MP Endif
#MP If CUSTOMER==CUSTOMER_C
<customer C code>
#MP Endif
This snippet illustrates the If statement; it works as
intuitively expected. The result of Unimal processing of this fragment will
contain only the code for the selected customer, in this example,
<customer B code>
Unimal can be extremely useful in numerous situations a programming practitioner faces. A few simpler ones are discussed above; others may be mentioned further as we go along.
Cases that are more realistic may be more complex than this introduction is comfortable with; they are covered in separate application notes.
One of the more interesting (and algorithmically complex) applications involves automatic generation of auxiliary data, such as lookup tables, along with the corresponding accessor functions.
However, technically simple applications, like forced loop unrolling, or compacting data representation, are no less useful.
And that’s the whole point of Unimal:
You decide what is to be achieved at build time. You, not bound by limits of your programming language, come up with a conceptual algorithm of realizing your goal. Unimal provides expressive power to implement your algorithm.
An interesting feature of Unimal is support of composite names. They provide functionality of arrays, sparse arrays, and associative arrays (like Perl hashes) in a uniform manner.
As a simple example, consider a disparate set of, say, character strings, which you would like to process later in a loop, so you would like to give them “indexed” names.
The following code does just that:
(1) #MP count = 0
(2) #MP Setstr str%dcount = “foo”
(3) #MP count = count+1
(4) #MP Setstr str%dcount = “bar”
(5) #MP count = count+1
(6) #MP Setstr str%dcount = “baz”
(7) #MP count = count+1
In lines 2, 4 and 6 we see a common construct, str%dcount (and a Setstr operator assigning a string value to a macro parameter). It is a base name (in our case, str), followed by one or more suffixes. A suffix is a format (such as%d) followed by a simple name (count). A base name is a simple name or it can be empty.
When we arrive at line 2, count=0, and the name in Setstris rendered by “printing” the suffix, according to the format, as a decimal number, resulting in str0. When we are at line 4, count is a 1, so line 4 is equivalent to
#MP Setstr str1 = “bar”
Similarly, line 6 is equivalent to
#MP Setstr str2 = “baz”
As a useful side effect, after line 7, count is the number of strings we enumerated.
Unimal allows shaping this in a prettier and better maintainable way by using its macro facility.
In general, our definitions will have some initialization (like line 1 above), actual definition statements and perhaps some post-processing stuff as well. So, a typical data definition would look like
#MP Expand BeginData()
#MP Expand DefineData(“foo”)
#MP Expand DefineData(“bar”)
#MP Expand DefineData(“baz”)
#MP Expand EndData()
The Unimal operator Expand expands a named macro with arguments passed in a comma-separated list in parentheses, which may be empty. A macro name can be the name of a previously defined macro or a string expression which resolves to a name of a previously defined macro.
The keyword Expand is optional and may be omitted without restrictions. Empty parentheses may be omitted, too. So, the text above may be rewritten as
#MP BeginData
#MP DefineData(“foo”)
#MP DefineData(“bar”)
#MP DefineData(“baz”)
#MP EndData
It is purely the matter of style to choose one look over the other.
Let’s define the macros applicable to this example:
(1) #MP Macro BeginData ;()
(2) #MP count = 0
(3) #MP Endm
(4) #MP Macro DefineData ;(string)
(5) #MP Setstr str%dcount = #1#
(6) #MP count = count+1
(7) #MP Endm
(8) #MP Macro EndData ;()
(9) #MP NumStrings = count
(10) #MP Undef count {NUM}
(11) #MP Endm
A macro definition begins the keyword Macro, followed by a simple name which is the name under which the macro will be known. It ends with the keyword Endm. Any lines in between are the macro body which is substituted for the Expand operator. In the Unimal text above, lines 1, 4, 8 are beginnings of macro definitions, lines 3, 7, 11 are their respective ends, and lines 2, 5-6, 9-10 are the corresponding bodies.
Note that the number and types of arguments are not part of a macro definition; so if you write a macro expecting certain parameters, it is a good idea to indicate that in a comment.
In the second macro body, we encounter the expression #1#; it is a way to name the first argument passed to the macro.
In the third macro body, just for the sake of illustration, we save the number of strings defined and undefine a numeric (NUM) value of count. If count had a string and/or a macro value, they would still be defined. That is to say, a Unimal macro parameter can have a numeric, a string and a macro values at the same time (one might say they belong to different namespaces; which value is used depends on the context).
We can test our macros by printing the enumerated strings (see enums.u):
We enumerated #mp%dNumStrings strings:
#MP For i=0, NumStrings -1
#mp%di. “#mp%sstr%di”
#MP Endfor
Here is the output:
We enumerated 3 strings:
0. "foo"
1. "bar"
2. "baz"
Typically, a definition sequence Begin/Define/End is meant to process the Define type statements more than once, i.e. in a loop. This requires wrapping the For statement in the Begin macro and the Endfor statement in End. Unimal allows this spanning of a loop (as well as of If/Else/Endif compounds) across several macros.
As a simple example, consider defining an initialized array foo of some objects in a C file foo.c, and publishing their indices in a header foo.h.
This is what it might look like:
#MP Expand BeginArray("foo")
#MP Expand ArrayEntry("MYINDEX", "myObject")
#MP Expand ArrayEntry("YOURINDEX", "yourObject")
#MP Expand ArrayEntry("HISINDEX", "hisObject")
#MP Expand ArrayEntry("HERINDEX", "herObject")
#MP Expand EndArray()
BeginArray takes the base name of a file (before extension); it is also the name of the array to generate. ArrayEntry takes a symbolic name of the index and an object expression.
Here are the macro definitions with explanations that follow (see indices.u):
#MP Macro BeginArray ;(basename)
#MP Setstr suffix0 = ".c"
#MP Setstr suffix1 = ".h"
#MP Export Push
#MP For pass = 0,1
#MP Export (0) {uJoin, #1#, suffix%dpass}
#MP count = 0
#MP If pass == 0
ob_type #mp%s#1#[] = {
#MP Endif
#MP Endm
#MP Macro ArrayEntry ;(index_name, object_name)
#MP If pass == 0
#mp%s#2#,
#MP Endif
#MP If pass == 1
#define #mp%s#1# #mp%dcount
#MP Endif
#MP count = count + 1
#MP Endm
#MP Macro EndArray ;()
#MP If pass == 0
};
#MP Endif
#MP Undef suffix%dpass {STR}
#MP Endfor
#MP Undef count {NUM}
#MP Undef pass {NUM}
#MP Export Pop
#MP Endm
BeginEntry first defines two strings, suffix0 and suffix1, to be the extensions of the files we are going to make. Second, the