FSMLang

Current revision is 1.42

Philosophy

Back to TOC

FSMLang is designed to allow design work in the problem space of finite state machines without the encumbrances of any particular implementation language. Thus, FSMLang is implemented as a "pre-processor," generating code in any desired general programming language to implement the described finite state machine. FSMLang allows effort to be focused on the definition of events, states, and transitions; indeed, though the action to be taken in any particular event/state intersection is declarable (of course), the actual definition of that action is treated as a detail which falls outside the scope of FSMLang. Moreover, the mechanisms for collecting or determining events are also outside the language scope. FSMLang creates an object or objects in a target programming language which, when inserted into the larger program structure will invoke the correct actions and make the correct transitions for the events handed it.

(Though it is said, "any desired general programming language," implementation of FSMLang output in languages other than C is an exercise currently left to the reader.)

The created state machine contains a single state variable, which should not be manipulated by any user-written function. This variable is maintained in the RW data area, not on the machine's function call stack. This means that the machine must not be called recursively; neither from within any action function, nor from separate threads of execution. The keyword reentrant can be used to designate machines which are called from different execution threads. Macros will be inserted at the beginning and end of the state machine function which must be defined to properly protect the machine from re-entrance.

Example: Simple State Machine

Back to Philosophy

As an example, consider a simple communications protocol which specifies that an acknowledgement must be received for each message before another may be sent. The sender of messages in such a protocol must have at least two states: In the first state, which will be called IDLE, the sender has sent no message. In the second, which will be called AWAITING_ACK, the sender has sent a message and is awaiting an acknowledgement. The events which this automaton will see are SEND_MESSAGE, the request by a client entity for a message to be sent, and ACK, the receipt of the acknowledgment from the peer automaton.

The valid actions, then, are to send a message if one is requested, and the automaton is in the IDLE state,and to simply return to the IDLE state if an ACK is received while in the AWAITING_ACK state. Sending a message requires a transition to the AWAITING_ACK state. The receipt of an acknowledgement while in the IDLE state represents a protocol error which may be safely ignored. A request to send a message while in the AWAITING_ ACK state, however, must result in the message being queued for later transmission.

Using FSMLang, this machine can be described this way:

/** This machine manages communications using a "stop and wait" protocol. Only one message is allowed to be outstanding. */ machine simpleCommunicator { state IDLE, AWAITING_ACK; event SEND_MESSAGE, ACK; /** Since we're idle, we can simply send the message. Transitioning to the AWAITING_ACK state ensures that any other messages we're asked to send will be queued. */ action sendMessage[SEND_MESSAGE,IDLE] transition AWAITING_ACK; /** Since we're busy, we must queue the message for later sending. The queue will be checked when the ACK is received. */ action queueMessage[SEND_MESSAGE,AWAITING_ACK]; /** We've received our ACK for the previous message. It is time to check for any others. */ action checkQueue[ACK,AWAITING_ACK] transition IDLE; /* these lines are informational; they affect the html output, but do not affect any code generated. */ /** queueMessage adds a message to the queue */ queueMessage returns noEvent; /** sendMessage sends a message from the queue. The message is expected to be there, since checkQueue will have been previously called. */ sendMessage returns noEvent; /** checkQueue only checks; it does not dequeue; that is done by sendMessage. Return SEND_MESSAGE when the queue is not empty. */ checkQueue returns SEND_MESSAGE, noEvent; }

(When no transition is specified, the machine remains in the state it was in when the event occured. And, a comma seperated list of events or states enclosed within parentheses may be used in place of any single event or state designation; in which case, the action specifed will be taken in the "cross product" of the two (event and/or state) vectors so described.)

These are the header files created

This is the source file created

This is the HTML created

The action functions themselves are not generated; instead, they are written by the machine implementor. This allows FSMLang to remain focused on the state machine, rather than the details of the implementation of its purpose for existing.

Events are the life-blood of the Finite State Machine. The machines created by FSMLang depend on mechanisms created by the user to provide these events "from the outside world." However, by default, the FSMLang core FSM function expects action functions to return an event. This is a powerful technique for the state machine to immediately feed events to itself, simplifying, for example, the management of internal errors. Imagine an action function which must allocate some memory to complete its task; should that memory not be available, the function can simply exit, returning a "memory not available" event; the state machine processing for this situation can be placed in another action function, and the machine can be transitioned to an error state, should the situation be unrecoverable. The core FSM function loops, calling appropriate actions, then making designated transitions, until an action function returns noEvent.

When a reentrant machine is necessary, that is, when a machine must be called both from interrupt and non-interrupt contexts, some care should be taken to ensure that the machine will not loop excessively, since interrupts will be blocked until the core FSM function exits.

Hierarchical

Back to Philosophy

A more complex example shows how FSMLang treats hierarchical state machines. We reuse the Simple Communicator from the first example, but add the requirement to establish a session with the peer before any message is sent. One way to do this is to have a top-level state machine with two sub state machines: One machine to establish the session; the other to send the messages, again with the requirement that only one message can be in transit at any time.

The fsm code describes the machines in this way. Note the similarity of the sendMessage sub machine to the simpleCommunicator, above.

/**

This machine manages communications using a "stop and wait" protocol. Only one message is allowed to be outstanding.

Before any message can be exchanged, however, a session must be established with the peer. Establishing a connection requires several exchanges to authenticate. The connection will remain active as long as messages continue to be exchanged with a minimum frequency.

*/ machine hsmCommunicator { state IDLE, ESTABLISHING_SESSION, IN_SESSION; event SEND_MESSAGE, SESSION_ESTABLISHED, SESSION_TIMEOUT; /** Establish a connection with the peer. */ machine establishSession { state IDLE, AWAITING_RESPONSE; event ESTABLISH_SESSION_REQUEST, STEP0_RESPONSE, STEP1_RESPONSE; /** Start the session establishment process. */ action sendStep0Message[ESTABLISH_SESSION_REQUEST, IDLE] transition AWAITING_RESPONSE; /** Continue session establisment */ action sendStep1Message[STEP0_RESPONSE, AWAITING_RESPONSE]; /** Notify parent that session is established. */ action notifyParent[STEP1_RESPONSE, AWAITING_RESPONSE] transition IDLE; /* these lines are informational; they affect the html output, but do not affect any C code generated. */ sendStep0Message returns noEvent; sendStep1Message returns noEvent; notifyParent returns parent::SESSION_ESTABLISHED; } machine sendMessage { state IDLE, IN_SESSION, AWAITING_ACK; event SEND_MESSAGE, ACK, SESSION_ESTABLISHED, SESSION_TIMEOUT; /** Transmit message to the peer. */ action sendMessage[SEND_MESSAGE,IN_SESSION] transition AWAITING_ACK; /** Place message into queue. */ action queueMessage[SEND_MESSAGE,(IDLE, AWAITING_ACK)]; /** Check queue for messages; if found pop from queue and return SEND_MESSAGE. If no message is found in the queue return noEvent. */ action checkQueue[ACK,AWAITING_ACK] transition IN_SESSION , checkQueue[SESSION_ESTABLISHED, IDLE] transition IN_SESSION; transition [SESSION_TIMEOUT, (IN_SESSION, AWAITING_ACK)] IDLE; /* these lines are informational; they affect the html output, but do not affect any C code generated. */ queueMessage returns noEvent; sendMessage returns noEvent; checkQueue returns SEND_MESSAGE, noEvent; } /* these are actions of the top level machine */ /** Start the session establishment process by activating the establishSession machine. */ action startSessionEstablishment[SEND_MESSAGE, IDLE] transition ESTABLISHING_SESSION; /** Start the session timer and notify the sendMessage machine that the session is established. */ action completeSessionStart[SESSION_ESTABLISHED, ESTABLISHING_SESSION] transition IN_SESSION; /** Notify the sendMessage machine that the session has timed-out. */ action notifySessionTimeout[SESSION_TIMEOUT, (ESTABLISHING_SESSION, IN_SESSION)] transition IDLE; /** Extend the session timer and pass the message to be sent to the sendMessage machine. */ action requestMessageTransmission[SEND_MESSAGE, (ESTABLISHING_SESSION, IN_SESSION)]; /* these lines are informational; they affect the html output, but do not affect any C code generated. */ startSessionEstablishment returns SEND_MESSAGE; completeSessionStart returns sendMessage::SESSION_ESTABLISHED; notifySessionTimeout returns sendMessage::SESSION_TIMEOUT; requestMessageTransmission returns sendMessage::SEND_MESSAGE; }

The top-level machine's event enumeration contains the events of all machines so that any event can be passed to the machine. Each sub-machine has its own enumeration, but these enumerations do not start at 0. Rather, they are arranged so that each noEvent event has the same numeric value as the corresponding event in the top-level enumeration. In this way, sub-machines are easily selected based on their numeric event range.

The FSM function of a sub-machine returns a top-level machine event, providing the mechanism by which the sub-machines communicate with the top-level machine. The return value of the sub-machine FSM function is the event returned by a sub-machine action function when that event is not a sub-machine event. When the sub-machine action returns the sub-machine's noEvent that event is translated into the top-level machine's noEvent.

FSMLang does not provide macros for sub-machines to indicate that their action functions return an event from another sub-machine. Rather, sub-machine actions can return events from their own machine, or from their parent. This is by design. The top-level machine should provide actions when necessary to bridge between the activation of different sub-machines. This, of course, is not a rule, since the event enumeration is available and members can be written out by hand. But, thus chaining together sub-machines violates the "highly coherent, loosely coupled" principle, which principle has value even within a hierarchical state machine.

That being said, it is possible for a sub-machine to designate any event as being "from the parent" (event parent::e1, for example). The name, of course, must be that of an event actually declared in the parent. Moreover, by also giving the name of a data translation function (e.g. event parent::e1 data translator dt1), sub-machines can update their local data from the parent. FSMLang generates code for the parent to call the submachine when a shared event occurs. If a data translator is given, it will be called before the sub-machine is invoked. More than one sub-machine can share a parent event; the machines will be called in the order they are defined. The loop will be exited when a machine returns anything other than noEvent, so that the parent can then handle that event. Once one sub-machine does not return noEvent, no other sub-machine will have a chance to see the event.

It is possible, that the parent would want to inhibit the operation of the sub-machines. state s1 inhibits submachines will do just that; no submachines will be run by FSMLang in that state.

The html file created for the top-level machine contains a list of sub-machines. The list is hyper-linked to the html file for the respective machine.

The -ts flag must always be used to generate code due the way the event enumerations are generated for sub-machines. This limitation may be removed in later releases.

These are the header files created

These are the source files created

These are the html files created

Note that the -tp option was used to create PlantUML output, which was then processed by plantuml to produce an SVG image. The html was created using the --include-svg-img=true option to include the image in the output.

An unrealized goal of the FSMLang effort is to optimize the output machine for speed and size based on an analysis of the event-state matrix. There are command-line switches which force the creation of a compact table, or the use of switch statements instead of a table, but these are manual. One should be able to make those desicions based on the density of the event-state matrix. It may also be possible, using matrix transformations to make some part of the matrix very dense, then use a hybrid table/switch approach in the code.

State Entry and Exit Functions

Back to Philosophy

Entry and exit functions may be attached to states. For example: state some_state on entry prepare_some_state; Adds a named entry function to some_state.

Similarly, state some_state on exit tidy_up_some_state; Adds tidy_up_some_state as a named exit function to some_state

When entry or exit functions exist, code will be generated to call them appropriately. Names given in the .fsm file are prepended with the name of the machine. "Anonymous" entry or exit functions can be declared simply by ommiting the names. In this case names will be generated using the pattern <machine_name>_onEntryTo_<state_name;> for entry functions, and similarly, using onExitFrom for exit functions. When weak function generation is not disabled, weak versions of these anonymous functions are created.

Because they are powerful, entry and exit functions can be mis-used. As the names chosen here suggest, they should be used only to prepare a state or to leave it in a tidy condition. They should not be used as a substitue for the creation of sub-machines, or well-thought out action sequences.

Event Data

Back to Philosophy

Events may be declared to have associated data. For example:

event data_packet_arrived data { unsigned length; uint8_t *packet; };

When any event is declared with data, FSMLang shifts the event declaration from a simple enumeration to a structure containing the event enumeration and the union of the event data structures. This structure becomes the method for passing events in from the outside world into the state machine.

Continuing the example, event data_packet_arrived will cause the declaration of this structure for the event's data:

typedef struct _hsmCommunicator_data_packet_arrived_ { unsigned length; uint8t_t *packet; } HSM_COMMUNICATOR_DATA_PACKET_ARRIVED_DATA, *pHSM_COMMUNICATOR_DATA_PACKET_ARRIVED_DATA;

The event data union is declared:

typedef union { HSM_COMMUNICATOR_DATA_PACKET_ARRIVED_DATA data_packet_arrived_data; . . . } HSM_COMMUNICATOR_EVENT_DATA, *pHSM_COMMUNICATOR_EVENT_DATA;

Finally, the machine's event enumeration will be named <MACHINE_NAME>_EVENT_ENUM, and a structure, containing the event enumeration and the event data union will be created:

typedef struct { HSM_COMMUNICATOR_EVENT_ENUM event; HSM_COMMUNICATOR_EVENT_DATA event_data; } HSM_COMMUNICATOR_EVENT, *pHSM_COMMUNICATOR_EVENT;

As indicated above, this structure is used to communicate external events to the state machine. It is not used to communicate events internally. Internally, events are communicated only through the event enumeration, as is done by machines not having events with data. Thus, action functions are declared as returning HSM_COMMUNICATOR_EVENT_ENUM, rather than HSM_COMMUNICATOR_EVENT. Because of this, the data communicated by the (external) event must be moved into the machine's data structure, in order to be visible to the action functions and to sub-machines' data translators.

The movement of data is done through a invocation of a data translator. The translater is named in the event declaration:

event data_packet_arrived data translator copy_data_packet;

When a data translator is indicated but not named, a translator function will be declared as <machine_name>_translate_<event_name>_data.

In either case, the function argument is a pointer to the parent's data structure.

Using the State Machine

Back to TOC

C Files Produced

Back to Using the State Machine

As has been illustrated, when C language output is requested, a header/source file pair is created for each machine in the input file. The base header file names end with the string "_priv" to emphasize that they should be included only by the files implementing the various action, transition and other functions required by the machine itself. For any machine having submachines, a header with a name ending with "_submach" is created to communicate material which must be known by the submachines. This header is included in the submachine's header, and so can be seen as transparant. Finally, an additional header file for each top-level machine is created for use by other application files so that the machine may be invoked as needed in the application being created. For the top-level machine, base file names start with the base name of the input file. For any sub-machines, base file names are formed from the sub-machine names.

The source files contain all of the functions and structures necessary for the FSM itself, in addition to weak (and mostly empty) implementations of any functions expected to be created by the machine's designers. Structure definitions needed only by the FSM itself are placed in the source file. The private headers contain only those definitions needed by user functions.

The header file design is intended to keep the project's global name space free from structure and function declarations needed only by the machine. Nevertheless, since C does not support name-spaces, the machine ancestry must be prepended to many user-declared function names. Deeply nested machines can result in very long function names, prompting the creation of the convenience macros discussed later.

Adding Code to Generated Source and Header Files

Back to Using the State Machine

Declaring machine or event data often requires the inclusion of header files which define data types used. This is done using the native keyword:

native { #include <stdio.h> #include "project_header.h" #define DBG_PRINTF(...) printf(__VA_ARGS__); printf("\n"); #include "start_extern_c.h" }

Text in the native block is parsed by FSMLang in order to find the closing brace. This means that any braces which appear in the block must be balanced. The alert reader will guess that the file, start_extern_c.h contains

#ifdef __cplusplus extern "C" { #endif

This unbalanced brace would cause a parsing error if put directly in the .fsm file.

When machines have sub-machines, multiple header files are created so the native block is protected from repeated compilation.

It is also possible to have code included at the end of any generated header:

native epilogue { #include "end_extern_c.h" }

Even though this block will go at the end of generated headers, it must be placed before the machine is defined.

For the sake of symmetry, there is a keyword, prologue. But, since its use is optional, it can be said to do nothing other than to make clear which native block will be placed at the beginning of the header files.

Blocks can also be declared for inclusion at the top or bottom of the generated C source files using the keyword impl(ementation)?. Such blocks are placed after the machine name, but before the opening curley-brace.

machine some_machine native impl { #include "start_extern_c.h" } native implementation epilogue { #include "end_extern_c.h" } { /* machine declarations */ }

implementation can be shortened to impl, but no other variants are recognized. As with the native blocks, the optional keyword prologue may be used.

Writing User Functions

Back to Using the State Machine

Since the most prevelant user functions written are action functions, the files which contian their definitions are often named <machineName>_actions.c. This file must include <machineName>_priv.h. Several convenience macros are provided which facilitate the writing of function signatures and return values, as illustrated below using the establishSession machine.

Macro Definition Example
ACTION_RETURN_TYPE #define ACTION_RETURN_TYPE HSM_COMMUNICATOR_EVENT ACTION_RETURN_TYPE hsmCommunicator_establishSession_sendStep0Message(pESTABLISH_SESSION);

When events do not carry data, they are a simple enumeration; however, when they do carry data, events being handed to the state machine are unions, but the values returned by action functions are the event enumeration. Using the ACTION_RETURN_TYPE macro facilitates adding data to events without changing action function signatures.

FSM_TYPE_PTR #define FSM_TYPE_PTR pESTABLISH_SESSION ACTION_RETURN_TYPE hsmCommunicator_establishSession_sendStep0Message(FSM_TYPE_PTR);

Using FSM_TYPE_PTR facilitates shifting user function definitions to new machines, as when tweaking sub-machine design.

THIS(A) #define hsmCommunicator_establishSession_##A Best used to express event return values: return THIS(ESTABLISH_SESSION_REQUEST);

Writing the name prefix can be tedious. The THIS macro relieves that. As an added benefit, should the process of designing the FSM be iterative (when is designing code done without at least some iteration?), using THIS for return statements facilitates shifting user function definitions to new machines, as when tweaking sub-machine design.

PARENT(A) #define hsmCommunicator_##A Best used to express event return values: return PARENT(SESSION_ESTABLISHED);

Writing the name prefix can be tedious. The PARENT macro relieves that. As an added benefit, should the process of designing the FSM be iterative (when is designing code done without at least some iteration?), using PARENT for return statements facilitates shifting user function definitions to new machines, as when tweaking sub-machine design.

UFMN(A) #define UFMN(A) hsmCommunicator_establishSession_##A

Though apparently the same as THIS, UFMN can morph to:

#define UFMN(A) estabishSession_##A

when --short-user-fn-names=true is used.

This macro is intended to provide the same benefits as the use of THIS and PARENT to the definition of user functions themselves. Wrapping the function name with UFMN allows movement of user functions between machines without re-typeing their names. The example below is the most portable way to express the sendStep0Message action function's signature:

ACTION_RETURN_TYPE UFMN(sendStep0Message)(FSM_TYPE_PTR);

Invoking the State Machine

Back to Using the State Machine

To use the state machine in C code, include the generated top-level header file, and use the provided run_<machineName>

Using the Simple Communicator as an example, this line runs the machine with the ACK event:

run_simpleCommunicator(simpleCommunicator_ACK);

This would be placed in the function which receives the message. Since such functions are often IRSs, it must be remembered that the machine is not reentrant. Invocations of the machine outside of this ISR context must be properly protected from this interrupt. The needed protection can be facilitated through the use of the reentrant keyword to modify the machine declaration. Or, it may be done by simply wraping the invocation with interrupt protection.

Expoiting make

Back to Using the State Machine

Three command line options are provided specifically to exploit the power of make for FSMLang generated code. These options are controlled by the -t selection, which must appear, and must appear afterward.

-M Produces a list of the source files which would be created.
-Mh Produces a list of the header files which would be created.
-Md

Produces a recipe which specifies how to create the generated files from the .fsm source:

machine.c submachine1.c submachine2.c subsubmachine1.c: machine.fsm $(FSM) $(FSM_FLAGS) $<

Commenting and Graphical Output

Back to TOC

As shown in the examples above, "document comments" can be added to the FSMLang file. In addition to illuminating the FSMLang file itself, these comments are used in the HTML and PLANTUML output. Starting with Release 1.40 document comments are processed somewhat differently than in earlier releases.

The most visible change is that document comments appearing before action or transition matrices now adhere to the matrix. Formerly, such comments would adhere either to the action, or to the first event in the matrix, in the case of a transition only matrix. (This last circumstance was not really desireable, of course.) By adhering to the matrix, the comment addresses the involved event and state vectors (along with any associated transition), rather than the action. Previously, there was no way to make such a comment.

With this change, the comments are now used differently in the HTML output, and used for the first time in the plantuml output.

In the HTML, the comments are placed in the relevant cell of the event/state table. If the matrix is a cross-product of non-trivial event and state vectors, each cell in that product will contain the comment. If this is not desired, the vectors can be broken up into trivial vectors (i.e. individual event, state pairs), each being given the appropriate comment.

In the plantuml, these comments are used as notes on link, appearing next to the line linking two states by the transition.

HTML formatting can be controlled by providing an alternate style sheet (--css-content-filename=<filename> ). To provide independence to the HTML output, the stylesheet can be included directly into the HTML document (--css-content-internal=true).

The generated plantuml can be altered in several ways: The machine name can be set as the diagram title ( --add-plantuml-title=true). Strings may be given which will be placed after the opening @plantuml, but before any FSMLang generated content (--add-plantuml-prefix-string=<text>). And, whole files can be named which will be similarly placed (--add-plantuml-prefix-file=<filename>). The last two options can be used any number of times; the text or files will be added in the order given; but all strings will be added before any files. Finally, --add-plantuml-legend=true will add a legend, which, by default, will contain the names and associated document commnents for all events, states, and, actions of the machine. Any of these three may be excluded by --exclude-[events|states|actions]-from-plantuml-legend=true.

Making the fsm Executable

Back to TOC

The source is in a git repository at https://github.com/FSMLang/FSMLang.

The simplest way to make the executable is use a Linux installation. Create a build directory at the same level as ~/src. From that directory type

cmake .. -G "Unix Makefiles"

to create the necessary Makefile. Then, simply invoke make with no target.

To run the tests, set OUTPUT_DIR to the build directory and export it. Then, from ~/src type

make Linux.testonly

To build on Windows, without WSL, install the MinGW-64 toolset. Then, follow the instructions above using a gitbash shell. The cmake invocation must be with -G "MinGW Makefiles" in this case.

The tests do not execute properly in the MinGW environment.

The test target is supported by a legacy build system (see the Makefile in ~/src). That system is being abandoned in favor of the cmake system.

simpleCommunicator.fsm and hsmCommunicator.fsm, shown in this page, are at the top of the tree.

Command Syntax

Back to TOC

Basic Language Syntax

Back to TOC

The example machine above illustrates the major language features and syntax. There are five basic keywords: machine, state, event, action, transition. A sixth keyword, returns, was shown in the example in its use as an informational element for actions, but is not further discussed here. Identifiers follow the C rules, as do declarations, which may be comma seperated, as in the state and event declarations above, or set seperately delimited by semi-colons as are all of the action declarations. State and event declarations may be interspersed, but all must come before the first action declaration. Naming scope is within a given machine; that is, on the one hand, all states and events must be declared within a machine in order to be used by the machine, and on the other, different machines may use the same names for states and events because they will apply only within that machine definition.

The action declaration is :

action action_identifier[event_vector,state_vector];

Or,

action action_identifier[event_vector,state_vector] transition state_identifier;

Where event_vector is

event_identifier

or,

(event_identifier_list)

with event_identifier_list being

event_identifier

or

event_identifier, event_identifier_list.

The analogous definition holds for state_vectors.

It is also possible to declare an actionless transition:

transition[event_vector, state_vector] state_identifier;

More examples

There are many files with the .fsm extension included in the distribution, most found under the test directory. All illustrate valid constructions. Moreover, the tests starting with full_test create executables with action functions supplied. The effects of the use of other keywords not discussed here can also be seen, such as:

Finally, the test makefiles show how to add .fsm targets to a makefile.