Decomposition and encoding of finite state machines for FPGA implementation
Slusarczyk, A S

DOI:
10.6100/IR582591

Published: 01/01/2004

Document Version
Publisher's PDF, also known as Version of Record (includes final page, issue and volume numbers)

Please check the document version of this publication:

• A submitted manuscript is the author's version of the article upon submission and before peer-review. There can be important differences between the submitted version and the official published version of record. People interested in the research are advised to contact the author for the final version of the publication, or visit the DOI to the publisher's website.
• The final author version and the galley proof are versions of the publication after peer review.
• The final published version features the final layout of the paper including the volume, issue and page numbers.

Link to publication

Citation for published version (APA):
Eindhoven: Technische Universiteit Eindhoven DOI: 10.6100/IR582591

General rights
Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights.

• Users may download and print one copy of any publication from the public portal for the purpose of private study or research.
• You may not further distribute the material or use it for any profit-making activity or commercial gain
• You may freely distribute the URL identifying the publication in the public portal

Take down policy
If you believe that this document breaches copyright please contact us providing details, and we will remove access to the work immediately and investigate your claim.
Decomposition and Encoding of Finite State Machines for FPGA Implementation

PROEFSCHRIFT

ter verkrijging van de graad van doctor aan de
Technische Universiteit Eindhoven,
op gezag van de Rector Magnificus, prof.dr. R.A. van Santen,
voor een commissie aangewezen door het College voor
Promoties in het openbaar te verdedigen op
woensdag 15 december 2004 om 14.00 uur

door
ALEKSANDER STANISLAW ŚLUSARCZYK
geboren te Warschau, Polen
Summary

One of the most important steps in the synthesis of digital systems is the state encoding of Finite State Machines (FSMs). This step performs translation of the symbolic functional description of the system’s architecture module into its binary representation. Therefore, it has profound influence on all quality aspects of the final binary logic implementation. Unfortunately, despite many research efforts, the problem of effective state encoding has not been so far satisfactorily solved for most implementation technologies.

This is in particular the case for the modern programmable implementation platforms, such as Field Programmable Gate Arrays (FPGAs), reconfigurable systems-on-chip (SoC) platforms and embedded FPGAs. These platforms possess unique characteristics, such as the availability of universal function blocks only limited by their number of inputs and outputs and the critical importance of interconnections, that have not been adequately addressed by the existing state encoding methods.

This thesis proposes and discusses a new approach to the problem of state encoding of FSMs targeting FPGA implementation and new method and electronic design automation tool that implements this approach. The proposed approach is based on FSM decomposition. Decomposition is a natural method of handling complex problems. It consists in splitting a complex problem into smaller, interrelated sub-problems and solving the network of sub-problems. Applied to FSMs, decomposition consists in realizing the behavior of a complex FSM as a network of smaller, collaborating FSMs.

To form sound theoretical base for the research on decomposition and encoding, this thesis presents extension of the existing General Decomposition Theorem to the case of multi-state realization of incompletely specified FSMs. This theorem covers as special cases all other known decomposition structures, such as serial and parallel decomposition, and the decomposition of discrete functions and relations. The theorem explicitly treats state and input information flows within the decomposed machine or function and thus supports the novel, information-driven approach to decomposition.

In this thesis, the problem of state encoding is also considered as a special case of general decomposition. This way, the extended decomposition theorem forms the basis for the formulation of flexible conditions for valid state encoding that enable implicit minimizations of the machine in the process of encoding and, in consequence, its efficient implementation. Based on the encoding validity conditions, a generic, implementation-platform-independent state assignment method is proposed.

This generic method is further augmented with the heuristics targeting efficient FPGA implementation. The heuristics exploit the information modeling apparatus of set systems used in formulation of the General Decomposition Theorem and analysis apparatus of information relationships and measures to perform analysis and optimiza-
tion of the information flows within the encoded machine during the encoding process.
The heuristics were implemented in the software tool SECODE, which is discussed in detail.

The effectiveness of the decompositional, information-based approach to state encoding is analyzed by testing the developed encoding method and tool on a set of standard benchmarks and a number of generated FSMs exhibiting characteristics typical to the circuits encountered in various industrial applications. The presented experimental results indicate that the new approach proposed and researched in this thesis compares favorably with the existing state encoding methods.
Samenvatting

Een van de meest belangrijke stappen in de synthese van digitale systemen is toestandcodering van Finite State Machines (FSM’s). Deze stap realiseert de vertaling van de symbolische functionele beschrijving van de systeemarchitectuurmodule naar de binaire realisatie daarvan. Daarom heeft het een diepgaande invloed op alle kwaliteitsaspectsen van de definitieve implementatie van binaire logica. Helaas is ondanks vele onderzoekspogingen het probleem van de effectieve toestandcodering voor de meeste implementatie-technologieën tot nu toe nog niet naar tevredenheid opgelost.

Dit is met name het geval voor de moderne programmeerbare implementatie platforms, zoals Field Programmable Gate Arrays (FPGA’s), aangepaste systems-on-chip (SoC) platforms en ingebouwde FPGA’s. Deze platforms bezitten unieke eigenschappen zoals de beschikbaarheid van universele functieblokken die slechts beperkt zijn door het aantal ingangen en uitgangen en het kritische belang van onderlinge verbindingen die door de bestaande toestandcoderingsmethodes nog niet goed genoeg geadresseerd zijn.

Dit proefschrift introduceert en behandelt een nieuwe aanpak van het probleem van toestandcodering van FSM’s zich richtend op FPGA implementatie evenals een nieuwe methode en tool voor elektronische design automatisering die deze aanpak implementeren. De voorgestelde benadering is gebaseerd op FSM decompositie. Decompositie is een natuurlijke methode om met complexe problemen om te gaan. Het is gebaseerd op het delen van een complex probleem in kleinere, met elkaar in verband staande subproblemen en het oplossen van het netwerk van subproblemen. Toegepast op FSM’s is decompositie gebaseerd op het realiseren van het gedrag van een complex FSM als een netwerk van kleinere samenwerkende FSM’s.

Om een grondige theoretische basis voor het onderzoek naar decompositie en codering te leggen, introduceert dit proefschrift een uitbreiding van het bestaande General Decompositie Theorema voor wat betreft de multi-toestandrealisatie van incompleet gespecificeerde FSM’s. Dit theorema dekt als bijzondere gevallen alle andere bekende decompositiestructuren zoals seriële en parallelle decompositie en decompositie van discrete functies en relaties. Dit theorema behandelt toestands- en ingangsinformatiestromen binnen de gedesintegreerde machine of functie en ondersteunt op deze manier de nieuwe informatiegerechte aanpak voor decompositie.

In dit proefschrift wordt het probleem van toestandcodering ook als een bijzonder geval van algemene decompositie beschouwd. Op deze manier vormt het uitgebreide decompositie theorema de basis voor de formuleren van flexibele condities voor geldige toestandcodering die geïmpliceerde minimalisaties van de machine in het proces van codering en, als gevolg daarvan, de efficiënte implementatie daarvan mogelijk maken. Gebaseerd op de coderingsvaliditeitsvoorwaarden wordt een generieke implementatie-
platform-onafhankelijke toestandstoewijzingmethode voorgesteld.

Deze generieke methode wordt verder uitgewerkt door het ontwikkelen van de heuristische beslissingsmechanismen die zich richten op de efficiënte FPGA-implementatie. Deze heuristische beslissingsmechanismen maken gebruik van informatie-modelleringsgereedschap van set systemen dat in de formulering van het General Decomposition Theorema gebruikt wordt en tevens van analyse-gereedschap van informatierelaties en maten om analyse en optimalisering van de informatiestromen binnen de gecodeerde machine gedurende het coderingsproces ten uitvoer te brengen. De heuristische beslissingsmechanismen werden in de software tool SECODe geïmplementeerd; deze wordt in detail besproken.

De doeltreffendheid van de decompositional, informatiegebaseerde aanpak van toestandcodering wordt geanalyseerd door het testen van de ontwikkelde coderingsmethode en coderingsool op een verzameling standaard benchmarks en een aantal gegenereerde FSM’s die eigenschappen vertonen die typisch zijn voor de schakelingen waarmee men in diverse industriële toepassingen geconfronteerd werd. De nieuwe, in dit proefschrift voorgestelde en onderzochte aanpak levert betere resultaten op dan de reeds bestaande toestandcoderingsmethodes.
Acknowledgments

I would like to thank everyone who contributed to the work presented in this thesis.

I would like to thank prof. Mario Stevens, who sadly is no longer amongst us, for giving me the opportunity to carry out this research. I will remember the years in his CND group as an extremely stimulating and interesting period of my life. I also wish to thank prof. Ralph Otten for his assistance in the later stages of my work in ICS group and on this thesis.

My warmest thanks go to the co-promotor of this thesis, dr. Lech Jóźwiak, for his continuous support and for sharing his insight and enthusiasm during our countless discussions.

A special thanks is also directed to my fellow researchers – Artur Chojnacki, Szymon Biegański and Dominik Gawłowski – for the time we spent together at work and at leisure. I would also like to thank all members of CND and ES group, and especially the “coffee room club”, for all the exchanges of views we had on all topics imaginable.

Rian van Gaalen and Marja de Mol-Regels deserve especially warm thank you for all the support they provide to the whole ICS group and especially to new foreign members, often struggling in the new environment.

## Contents

### 1 Introduction

1.1 Digital Systems .............................................. 1
1.2 Implementation technologies ............................... 2
   1.2.1 ASICs ................................................. 2
   1.2.2 Programmable logic .................................. 4
   1.2.3 Summary .............................................. 8
1.3 FPGA design flow .......................................... 10
   1.3.1 State assignment ...................................... 12
1.4 General decomposition ..................................... 16
1.5 Research aims and thesis overview ....................... 18
   1.5.1 Motivation ............................................. 18
   1.5.2 Research overview .................................... 20
   1.5.3 Thesis overview ...................................... 21
   1.5.4 Thesis outline ....................................... 22

### 2 Preliminaries

2.1 Boolean functions ........................................... 25
2.2 Finite State Machines ..................................... 26
2.3 Covers ......................................................... 30
   2.3.1 Dichotomies ........................................... 32
2.4 Information analysis ....................................... 34
   2.4.1 Information model ..................................... 34
   2.4.2 Information relationships and measures .......... 41

### 3 General Decomposition Theorem

3.1 Decomposition of completely specified FSM ............... 46
   3.1.1 General Decomposition Theorem .................... 46
   3.1.2 Example .............................................. 50
3.2 Decomposition of incompletely specified FSM ............. 60
   3.2.1 Extensions of GDT ................................... 60
   3.2.2 General Decomposition Theorem .................... 66
3.3 Proof of the General Decomposition Theorem ............ 66
   3.3.1 Forward proof ....................................... 66
   3.3.2 Reverse proof ....................................... 74
3.4 General decomposition example ............................ 76
   3.4.1 Output and state behavior realization ............ 76
3.4.2 Output behavior realization ........................................ 82
3.4.3 Construction of the decomposition structure ................. 83
3.5 Conclusions ............................................................. 87

4 General Decomposition in circuit synthesis ....................... 89
  4.1 Sequential synthesis ............................................... 89
    4.1.1 State-encoding-induced decomposition ....................... 89
    4.1.2 Mechanics of state assignment ................................. 99
  4.2 Combinational synthesis ......................................... 105
  4.3 Conclusions .......................................................... 108

5 Effective and efficient state assignment for LUT-FPGAs .......... 111
  5.1 Introduction ........................................................ 111
  5.2 State assignment heuristics ..................................... 112
    5.2.1 Heuristic outline ............................................. 112
    5.2.2 Information flows example ................................... 113
    5.2.3 Additional issues .............................................. 115
  5.3 The method .......................................................... 116
    5.3.1 Outline ........................................................ 116
    5.3.2 Initial encoding .............................................. 122
    5.3.3 Input supports calculation ................................... 122
    5.3.4 Dichotomy clustering ......................................... 124
    5.3.5 Code construction ............................................. 130
  5.4 Structures and algorithms ....................................... 132
    5.4.1 Twin Graph ................................................... 132
    5.4.2 Clustering ..................................................... 138
    5.4.3 Code improvement with simulated annealing ............... 139
  5.5 Special encodings ................................................ 142
    5.5.1 One-hot encoding ............................................. 143
  5.6 Conclusions .......................................................... 144

6 Experiments ............................................................ 147
  6.1 Comparison of one-hot and min-length encodings .............. 147
  6.2 SECODE ............................................................. 149
    6.2.1 Standard benchmarks .......................................... 149
    6.2.2 Interconnections .............................................. 153
    6.2.3 Layout results ................................................. 155
    6.2.4 Comparison of synthesis chains ............................... 157
  6.3 Generated benchmarks ............................................. 159
    6.3.1 BENGEN benchmark generator ................................ 160
    6.3.2 Experimental results for generated benchmarks ........... 161

7 Conclusions and future work ......................................... 165

A Encoding results for generated FSMs ............................... 177
# List of definitions

<table>
<thead>
<tr>
<th>Section</th>
<th>Definition</th>
<th>Page</th>
</tr>
</thead>
<tbody>
<tr>
<td>2.1</td>
<td>Completely specified Boolean function</td>
<td>25</td>
</tr>
<tr>
<td>2.2</td>
<td>Incompletely specified Boolean function</td>
<td>25</td>
</tr>
<tr>
<td>2.3</td>
<td>Completely specified finite state machine (FSM) – Mealy type</td>
<td>26</td>
</tr>
<tr>
<td>2.4</td>
<td>Completely specified finite state machine (FSM) – Moore type</td>
<td>26</td>
</tr>
<tr>
<td>2.5</td>
<td>Incompletely specified sequential machine</td>
<td>28</td>
</tr>
<tr>
<td>2.6</td>
<td>Single-state output behavior realization of completely specified FSM</td>
<td>28</td>
</tr>
<tr>
<td>2.7</td>
<td>Single-state output and state realization of completely specified FSM</td>
<td>29</td>
</tr>
<tr>
<td>2.8</td>
<td>Output behavior realization of incompletely specified FSM</td>
<td>29</td>
</tr>
<tr>
<td>2.9</td>
<td>Output and state realization of incompletely specified FSM</td>
<td>29</td>
</tr>
<tr>
<td>2.10</td>
<td>Realization structure</td>
<td>30</td>
</tr>
<tr>
<td>2.11</td>
<td>Cover</td>
<td>30</td>
</tr>
<tr>
<td>2.12</td>
<td>Unique-block cover</td>
<td>30</td>
</tr>
<tr>
<td>2.13</td>
<td>Set system</td>
<td>31</td>
</tr>
<tr>
<td>2.14</td>
<td>Partition</td>
<td>31</td>
</tr>
<tr>
<td>2.15</td>
<td>Partition product</td>
<td>31</td>
</tr>
<tr>
<td>2.16</td>
<td>Unique-block cover product</td>
<td>31</td>
</tr>
<tr>
<td>2.17</td>
<td>Cover sum</td>
<td>31</td>
</tr>
<tr>
<td>2.18</td>
<td>Cover relation ≤</td>
<td>31</td>
</tr>
<tr>
<td>2.19</td>
<td>Zero-cover</td>
<td>32</td>
</tr>
<tr>
<td>2.20</td>
<td>One-cover</td>
<td>32</td>
</tr>
<tr>
<td>2.21</td>
<td>Induced cover</td>
<td>32</td>
</tr>
<tr>
<td>2.22</td>
<td>Unordered dichotomy</td>
<td>32</td>
</tr>
<tr>
<td>2.23</td>
<td>Ordered dichotomy</td>
<td>32</td>
</tr>
<tr>
<td>2.24</td>
<td>Unordered dichotomy compatibility</td>
<td>33</td>
</tr>
<tr>
<td>2.25</td>
<td>Ordered dichotomy compatibility</td>
<td>33</td>
</tr>
<tr>
<td>2.26</td>
<td>Dichotomy merging operator *</td>
<td>33</td>
</tr>
<tr>
<td>2.27</td>
<td>Dichotomy covering relation ⊆</td>
<td>33</td>
</tr>
<tr>
<td>2.28</td>
<td>Elementary information</td>
<td>35</td>
</tr>
<tr>
<td>2.29</td>
<td>Information set</td>
<td>36</td>
</tr>
<tr>
<td>2.30</td>
<td>Elementary abstraction</td>
<td>36</td>
</tr>
<tr>
<td>2.31</td>
<td>Abstraction set</td>
<td>36</td>
</tr>
<tr>
<td>2.32</td>
<td>Occurrence multiplicity of elementary information</td>
<td>42</td>
</tr>
<tr>
<td>3.1</td>
<td>Block transition function</td>
<td>47</td>
</tr>
<tr>
<td>3.2</td>
<td>Block output function</td>
<td>47</td>
</tr>
<tr>
<td>3.3</td>
<td>( S \times I = \hat{S} ) partition pair</td>
<td>47</td>
</tr>
<tr>
<td>3.4</td>
<td>( S \times I = O ) partition pair</td>
<td>47</td>
</tr>
<tr>
<td>Section</td>
<td>Title</td>
<td>Page</td>
</tr>
<tr>
<td>---------</td>
<td>----------------------------------------------------------------------</td>
<td>------</td>
</tr>
<tr>
<td>3.5</td>
<td>General composition of sequential machines</td>
<td>48</td>
</tr>
<tr>
<td>3.6</td>
<td>General full-decomposition</td>
<td>48</td>
</tr>
<tr>
<td>3.7</td>
<td>Operator $\bigcup$</td>
<td>60</td>
</tr>
<tr>
<td>3.8</td>
<td>Operator $\bigcap$</td>
<td>60</td>
</tr>
<tr>
<td>3.9</td>
<td>Cover transition function</td>
<td>61</td>
</tr>
<tr>
<td>3.10</td>
<td>Synchronized set of $S \times I - S$ cover pairs</td>
<td>62</td>
</tr>
<tr>
<td>3.11</td>
<td>Block output function</td>
<td>62</td>
</tr>
<tr>
<td>3.12</td>
<td>Cover output function</td>
<td>62</td>
</tr>
<tr>
<td>3.13</td>
<td>$S \times I - O$ cover pair</td>
<td>62</td>
</tr>
<tr>
<td>3.14</td>
<td>Cover specialization</td>
<td>63</td>
</tr>
<tr>
<td>3.15</td>
<td>Multi-state realization</td>
<td>63</td>
</tr>
<tr>
<td>3.16</td>
<td>Labeled cover transition function</td>
<td>64</td>
</tr>
<tr>
<td>3.17</td>
<td>Labeled cover output function</td>
<td>65</td>
</tr>
<tr>
<td>3.18</td>
<td>$S_l \times I - S_l$ cover pair</td>
<td>65</td>
</tr>
<tr>
<td>3.19</td>
<td>$S_l \times I - O$ cover pair</td>
<td>65</td>
</tr>
<tr>
<td>3.20</td>
<td>$S_l - S$ cover pair</td>
<td>65</td>
</tr>
<tr>
<td>4.1</td>
<td>Support pair set</td>
<td>97</td>
</tr>
<tr>
<td>5.1</td>
<td>Matching</td>
<td>133</td>
</tr>
<tr>
<td>5.2</td>
<td>Twin graph</td>
<td>133</td>
</tr>
</tbody>
</table>
Chapter 1

Introduction

1.1 Digital Systems

The past several decades have witnessed explosive growth of integrated circuit technology and its applications in our everyday life. Faithful to the famous prediction of Moore’s law, the capacity of integrated circuits (or chips) doubles every 18 months, delivering the devices that currently fit tens of millions of transistors on a few square centimeters of silicon and capable of performing billions of operations per second. The rapid technology revolution and enthusiastic acceptance of these devices have resulted in a significant price decrease that made it feasible to introduce digital systems into virtually every aspect of life.

The most prominent example of this trend has been the personal computer (PC), which evolved from an experimental, expensive curiosity to an ubiquitous and affordable work tool indispensable in a growing number of professions, as well as an entertainment and information source present in most of today’s homes. But this best known example is just a tip of an iceberg of billions of chips produced every year and installed in modern devices, machines and facilities. The integrated circuits in production and measurement equipment help in work, making it easier and increasing our productivity. Consumer electronics devices such as television sets, video recorders and digital audio and video players have leveraged the advances in integrated circuit technology to deliver higher quality video and sound at ever decreasing price. Chips help to drive cars, fly planes, connect telephone calls and save lives in medical equipment. The latest trends embodied in terms such as “ambient intelligence” or “ubiquitous computing” predict expansion of computer chips even further, to equip with “intelligence” domestic appliances, lighting, even furniture and clothes.

All these developments have led consumers to expect advanced functionality features in new consumer devices and prompted the device vendors to deliver ever more sophisticated products. Also in the market of professional applications, the demands on the functionality and, often more importantly, cost, efficiency and power consumption of integrated circuits are growing. To remain competitive, companies must develop and enhance products with unprecedented constraints on cost, functionality, power consumption and, crucially, time-to-market.

To satisfy the demand, semiconductor industry has responded with a number of tech-
nology options to implement the digital circuits embedded in the newest products. These options represent different cost, performance and flexibility characteristics and enable a designer to make a choice of the most suitable technology and make different trade-offs based on the particular application.

Unfortunately, the explosive development of technology has not been matched by an equal progress in the field of designer productivity and software tools supporting the design of digital systems. The insufficient support offered by the Electronic Design Automation (EDA) tools is responsible for increased design times and suboptimal circuit realizations. This phenomenon known as “design gap” makes it more and more difficult for designers to take full advantage of the opportunities created by the technology. This is the reason for the continuous interest that EDA field has been receiving from the research community. The work presented in this thesis is also a result of this interest.

The subject of this thesis is circuit synthesis of finite state machines (FSMs) — one of the basic forms of description for digital system behavior. An FSM describes behavior of the system in terms of distinct states that the system can be in and transitions between the states that are triggered by particular input signals. The combination of the current state of the FSM and the input signals determine the output produced by the FSM and its next state.

The particular problem addressed by this thesis is state machine encoding, i.e. translation from symbolic, conceptual form to binary functions implementing the behavior described by the FSM. When encoding the machine, we are particularly targeting the implementation of the resulting logic in Field Programmable Gate Array (FPGA).

In this chapter we will present a short introduction to the available implementation options for sequential machines and, particularly, digital controllers, the design methodology used to implement the machines and look closer at the problem of state assignment.

1.2 Implementation technologies

Modern microelectronics industry offers to a digital system designer a number of implementation options for the system. Currently the most prominent of these options are:

- Standard general purpose processors, such as microprocessors, microcontrollers, DSPs, multimedia processors, etc. (software solution).
- Application specific integrated circuits — ASICs (hardware or hardware/software solution).
- Field programmable logic (hardware or hardware/software solution).

Since the subject of this thesis is related to FSM’s hardware implementation, we will focus on the two hardware options — ASICs and programmable logic.

1.2.1 ASICs

Due to the complexity and the cost of the chip production technology, application specific integrated circuits are usually not manufactured directly by the system designer, but by a specialized silicon foundry — a firm that receives from the designer the design of the
chip and manufactures the required volume of chips based on this design. Depending on the level of detail the design requires, several options are available. They range from full custom design, where designer specifies all features of single transistors on the chip; through standard cell approach, where a number of functional cells is used in the design; to gate array, where only connections between prefabricated functional blocks can be determined.

The **full custom** option involves design of the integrated circuit at the lowest level of single transistors, or even particular features of a single transistor. On silicon transistors are manufactured in multiple layers, each layer corresponding to a part of the transistor (e.g. well, source and drain, gate, contacts) and a number of layers where interconnections are realized. Full custom design requires separate definition of each layer, which usually involves drawing rectangles corresponding to particular elements of the transistor. Hence the popular term “polygon pushing” used to describe the process. This approach allows designer to determine position and size of the transistor and therefore gives ultimate control over speed, area and power dissipation of the circuit. The price for the level of control achieved in full custom designs is the huge effort required to design any but the simplest circuits. Moreover, this effort is usually required for each separate design, as the results of the previous designs cannot be readily reused. The large non-recurring engineering (NRE) costs induced by this labour-intensive process make this approach only practical for the most performance-, density- or power-constrained devices, or for very high volume parts. It can also be applied for large regular structures such as memories, where the design of a single cell is multiplied millions of times and thus any saving gained by hand-crafting the cell offers significant reduction of the whole structure.

To avoid the effort associated with designing each separate transistor by hand, the **standard cell** approach has been introduced. In this methodology, the circuit is constructed from a number of pre-designed standard cells. The cell library, usually delivered by the silicon foundry and optimized for the foundry’s process, includes basic logic building blocks, such as logic gates, flip-flops, registers, and often more complex functions, such as multiplexers, ALUs, multipliers and memories. The library often includes several implementations of a particular function, each implementation optimized for a different parameter, such as area, delay, power, drive strength, etc. While the regular shape of the cells gives rise to regular, possibly suboptimal layout and wiring, the highly optimized implementation of particular cells may make up for that effect and often the circuits of quality close to full custom design are produced. This is achieved at a fraction of the design costs, as much of the work of mapping the design to cells, placement and routing are automated and may only require some designer intervention for particularly difficult or critical elements.

While standard cell approach significantly reduces the design cost of a digital system, it does not address another crucial cost ingredient — the fabrication. As mentioned before, transistors are realized in silicon in layers. Each layer is created in a photolithography process by exposing the silicon covered with photoresist to ultraviolet light via a mask. The exposed regions of the photoresist change their chemical properties and can be washed away with a developer solution, leaving the unexposed regions untouched. Since there are multiple layers, multiple masks are necessary to fabricate a working chip. Unfortunately, masks are high-precision, expensive devices, with the cost for a complete set of masks reaching $1 million and more. The chilling cost effect is compounded by
the fact that the masks in general cannot be corrected or modified. Therefore, if a flaw is detected in the design after the masks have been fabricated, the mask set needs to be replaced. To avoid this very costly operation, extensive tests, verification and simulations need to be performed in the design stage, increasing the development time and costs even further.

The problem of high mask costs motivated the development of Mask Programmable Gate Arrays (MPGA). In this technology, a chip containing an array of unconnected gates is prefabricated by the foundry. The customer’s design is mapped onto the gate fabric and the interconnection structure necessary to realize the desired functionality is determined. In this process chip-specific masks are only necessary for the contact and interconnection layers, thereby reducing the unique mask set to 2-5 masks. Naturally, there is very little room for optimizations in this methodology. Also, for some designs significant amount of prefabricated logic may never be used. Still, this option is attractive for less demanding applications. Except for the reduced mask cost, the chip cost is reduced and the reliability is improved due to the fact that the foundry produces large numbers of identical prefabricated wafers which can be used by multiple customers.

Recently, the MPGA idea has been revitalized by the so called structured ASIC approach. It is similar to the original MPGA idea as it also offers prefabricated logic with the programming taking place at the interconnect level. However, structured ASICs contain more coarse-grain logic. The prefabricated logic includes complex programmable functions, flip-flops, memories, etc.

1.2.2 Programmable logic

The reduced mask cost offered by the MPGA or structured ASIC methodology still does not solve most of the fundamental problems of the fixed logic implementations related to design and fabrication cost, flexibility, time-to-market etc. Still, any flaws in the original design are severely penalized if discovered in the manufacturing stages. This is also the case for other updates to the design, involving additional features or modified functionality due to changing requirements, standards, etc. Also, the typical turn around time of a foundry is measured in weeks, which hurts the increasingly important time-to-market. At the same time, foundries will set a minimum limit on the number of parts produced. In the times of fierce competition, when not all products turn out to be immediate success, this may leave the vendor with a large and expensive stock of unsold chips. Moreover, the design tools and expertise required to design an ASIC are expensive resources in general unavailable to small enterprises.

An answer to these problems is offered by programmable logic. As the name implies, logic functions of the devices in this methodology can be programmed rather than prefabricated. Moreover, the whole programming process takes place at the designer’s site, or even in the actual working system where the chip is used, rather than in the foundry. After the circuit has been synthesized and an appropriate programming file generated, the programming of the chip can be performed on-site and takes just seconds. This way, the production turn around time is dramatically reduced. Many of the programmable devices offer re-programmability as well, what eliminates problems with bugfixes and functionality updates to the chip. If a different logic behavior is required, a new programming file can be downloaded to a chip and the new functionality is available in seconds. Nowadays, these updates can be performed remotely in a working system,
via Internet, wireless communication links or other connection. Another important factor in the success of programmable logic is the availability and relative ease of use of the corresponding software design tools. There are many software tools available in the market that help design programmable circuits even by the designers with minimum experience. Moreover, these tools are very cheap when compared to any ASIC-flow tools. In fact, some device vendors offer free design packages with their products.

The price paid for low NRE costs, flexibility and short time-to-market is a lower logic density, lower speed and higher power dissipation of programmable logic as compared to ASICs. The simple and cheap push-of-a-button design tools do not produce the circuits approaching the quality or performance of the designs hand-crafted by the expert designer. Also, a single programmable chip is more expensive than a single ASIC in a long production series. While this is unacceptable for high-end manufacturers offering high volume products, for smaller vendors planning production of several hundred or thousand units the performance may be just enough and a small number of relatively expensive programmable devices may still cost less than NRE cost associated with ASIC production.

The two dominant programmable logic technologies are Complex Programmable Logic Devices (CPLDs) and Field Programmable Gate Arrays (FPGAs).

**CPLD**

The popular CPLDs evolved as an extension of simple Programmable Logic Array (PLA). These arrays are optimized for realization of logic as a two-level Boolean sum-of-products expression. PLA device (see Fig. 1.1) is composed of the programmable AND-plane, where input signals available in direct and negated polarity are combined to form product

![Figure 1.1. PLA structure](image)
terms, and programmable OR-plane which performs logical-or of the product terms from AND-plane. Often, a set of output XOR-gates is available for programmable output signal polarity. The prevalent programming technology for PLA is Electrically Erasable Programmable Read-Only Memory (EEPROM). In this technology floating gate transistors are used to program connections. A transistor can be permanently switched-off by applying appropriate voltage, thus breaking the connection. This process is fully reversible by applying the reverse voltage, making the circuit re-programmable. While PLA-based devices consist of AND-plane and OR-plane, in Programmable Array Logic (PAL, see Fig. 1.2) devices, the OR-plane is replaced by fixed OR-gates, with programmable input and output interconnections. The CPLD devices typically include a number of PLA-like or PAL-like structures called macrocells, together with programmable interconnections between the cells.

This type of architecture is represented by the popular MAX devices from Altera Cor-
1.2. IMPLEMENTATION TECHNOLOGIES

In the MAX 7000 devices (depicted in Fig. 1.3), the macrocells are organized in the Logic Array Blocks (LABs), 16 macrocells per LAB. The LABs are interconnected by means of Programmable Interconnect Array (PIA) that provides full connectivity between LABs. Each macrocell within a LAB contains a PAL element capable of realizing sum of 5 product terms of 36 external signals and 16 so-called shared expander signals. The external signals can come from the PIA or the I/O control blocks that interface with the device’s pins. Shared expanders can be used to feed output of one macrocell within a LAB to an input of another macrocell, allowing for realization of complex logic expressions. Additionally, a macrocell output signal may registered in the configurable flip-flop available in the macrocell. The devices of 7000 family include 2 to 32 LABs.

CPLD devices offer relatively small logic density and capacity, with the largest devices having logic capacity equivalent to 10,000 gates. However, the simplicity of structure results in very fast circuits and predictable timing characteristics, making CPLD suitable for synthesis of timing-critical control applications. Naturally, the small capacity is also reflected in the low price of the devices, which makes CPLDs attractive for tight-budget applications.

FPGA

A more prominent place in the programmable logic market is taken by the Field Programmable Logic Array (FPGA) technology. In these devices, a large number of configurable logic blocks capable of realizing complex functions of a small number of inputs is connected by means of programmable interconnection fabric. The typical FPGA architecture is represented by the devices from Virtex family manufactured by Xilinx. A Virtex device comprises of logic blocks laid out in regular matrix, interconnection resources routed between the matrix rows and columns and I/O blocks at the perimeter of the chip. All these elements are programmable to allow realization of complex logical functions.

Figure 1.4. Virtex II FPGA
with flexible interconnections and communication with the outside world using multitude of I/O standards. The programming of the device is achieved by filling Static RAM (SRAM) memory cells controlling the behavior of different resources.

The primary logic resource of Virtex FPGA is a slice. It includes two 4-input function generators which can be configured to be used as a 4-input Look Up Table (LUT), 16-bit shift register, or 16-bit RAM element. A half of a slice is schematically depicted in Fig. 1.4(b). In the LUT configuration, the function generator can be programmed to realize any logical function of 4 inputs, by storing output value for each of the 16 input combinations. Groups of four slices form Configurable Logic Blocks (CLBs). Slices within CLB have fast local interconnections and can be combined by means of dedicated multiplexers to realize any function of up to eight inputs, with some functions of a larger number of inputs also possible. In addition to LUTs, a slice contains two configurable flip-flops and a number of dedicated logic gates: XOR gates for implementation of adders, AND gates for simplified multiplier designer and OR gates for realization of large sum-of-product expressions. The fast local interconnections between the slices in a CLB include two carry chains with associated multiplexers used to implement addition and subtraction, or to cascade LUTs to implement wide logic functions.

In addition to CLBs for implementation of random logic, Virtex FPGAs contain a number of other dedicated, programmable resources. These include:

- Block SelectRAM — 18 Kbit memories with configurable word and address widths and dual-port capabilities
- embedded 18-bit multipliers
- Digital Clock Management circuitry for programmable clock signal division and multiplication, delay compensation and phase shifting
- (Virtex II Pro family) up to four general-purpose PowerPC processor cores integrated within the FPGA fabric

Top-of-the-range Virtex II Pro FPGA currently contains over 14,000 CLBs with logic capacity of about 8 million logic gates, 10 Mbits of Block RAM, 556 multipliers, and 4 PowerPC processor cores.

The hierarchical programmable interconnection structure connects logic elements together. In particular, in addition to fast inter-slice connections, each CLBs has direct connections to all its immediate neighbors and a limited number of connections to every second, third and sixth block away from the CLB in all four directions. For long-range communications, a number of global signals routed along the entire width and height of the chip is available. Other global communication resources include eight dedicated low-skew clock networks and three-state buses. For off-chip communications I/O blocks are available that are configurable for multiple voltage and impedance standards. Dedicated high-speed transceivers allow inter-chip serial communication at 3.125 Gbps.

1.2.3 Summary

In this section, we outlined characteristics of two basic hardware implementation strategies for digital systems: with fixed and programmable logic.
The fixed logic methodology, due to its inherently high design and manufacturing costs is especially suitable for high-performance, high-volume products. Still, the high logic density and performance levels of ASICs continue to make it the dominant technology for digital systems implementation in parallel to standard general purpose programmable processors.

However, in the recent years this picture has begun to change. Programmable logic solutions, in particular FPGAs, have reached maturity and begin to challenge ASICs. Latest families of FPGA devices reach performance and logic density levels that is sufficient for many applications. Elimination of manufacturing costs and dramatical reduction of turn-around times make FPGAs particularly attractive for short-series, high-end industrial applications. In the markets such as telecommunication, networking, broadcasting, or medical instrumentation, several thousands of dollars paid for an advanced FPGA device is more than justified by the saved design and manufacturing time and the opportunity for product updates throughout its lifetime.

Interestingly, important market segment for FPGAs is created by the ASIC designers. Programmable logic has become a standard way of emulating and testing of ASIC designs before manufacturing. Instead of performing the time-consuming software simulation of an ASIC, it can be implemented in FPGA and tested in circuit, and at speed, increasing simulation speed and reliability.

In addition to time and cost constraints, other economical and technical factors are playing in favor of FPGA adoption. FPGAs are mass-produced, highly reliable devices manufactured in the cutting edge processes, otherwise unavailable to most designs. The ability to buy and program in-house arbitrary number of FPGA devices frees designers from close ties with the foundries, allowing flexible stock control and protecting their intellectual property.

Except for the above mentioned reasons, which essentially promote FPGAs as short-series replacement for ASICs, programmable logic with its in-system, even on-the-fly re-programmability introduces a new design paradigm, unavailable for ASIC implementations. FPGA-based hardware has a potential to change its programming and, in result, behavior adopting this way to new operating conditions. Experiments in evolutionary hardware [20], where a part of FPGA is programmed to re-program other part in reaction to changing conditions, are showing just one of the possible development directions.

During the last decade the recongurable systems evolved from a cost-effective replacement of ASICs to the mainstream implementation option of application specific (embedded) systems in parallel to programmable general purpose processors and ASICs. Also, emerging FPGA-based system-on-chip structures are ideally suited for implementation of modern system-on-chip designs. Devices such as Virtex II Pro include in one chip general purpose processors for control and non-critical computations, programmable logic for hardware acceleration of timing- and power-critical tasks, reasonably large memories for storage, multipliers for arithmetical and DSP applications, and high speed transceivers for communication. These characteristics allow implementation of the entire system on just one chip, increasing its speed and reliability, decreasing power dissipation and eliminating costs and inefficiencies associated with multi-chip or general purpose processor implementation.

All of the above reasons indicate large potential and quickly growing importance of the FPGA technology and motivate intensive research in the areas associated with FPGA-based system design. In particular, much of the research effort is invested to deliver high-
1. INTRODUCTION

quality, easy to use design tools capable of producing efficient FPGA implementation.

1.3 FPGA design flow

In the previous section, we pointed out that the availability of cheap and easy to use design tools is one of the key factors to FPGA success. These tools enable description of the system in a high level language or in a graphical form and automatically perform synthesis of the circuit with the described behavior. The typical synthesis flow is presented in Fig. 1.5.

The input to a typical FPGA synthesis tool is a specification of the system using one of hardware (or system) description languages. The most popular method of description is by means of a Hardware Description Language (HDL), such as Verilog [26] or VHDL [27]. In these languages the system can be described by a mixed behavioral and structural model, with algorithmic description of operation sequences, control decisions, I/O operations etc.

Another form of behavior specification is a finite state machine (FSM), which describes system in terms of states that the system can be in, transitions between the states triggered by received input signals and the output values produced by the system based on current state and input values. An FSM can be described in a tabular form by a state transition table (STT), in a graphical form by a state transition graph (STG), or by an appropriate HDL construct.

A lower-level system specification can also be provided by means of Register Transfer Level (RTL) description. In RTL, designer specifies the system in terms of registers and operations on data flowing between the registers. These operations can be specified as abstract data operations, such as addition or multiplication, or the actual networks of gates realizing the data transformation. The RTL description can be provided in HDL or in graphical form as a circuit diagram.

Less popular specification options include formal description languages represented by generalizations of FSMs, such as statecharts [21], Estelle [28], SDL [18] or Promela [24]. For these languages, separate tools may be provided that generate equivalent HDL or netlist descriptions. The generated circuit description is then synthesized in the regular flow.

The system specification expressed in one of the abovementioned forms is processed by an appropriate compiler and converted to a network of combinational logic blocks and finite state machines realizing the required behavior. To implement the FSM behavior, the structure depicted in Fig. 1.6 is used. The current state of the machine is stored in the register. Based on the current state and the inputs, the combinational logic block computes output values and the next state of the machine, which is fed back to the register. However, the states of an FSM are present in the original design as symbolic state names. To be able to store the state in a binary register, binary vector (code) corresponding to each of the machine’s states needs to be determined. This task is performed by state assignment (or state encoding).

After state assignment, the entire system is described by an RTL-level netlist of registers and Boolean functions. (The functions that can be realized using dedicated logic resources, such as multipliers or carry chains, are usually directly instantiated in the original system description or in the RTL netlist by the design compiler and are not the
1.3. FPGA DESIGN FLOW

Figure 1.5. FPGA synthesis flow
subject of further optimization.\) In the next step, \textit{logic synthesis} processes the Boolean functions to find their effective and efficient realization. Usually, logic synthesis will only perform partial optimizations of the functions that are expected to result in efficient implementation. The task of actual realization of the functions is in this approach deferred to the \textit{technology mapping} phase, where the simplified functions are realized by logic blocks available in the target implementation platform (so-called technology primitives). In particular, in LUT-based FPGA, such as Virtex, the logic functions are mapped to 4-input LUTs, multiplexors and gates available in a slice.

The result of technology mapping is a network of technology primitives. These primitives are then assigned to particular slices within the target FPGA device and the interconnection structure between the slices is determined. These processes are referred to as \textit{placement and routing}. They ultimately determine how particular logic resources and interconnections within device need to be configured and result in so-called \textit{bitfile} or programming file that contains complete information necessary to program the target device.

\subsection*{1.3.1 State assignment}

The main subject of this thesis is the state assignment of finite state machines. Our interest in the subject stems from the fact that the influence of state assignment on the circuit realization goes far beyond simple association of state names and binary codes.

As depicted in Fig. 1.6, bits of the binary vector representing current state are used both as inputs and outputs of the combinational logic component. Consequently, they have crucial influence on the form of binary functions necessary to compute outputs and next states. As a result, the complexity of the combinational component for different encodings can differ dramatically. This is the reason why the topic has attracted attention of researchers for the past forty years.

However, even though much effort has been invested, the problem of the optimal state assignment is still far from being solved. The main reason is the complex dependency between the choice of the state codes and the resulting Boolean functions. Moreover, even after the encoding is chosen and the combinational component is determined, the main attributes of its circuit realization, such as area, speed, or power consumption cannot be readily obtained. As illustrated by Fig. 1.5, the encoded machine is further subjected to the combinational logic synthesis, technology mapping and placement and routing, each of these steps applying complex heuristics to find optimum implementation. The state assignment method has no direct or easy way of predicting the result of
these operation and therefore the task of determining codes resulting in effective and efficient implementation is extremely difficult.

The problem is further complicated by the feedback loop introduced by the state register. The fact that state-code-bits are both inputs and outputs of the combinational component implies that the consequences of choosing a particular state assignment need to be considered both from the input and output point of view. The requirements resulting from these two types of analysis may turn out to be conflicting and require intelligent resolving.

Finally, the desired form of the combinational component strongly depends on the target technology, as it is the implementation technology that ultimately determines which functions can be efficiently realized. This fact limits applicability of the state assignment methods developed for previous technologies to the new ones.

Related research

The first attempts at algorithms of state assignment for computer program implementation date back to 1960s, when the implementation technology were diodes, transistors and simple gates. Following the methods of human designers, formulated by Humphrey [25] as the “rules for state code adjacency”, the algorithms proposed in [3][2][15] are based on grouping together 1’s in the Karnaugh tables of the resulting binary output and next-state functions, and in result on maximizing the size and minimizing the number of product terms in the sum-of-product expressions describing the combinational part of the FSM realization.

Even though the technology changed, the observations formulated by Humphrey became foundation for the next generations of algorithms.

This was also the case with the family of algorithms based on symbolic minimization. This novel idea implemented by de Micheli in KISS [13] in 1985 consists in performing the logic minimization phase before the state encoding. The minimized functions could be realized in PLA technology at the cost (in terms of circuit area) determined prior to encoding, assuming that the constructed code will satisfy all encoding constraints resulting from the symbolic minimization phase. The state encoding problem is in this way reduced to satisfaction of the so called face or input constraints expressing certain relations between the codes of some groups of states.

The method implemented in KISS has however several shortcomings. First of all, in the symbolic minimization phase the binary next-state functions are assumed to have disjoint on-sets. Thus, the next-state function minimization is not taken into account. This aspect is especially apparent for counter FSMs, where minimized symbolic cover for 2p-state counter has 2p product terms, while the optimum is $O(p^2)$. Moreover, the FSM’s feedback is not properly taken into account. Also the constraint satisfaction algorithm is just a simple-minded, greedy search.

The idea corresponding to symbolic minimization (minimization of generalized implicants) was even earlier pursued in [6]. However, the computational complexity of the presented (admittedly, much more sophisticated) method made it impractical for FSMs larger that 8 states.

In 1988, Jóźwiak published the method of maximal adjacencies, MAXAD, targeting two-level logic implementations [30][31][33]. Although the method uses the “state code adjacency” concept, it differs considerably from the previous methods based on this idea.
Jóźwiak considers many sorts of adjacencies, performs a sophisticated adjacency analysis and uses the results of the analysis in a sophisticated code construction performed in the framework of an effective and efficient double-beam search algorithm. This resulted in a state assignment tool that efficiently produced much better results than any other method published at that time. Compared to KISS, the machines encoded by MAXAD have realizations with on average 13% smaller PLA area and 28% smaller feedback area.

Some of the shortcomings of KISS are addressed in its successor — NOVA [74]. NOVA takes more efficient and flexible approach to constraints satisfaction, representing it as a graph embedding problem and solving in several, heuristic strategies producing superior results and offering quality/runtime trade-offs. Also the output encoding problem is considered here, however in a marginal manner, subordinate to the KISS-like input encoding. The best strategy of NOVA – ico hybrid – produced results of comparable quality to the results of the maximal adjacency method.

The output encoding problem is addressed in two dichotomy-based methods, DUET [11] and DISA [61]. They represent constraints as dichotomies, thereby introducing uniform representation of encoding and constraints. The problem of the encoding constraint satisfaction is transformed in this way to the problem of the compatible dichotomies merging. The resulting merged dichotomies define unambiguously certain encoding, which satisfies all the constraints represented by the component dichotomies.

All of the above symbolic minimization-based methods assume two-level PLA implementation of the functions resulting from the state encoding. This assumption drives the symbolic minimization, which aims at generation of minimum-cardinality symbolic cover – a reasonably good approximation of the optimal FSM realization, especially for input-dominated FSMs. They differ from each other in the level of detail in constraint consideration and in constraint satisfaction method.

Unfortunately, no approximation as good as term number for two-level implementation has been discovered for any sort of multi-level logic (except for multiplexor networks). The possibility of realizing multiple trade-offs and, in particular, the characteristics of the new multi-level implementation platforms greatly complicate construction of good cost functions, which hinders quality estimation of the constructed code. For these reasons, symbolic minimization is difficult to realize in the multi-level domain (one of the few attempts was taken in MIS-MV [55]). Instead, most encoding methods for the multi-level implementations settle for crude predictions of the operations of subsequent logic synthesis steps, and attempt to create such an encoding, which produces output functions “easy” for a certain multi-level combinational logic minimizer targeting a particular implementation technology.

One of the earliest multi-level state assignment methods, MUSTANG [14], falls into this category. Designed to work with MIS logic synthesis system, it attempts to maximize the number and size of the common (sub-)cubes in expressions describing the output functions. These common (sub-)cubes will make it easier for MIS to realize its objective of minimizing the number of literals. The common cube maximization is realized in the process of adjacent code assignment to some selected pairs of states. The selection of the pairs, interestingly, is guided by rules similar to those of Humphrey. Two other well known methods, JEDI [57] and MUSE [16], closely follow the concepts implemented in MUSTANG, with additional improvements introduced by elements of simulated annealing and consideration of Boolean (as opposed to algebraic) operations in common-cube extraction. Since codes produced by MAXAD also result from cumulation of adjacencies,
it belongs to the same class of methods and produces comparable results for multi-level circuits.

A different approach is taken in MIS-MV [55]. It follows in the footsteps of KISS and applies symbolic minimization to the multi-level realizations. MIS-MV is actually not a state encoding method, but a multi-level logic minimizer able to handle multivalued variables. It is therefore able to minimize the combinational component of FSM before the actual state encoding, when the state variable is still in its symbolic, multivalued form. Symbolically minimized function is then encoded with a simple algorithm based on simulated annealing, guided by the number of literals as its cost function.

Some of the interesting alternative approaches to state encoding include genetic algorithms [1][10]. These methods, however, suffer from the fact that basic genetic algorithms are known to poorly handle the problems with complicated and time consuming quality function [46]. It is apparent in the run-times of the tools, which exceed in some, even small, cases those of the classical algorithms by a factor of 100. An interesting direction relevant to some implementation technologies pointed in [10] is, however, the simultaneous encoding and selection of the types of flip-flops used to store state variables (D or J/K type). In some cases, the choice of J/K flip-flops (which are known to require less complicated excitation functions) reduced the combinational component of the FSM realization by as much as 80%.

The challenges introduced by the multi-level implementations were further deepened with the advent of FPGA devices. The characteristics of FPGAs invalidate estimations of the implementation cost of the Boolean functions by the number of terms or literals. Till now, little has been done in the field of sequential synthesis targeting FPGA implementations. Notable exceptions are the programs LAX [69] – a BDD-based tool for multiplexor-based FPGA architectures and MINISUP [56] – a tool for reducing input support of the binary functions implemented in LUT-FPGAs.

A totally different view at the state encoding problem is presented in the algebraic structure theory of sequential machines due to Hartmanis and Stearns [22]. The theory utilizes the concepts of partitions and set systems to model the FSM’s information structure, and trace dependencies between the information about the FSM’s states, inputs and outputs. Observation of these dependencies gives some hints related to the direction, in which the encoding of the states should follow to reduce the dependencies of the next-state and output functions from the state and input variables. This approach seems to be especially interesting in the case of LUT-FPGAs, where reducing dependencies satisfies two important goals of reducing the functions’ input supports and interconnections.

However, the hints of Hartmanis and Stearns were limited to only some special implementation structures of sequential machines (parallel and serial decomposition), required extensive computations (computation of SP-partitions and partition pairs) and were not formulated into any method or algorithm. Moreover, although Hartmanis and Stearns understood that partitions and set systems model information, they did not discover any method of expressing what particular information is modeled, and therefore they were unable to characterize and measure the modeled information and the relationships between the modeled information stream.

Therefore, Jóźwiak formulated the theory and methodology of general decomposition [35] that give the most general known generator of correct sequential and combinational circuit structures, and explained how to use the generator for the (near-)optimal circuit construction. To enable effective and efficient circuit construction, Jóźwiak pro-
posed the notion of elementary information [36] that enables precise characterization of information as modeled, for instance, by set systems and partitions, and using this notion formulated the theory of information relationships and relationship measures [36]. The apparatus of information relationships and relationship measures enables analysis and measurement of the modeled information and relationships between information streams in a given discrete function, relation or sequential machine, as well as in the circuit under construction. The information delivered by the apparatus of information relationships and measures can be used to control the circuit generator of general decomposition in order to construct only the most promising circuit structures and arrive at some (near-)optimal circuits. In this way, information relationships and measures make operational the theory of partitions and set systems of Hartmanis and Stearns and its extensions. In particular, it enables construction of effective and efficient decomposition and encoding algorithms based on this theory and its extensions.

Some applications of the algebraic structure theory (without its extensions) to FSM decomposition or encoding were also considered by some other authors (see a.o. [4, 63, 72, 76]).

1.4 General decomposition

Our approach to state assignment is based on a special case of general decomposition of finite state machines. Decomposition is a natural method of handling complex problems. It consists in splitting a complex problem into smaller, interrelated subproblems and solving each of the subproblems separately. General decomposition of finite state machines is a theory developed by Jóźwiak that deals with realization of the behavior of large FSMs by networks of interconnected, smaller FSMs operating in parallel (so called partial machines). In [35] Jóźwiak published theorem fundamental to this theory that established conditions for valid decomposition of completely specified, deterministic finite state machines. The theorem describes decomposition by partitions – mathematical constructs with well-defined operations. This enables efficient manipulation of the decomposition structures in the domain of partitions, while preserving the behavior of the decomposed finite state machine.

The theorem considers general decomposition structure presented in Fig. 1.7, where the partial machines are fed input information processed by some primary input decoder, their interconnections are realized by means of connection modules and their output is combined by the output encoder block to produce the output of the network. The entire network, when considered from the point of view of primary inputs and outputs behaves identically as the FSM being realized.

The general decomposition theory of completely specified FSMs was later extended by Jóźwiak and the author of this dissertation to the case of incompletely specified FSMs and their multi-state realizations [48]. This extension accounts for incompletely specified output function (output don’t-cares) as well as for the incompletely specified state transition function (non-determinism) for both original machine and the partial machines. Multi-state realization (see also Definition 3.15) allows the modeling of decomposition networks having multiple states corresponding to a single state of the original machine. This way, the extended theory covers all finite state machines and a much wider range of their decompositions that include not only identical, but also compatible behavior realiza-
In a separate work [36], Jóźwiak showed correspondence between the partitions used to describe the decomposition structure and information flowing within the structure. That work introduces the apparatus of Information Relationships and Measures (IRM) that enables analysis of information in a system and in particular:

- analysis of the information flows (where and how particular information is produced, and where and how it is consumed)
- analysis of the relationships (similarity, difference) between various information channels
- quantitative analysis of the information flows and their relationships (quantity, importance of information)

General decomposition theorem and IRM together form the basis for construction of effective and efficient decomposition methods and tools, with decomposition theory providing engine for creating correct-by-construction realization structures and IRM providing measures to guide the construction process towards the most efficient realizations.

One particular special case of general decomposition is functional decomposition. It deals with state machines having a single state and trivial state behavior, i.e. with combinational functions. In this approach, a large combinational function is realized by a network of smaller subfunctions. Functional decomposition has been successfully used among others for logic synthesis for FPGA platforms [41].

However, the applications of general decomposition are not limited to circuit synthesis. It can be used in analysis and decomposition of any discrete, binary, multi-value or symbolic system in many fields of modern engineering and science. Known applications...
include pattern analysis, knowledge discovery, machine learning, neural network training, decision systems, databases, encryption, compression, encoding etc. The particular suitability of decomposition-based methods to all these fields stems from the fact that, as the name suggests, general decomposition does not make any prior assumptions about the form of the network or the partial machines (or functions). Therefore, it allows the discovery of any “natural” structure that best realizes the required behavior. This is in contrast to many current methods popular in particular in logic synthesis. Most of these methods are devoted to some very special cases of possible implementation structures involving some minimal functionally complete systems of logic gates (e.g. AND+OR+NOT, NAND, AND+EXOR, etc.). This is particularly unsuitable for implementation technologies such as FPGAs, where the constraints on the network do not involve the kind of the logic functions (LUT in FPGA is capable of realizing any function of a limited number of inputs), but rather their structural characteristics, such as number of inputs and outputs (LUT is a 4-input, 1-output universal gate), or number and length of interconnections.

In this thesis, we present extension of the General Decomposition Theorem to the case of incompletely specified, non-deterministic finite state machines with multi-state behavior realization. This extends the field of realization structures covered by GDT to include all valid implementations, not only strictly equivalent to the original specification, but all compatible with the original description. The extension allows to handle not only simple machines and functions, but also incompletely specified, multi-valued machines and relations, widening even further applicability of the theory to other fields of science and engineering.

In particular, we formulate the state assignment problem as a special case of general decomposition. Based on this formulation, we use set systems to describe state assignment of an FSM and, based on General Decomposition Theorem, we derive conditions for the encoding to be valid. These conditions give rise to a general state assignment method that is capable of not only identifying valid encodings, but also performing implicit machine simplification (e.g. implicit minimization of the number of states) in the encoding process, using freedom and redundancy present in specification. However, while the general method describes mechanics of the encoding construction process, it does not prescribe the decisions necessary to steer the construction process towards efficient implementation. These decisions require heuristics reflecting the particular encoding objectives. We implemented such heuristics for the case of FSM targeting implementation in FPGA. These heuristics basically consist in confronting the information streams in a particular FSM with the constraints imposed by the FPGA technology and the optimization objectives. The analysis of the information streams and their relationships is facilitated by the apparatus of information relationships and measures. The software tool implementing these heuristics and experimental results obtained with this tool are also discussed in this thesis.

1.5 Research aims and thesis overview

1.5.1 Motivation

Programmable logic fills the flexibility, performance, power dissipation, and development and fabrication cost gap between the application specific integrated circuits and standard (general purpose) programmable micro-processors. During the last decade it
became the mainstream implementation technology for custom computation and embedded system products in such fields as telecommunication, networking, image processing, video processing, multimedia, DSP, cryptography, embedded control etc. To efficiently develop, implement and use the systems based on programmable logic, adequate computer-aided support tools are necessary. Since most such systems are implemented using the look-up table (LUT) FPGAs, the circuit synthesis tools targeting this technology are of primary importance for their effective and efficient implementation.

However, the new set of characteristics and constraints imposed by the modern FPGA technology is not properly addressed by the available synthesis tools. These characteristics include the availability of multi-level networks of arbitrary functions limited by their dimensions (maximum number of inputs and outputs) rather than specific functions limited by their form (number of terms or literals), and the growing influence of interconnections on all important circuit characteristics (speed, area, power dissipation, etc.). To address these issues, a new family of synthesis methods based on the general or functional decomposition has been recently proposed. The decompositional approach, and in particular the general decomposition approach proposed by Jóźwiak [35], has the potential of directly building such complex multi-level networks of functions with constrained number of inputs and/or outputs, and directly controlling the number and length of interconnections. The combinational synthesis methods based on the general decomposition significantly outperform other methods used in academic and commercial tools, demonstrating the effectiveness of this approach [41]. While the problems of combinational synthesis for reconfigurable platforms have received significant attention of researchers, there is very little done in the field of sequential synthesis targeting FPGA implementation.

Meanwhile, the sequential synthesis has a profound impact on the final implementation cost. In sequential synthesis, the symbolic description of the FSM functionality is translated to a set of Boolean functions implementing this functionality. The form of these functions determines all crucial characteristics of the system’s implementation, such as area, delay and power dissipation. The importance of sequential synthesis is paired with great difficulty of the problems involved. Decomposition and encoding of FSMs are both well-known to be NP-hard problems, which makes the search for the high-quality solutions only feasible through heuristic methods. These heuristic methods are hindered by the complex evaluation criteria for particular solutions. The complexity stems from the fact that the actual implementation cost of a given solution is not known until the combinational synthesis and technology mapping were performed on the particular Boolean network produced for the given solution. Significance of the FSM decomposition and assignment problems and the lack of answers to them, especially in relation to the FPGA technology, motivated us to perform the research in the field of sequential synthesis of FSMs targeting FPGA implementation.

Recently, Jóźwiak proposed a new information-driven approach to circuit synthesis and formulated two theories that support this approach:

- theory of general decomposition of discrete functions and sequential machines [35], and
- theory of information relationships and information relationship measures [36, 37].

This way, a new theoretical and methodological framework has been established for analysis and synthesis of sequential and combinational logic networks. The framework con-
sists of the information-driven approach to circuit synthesis, theory of general decomposition, and the information modeling and analysis apparatus based on information relationships and measures. The framework was successfully applied by Jóźwiak and his collaborators to a variety of problems, including combinational synthesis [38, 41, 43, 44, 51, 52, 64, 65].

Encouraged by the success of the decompositional combinational synthesis methods and the level of control provided by general decomposition over the vital characteristics of the synthesized system, we decided to pursue the information-driven decompositional approach to sequential synthesis.

1.5.2 Research overview

The main aims of this research can be sub-divided into:

1. analysis aims, including the aims related to:
   - research field and problem analysis, and
   - research result analysis, including experimental research

2. synthesis aims, including:
   - theoretical
   - methodological, and
   - application-oriented

The main theoretical aim was to further analyze and adequately supplement the theory related to the subject field, to obtain a more complete theoretical base for this and further research in FSM decomposition and encoding.

This aim has been successfully realized. The main result is related to the theory of general decomposition of sequential machines. The central point of this theory is a constructive theorem on the existence of a general decomposition. This theorem provides a generator of correct by construction circuit structures. Originally it was formulated by Jóźwiak for the case of deterministic (completely specified) sequential machines with single-state behavior realization [35]. At the start of this research, Jóźwiak proposed the outline of the extended version of the theorem to account for the sequential machines with incompletely specified output function and transition function (nondeterministic) and their multi-state realizations and outlined the proof of the extended theorem. The final version of the extended general decomposition theorem and its complete proof have been constructed together by the author and Jóźwiak as a part of the research reported in this thesis.

The main methodological aim, being the central aim of this research, was to develop an effective and efficient FSM state assignment method for the decompositional state assignment, based on the information-driven approach to circuit synthesis, theory of general decomposition, and theory of information relationships and measures. The associated primary application-oriented aim was the implementation of the method in the form of an EDA software tool and their testing.

These aims have been fully realized. Based on the outline of the FSM state assignment method proposed by Jóźwiak, I developed and implemented in a prototype software
tool a complete effective and efficient FSM state assignment method, including its particular data structures and algorithms.

The primary analytical aim of the research was to analyze the characteristics of the modern FPGA devices and how these characteristics are addressed by the available state assignment methods. The methods used in commercial FPGA synthesis tools were of particular interest.

To this end, I reviewed the available FPGA product families and their features. I also analyzed the results of state assignment with various academic tools and the methods used in commercial synthesis software. In particular, the applicability of one-hot encoding prevalent in commercial tools was a subject of investigation. While I found one-hot encoding in most cases inadequate for FPGA implementation, some of the cases where it does produce efficient realizations led to the development of a method reported in this thesis that automatically identifies good candidate FSMs for one-hot encoding.

Another analytical aim was to support the development of the information-driven decompositional state assignment method with experimental research and analysis of its results. Within this scope, we were particularly interested in interactions between our state assignment method and the combinational synthesis methods that were used to synthesize the encoded FSMs.

This aim was realized by evaluating the synthesis results of a large number of FSMs encoded with the proposed state assignment method. I analyzed the synthesis results both after combinational synthesis of the encoded machines and after placement and routing of the synthesized circuit in the actual FPGA device. To further support this aim, we developed together with L. Jóźwiak and D. Gawlowski sequential benchmark generator BENGEN [42] that enabled efficient generation of various sorts of well-characterized FSM benchmarks and this way facilitated the analysis of effectiveness and robustness of the proposed method on a wide spectrum of FSMs. The results of this analysis were used to improve the effectiveness of the state assignment method, and to finally characterize and evaluate the method.

1.5.3 Thesis overview

In the remainder of this thesis we will discuss the results of the research. We will present the extensions we introduced to the existing general decomposition theory of completely specified FSMs to account for non-determinism, incompletely specified output functions and multi-state behavior realization. The extensions are introduced in the new formulation of General Decomposition Theorem, which we present along with the proof. The extended General Decomposition Theorem constitutes the most general known generator of correct decompositional circuit structures. It covers all other specific decomposition structures, such as parallel or serial decomposition, as well as the decomposition of discrete functions and relations, as special cases [35]. The theorem explicitly treats information flows within the decomposed machine or function and thus supports the novel, information-driven approach to decomposition.

We will then show the correspondence between the state assignment and decomposition, and how we used this correspondence to derive the conditions for valid state assignment from the General Decomposition Theorem. These conditions are more general and flexible than conditions used in other state assignment methods, and thus allow the encoding method to search in a larger solution space. They form the basis of a gen-
eral, implementation-platform-independent encoding method that constructs encoding by merging the variables of the specific initial encoding. Thanks to the flexible formulation of the validity conditions, the method enables implicit optimizations of the encoded machine during the encoding process in a manner unexplored by other methods.

Further, we will analyze the FPGA characteristics that influence the implementation efficiency of sequential circuits and, based on this analysis, we will propose novel heuristics for state assignment. These heuristics exploit the information modeling apparatus used in formulation of the General Decomposition Theorem and analysis apparatus of information relationships and measures to perform analysis and optimization of the information flows within the encoded machine during the encoding. We implemented these heuristics in the software tool SECODE, which we will discuss in some detail.

Finally, we will present and discuss the results of the experimental research we performed using the newly developed state assignment tool and some of the other popular state assignment methods. We will show that the FSM implementations resulting from the encoding process performed with SECODE are smaller and faster than those resulting from other encodings.

In the presented research, we considered the important and difficult problems of decomposition and state assignment of FSMs for the emerging, FPGA-based reconfigurable system implementation platforms. We extended the underlying theory and applied it to develop the novel state assignment method, which outperforms the currently used methods. The experimental research of the developed theory, method and prototype software tool demonstrates the effectiveness of the information-driven approach to circuit synthesis, and shows that they form a solid base for further theoretical and experimental research in this field.

1.5.4 Thesis outline

The presentation of the research is organized in the following manner.

In Chapter 2, we introduce the necessary theoretical background related to Boolean functions, finite state machines and information modeling and analysis. Based on this background, we will present in Chapter 3 the theory of general decomposition. To better introduce the complex notions of decomposition, we will first present and illustrate with an example the General Decomposition Theorem proven in [35] for the case of completely specified FSMs. Then, we will discuss modifications to the theorem necessary to account for incompletely specified output and next-state functions and multi-state behavior realization. Finally, we will prove and illustrate with another example the extended General Decomposition Theorem for incompletely specified FSMs with multi-state behavior realization.

In Chapter 4, we will show the correspondence between state assignment and decomposition of an FSM. This correspondence will allow us to derive from the General Decomposition Theorem the conditions for a valid state assignment that form the basis of a general, implementation-platform-independent encoding method. While the general encoding method presented in Chapter 4 provides mechanics of constructing a valid state assignment, it does not introduce any implementation-platform-dependent knowledge necessary to construct an encoding that will result in efficient implementation of the FSM on the target platform. In Chapter 5, we will present heuristics guiding the general state assignment method towards encodings that result in efficient FPGA imple-
mentation of the FSM, and the software tool SECODE implementing these heuristics.

The results of the experiments with SECODE will be discussed in Chapter 6. We will conclude with the summary of the research results and recommendations for future work in Chapter 7.
Chapter 2

Preliminaries

Digital circuits can be classified, with respect to their behavior, into two main classes: combinational and sequential. In combinational circuits, the output of the circuit depends only on the current value of the inputs. The functionality of a combinational circuit can be described by Boolean functions of its outputs. In the sequential circuits, the output depends not only on the current inputs but also on the history of execution, represented by the current state of the circuit. The current state is stored in some memory elements and influences the output of the circuit as well as the next state. The functionality of a sequential circuit can be described by a Finite State Machine (FSM).

Both Boolean functions and Finite State Machines can be modeled as information processing systems that require some input information, possibly use some internal state information (FSMs) and from combination of these two sources produce the output information. The analysis of the information and information flows in FSMs is the basis of the state assignment method presented in this thesis. To model the information, we use constructs called covers and to analyze the relationships and characteristics of the information modeled by covers, we use information relationships and measures.

In this chapter we will introduce basic notions related to Boolean functions, Finite State Machines and information modeling and analysis with covers and information relationships and measures.

2.1 Boolean functions

Definition 2.1 (Completely specified Boolean function). Completely specified Boolean function of $n$ variables is a mapping $B^n \rightarrow B$, where $B = \{0, 1\}$.

Definition 2.2 (Incompletely specified Boolean function). Incompletely specified Boolean function of $n$ variables is a mapping $B^n \rightarrow B^*$, where $B^* = \{0, 1, -\}$ and "-" stands for "don't care" and indicates any of the values 0 or 1.

Boolean variables will be denoted in the following by lowercase letters (e.g. $a$). A negated Boolean variable is denoted by $\overline{a}$ and represents completely specified (c.s.) Boolean function $\neg a$. A literal is a variable or its negation ($a$ or $\overline{a}$). A cube is a set of literals that represents c.s. Boolean function being the product of the literals (e.g. cube $\{a, b, \overline{c}\}$ represents function $a \land b \land \neg c$, also denoted $\overline{abc}$). A sum-of-products (SOP) expression is
2. PRELIMINARIES

a set of cubes representing c.s. function corresponding to the disjunction of the cubes’ functions (e.g. \{\{a, b, \overline{c}\}, \{d\}\} is a SOP corresponding to function \(a \land b \land \neg c \lor \neg d\), or \(abc + \overline{d}\)). A truth table is a table listing values of a Boolean function for some input combinations. For the input combinations not explicitly listed in the truth table, the don’t-care ("\(\overline{\cdot}\)"") value is assumed.

The above notions can be extended to discrete multiple-valued (symbolic) functions and relations. This is partially done in Chapter 3, when considering the general decomposition of FSMs, as discrete functions are a special case of FSMs.

2.2 Finite State Machines

Definition 2.3 (Completely specified finite state machine (FSM) – Mealy type).
Completely specified Mealy-type finite state machine \(M\) is an algebraic system defined by:

\[ M = \{I, S, O, \delta, \lambda\} \]

Where:
\(I\) – a finite set of inputs
\(S\) – a finite non-empty set of internal states
\(O\) – a finite set of outputs
\(\delta\) – the next-state function \(\delta : S \times I \rightarrow S\)
\(\lambda\) – the output function \(\lambda : S \times I \rightarrow O\)

Definition 2.4 (Completely specified finite state machine (FSM) – Moore type).
Completely specified Moore-type finite state machine \(M\) is an algebraic system defined by:

\[ M = \{I, S, O, \delta, \lambda\} \]

Where:
\(I\) – a finite set of inputs
\(S\) – a finite non-empty set of internal states
\(O\) – a finite set of outputs
\(\delta\) – the next-state function \(\delta : S \times I \rightarrow S\)
\(\lambda\) – the output function \(\lambda : S \rightarrow O\)

The two types of Finite State Machines differ in the method of producing output. In the Moore FSM, output is produced “in the state”, i.e. the output depends exclusively on the current state of the machine. A Mealy FSM, on the other hand, produces outputs “on the transition”, i.e. the output depends not only on the current state, but also on the current input and is produced concurrently with computation of the next state. The schematic structures of both types of FSMs are presented in Fig. 2.1.

For any Moore machine there exists a Mealy machine with equivalent output behavior and vice versa, which makes both types equally expressive in terms of functionality. Also, when abstracting from the above described difference in the output timing and the related physical implementation consequences, Moore machine can be considered as a special case of Mealy machine. Therefore, theory and methods developed for Mealy machines will be also applicable to Moore machines (in some cases after small modifications related to the difference in output computation timing). For this reason in the following we will focus on the Mealy-type machines, and by FSM we will mean Mealy-type FSM, unless explicitly stated otherwise.
One of the forms of the specification of a finite state machine is a State Transition Table (STT). The table consists of rows (transitions), each transition in the form \((x, s, \delta(s, x), \lambda(s, x))\), i.e. it consists of the input value, current state of the machine, the next state of the machine for this input/current state combination and the output of the machine for this input/current state combination. This specification can be easily transformed into a corresponding State Transition Graph (STG) representation. In STG, nodes represent states of the machine and directed edges represent transitions. A transition edge connects a state with its next state and is labeled with the input combination that triggers this transition and the output value produced by the FSM in these input/current state conditions.

**Ex. 2.2.1.** Consider for example the State Transition Table in Fig. 2.2(a). It describes a (Mealy) Finite State Machine with one binary input, three states \(a\), \(b\) and \(c\), and two binary outputs. The first row of the STT specifies that when the FSM is in state \(a\) and the input is 0, the machine moves to state \(c\) and produces output 00. The corresponding STG is shown in Fig. 2.2(b). The first transition is reflected by the edge from state \(a\) to \(c\) labeled with input combination 0 and the produced output value 00.

---

**Figure 2.1. Finite State Machine types**

![Mealy FSM](a)

![Moore FSM](b)

**Figure 2.2. Completely specified sequential machine**
Definition 2.5 (Incompletely specified sequential machine). Incompletely specified sequential machine $M$ is an algebraic system defined by: $M = \{I, S, O, \delta, \lambda\}$

Where:
- $I$ – a finite set of inputs
- $S$ – a finite non-empty set of internal states
- $O$ – a finite set of outputs
- $\delta$ – the next-state function $\delta : S \times I \to 2^S$
- $\lambda$ – the output function $\lambda : S \times I \to 2^O$

In incompletely specified sequential machines next-state and output functions incorporate generalized don’t-cares – for each input/state combination they return a subset (a choice) of values from state (output) alphabet. “Traditional” don’t-cares (“–”) are a special case of the generalized don’t-cares, because “–” represents a whole set of states (for next-state don’t-care) or output values (for output don’t-care). Thus, Def. 2.5 covers the traditional definition of an incompletely specified sequential machine as its special case. This definition covers also nondeterministic FSMs, which are defined as FSMs having transition function that for a given input and current state combination may return multiple next states.

<table>
<thead>
<tr>
<th>in</th>
<th>ps</th>
<th>ns</th>
<th>out</th>
</tr>
</thead>
<tbody>
<tr>
<td>0</td>
<td>a</td>
<td>b, c</td>
<td>00, 11</td>
</tr>
<tr>
<td>1</td>
<td>a</td>
<td>a</td>
<td>01</td>
</tr>
<tr>
<td>0</td>
<td>b</td>
<td>a, c</td>
<td>11</td>
</tr>
<tr>
<td>1</td>
<td>b</td>
<td>b</td>
<td>01</td>
</tr>
<tr>
<td>0</td>
<td>c</td>
<td>a</td>
<td>10, 01</td>
</tr>
<tr>
<td>1</td>
<td>c</td>
<td>b</td>
<td>10, 01, 11</td>
</tr>
</tbody>
</table>

Figure 2.3. Incompletely specified sequential machine

Ex. 2.2.2. Consider the STT in Fig 2.3. Transition 1 specifies a move from state $a$ under input of 0 to either $b$ or $c$, producing output 00 or 11.

When performing any operations or modifications of an FSM, it is usually necessary to ascertain that the behavior of the modified FSM does not change. For this purpose a concept of behavior realization is introduced that indicates that the modified FSM still has the same behavior as the original machine. There exist two notions of behavior realization for FSMs: output behavior realization and state behavior realization.

Output behavior realization means that for any sequence of inputs values, the sequences of output values are identical. The following definition states that property formally and relaxes the requirement for identical or isomorphic sequences to allow for equivalent sequences, which introduces mapping functions between the input/output alphabets of the original machine and its equivalent realization.

Since a single state of the realization machine $M'$ is used here to represent a certain state of the specification machine $M$, such realizations are called single-state realizations.

Definition 2.6 (Single-state output behavior realization of completely specified FSM). Completely specified machine $M'(I', S', O', \delta', \lambda')$ is an output behavior realization of c.s.
machine $M(I, S, O, \delta, \lambda)$ if and only if functions: $\Psi : I \rightarrow I'$, $\Phi : S \rightarrow S'$, $\Theta : O' \rightarrow O$
exist, such that

$$\forall s \in S \forall x \in I \quad \delta'(\Phi(s), \Psi(x)) = \Phi(\delta(s, x)) \wedge \Theta(\lambda'(\Phi(s), \Psi(x))) = \lambda(s, x)$$

For many applications, only output behavior of the FSM is relevant, and the inter-

nal state behavior is never apparent to, or used by, the environment. However, in some

cases the current state of the machine may be used outside it. Then, the concept of the

state behavior realization is necessary that requires that the current state of the modi-

fied machine unambiguously identifies the state of the original machine (possibly, after

renaming).

**Definition 2.7 (Single-state output and state realization of completely specified FSM).**

Completely specified machine $M'(I', S', O', \delta', \lambda')$ is an output and state behavior realization

of c.s. machine $M(I, S, O, \delta, \lambda)$ if and only if functions: $\Psi : I \rightarrow I'$, $\Phi : S \rightarrow S'$, $\Theta : O' \rightarrow O$
exist, such that

$$\forall s' \in S' \forall x \in I \quad \Phi'(\delta'(s', \Psi(x))) = \delta'(\Phi'(s'), x) \wedge \Theta(\lambda'(s', \Psi(x))) = \lambda(\Phi'(s'), x)$$

For the case of incompletely specified FSMs, the above requirements for behavior

realization can be further relaxed to allow for the realization machine to use some of

the implementation freedom allowed for by the original FSM’s don’t-cares. In particular, in

given state/input conditions the next state and output of the realization machine does not

have to be exactly equivalent to the next state and output of the specification machine. It

is sufficient if they are compatible, i.e. belong to a subset of the next-states and output

of the specification machine. Also, a single state of a specification machine $M$ can be

in general represented by multiple states of a corresponding implementation machine.

As a result of the above considerations, the conditions of the behavior realization can be

relaxed to check for compatibility (i.e. inclusion) of next-states and outputs instead of

equivalence (i.e. equality).

**Definition 2.8 (Output behavior realization of incompletely specified FSM).** Incompletely

specified machine $M_r(I_r, S_r, O_r, \delta_r, \lambda_r)$ is an output behavior realization of i.s. machine

$M(I, S, O, \delta, \lambda)$ if and only if functions: $\Psi : I \rightarrow I_r$, $\Phi : 2^S \rightarrow 2^{S_r}$, $\Theta : 2^O \rightarrow 2^O_r$
exist, such that

$$\forall s \in S \forall x \in I \quad \delta_r(\Phi(s), \Psi(x)) \subseteq \Phi(\delta(s, x)) \wedge \Theta(\lambda_r(\Phi(s), \Psi(x))) \subseteq \lambda(s, x)$$

If $M_r$ is an output behavior realization of $M$ then, for all possible input sequences, the

output sequences produced by imitation $M_r$ after renaming them are subsets of the

output sequences of $M$.

**Definition 2.9 (Output and state realization of incompletely specified FSM).** Incompletely

specified machine $M_r(I_r, S_r, O_r, \delta_r, \lambda_r)$ is an output and state behavior realization of i.s. machine

$M(I, S, O, \delta, \lambda)$ if and only if functions: $\Psi : I \rightarrow I_r$, $\Phi_r : 2^{S_r} \rightarrow 2^S$, $\Theta : 2^{O_r} \rightarrow 2^O$
exist, such that

$$\forall s_r \in S_r \forall x \in I \quad \Phi_r(\delta_r(s_r, \Psi(x))) \subseteq \delta(\Phi_r(s_r), x) \wedge \Theta(\lambda_r(s_r, \Psi(x))) \subseteq \lambda(\Phi_r(s_r), x)$$
Given a specification machine $M$ and a machine $M'$ realizing its output and/or state behavior, the machine compatible with the original can be built by adding input / output / state mapping functions to the realization machine. Such structure is referred to as realization structure of $M$.

**Definition 2.10 (Realization structure).** The sequential machine composed as a structure consisting of $\Psi$, $M'$ and $\Theta$ (and $\Phi'$ for state behavior realization) is referred to as the realization structure for $M$ defined by $M'$ and denoted as $\text{str}(M')$.

Schematic representation of the realization structure for output and state and output behavior realization is presented in Fig. 2.4.

![Figure 2.4. FSM behavior realization structures](image)

**2.3 Covers**

The concepts of covers, set systems and partitions are central to the information analysis apparatus used extensively in this thesis. They are used as representation of information in the modeled circuits. These concepts were used under various names by various researchers, e.g. in [7, 8, 23]. Below, we present the nomenclature introduced in [48].

**Definition 2.11 (Cover).** Cover $\phi_S$ on $S$ is defined as

$$\phi_S = \{B_i | B_i \subseteq S \land \bigcup_i B_i = S\}$$

Cover is any set of subsets of $S$ covering $S$.

**Ex. 2.3.1.** For set $S = \{1 \ldots 5\}$ $\phi_5^1 = \{1; 12; 23; 45; 45\}$ is a cover, but $\phi_5^2 = \{1; 12; 45; 45\}$ is not.

**Definition 2.12 (Unique-block cover).** Unique-block cover $\phi_S$ on $S$ is defined as

$$\phi_S = \{B_i | B_i \subseteq S \land \bigcup_i B_i = S \land i \neq j \Rightarrow B_i \neq B_j\}$$

Unique-block cover is a cover with no identical blocks.

**Ex. 2.3.2.** $\phi_5^1$ from Example 2.3.1 is not a unique-block cover, but $\phi_5^3 = \{1; 12; 23; 45\}$ is.
Defintion 2.13 (Set system). Set system $\phi_S$ on $S$ is defined as

$$\phi_S = \left\{B_i | B_i \subseteq S \land \bigcup_i B_i = S \land i \neq j \Rightarrow B_i \nsubseteq B_j \right\}$$

Set system is a unique-block cover with no blocks entirely included in others.

Ex. 2.3.3. $\phi_2^S$ from Example 2.3.2 is not a set system, but $\phi_4^S = \{12; 23; 45\}$ is.

Defintion 2.14 (Partition). Partition $\pi_S$ on $S$ is defined as

$$\pi_S = \left\{B_i | B_i \subseteq S \land \bigcup_i B_i = S \land i \neq j \Rightarrow B_i \cap B_j = \emptyset \right\}$$

Partition is a set system with non-overlapping blocks.

Ex. 2.3.4. $\phi_2^S$ from Example 2.3.3 is not a partition, but $\phi_2^S = \{12; 3\}$ is.

For a given $s \in S$ the set of blocks of cover containing $s$ is denoted as $[s]_\phi$. Similarly, $[S]_\phi$ for $S \subseteq S$ is the set of blocks containing all of the elements $s \in S$ ($[S]_\phi = \bigcap_{s \in S} [s]_\phi$).

Defintion 2.15 (Partition product). A product of partitions $\pi_1^S$ and $\pi_2^S$ is a partition $\pi_S$ such that:

$$\pi_S = \{B_i | \exists B^1 \in \pi_1^S \exists B^2 \in \pi_2^S : B_i = B^1 \cap B^2 \land i \neq j \Rightarrow B_i \cap B_j = \emptyset\}$$

Ex. 2.3.5. $\{1, 2, 3, 4, 5\} \cdot \{1, 5, 2, 3, 4\} = \{1, 2, 3, 4, 5\}$

Defintion 2.16 (Unique-block cover product). A product of unique-block covers $\phi_1^S$ and $\phi_2^S$ is a unique-block cover $\phi_S$ such that:

$$\phi_S = \{B_i | \exists B^1 \in \phi_1^S \exists B^2 \in \phi_2^S : B_i = B^1 \cap B^2 \land i \neq j \Rightarrow B_i \neq B_j\}$$

Ex. 2.3.6. $\{1, 2, 1, 2, 3, 4, 5\} \cdot \{1, 3, 2, 3, 4, 5\} = \{1, 2, 1, 3, 2, 3, 4, 5\}$

Covers are multiplied by intersecting their blocks and removing blocks included in other blocks (for set systems) or identical blocks (for unique-block covers).

Defintion 2.17 (Cover sum). A sum of covers $\phi_1^S$ and $\phi_2^S$ is a partition $\pi_S$ such that $[s]_\phi \pi_S = [t]_\phi \pi_S$ if and only if a sequence $s_0 = s, s_1, \ldots, s_n = t, s_i \in S$ exists where either $[s_i]_\phi \phi_1^S \cap [s_{i+1}]_\phi \phi_2^S \neq \emptyset$ or $[s_i]_\phi \phi_2^S \cap [s_{i+1}]_\phi \phi_1^S \neq \emptyset$.

Ex. 2.3.7. $\{1, 2, 1, 3, 2, 3, 4, 5\} + \{1, 2, 3, 4, 5\} = \{1, 2, 3, 4, 5\}$

A sum of covers is formed by merging blocks that have any element in common.

Defintion 2.18 (Cover relation $\leq$). $\phi_1 \leq \phi_2 \iff \forall B^1 \in \phi_1 \exists B^2 \in \phi_2 : B^1 \subseteq B^2$

Ex. 2.3.8. $\{1, 2, 3, 1, 3, 4, 5\} \leq \{1, 2, 3, 1, 2, 4, 5\}$ but for $\phi_1 = \{1, 2, 3, 1, 3, 4, 5\}$ and $\phi_2 = \{1, 2, 3, 4, 5\}$ neither $\phi_1 \leq \phi_2$ nor $\phi_2 \leq \phi_1$
Definition 2.19 (Zero-cover). A zero-cover is a partition \( \pi_S(0) \) such that
\[
\pi_S(0) = \{ B_i \mid |B_i| = 1 \land \bigcup_i B_i = S \}
\]
A zero-cover puts each element of \( S \) in a separate block.

Definition 2.20 (One-cover). A one-cover is a partition \( \pi_S(1) = \{ S \} \).

A one-cover is a cover with all elements in one block.

Note that for any \( S \) on any \( S \) the following equalities hold:
\[
S + S(0) = S \\
S + S(1) = S(1) \\
S * S(0) = S(0) \\
S * S(1) = S
\]

Definition 2.21 (Induced cover). \( \phi_A \times B = ind_A^A \times B(\phi_A) = \{ X \times B \mid X \in \phi_A \} \)

\( \phi_A \times B \) is a cover on \( A \times B \) induced by \( \phi_A \) if it keeps all pairs of \( A \times B \) involving the elements of \( A \) placed in the same block of \( \phi_A \) in one block.

Ex. 2.3.9. For \( A = \{ a, b, c, d \} \), \( B = \{ 0, 1 \} \) and \( \phi_A = \{ \bar{a}, \bar{b}, \bar{c}, \bar{d} \} \), \( ind_A^A \times B(\phi_A) = \{ (a, 0)(a, 1)(b, 0)(b, 1); (c, 0)(c, 1)(d, 0)(d, 1) \} \)

2.3.1 Dichotomies

For the special case of two-block set system, a useful shorthand notation can be introduced in the form of dichotomy.

Definition 2.22 (Unordered dichotomy). An unordered dichotomy on \( S \) is a set of two disjoint subsets of \( S \).

Note that the union of blocks of a dichotomy does not necessarily cover \( S \), what is required for a cover. However, a dichotomy can be transformed into a valid set system in a straightforward manner by placing the missing symbols in both blocks. Thus, a dichotomy can be treated as a compact notation of a two-block set system. As we will show in the following, two-block set systems play major role in modeling binary information and the introduction of dichotomies and their associated operations simplifies analysis and manipulation of the information without any loss of generality.

In some applications, such as state assignment, it is necessary to establish an ordering of the dichotomy’s blocks, and associate a number (0 or 1) with each of the blocks.

Definition 2.23 (Ordered dichotomy). An ordered dichotomy on \( S \) is an ordered pair of two disjoint subsets of \( S \). The first subset is referred to as left- or zero-block, while the second is a right- or one-block.

In the following, we will mostly deal with unordered dichotomies and refer to them in short as dichotomies. For the compactness of notation and where appropriate, we will denote an unordered dichotomy \( d = \{ P, Q \} \) as \( P/Q \) and an ordered dichotomy \( e = (P, Q) \) as \( P\//Q \).
2.3. COVERS

**Ex. 2.3.10.** Consider set $S = \{1, \ldots , 5\}$ and a set system $\phi_S = \{1, 2, 5, 3, 4, 5\}$ on $S$. An unordered dichotomy corresponding to $\phi_S$ can be created by removing the repeated symbols $(3)$ from both blocks. The resulting dichotomy is $d = \{(1, 2), \{4, 5\}\} = 12/45$. The dichotomy $d$ has two ordered versions: $e = \{(1, 2), \{4, 5\}\} = 12/45$ and $f = \{(4, 5), \{1, 2\}\} = 45/12$.

**Definition 2.24 (Unordered dichotomy compatibility).** Two unordered dichotomies $d_1 = \{P_1; Q_1\}$ and $d_2 = \{P_2; Q_2\}$ are compatible (denoted as $d_1 \sim d_2$) iff $P_1 \cap Q_2 = \emptyset \wedge Q_1 \cap P_2 = \emptyset$ (direct compatibility, denoted $\sim_+$) or $P_1 \cap P_2 = \emptyset \wedge Q_1 \cap Q_2 = \emptyset$ (inverse compatibility, denoted $\sim_-$).

**Definition 2.25 (Ordered dichotomy compatibility).** Two ordered dichotomies $d_1 = (P_1, Q_1)$ and $d_2 = (P_2, Q_2)$ are compatible (denoted as $d_1 \sim d_2$) iff $P_1 \cap Q_2 = \emptyset \wedge Q_1 \cap P_2 = \emptyset$.

Note that while unordered dichotomies can be compatible either in direct or inverse form, ordered dichotomies can only be compatible in direct form.

For two compatible ordered or unordered dichotomies we define the operation of merging with as result a dichotomy with the blocks being the union of the corresponding blocks of the component dichotomies.

**Definition 2.26 (Dichotomy merging operator $\ast$).**

$$d_1 = (P_1; Q_1) \wedge d_2 = (P_2; Q_2) \wedge d_1 \sim_+ d_2 \Rightarrow d_1 \ast_+ d_2 = (P_1 \cup P_2; Q_1 \cup Q_2)$$

(direct merging of unordered dichotomies)

$$d_1 = (P_1; Q_1) \wedge d_2 = (P_2; Q_2) \wedge d_1 \sim_- d_2 \Rightarrow d_1 \ast_- d_2 = (P_1 \cup Q_2; Q_1 \cup P_2)$$

(inverse merging of unordered dichotomies)

$$d_1 = (P_1; Q_1) \wedge d_2 = (P_2; Q_2) \wedge d_1 \sim d_2 \Rightarrow d_1 \ast d_2 = (P_1 \cup P_2, Q_1 \cup Q_2)$$

(merging of ordered dichotomies)

The covering relation indicates the fact that the blocks of one dichotomy are subsets of the blocks of the other dichotomy.

**Definition 2.27 (Dichotomy covering relation $\subseteq$).**

For $d_1 = (P_1, Q_1), d_2 = (P_2, Q_2)$

$$(d_1 \subseteq d_2 \iff P_1 \subseteq P_2 \wedge Q_1 \subseteq Q_2 \vee P_1 \subseteq Q_2 \wedge Q_1 \subseteq P_2)$$

For $d_1 = (P_1, Q_1), d_2 = (P_2, Q_2)$

$$(d_1 \subseteq d_2 \iff P_1 \subseteq P_2 \wedge Q_1 \subseteq Q_2)$$

**Ex. 2.3.11.** Consider $S = \{1 \ldots 6\}$ and unordered dichotomies $d_1 = 12/45, d_2 = 3/4, d_3 = 3/6$. $d_1$ and $d_2$ are directly compatible, so they can be directly merged and form $d_{12} = d_1 \ast_+ d_2 = 123/45$. $d_1$ and $d_2$ are not inversely compatible, since the second block of $d_1$ and the second block of $d_2$ share an element $(4)$. $d_1$ and $d_3$ are both directly and inversely compatible, so they can be merged and form $d_{13} = d_1 \ast_+ d_3 = 123/456$ or $d_{31} = d_1 \ast_- d_3 = 126/345$. Also, ordered dichotomies $e_1 = 12/45$ and $e_2 = 3/6$ are compatible and can be merged to form $e_{12} = e_1 \ast e_2 = 123/456$. 


2.4 Information analysis

In the recent years, a number of promising approaches to problems in diverse fields have been proposed that are based on the analysis and manipulation of information. These approaches use set systems (also named blankets or rough sets by some authors) to model information in applications in many fields of modern engineering and science, including logic and architecture synthesis for VLSI systems \[32, 34–37, 39–41, 47, 49, 50, 66, 67\] pattern analysis, knowledge discovery, machine learning, neural network training, decision systems, data bases, encryption, compaction, encoding etc. \[9, 19, 58–60, 68, 70, 73\].

The work in information modeling with partitions and set systems was initiated by Hartmanis and Stearns who used partitions and set systems for the analysis of algebraic structure of finite state machines \[23\], and applied the analysis results to reason about special decompositions (i.e. parallel and serial) and state assignment of FSMs. Although the work of Hartmanis and many authors following him was based on intuitive understanding that set systems model some information present in the considered system, it lacked a precise apparatus for expressing what information is actually modeled and for analysis of the modeled information streams and their relationships. This problem was alleviated with the introduction of Information Relationships and Measures (IRM) by Jóźwiak \[36\].

The IRM framework introduces a concept of elementary information item or atom. Using this concept, it is possible to analyze any information flowing or used within the modeled system as a collection of atomic information items from a certain set of elementary information items. The representation of information as a set enables the application of the set-theoretical operators and methods to reason about relations and attributes of different information flows. In particular, IRM apparatus facilitates the following aspects of system analysis:

- analysis of the information flows — where and how a particular information is produced, and where and how it is consumed,
- analysis of the relationships (similarity, difference) between various information flows,
- introduction of the quantitative flavor (quantity, importance, weight) to characterize the analyzed information flows and their relationships.

The IRM apparatus forms the basis for much of the work presented in this thesis. In the remainder of this section we will therefore introduce basic notions of the information modeling and analysis. We illustrate these notions with the examples from the domain of this thesis, i.e. the synthesis of digital circuits. Note, however, that the apparatus is very general and can be used for analysis of any systems involving discrete functions, relations or finite state machines.

2.4.1 Information model

Elementary information

As discussed in the introduction, the behavior and structure of digital circuits are commonly specified using high-level (hardware) description languages. Despite different
form and syntax, various high-level language specifications of digital circuits and their compiled versions represent some networks of discrete functions, relations or sequential machines. Discrete functions, relations and sequential machines define some mappings between elements of some finite discrete sets (e.g. inputs, states, outputs). Let us use the term symbol to designate an element of such a set.

For instance, in the case of a finite state machine, the current conditions of the machine are determined by the values of the primary inputs and the present state of the machine. Therefore, we can interpret an FSM as an information processing system that combines partial information about the conditions delivered by its primary inputs with the information delivered by the current state variable to obtain the full information about the conditions. Then, FSM uses that combined information to determine the next state and the primary output. In this interpretation, introduced by Józwiak in [35], FSM processes information about “condition-symbols”, where each symbol is associated with a separate input/current-state condition. Consider, for example, the FSM in Fig. 2.5. It has two binary primary inputs, three states and a single binary primary output. In the figure, we associated with each of the seven input/current-state combinations a symbol from a set \( S = \{0 \ldots 6\} \).

To model the information about the symbols delivered by different sources (primary inputs and current state), we use the definition of information formulated in the Information Relationships and Measures apparatus. In the IRM interpretation, information about symbols pertains to the ability to distinguish certain symbols from other symbols. From this formulation follows the definition of an elementary information item as a simplest possible distinction between two particular symbols.

**Definition 2.28 (Elementary information).** An elementary information describes the ability to distinguish a certain single symbol \( s_i \) from another single symbol \( s_j \) (\( s_i, s_j \in S \) and \( s_i \neq s_j \)).

Consider, for instance, information about symbols from \( S \) delivered by the first primary input \( x_1 \) of the FSM in Fig. 2.5. The fact that the variable has different values for the conditions 0 and 1 means that the variable is able to distinguish condition 0 from 1 and, therefore, the information delivered by this variable will contain the elementary distinction 0/1. This fact means that, knowing that the current condition is either 0 or 1, the variable \( x_1 \) is capable of indicating which of the two it is. Note that this is different from saying that the variable is capable of indicating whether the current condition is 0. This would require distinguishing condition 0 from all other conditions, not just 1, and would be described by a set of elementary information items \( \{0|1; 0|2; 0|3; 0|4; 0|5; 0|6\} \).

<table>
<thead>
<tr>
<th>( x_1x_2 )</th>
<th>( p )</th>
<th>( n )</th>
<th>( y_1 )</th>
</tr>
</thead>
<tbody>
<tr>
<td>0 0</td>
<td>( s_0 )</td>
<td>( s_1 )</td>
<td>0</td>
</tr>
<tr>
<td>1 0</td>
<td>( s_0 )</td>
<td>( s_2 )</td>
<td>1</td>
</tr>
<tr>
<td>2 00</td>
<td>( s_1 )</td>
<td>( s_0 )</td>
<td>0</td>
</tr>
<tr>
<td>3 1</td>
<td>( s_1 )</td>
<td>( s_2 )</td>
<td>1</td>
</tr>
<tr>
<td>4 1</td>
<td>( s_1 )</td>
<td>( s_2 )</td>
<td>1</td>
</tr>
<tr>
<td>5 0</td>
<td>( s_2 )</td>
<td>( s_1 )</td>
<td>0</td>
</tr>
<tr>
<td>6 1</td>
<td>( s_2 )</td>
<td>( s_2 )</td>
<td>1</td>
</tr>
</tbody>
</table>

**Figure 2.5.** Example finite state machine
Analyzing the values of \( x_1 \) for the remaining conditions (symbols of \( S \)), we conclude that the two values of \( x_1 \) allow to distinguish between the following symbols: 0/1, 0/3, 1/2 and 2/3. Since the value of \( x_1 \) for condition 4 is don’t-care, which indicates 0 and 1, the variable is incapable of distinguishing 4 from any other condition. The set of atomic distinctions realized by variable \( x_1 \) is called an information set of \( x_1 \) (denoted \( IS(x_1) = \{0|1, 0|3, 1|2, 2|3\} \)) and it models the entire information about the symbols delivered by this variable.

**Definition 2.29 (Information set).** A given (partial) discrete information about a set of symbols \( S \) can be represented by an information set \( IS \) defined on \( S \times S \) as follows: \( IS = \{\{s_i, s_j\} | s_i, s_j \in S \text{ and } s_i \neq s_j\} \).

In an analogous way, the concept of abstraction was introduced in \([36]\) that expresses the inability of the given information to distinguish between symbols. The elementary item of abstraction is defined as follows.

**Definition 2.30 (Elementary abstraction).** An elementary abstraction describes the inability to distinguish a certain single symbol \( s_i \) from another single symbol \( s_j \) \((s_i, s_j \in S \text{ and } s_i \neq s_j)\).

The total abstraction associated with the given information stream can be described by an abstraction set.

**Definition 2.31 (Abstraction set).** An abstraction on a set of symbols \( S \) corresponding to a given (partial) information can be represented by an abstraction set \( AS \) defined on \( S \times S \) as follows: \( AS = \{\{s_i, s_j\} | s_i, s_j \in S \text{ and } s_i \neq s_j\} \).

Note that for a given information stream, the information and abstraction sets are complementary and their union is \( S \times S \). Therefore, information set unambiguously identifies the abstraction set and vice versa.

**Set systems**

Another alternative of expressing abstraction and, by extension, the corresponding information is to identify blocks of symbols that are not distinguished by the modeled information. In our example, the variable \( x_1 \) through its two values 0 and 1 induces two compatibility classes on the symbols: \( B^0 = \{0, 2, 4, 5, 6\} \) and \( B^1 = \{1, 3, 4, 5, 6\} \). \( x_1 \) has value 0/1 for each symbol in class \( B^0(B^1) \) (don’t-care “−” means: 0 and 1). Thus, variable \( x_1 \) is not able to distinguish between symbols 0, 2, 4 and 5, and 6, because they belong to the same compatibility class. The set of compatibility classes associated with all values of a variable forms a set system that models the information delivered by the variable. In this case, the set system is \( \phi_S(x_1) = \{0, 2, 4, 5, 6; 1, 3, 4, 5, 6\} \).

There exist operations of conversion between a set system representing given information and the corresponding information set, and vice versa. Since the sets of distinguishable symbols and non-distinguishable symbols are complementary, to convert a set system to an information set, it is sufficient to list all pairs of symbols that are not placed together in any block of the set system. Conversely, to find a set system corresponding to an information set, one needs to start with a single block containing all symbols and, for all distinctions in IS, split all blocks containing both symbols of the considered distinction into two copies, each of the copies with one of the symbols removed. In this process, some of the blocks may become a subsets of other blocks, and they need to
be removed. For instance, if we list all 21 pairs of 7 symbols in $S$, we can verify that only the four pairs of symbols present in $IS(x_1)$ are not placed together in any of the two blocks of $\phi_S(x_1)$. To perform the reverse mapping, we start with a set system containing single block $\phi^1_S = \{0, 1, 2, 3, 4, 5, 6\}$. Then, we take the first distinction from $IS(x_1)$, $0|1$ and split the single block of $\phi^1_S$ into two copies — one of the without 0, the other without 1. Thus, the set system becomes $\phi^2_S = \{0, 2, 3, 4, 5, 6; 1, 2, 3, 4, 5, 6\}$. Further, we split the first block containing both symbols of the next distinction, $0|3$, and obtain $\phi^3_S = \{0, 2, 4, 5, 6; 2, 3, 4, 5, 6; 1, 2, 3, 4, 5, 6\}$. Since the second block of $\phi^2_S$ is entirely included in the third block, it is removed and the set system becomes $\phi^5_S = \{0, 2, 4, 5, 6; 1, 2, 3, 4, 5, 6\}$. For the distinction $1|2$, the second block is split into $1, 3, 4, 5, 6$ and $2, 3, 4, 5, 6$ resulting in $\phi^6_S = \{0, 2, 4, 5, 6; 2, 3, 4, 5, 6; 1, 3, 4, 5, 6\}$. Finally, the distinction $2|3$ splits the second block into $2, 4, 5, 6$, which is covered by the first block, and $3, 4, 5, 6$, which is covered by the third block, and thus the set system $\phi_S = \{0, 2, 4, 5, 6; 1, 3, 4, 5, 6\}$ corresponding to $IS(x_1)$ is constructed.

While the procedure described above generates a unique set system for any information set, in general there are multiple set system representations for the same information set. Consider, for instance two set systems on $S = \{1 . . . 5\}$: $\phi^1_S = \{12, 34\}$ and $\phi^2_S = \{12, 23, 13, 45\}$. For both set systems, the corresponding information set is $IS = \{1|4, 1|5, 2|4, 2|5, 3|4, 3|5\}$. The issue of multiple set system representation was discussed in depth in [75]. It was shown there that for any information set there exists a canonical set system representation that is unique. The blocks of canonical set system have to be the maximal compatibility classes of symbols. In the above example, $\phi^1_S$ is the canonical representation of $IS$, while $\phi^2_S$ is not, because the three first blocks of $\phi^2_S$ can be merged to a larger compatibility class $123$. Note, however, that the conversion procedure described above will always create a canonical set system, as it only separates the symbols that cannot be in the same block, and therefore it implicitly preserves maximal compatibility classes of symbols.

At this point we note the two special set systems, zero-cover and one-cover (see Def. 2.19 and 2.20). We recall that zero-cover is in fact a partition with all symbols in separate blocks. Its corresponding information set contains therefore all distinctions of its symbols, and in this way models full information about the symbols. This situation corresponds to the modeled variable having separate value for each of the symbols. On the other end of the spectrum is one-cover, which contains all symbols in one block and, therefore, corresponds to an empty information set. It represents the variable with don’t-cares (or the same, single value) for all symbols and therefore, it does not deliver any information. Note that the initial cover in the construction of a cover corresponding to a given information set is a one-cover, so there are no distinctions represented. In the following construction steps, the covers represent information sets with one additional distinction, until all required distinctions are represented.

The above notions, illustrated with the examples of binary variables, also hold for the multi-valued variables. In any case, each value of a variable defines a single block of a set system representing the information associated with this variable. In this block, all symbols associated with this value of the variable are placed. For instance, the present-state variable $p$ of the FSM in Fig. 2.5, induced through its three values three blocks of symbols: $0, 1$ (for value $s_0$), $2, 3, 4$ (for $s_1$) and $5, 6$ (for $s_2$). Thus, the set system associated with $p$ is $\phi_S(p) = \{0, 1; 2, 3, 4, 5, 6\}$. As illustrated above, the information set corresponding to $\phi_S(p)$ can be determined by listing all symbol pairs not placed together in a block.
of \( \phi_S(p) \). This results in \( IS(p) = \{0|2, 0|3, 0|4, 0|5, 0|6, 1|2, 1|3, 1|4, 1|5, 1|6, 2|5, 2|6, 3|5, 3|6, 4|5, 4|6\} \).

While the distinctions relating to the input variables are interpreted as information delivered by the variables, the distinctions relating to the output variables can be interpreted as the information required to compute the given output. Take, for instance, the output \( y_1 \) of the FSM in Fig. 2.5. Its set system is \( \phi_{y_1} = \{0, 2, 3, 4, 6\} \), and the information set is \( IS(\phi_{y_1}) = \{0|1, 0|3, 0|4, 0|6, 1|2, 1|5, 2|3, 2|4, 2|6, 3|5, 4|5, 5|6\} \). Clearly, to be able to decide whether the output is 0 or 1, the FSM needs to know whether the current condition corresponds to one of the symbols \{0, 2, 5\} or \{1, 3, 4, 6\}. This requires the information items contained in \( IS(\phi_{y_1}) \). On the other hand, the output variable can be viewed as an input to the “user” of the FSM, and thus considered to deliver its information to the environment.

To summarize, the corresponding set systems and information sets for all (primary and state) inputs and outputs of the finite state machine shown in Fig. 2.5 are as follows:

\[
\phi_{x_1} = \{0, 2, 4, 5, 6; 1, 3, 4, 5, 6\}, \phi_{x_2} = \{0, 1, 2, 3, 5; 0, 1, 3, 4, 6\}, \\
\phi_p = \{0, 1; 2, 3, 4; 5, 6\}, \phi_o = \{2; 0, 5; 1, 3, 4, 6\}, \phi_{y_1} = \{0, 2, 5; 1, 3, 4, 6\}, \\
IS(\phi_{x_1}) = \{0|1, 0|3, 1|2, 2|3\}, IS(\phi_{x_2}) = \{2|4, 2|6, 4|5, 5|6\}, \\
IS(\phi_p) = \{0|2, 0|3, 0|4, 0|5, 0|6, 1|2, 1|3, 1|4, 1|5, 1|6, 2|5, 2|6, 3|5, 3|6, 4|5, 4|6\}, \\
IS(\phi_o) = \{0|1, 0|2, 0|3, 0|4, 0|6, 1|2, 1|5, 2|3, 2|4, 2|5, 2|6, 3|5, 4|5, 5|6\}, \\
IS(\phi_{y_1}) = \{0|1, 0|3, 0|4, 0|6, 1|2, 1|5, 2|3, 2|4, 2|6, 3|5, 4|5, 5|6\}
\]

Information-related set system operations

From the point of view of information manipulation, there are two important set system operations: multiplication of set systems (Def. 2.16) and smaller-or-equal relation (Def. 2.18).

The multiplication of set systems corresponds in the information set domain to the sum of information modeled by the set systems. In this way, it is a very useful operation that allows combining information delivered by multiple information streams. For instance, the total information delivered by the two set systems \( \phi_1^S = \{12, 34\} \) and \( \phi_2^S = \{13, 25\} \) is described by the product set system \( \phi_S = \phi_1^S \cdot \phi_2^S = \{1, 23, 34\} \). This is reflected by the fact that \( IS(\phi_S) = IS(\phi_1^S) \cup IS(\phi_2^S) \). An important result proven in [73] shows that multiplication preserves canonicity of set systems. Therefore, information streams can be combined with the multiplication operation preserving unambiguity of information representation.

The smaller-or-equal relationship between two set systems implies that the information set corresponding to the smaller set system is a superset of the information set of the larger set system. In other words, the smaller set system provides more information. This is a direct consequence of the fact that a smaller set system has smaller blocks, and therefore provides less abstraction and more information. Consider, for instance, two set systems on \( S = \{1 \ldots 4\}: \phi_1^S = \{12, 34\} \) and \( \phi_2^S = \{1, 2, 34\} \) with information sets \( IS(\phi_1^S) = \{1|3, 1|4, 2|3, 2|4\} \) and \( IS(\phi_2^S) = \{1|2, 1|3, 1|4, 2|3, 2|4\} \) Clearly, \( \phi_2^S \leq \phi_1^S \) and
IS(\phi^2_S) \supset IS(\phi^1_S). This is because the two blocks 1 and 2 of \phi^2_S that are included in the block 12 of \phi^1_S allow to distinguish symbol 1 from 2 that are indistinguishable by \phi^1_S.

Note that smaller-or-equal relationship implies information set inclusion, but is not equivalent to it. I.e. it is a sufficient but not necessary condition. This is caused by the canonicity issues of the set system representation that were discussed in the previous section. In particular, it may happen that two non-canonical set systems \phi^1_S and \phi^2_S have identical information sets, while neither \phi^1_S \leq \phi^2_S nor \phi^2_S \leq \phi^1_S. This is, for instance, the case for the following two set systems: \phi^1_S = \{123, 45, 56, 46\} and \phi^2_S = \{12, 23, 13, 456\}. However, as shown in [75], if the set systems are canonical, the smaller-or-equal relationship is both sufficient and necessary condition for the information set inclusion.

The concept of information set inclusion, and therefore of the smaller-or-equal set system relation, is crucial to the analysis of information flows. In particular, given two set systems \phi^1_S and \phi^2_S, if the information set of \phi^1_S is a superset of the information set of \phi^2_S, it is possible to build a relation (or function) that has as input the information stream \phi^1_S and produces output information stream \phi^2_S. If both set systems are canonical, this requirement is equivalent to \phi^1_S \leq \phi^2_S.

Let us illustrate the usage of multiplication and \leq operator to the analysis of information flows in a Boolean function network. Consider, for instance, the incompletely specified Boolean function in Fig. 2.6(a). It has three inputs: a, b and c and output f. In the figure, all the possible input combinations of the function were associated with symbols from set S = \{0\ldots7\}. The information about input combination required to compute the output f is given by set system \phi(f) = \{02367; 1345\}. At the same time, each of the inputs delivers a partial information about input combination given by the following set systems: \phi(a) = \{0123; 4567\}, \phi(b) = \{0145; 2367\} and \phi(c) = \{0246; 1357\}. The realization network for the function f is presented in Fig. 2.6(b). The network is partially built with two sub-functions f_1 and f_2 already built and the third sub-function f_3 being constructed. The analysis of information flows will allow us to analyze how the input information was filtered by f_1 and f_2; if the filtered information is sufficient to construct the missing sub-function; and what function has to be realized by f_3.

The input information available to f_1 is a combination of information delivered by inputs a and b. It is expressed by the input set system \phi^{in}(f_1) = \phi(a) \cdot \phi(b) = \{01(00); 23(01); 45(10); 67(11)\}. The values in parenthesis indicate what combination of the variables a and b is associated with a particular block of the input set system. The

\begin{verbatim}
\begin{array}{cccc}
  abc & f & S \\
  000 & 0 & 0 \\
  001 & 1 & 1 \\
  010 & 0 & 2 \\
  011 & 0 & 3 \\
  100 & 1 & 4 \\
  101 & 1 & 5 \\
  110 & 0 & 6 \\
  111 & 0 & 7 \\
\end{array}
\end{verbatim}

(a) Example function

\begin{verbatim}
\begin{array}{cccc}
  abc & f & S \\
  000 & 0 & 0 \\
  001 & 1 & 1 \\
  010 & 0 & 2 \\
  011 & 0 & 3 \\
  100 & 1 & 4 \\
  101 & 1 & 5 \\
  110 & 0 & 6 \\
  111 & 0 & 7 \\
\end{array}
\end{verbatim}

(b) Realization network

Figure 2.6. Example of information flow analysis
function $f_1$ assigns one output value (0) to its input combinations 00, 01 and 10. For the output set systems of function $f_1$, $\phi^{\text{out}}(f_1)$, it means that the function abstracts from any distinction between these three situations, and therefore in the output set system the three blocks of the input set system will be merged together. The fourth block of $\phi^{\text{in}}(f_1)$ is assigned a separate value 1 by $f_1$, and therefore will be preserved in the output set system. Thus, the output set system becomes $\phi^{\text{out}}(f_1) = \{012345(0); 67(1)\}$, with the output value associated with each block given in parenthesis.

In an analogous manner, the input information of $f_2$ is $\phi^{\text{in}}(f_2) = \phi(a) \cdot \phi(c) = \{02(00); 13(01); 15(10); 57(11)\}$. The function $f_2$ abstracts from input combinations 01, 10 and 11 and therefore the output set system of $f_2$ is $\phi^{\text{out}}(f_2) = \{02(0); 134567(1)\}$.

We can see that in both cases the input set systems are smaller than the output set systems. It is a natural consequence of the fact that a function can only abstract from the input information, i.e. assign same value to different input combinations, and therefore the output set system is always a result of some merging of the blocks of the input set system.

The input of $f_3$ is formed by two information streams: first from $f_1$, described by $\phi^{\text{out}}(f_1)$, and second from $f_2$, described by $\phi^{\text{out}}(f_2)$. The input information of $f_3$ is therefore $\phi^{\text{in}}(f_3) = \phi^{\text{out}}(f_1) \cdot \phi^{\text{out}}(f_2) = \{02(00); 134501; 67(11)\}$. To check whether it is possible to build a function that will produce output $f$ using this input information, we just need to check the condition $\phi^{\text{in}}(f_3) \leq \phi(f)$. In this case, we can see that the condition is fulfilled. By merging the first and third block of $\phi^{\text{in}}(f_3)$, a two-block set system $\phi^*(f) = \{0267; 1345\}$ is built that is smaller than $\phi(f)$, and therefore represents output of function $f$ with some don't-cares filled. In this case, for input condition 3, which was a don't-care condition in $f$ the value 1 was assigned, making $f'$ a completely specified realization of the incompletely specified function $f$. At the same time, the merging of blocks of $\phi^{\text{in}}(f_3)$ defines the function of the block $f_3$. Since the blocks corresponding to input combinations 00 and 11 were merged together to form the output set systems $\phi^*(f)$, the function $f_3$ will have the same output value 0 for these two input combinations. The second block of the input set system associated with input combination 01 was not modified in the output set system, and therefore the value of $f_3$ for input 01 is 1. We can see that the input combination 10 does not appear at all in the input set system, what indicates that this is a don't-care condition of $f_3$, which can be used for minimization of $f_3$.

Dichotomies

As discussed in Section 2.3.1, dichotomies can be used as a short-hand notation for two-block set systems and therefore can be used to model information delivered by binary variables. Since a dichotomy is exactly equivalent to the corresponding canonical set system, we can unambiguously associate the set system’s information set with the dichotomy.

Consider, for instance, the set system $\phi = \{123; 2345\}$. The corresponding dichotomy $d$ is created by removing repeating symbols from both blocks. Thus, $d = 1/45$. The information set of $d$ is equal to the information set of $\phi$ and is $IS = \{1|4, 1|5\}$. In the case of dichotomies the information set can be derived even more easily by simply enumerating all pairs that the symbols in the left block can form with the symbols in the right block.
Since binary variables are prevalent in today’s implementations of digital systems, the dichotomies as the information modeling tool for these variables are very useful. In particular, in this thesis we will show how dichotomies can be used to represent information about states of a finite state machine delivered by binary encoding variables.

### 2.4. INFORMATION ANALYSIS

#### 2.4.2 Information relationships and measures

Representation of information as an information set means that information relationships between variables or set systems representing various information streams can be analyzed by considering relationships between their corresponding information sets. For instance, a question about common information delivered by two variables can be answered by identifying the elementary distinctions present in the information sets of both variables. The fact that information is composed of discrete atoms makes it also possible to measure the amount of information as the number of atoms, or to associate importance, or weight, to a particular information item.

In [36][37], an appropriate analysis apparatus is proposed that exploits these characteristics: the theory of information relationships and measures. In particular, a number of relationships between information streams is defined in [36][37]:

- **common information** $CI$ (i.e. information that is present in both $\phi_1$ and $\phi_2$):
  
  $$CI(\phi_1, \phi_2) = IS(\phi_1) \cap IS(\phi_2)$$

- **total (combined) information** $TI$ (i.e. information that is present in either $\phi_1$ or $\phi_2$):
  
  $$TI(\phi_1, \phi_2) = IS(\phi_1) \cup IS(\phi_2)$$

- **missing information** $MI$ (i.e. information that is present in $\phi_1$ but missing in $\phi_2$):
  
  $$MI(\phi_1, \phi_2) = IS(\phi_1) \setminus IS(\phi_2)$$

- **extra information** $EI$ (i.e. information that is missing in $\phi_1$ but present in $\phi_2$):
  
  $$EI(\phi_1, \phi_2) = IS(\phi_2) \setminus IS(\phi_1)$$

- **different information** $DI$

  $$DI(\phi_1, \phi_2) = MI(\phi_1, \phi_2) \cup EI(\phi_1, \phi_2)$$

Also for abstraction, a number of relationships is defined in [36] that describes common, total, missing, extra and different abstraction.

Consider, for instance, the set systems on p. 38 modeling information flows in the FSM in Fig. 2.5. The information delivered by the present state variable $p$ is described by the set system $\phi_p = \{0, 1, 2, 3, 4, 5, 6\}$ with the corresponding information set $IS(\phi_p) = \{0|2, 0|3, 0|4, 0|5, 0|6, 1|2, 1|3, 1|4, 1|5, 1|6, 2|5, 2|6, 3|5, 3|6, 4|5, 4|6\}$. The information required to calculate the next state variable $n$ is modeled by the set system $\phi_n = \{2, 0|5; 1, 3, 4, 6\}$ and the information set $IS(\phi_n) = \{0|1, 0|2, 0|3, 0|4, 0|6, 1|2, 1|5, 2|3, 2|4, 2|5, 2|6, 3|5, 4|5, 5|6\}$. The common information $CI(\phi_n, \phi_p)$, i.e. the information used by $n$ and delivered by $p$ is determined by $IS(\phi_n) \cap IS(\phi_p) = \{0|2, 0|3, 0|4, 0|6, 1|2, 1|5, 2|3, 2|4, 2|5, 2|6, 3|5, 4|5, 5|6\}$. The information used by $n$ but missing in $p$ is described by $MI(\phi_n, \phi_p) = IS(\phi_n) \setminus IS(\phi_p) = \{0|1, 2|3, 2|4, 5|6\}$. The extra information present on $p$ but not used by $n$ is $EI(\phi_n, \phi_p) = IS(\phi_p) \setminus IS(\phi_n) = \{0|5, 1|3, 1|4, 3|6, 4|5|6\}$.

The strength of information or abstraction relationships can be measured in the simplest manner by measuring the amount of common, total, etc. information (or abstraction), expressed as the number of information (or abstraction) items. This way, the following measures can be defined:
\begin{itemize}
  \item information similarity \( ISIM(\phi_1, \phi_2) = |CI(\phi_1, \phi_2)| \)
  \item information difference \( IDIS(\phi_1, \phi_2) = |DI(\phi_1, \phi_2)| \)
  \item information decrease (loss) \( IDEC(\phi_1, \phi_2) = |MI(\phi_1, \phi_2)| \)
  \item information increase (growth) \( IINC(\phi_1, \phi_2) = |EI(\phi_1, \phi_2)| \)
  \item total information quantity \( TIQ(\phi_1, \phi_2) = |TI(\phi_1, \phi_2)| \)
\end{itemize}

For example, the information similarity of the above mentioned variables \( p \) and \( n \) can be measured as \( |CI(n; p)| = 10 \), the information loss: \( |MI(n; p)| = 4 \) and the information growth: \( |EI(n; p)| = 5 \).

In [36][37] some normalized and weighted measures are also defined, by associating an appropriate importance weight \( w(s_i|s_j) \) with each elementary information. The importance of information may be, for instance, related to its availability, i.e. the number of variables at which this information is present. Let \( o \) be a certain output variable, \( X \) be a set of some input variables of \( o \) and \( ISS(X) \) be the set of information sets of variables from \( X \).

**Definition 2.32 (Occurrence multiplicity of elementary information).** Occurrence multiplicity \( m \) of an elementary information \( s_i|s_j \) from \( IS(o) \) in \( ISS(X) \) is defined as follows:

\[
m(s_i|s_j)_{|IS(o)}_{IS(X)} = \sum_{x \in X} occ(s_i|s_j)_{|IS(o)}_{IS(x)}
\]

where:

\[
occ(s_i|s_j)_{|IS(o)}_{IS(x)} = \begin{cases} 
1 & \text{if } (s_i|s_j) \in IS(o) \cap IS(x) \\
0 & \text{otherwise}
\end{cases}
\]

If \( m(s_i|s_j)_{|IS(o)}_{IS(X)} = 1 \), \( s_i|s_j \) required by \( o \) is provided by only a single variable from \( X \), then \( s_i|s_j \) is called a unique information with respect to \( X \). Unique information is of primary importance. For instance, in the example FSM in Fig. 2.5, the distinction \( 2|4 \) is necessary to compute the next-state variable \( n \) \( (2|4 \in IS(n)) \). We can see that this item of information is only delivered by a single input variable \( x_2 \) \( (2|4 \in IS(x_2)) \). Therefore,

\[
m(2|4)_{|IS(n)}_{IS(x_1,x_2,p)} = 1
\]

and \( 2|4 \) is an item of unique information.

The introduction of weights gives rise to the weighted information relationships measures, such as weighted information similarity measure:

\[
WISIM(\phi_1, \phi_2) = \sum_{s_i|s_j \in CI(\phi_1, \phi_2)} w(s_i|s_j)
\]

These measures reflect not only how much information is common, or different in the two information streams, but also how important the information is.
2.4. INFORMATION ANALYSIS

Analysis of information and information relationships enabled by IRM apparatus is of primary importance for analysis and synthesis of digital information systems. In particular, knowing the required and delivered information, amount and importance of information, relationships between the information in different information streams, the strength of these relationships, and combining this knowledge with the synthesis objectives and constraints, one can effectively and efficiently make design decisions, i.e. determine the structure of the required system in order to satisfy the given constraints and optimize the given objectives. In this thesis we will demonstrate how IRM can be used to analyze information flows in the encoded finite state machine and how the result of this analysis can be used to guide the encoding process towards efficient implementation of the machine.
2. PRELIMINARIES
Chapter 3

General Decomposition Theorem

The term *composition* means the act or state of arrangement of parts into a proper relation in order to form a whole (an aggregate). The term *decomposition* means the act or state of disintegration or breakdown of a whole into a system of constituent parts (elements or simpler aggregates). These two notions are very general and can be used in a lot of various contexts. In particular, decomposition is one of the basic analysis and synthesis concepts of complex systems. It consists in breaking down a complex system into a network of less complex and relatively independent collaborating sub-systems that are easier to analyze, comprehend, synthesize or implement, in such a way, that the original system’s behavior is preserved.

In this chapter we will discuss the decomposition of finite state machines. Since a discrete relation (function) can be considered as a special case of an FSM with single state and trivial state-transition function, the results presented in this chapter also extend to the decomposition of discrete functions and relations. In the context of sequential machine analysis and synthesis, decomposition consists in representation of a given sequential machine as a composition of a number of collaborating partial machines that together realize behavior of the given machine. As discussed in Section 1.4, the behavior of the decomposed machine is realized by a network of partial machines with some interconnections between them, input encoder that distributes and encodes primary inputs of the original FSM to the partial machines and the output decoder that decodes the outputs of the partial machines to produce the primary output of the original FSM.

The central point of this chapter is a constructive theorem on the existence of general decomposition of an incompletely specified FSM. This theorem describes the decomposition in terms of covers on the input, output and state alphabets of the decomposed FSM. These covers model partial information about input, output and state of the decomposed machine that is available to the partial machines in the decomposition network. The theorem describes conditions under which the covers describe a valid decomposition. If these conditions are met, it is possible to derive from the covers state transition tables of the partial machines as well as the rules to connect them and the input encoder and output decoder functions. Thus, the theorem defines a generator of correct circuit structures. For a given function or sequential machine, correct circuits that realize the function/machine can be constructed by repetitive use of the generator or its special cases.
The theorem presented in this chapter is an extension of the General Decomposition Theorem for completely specified FSMs published by Józwiak in [35]. The extended theorem was formulated and proven by the author together with Józwiak and was originally published and discussed in [48]. The extension allows modeling of FSMs with incompletely specified output and next-state functions (i.e. non-determinism) and multi-state behavior realization (i.e. multiple states of the realization network corresponding to a single state of the realized machine). In this way it is the most general theorem describing decompositions of any finite state machine and, as a special case, of any discrete function or relation. To introduce the complex notions involved in formulation of the general case, we will start with the completely specified case and illustrate it with an example. Building on that, we will show what modifications to the formulation of the theorem are necessary to account for incompletely specified output and next-state functions and the multi-state behavior realization. This will lead us to the final formulation of the General Decomposition Theorem. Finally, we will present the proof of the extended theorem and an example to illustrate its working.

3.1 Decomposition of completely specified FSM

3.1.1 General Decomposition Theorem

In the view of the General Decomposition Theorem, the realization network is a system of information-processing sub-systems (partial machines). Each of these sub-systems requires some input information and produces some output information. The conditions stated in the General Decomposition Theorem in their essence require that the produced output information can be legally derived from the input and/or state information. These conditions are expressed in terms of partitions, set systems or covers modeling the input, state and output information streams of the realized machine and of the partial machines. In the case of completely specified FSMs, it is possible to use partitions instead of more general set systems or covers to model this information, as in the completely specified machine the distinctions between symbols are “crisp”, i.e. the compatibility between the symbols that is introduced by don’t-cares in specification is replaced with a strictly defined equivalence. Later, we will see that introduction of the output and state don’t-cares requires using covers rather than partitions for information modeling.

For the input and output information streams described by the partitions on the same sets of symbols, the fact that the output information can be legally derived from the input is described by the $\leq$ operator. As discussed in Section 2.4, the $\leq$ operator indicates that the smaller partition has smaller blocks than the larger partition, and therefore delivers more information. Hence, it is always possible to construct a function that, based on the input information, will unambiguously produce the output information. Therefore, the requirement of non-increasing output information can be expressed as $\pi_X^{\text{in}} \leq \pi_X^{\text{out}}$.

The situation is more complicated when the information about one set symbols is used to compute information about another set of symbols. Then, the $\leq$ operator is not defined and a way to map between the sets of symbols is necessary. In the case of FSM, the total input information to a machine is an information about the current state and the primary input. Therefore, the input information can be described by a partition on the Cartesian product of the state and input alphabets — $S \times I$. The output information is the information about the next state (modeled by a partition on $S$) and information about the
primary output (modeled by a partition on \( O \)). To be able to determine whether the given input information about \( S \times I \) is sufficient to compute the given output information about \( S \) or \( O \), the concept of partition pair is introduced. A given partition \( \pi_{S \times I} \) on \( S \times I \) forms a pair with a given partition \( \pi_S \) on \( S \) or \( \pi_O \) on \( O \) if the input/state combinations placed in one block on \( \pi_{S \times I} \) are mapped by \( \delta \) and \( \lambda \), respectively, to the a single block of \( \pi_S \) and \( \pi_O \). Therefore, the existence of partition pair means that knowing the input/state conditions with the precision to a block of \( \pi_{S \times I} \) makes it possible to calculate the next state or output information of a machine with the precision to a block of \( \pi_S \) or \( \pi_O \), respectively. In other words, it is possible to build a function that will map the blocks of \( \pi_{S \times I} \) to the blocks of \( \pi_S \) or \( \pi_O \), and thus the output information can be legally derived from the input information.

**Definition 3.1 (Block transition function).**

\[
\overline{\delta} : 2^{S \times I} \to 2^S \quad \text{and} \quad \overline{\delta}(D) = \{ s' | \delta(s, x) = s' \land (s, x) \in D \land D \subseteq S \times I \}
\]

**Definition 3.2 (Block output function).**

\[
\overline{\lambda} : 2^{S \times I} \to 2^O \quad \text{and} \quad \overline{\lambda}(D) = \{ o | \lambda(s, x) = o \land (s, x) \in D \land D \subseteq S \times I \}
\]

**Definition 3.3 (\( S \times I \) - \( S \) partition pair).** \( (\pi_{S \times I}, \pi_S) \) is a \( S \times I \) - \( S \) partition pair iff

\[
\forall D \in \pi_{S \times I} \exists B \in \pi_S : \overline{\delta}(D) \subseteq B
\]

**Definition 3.4 (\( S \times I \) - \( O \) partition pair).** \( (\pi_{S \times I}, \pi_O) \) is a \( S \times I \) - \( O \) partition pair iff

\[
\forall D \in \pi_{S \times I} \exists C \in \pi_O : \overline{\lambda}(D) \subseteq C
\]

**Ex. 3.1.1.** Consider the state transition table in Figure 3.1. For \( D_1 = \{(a, 0)(b, 0)\} \), \( \overline{\delta}(D_1) = \{\delta(a,0), \delta(b,0)\} = \{c\} \) and \( \overline{\lambda}(D_1) = \{\lambda(a,0), \lambda(b,0)\} = \{00, 11\} \).

For \( D_2 = \{(a,1)(b,1)(c,0)(c,1)\} \), \( \overline{\delta}(D_2) = \{a, b\} \) and \( \overline{\lambda}(D_2) = \{01, 10\} \). Therefore, \( \pi_{S \times I} = \{D_1, D_2\} \) forms a \( S \times I \) - \( S \) pair with \( \pi_S = \{a, b, c\} \) and \( S \times I \) - \( O \) pair with \( \pi_O = \{00, 11; 01, 10\} \). □

<table>
<thead>
<tr>
<th>( \text{in} )</th>
<th>( \text{ps} )</th>
<th>( \text{ns} )</th>
<th>( \text{out} )</th>
</tr>
</thead>
<tbody>
<tr>
<td>0</td>
<td>a</td>
<td>c</td>
<td>00</td>
</tr>
<tr>
<td>1</td>
<td>a</td>
<td>a</td>
<td>01</td>
</tr>
<tr>
<td>0</td>
<td>b</td>
<td>c</td>
<td>11</td>
</tr>
<tr>
<td>1</td>
<td>b</td>
<td>b</td>
<td>01</td>
</tr>
<tr>
<td>0</td>
<td>c</td>
<td>a</td>
<td>10</td>
</tr>
<tr>
<td>1</td>
<td>c</td>
<td>b</td>
<td>10</td>
</tr>
</tbody>
</table>

Figure 3.1. Completely specified sequential machine

General decomposition consists of representation of a given sequential machine as a composition of a number of collaborating partial machines that together realize behavior of the given machine. The concept of general composition describes the system of partial machines together with their interconnections.
Definition 3.5 (General composition of sequential machines). A general composition of \( n \) sequential machines \( M_i \), \( GC = (\{M_i\}, \{Con_i\}) \) consists of the following objects:

1. \( \{M_i = (I_i^*, S_i, O_i, \delta_i, \lambda_i), I_i^* = I_i \times I_i', 1 \leq i \leq n\} \), a set of sequential machines referred to as component (partial) machines.

2. \( \{Con_i : O_i \rightarrow I_i', 1 \leq i, j \leq n\} \), a set of surjective functions referred to as connecting rules of the component machines.

Figure 3.2. General composition of two component machines

Definition 3.6 (General full-decomposition). The machine \( str([\{M_i\}, \{Con_i\}]) \) is a general full-decomposition of the machine \( M \) if and only if a general composition of \( M \) realizes \( M \).

Let \( \pi_i^I, \pi_i^S \) and \( \pi_{S \times I}^I \) be partitions on \( M = (I, S, O, \delta, \lambda) \) on \( I \), \( S \) and \( S \times I \), respectively. Let \( \pi_{S \times I}^S \) be a partition on \( S \times I \) such that \( \pi_{S \times I}^S \leq \pi_{S \times I}^I \) and \( \pi_{S \times I}^I = \prod_{i=1}^{n} \pi_{S \times I}^{I_i} \). Let \( \pi_{S \times I}^I = \prod_{i=1}^{n} \pi_{S \times I}^{I_i} \). Let \( \pi_{S \times I}^S \) and \( \pi_{S \times I}^I \) be partitions induced on \( S \times I \) by \( \pi_i^I \) and \( \pi_i^S \), respectively. \( \pi_{S \times I}^I = \prod_{i=1}^{n} \pi_{S \times I}^{I_i} \) and \( \pi_{S \times I}^S = \prod_{i=1}^{n} \pi_{S \times I}^{S_i} \).

Theorem 3.1 (General Decomposition Theorem for completely specified FSMs [35]). A sequential machine \( M = (I, S, O, \delta, \lambda) \) has a general full decomposition with the output behavior realization with \( n \) component machines if and only if \( n \) trinities of partitions \((\pi_i^I, \pi_i^S, \pi_{S \times I}^I)\) exist, such that:

1. \((\pi_{S \times I}^S, \pi_{S \times I}^I, \pi_{S \times I}^I, \pi_{S \times I}^I)\) is an \( S \times I - S \) partition pair

2. \( \pi_{S \times I}^S \cdot \pi_{S \times I}^I \cdot \pi_{S \times I}^I \cdot \pi_{S \times I}^I \leq \pi_{S \times I}^I \)

3. \( \pi_{S \times I}^S \cdot \pi_{S \times I}^I \leq \pi_{S \times I}^I \)

4. \((\pi_{S \times I}, \pi_O(0))\) is an \( S \times I - O \) partition pair

Additionally, if

5. \( \prod_i \pi_i^S = \pi_S(0) \)

is satisfied, then the state behavior will be realized too.
A corresponding scheme of a general decomposition of $M$ into $n$ partial machines $M_i$ is presented in Figure 3.3. It involves $n$ partial machines $M_1$ through $M_n$. The state, input and output alphabets of each of the machines are defined by their corresponding partitions $\pi^i_S$, $\pi^i_I \times \pi^i_{S \times O}$, and $\pi^i_{S \times I}$, respectively. Each partial machine $M_i$ uses its own state information ($\pi^i_S$), its primary input information ($\pi^i_I$) and information imported from some other machines ($\pi^i_{S \times I}$), after combining all this information computes its own next-state ($\pi^i_S$) and output ($\pi^i_{S \times O}$) information, being a partial state and output information of the original specification machine $M$. The connections between machines are realized by the connection blocks $\text{Con}_i$. Each block $\text{Con}_i$ delivers to its corresponding partial machine $M_i$ the imported information extracted from the combined output information of some other partial machines and represented by the partition $\pi^i_{S \times I}$. The input information of each partial machine $M_i$ ($\pi^i_I$) is extracted from the primary input of the original specification machine $M$ and appropriately encoded by the input encoder block $\Psi$. The output decoder block $\Theta$ combines the output information of the partial machines (product of $\pi^i_{S \times O}$) and translates it to the output of the specification machine $M$.

In this context, the conditions of the General Decomposition Theorem can be interpreted as follows. The condition (1) demands that for any partial state machine, its next state has to be unambiguously defined using only partial state/input information available to the machine. The condition (2) requires that this information is also sufficient to compute the output of the partial machine. To guarantee that the imported information can be produced in some partial machine, condition (3) demands that any imported
3. GENERAL DECOMPOSITION THEOREM

<table>
<thead>
<tr>
<th>$PI_1$</th>
<th>$PI_2$</th>
<th>$PS$</th>
<th>$NS$</th>
<th>$PO_1$</th>
<th>$PO_2$</th>
</tr>
</thead>
<tbody>
<tr>
<td>00</td>
<td>s₁</td>
<td>s₁</td>
<td>00</td>
<td></td>
<td></td>
</tr>
<tr>
<td>01</td>
<td>s₁</td>
<td>s₄</td>
<td>11</td>
<td></td>
<td></td>
</tr>
<tr>
<td>10</td>
<td>s₁</td>
<td>s₄</td>
<td>11</td>
<td></td>
<td></td>
</tr>
<tr>
<td>11</td>
<td>s₁</td>
<td>s₃</td>
<td>11</td>
<td></td>
<td></td>
</tr>
<tr>
<td>00</td>
<td>s₂</td>
<td>s₁</td>
<td>11</td>
<td></td>
<td></td>
</tr>
<tr>
<td>01</td>
<td>s₂</td>
<td>s₄</td>
<td>01</td>
<td></td>
<td></td>
</tr>
<tr>
<td>10</td>
<td>s₂</td>
<td>s₄</td>
<td>01</td>
<td></td>
<td></td>
</tr>
<tr>
<td>11</td>
<td>s₂</td>
<td>s₃</td>
<td>11</td>
<td></td>
<td></td>
</tr>
<tr>
<td>00</td>
<td>s₃</td>
<td>s₂</td>
<td>00</td>
<td></td>
<td></td>
</tr>
<tr>
<td>01</td>
<td>s₃</td>
<td>s₄</td>
<td>11</td>
<td></td>
<td></td>
</tr>
<tr>
<td>10</td>
<td>s₃</td>
<td>s₄</td>
<td>11</td>
<td></td>
<td></td>
</tr>
<tr>
<td>11</td>
<td>s₃</td>
<td>s₃</td>
<td>11</td>
<td></td>
<td></td>
</tr>
<tr>
<td>00</td>
<td>s₄</td>
<td>s₁</td>
<td>00</td>
<td></td>
<td></td>
</tr>
<tr>
<td>01</td>
<td>s₄</td>
<td>s₄</td>
<td>01</td>
<td></td>
<td></td>
</tr>
<tr>
<td>10</td>
<td>s₄</td>
<td>s₄</td>
<td>01</td>
<td></td>
<td></td>
</tr>
<tr>
<td>11</td>
<td>s₄</td>
<td>s₃</td>
<td>11</td>
<td></td>
<td></td>
</tr>
<tr>
<td>00</td>
<td>s₅</td>
<td>s₅</td>
<td>00</td>
<td></td>
<td></td>
</tr>
<tr>
<td>01</td>
<td>s₅</td>
<td>s₄</td>
<td>11</td>
<td></td>
<td></td>
</tr>
<tr>
<td>10</td>
<td>s₅</td>
<td>s₄</td>
<td>11</td>
<td></td>
<td></td>
</tr>
<tr>
<td>11</td>
<td>s₅</td>
<td>s₃</td>
<td>11</td>
<td></td>
<td></td>
</tr>
<tr>
<td>00</td>
<td>s₆</td>
<td>s₅</td>
<td>11</td>
<td></td>
<td></td>
</tr>
<tr>
<td>01</td>
<td>s₆</td>
<td>s₄</td>
<td>01</td>
<td></td>
<td></td>
</tr>
<tr>
<td>10</td>
<td>s₆</td>
<td>s₄</td>
<td>01</td>
<td></td>
<td></td>
</tr>
<tr>
<td>11</td>
<td>s₆</td>
<td>s₃</td>
<td>11</td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

Figure 3.4. STT of the specification machine M

information has to be less than the whole state/input information available in the network. It eliminates the possibility of combinational loops and the usage of information not present in the realization network. The realization of the correct output function is guaranteed by the condition (4) that demands that the combined output of the partial machines has to provide sufficient information to compute the primary output of the realized machine. Finally, condition (5) requires that, for state behavior realization, the combined current state of the partial machines unambiguously identifies the current state of the realized machine.

For more detailed explanation and example of the general decomposition structure see the following example.

3.1.2 Example

Let us consider a completely specified machine $M = (I, S, O, \delta, \lambda)$ given in Fig. 3.4. $M$ has input alphabet $I = \{00, 01, 10, 11\}$, state alphabet $S = \{s₁, s₂, s₃, s₄, s₅, s₆\}$ and output alphabet $O = \{00, 01, 10, 11\}$.

To explain Theorem 3.1 and its use for the construction of decompositional realization structures of sequential machines, let us consider the decomposition of machine $M$ defined by two trinities of partitions: $(\pi^1_I, \pi^1_S, \pi^1_{S \times I})$ and $(\pi^2_I, \pi^2_S, \pi^2_{S \times I})$, where:
\[ \pi_1 = \{00; 01, 10, 11\} = \{i_1, i_2\} \]
\[ \pi_2 = \{s_1, s_2, s_3, s_4, s_5, s_6\} = \{a, b, c\} \]
\[ \pi_{S \times I} = \{(s_1, 00)(s_2, 00)(s_3, 00)(s_4, 00);
            (s_1, 01)(s_1, 10)(s_1, 11)(s_2, 01)(s_2, 10)(s_2, 11)(s_3, 01)(s_3, 10)(s_3, 11)\ldots
            (s_4, 01)(s_4, 10)(s_4, 11)(s_5, 01)(s_5, 10)(s_5, 11)(s_6, 01)(s_6, 10)(s_6, 11)\} =
            \{o_1, o_2, o_3\} \]
\[ \pi_{S}^{2} = \{00, 11 ; 01, 10\} = \{j_1, j_2\} \]
\[ \pi_{S}^{3} = \{s_1, s_3, s_5; s_2, s_4, s_6\} = \{x, y\} \]
\[ \pi_{S \times I}^{2} = \{(s_1, 00)(s_1, 01)(s_1, 10)(s_1, 11)(s_2, 11)(s_3, 00)(s_3, 01)(s_3, 10)(s_3, 11)\ldots
            (s_4, 00)(s_4, 11)(s_5, 00)(s_5, 01)(s_5, 10)(s_5, 11)(s_6, 11)\};
\[
            (s_2, 00)(s_2, 01)(s_2, 10)(s_4, 01)(s_4, 10)(s_4, 00)(s_6, 01)(s_6, 10)\}\} =
            \{p_1, p_2\}. \]

First, we will show that the two trinities \((\pi_1, \pi_2, \pi_{S \times I}^1)\) and \((\pi_2^2, \pi_2^3, \pi_{S \times I}^2)\) satisfy the conditions of Theorem 3.1 and therefore define a decomposition of \(M\) with two partial machines \(M_1\) and \(M_2\), each based on a corresponding trinity. We will also demonstrate how to construct the decompositional realization structure (see Fig. 3.3) for \(M\) based on these two trinities. The structure is constructed in four steps. Firstly, we build the input encoder block \(\Psi\), which transforms information delivered by primary inputs into a form suitable for partial machines. Secondly, state transition tables (STT) of partial machines \(M_i\) are constructed. Then, the network of connections between the partial machines is determined (\(Con_i\) functions). Finally, we construct the output decoder block \(\Theta\), which transforms the output information of the partial machines into the primary output of the specification machine.

Machine \(M\) computes its next state and output from information about its present state \(S\) and input \(I\). Thus, it computes two functions \(\delta : S \times I \rightarrow S\) and \(\lambda : S \times I \rightarrow O\). For its computations \(M\) uses complete information about the \(S \times I\) space. Partial machines, on the other hand, use a partial information on the points in \(S \times I\) space to compute their own next state and output. The abovementioned partition describe this partial information.

As discussed in Section 2.4, a partition models information stream that delivers information allowing to distinguish symbols placed in different blocks of the partition. The partition describing information delivered by a given variable can be derived by putting in one block all the symbols for which the value of the variable is the same. Thus, the partition has as many blocks as there are values of the variable. Applying this method in reverse, given a partition modeling given information stream, one can construct a variable delivering the information modeled by the partition by creating a variable with as many values as there are blocks in the partition and associating a single value of the variable with all the symbols placed in a single block of the partition. For example, the partial information about states modeled by \(\pi_{S}^1\) can be delivered by a three-valued variable, which assumes first value (e.g. \(a\)) for states \(s_1\) and \(s_2\), second value \((b)\) for states \(s_3\) and \(s_4\) and the third value \((c)\) for states \(s_5\) and \(s_6\).

Exploring the correspondence between partitions and variables, the input, state and
output variables of a partial machine $M_i$ can be constructed using partitions $\pi^1_i$, $\pi^2_i$ and $\pi^3_{S \times I}$, correspondingly.

Construction of the input encoder $\Psi$

The construction of the input encoder is determined by the primary input information required by the partial machines. This information is defined by the input partitions $\pi^1_i$ and $\pi^2_i$. For example, $\pi^1_i$ implies that the first partial machine $M_1$ only needs information about the primary input that allows one to distinguish input combination 00 from all the other combinations. To supply this information, the input encoder needs to compute values of a two-valued variable associated with $\pi^1_i$, which will assume its first value (e.g. $i_1$) when the input is 00, and the second value ($i_2$) for all the other input combinations.

The complete function table of the input encoder $\Psi$ is given in Figure 3.5.

<table>
<thead>
<tr>
<th>$PI_1$</th>
<th>$PI_2$</th>
<th>$\pi^1_i$</th>
<th>$\pi^2_i$</th>
</tr>
</thead>
<tbody>
<tr>
<td>0</td>
<td>0</td>
<td>$i_1$</td>
<td>$j_1$</td>
</tr>
<tr>
<td>0</td>
<td>1</td>
<td>$i_2$</td>
<td>$j_2$</td>
</tr>
<tr>
<td>1</td>
<td>0</td>
<td>$i_2$</td>
<td>$j_2$</td>
</tr>
<tr>
<td>1</td>
<td>1</td>
<td>$i_2$</td>
<td>$j_1$</td>
</tr>
</tbody>
</table>

Figure 3.5. Function table of input encoder $\Psi$

Construction of partial machines $M_1$ and $M_2$

Information available to partial machines. The partial information about the state and input $(S \times I)$ of the specification machine $M$ available to a partial machine $M_i$ is delivered by a combination of three information sources: information about $S \times I$ derived from its own state information (given by partition $\pi^i_{S \times I}$ on $S \times I$ induced by $\pi^i_S$), information about $S \times I$ derived from its own primary input information (given by partition $\pi^i_{S \times I}$ on $S \times I$ induced by $\pi^i_I$) and information imported from other partial machines ($\pi^i_{S \times I}$).

The induced partitions can easily be determined according to Def. 2.21 from input and state partitions $\pi^i_S$ and $\pi^i_I$. For example, the block of $\pi^i_{S \times I}$ induced by the first block $(s_1, s_2)$ of $\pi^i_S$ will include all state/input combinations involving states $s_1$ and $s_2$, i.e. $(s_1, 00) | (s_1, 01) | (s_1, 10) | (s_1, 11) | (s_2, 00) | (s_2, 01) | (s_2, 10) | (s_2, 11)$. This fact has a simple interpretation: knowing only that the specification machine $M$ is in either state $s_1$ or $s_2$, we are not able to distinguish whether the current state/input situation is $(s_1, 00)$, $(s_1, 01)$ ... or $(s_2, 11)$. The induced partitions therefore are the following (the symbols in parenthesis describe the values inducing a particular block):

$$\pi^i_{S \times I} = \{(s_1, 00) | (s_1, 01) | (s_1, 10) | (s_1, 11) | (s_2, 00) | (s_2, 01) | (s_2, 10) | (s_2, 11) | (s_1, s_2);$$
$$| (s_3, 00) | (s_3, 01) | (s_3, 10) | (s_3, 11) | (s_4, 00) | (s_4, 01) | (s_4, 10) | (s_4, 11) | (s_3, s_4);$$
$$| (s_5, 00) | (s_5, 01) | (s_5, 10) | (s_5, 11) | (s_6, 00) | (s_6, 01) | (s_6, 10) | (s_6, 11) | (s_5, s_6)\}$$
As discussed in Section 2.4, we use one-partition. Both partitions imported by \( M \) finally derived from the total primary input or state information (described by interconnections between the partial machines. Between these two limits any imported partition will allow a construction of that can be imported by a particular machine \( M \) is zero information, while the lower bound and upper bound on the information imported by \( M_2 \) is the entire output information of \( M_1 \). Therefore,

\[
\begin{align*}
\pi^{1}_{S\times I} &= \pi^{21}_{S\times I} = \pi_{S\times I}(1) \\
\pi^{2}_{S\times I} &= \pi^{12}_{S\times I} = \pi^{S}_{S\times I}
\end{align*}
\]

As discussed in Section 2.4, we use one-partition \( \pi_{S\times I}(1) \) to indicate the lack of information. Both partitions \( \pi^{i}_{S\times I} \) satisfy condition (3), with the partition \( \pi^{I}_{S\times I} \cdot \pi^{S}_{S\times I} \) describing

\[
\begin{align*}
\pi^{S}_{S\times I} &= \{(s_1, 00)(s_1, 01)(s_1, 10)(s_1, 11)(s_3, 00)(s_3, 01)(s_3, 10)(s_3, 11) \\
&\quad \ldots (s_5, 00)(s_5, 01)(s_5, 10)(s_5, 11)(s_1, s_3, s_5); \\
&\quad (s_2, 00)(s_2, 01)(s_2, 10)(s_2, 11)(s_4, 00)(s_4, 01)(s_4, 10)(s_4, 11) \\
&\quad \ldots (s_6, 00)(s_6, 01)(s_6, 10)(s_6, 11)(s_2, s_4, s_6)\} \\
\pi^{I}_{S\times I} &= \{(s_1, 00)(s_2, 00)(s_3, 00)(s_4, 00)(s_5, 00)(s_6, 00)(00); \\
&\quad (s_1, 01)(s_1, 10)(s_1, 11)(s_2, 01)(s_2, 10)(s_2, 11)(s_3, 01)(s_3, 10)(s_3, 11) \\
&\quad \ldots (s_4, 01)(s_4, 10)(s_4, 11)(s_5, 01)(s_5, 10)(s_5, 11)(s_6, 01)(s_6, 10)(s_6, 11) \\
&\quad (01, 10, 11)\} \\
\pi^{I}_{S\times I} &= \{(s_1, 00)(s_1, 11)(s_2, 00)(s_2, 11)(s_3, 00)(s_3, 11)(s_4, 00)(s_4, 11) \\
&\quad \ldots (s_5, 00)(s_5, 11)(s_6, 00)(s_6, 11)(00, 11); \\
&\quad (s_1, 01)(s_1, 10)(s_2, 01)(s_2, 10)(s_3, 01)(s_3, 10)(s_4, 01)(s_4, 10) \\
&\quad \ldots (s_5, 01)(s_5, 10)(s_6, 01)(s_6, 10)(01, 10)\}.
\end{align*}
\]
the global state and input information:

\[ \pi_{S \times I}^I \cdot \pi_{S \times I}^S = \{ (s_1, 00); (s_2, 00); (s_3, 00); (s_4, 00); (s_5, 00); (s_6, 00); (s_1, 11); (s_2, 11); (s_1, 01)(s_1, 10); (s_2, 01)(s_2, 10); (s_3, 11); (s_4, 11); (s_3, 01)(s_3, 10); (s_4, 01)(s_4, 10); (s_5, 11); (s_6, 11); (s_5, 01)(s_5, 10); (s_6, 01)(s_6, 10) \} \]

Using the above results, we can derive the partitions \( \pi_{S \times I}^I \) describing the combined information about the state and input of the specification machine available to a partial machine \( M_i \).

\[
\pi_{S \times I}^I = \pi_{S \times I}^S \cdot \pi_{S \times I}^1 \cdot \pi_{S \times I}^2 =
\begin{align*}
\{ & (s_1, 00)(s_2, 00); (s_1, 00)(s_4, 00); (s_5, 00)(s_6, 00); \\
& (s_1, 01)(s_1, 10)(s_1, 11)(s_2, 01)(s_2, 10)(s_2, 11); \\
& (s_3, 01)(s_3, 10)(s_3, 11)(s_4, 01)(s_4, 10)(s_4, 11); \\
& (s_5, 01)(s_5, 10)(s_5, 11)(s_6, 01)(s_6, 10)(s_6, 11) \\
\}
\end{align*}
\]

\[
\pi_{S \times I}^2 = \pi_{S \times I}^S \cdot \pi_{S \times I}^1 \cdot \pi_{S \times I}^2 =
\begin{align*}
\{ & (s_1, 00)(s_3, 00); (s_1, 00)(s_4, 11)(s_3, 11)(s_5, 11)(s_6, 00); \\
& (s_4, 00); (s_2, 11)(s_4, 11)(s_6, 11); \\
& (s_1, 01)(s_1, 10)(s_1, 01)(s_3, 10)(s_5, 01)(s_5, 10); \\
& (s_2, 01)(s_2, 10)(s_4, 01)(s_4, 10)(s_6, 01)(s_6, 10) \\
\}
\end{align*}
\]

**Construction of \( M_1 \).** In the same fashion as the partition \( \pi_I^1 \) defines the primary input alphabet of \( M_1 \) to be \( \{ i_1, i_2 \} \) (see the input encoder construction), partitions \( \pi_S^1 \) and \( \pi_{S \times I}^I \) determine the state and output alphabet of \( M_1 \), respectively. Thus, \( M_1 \) has three states defined by the three blocks of the state partition \( \pi_S^1 \). \( M_1 \) is in the state \( a \) corresponding to the first block of the state partition \( \pi_S^1 \) whenever the specification machine is in the state \( s_1 \) or \( s_2 \). Similarly, state \( b \) of \( M_1 \) is associated with the states \( s_3 \) and \( s_4 \) of the specification machine \( M \) and \( c \) with \( s_5 \) and \( s_6 \). Partition \( \pi_{S \times I}^I \) defines a ternary output variable, which for example assumes value \( a_2 \) defined by block \( (s_3, 00)(s_4, 00) \) when the specification machine \( M \) sees 00 on its input and is in either state \( s_3 \) or \( s_4 \).

The conditions of the General Decomposition Theorem guarantee that for the partial machine \( M_1 \) defined in this way it is possible to build STT consistent with the specification machine. We will show how these conditions are verified and use the results of the verification to construct the STT.

The total information available to a partial machine \( M_i \) is described by \( \pi_{S \times I}^i \). Each block of this partition therefore defines a single indivisible input situation recognized by the partial machine. It therefore corresponds to a single transition of \( M_i \). For example, block \( (s_3, 00)(s_4, 00) \) of \( \pi_{S \times I}^i \) corresponds to the situation in which the specification machine is in state \( s_3 \) or \( s_4 \) and the input is 00. In these conditions the state variable of \( M_1 \) assumes the value associated by \( \pi_S^1 \) with states \( s_3 \) and \( s_4 \) \( b \), while the primary input variable assumes the value associated by \( \pi_I^1 \) with the input 00 \( (i_1) \). It means that \( M_1 \) is in state \( b \) and its input is \( i_1 \). In this situation, the behavior of \( M_1 \) is determined by the requirement of consistency with the specification machine \( M \). For \( M_1 \) to behave in a man-
ner consistent with the specification machine, it has to transit to the state corresponding to the subset of the next states of the specification machine in these conditions. The subset of the next states of the specification machine is determined by \( \delta((s_3, 00); (s_4, 00)) = \{ \delta(s_3, 00); \delta(s_4, 00) \} = \{ s_1, s_2 \} \). This means that in these input/state conditions the specification machine transits to either \( s_1 \) or \( s_2 \). However, since \( M_1 \) does not recognize \( s_1 \) from \( s_2 \) anyway (\( s_1 \) and \( s_2 \) are placed in the same block of \( \pi_S^1 \) ), \( M_1 \) unambiguously performs the transition to its state associated with the block \( s_1, s_2 \) (a) of \( \pi_S^1 \). Note that in this way we have shown that for block \( (s_3, 00)(s_4, 00) \) of \( \pi_{S \times I}^2 \) there is a block of \( \pi_S^1 \) including \( \delta((s_3, 00)(s_4, 00)) \), and thus verified condition (1) of the theorem for the block \( (s_3, 00)(s_4, 00) \). At the same time, we have determined that the next state of \( M_1 \) in state \( b \) under input \( i_1 \) is \( a \). To determine the output value of \( M_1 \) we need to find a block of \( \pi_{S \times I}^1 \) unambiguously indicated by the current input/state conditions \( (s_3, 00)(s_4, 00) \). In this case it is the second block \( (s_3, 00)(s_4, 00) \) associated with the output value \( o_2 \). As a result, we have determined the block of \( \pi_{S \times I}^1 \) including the considered block of \( \pi_{S \times I}^2 \), which verifies the condition (2) for \( (s_3, 00)(s_4, 00) \) and determines the output of \( M_1 \) to be \( o_2 \).

Thus, we have constructed one transition of \( M_1 \) \( (i_1, b, a, o_2) \), i.e. transition 3 in Fig. 3.6. Following the same procedure, conditions (1) and (2) can be verified for the remaining blocks of \( \pi_{S \times I}^1 \) and the other transitions of \( M_1 \) can be determined. The complete STT of \( M_1 \) is given in Fig. 3.6.

**Construction of \( M_2 \).** In the case of \( M_1 \) we were able to neglect the information imported from the partial machines, since \( M_1 \) does not import any information \( (\pi_{S \times I}^1 = \pi_{S \times I}(1)) \). \( M_2 \) on the other hand imports the information from \( M_1 \) carried by a three-valued output variable of \( M_1 \) described by \( \pi_{S \times I}^2 \). The input alphabet of \( M_2 \) therefore becomes a combination of symbols delivered by the input encoding block \( \Psi \) and imported from machine \( M_1 \). It is described by the Cartesian product \( \{ j_1, j_2 \} \times \{ j_1', j_2', j_3' \} \), where \( j_1', j_2', j_3' \) are the values of the imported variable. Since \( M_2 \) directly imports the output of \( M_1 \) without any encoding, \( \{ j_1', j_2', j_3' \} = \{ o_1, o_2, o_3 \} \) and the input alphabet of \( M_2 \) becomes \( \{ j_1, j_2 \} \times \{ o_1, o_2, o_3 \} \).

The state alphabet of \( M_2 \) is determined by the blocks of partition \( \pi_S^2 \). Therefore, \( M_2 \) has two states: \( x \) corresponding to the states \( s_1, s_3 \) and \( s_5 \) of the specification machine;

<table>
<thead>
<tr>
<th>in ( (\pi_1^1) )</th>
<th>ps ( (\pi_1^2) )</th>
<th>ns ( (\pi_3^2) )</th>
<th>out ( (\pi_{S \times I}^2) )</th>
</tr>
</thead>
<tbody>
<tr>
<td>( i_1 )</td>
<td>a</td>
<td>a</td>
<td>o_1</td>
</tr>
<tr>
<td>( i_2 )</td>
<td>a</td>
<td>b</td>
<td>o_3</td>
</tr>
<tr>
<td>( i_1 )</td>
<td>b</td>
<td>a</td>
<td>o_2</td>
</tr>
<tr>
<td>( i_2 )</td>
<td>b</td>
<td>b</td>
<td>o_3</td>
</tr>
<tr>
<td>( i_1 )</td>
<td>c</td>
<td>c</td>
<td>o_1</td>
</tr>
<tr>
<td>( i_2 )</td>
<td>c</td>
<td>b</td>
<td>o_3</td>
</tr>
</tbody>
</table>

**Figure 3.6. STT of \( M_1 \)**

<table>
<thead>
<tr>
<th>in ( (\pi_1^1) )</th>
<th>imp ( (\pi_{S \times I}^2) )</th>
<th>ps ( (\pi_1^2) )</th>
<th>ns ( (\pi_3^2) )</th>
<th>out ( (\pi_{S \times I}^2) )</th>
</tr>
</thead>
<tbody>
<tr>
<td>j_1</td>
<td>o_1</td>
<td>x</td>
<td>x</td>
<td>p_1</td>
</tr>
<tr>
<td>j_2</td>
<td>o_2</td>
<td>x</td>
<td>y</td>
<td>p_1</td>
</tr>
<tr>
<td>j_1</td>
<td>o_3</td>
<td>x</td>
<td>x</td>
<td>p_1</td>
</tr>
<tr>
<td>j_2</td>
<td>o_2</td>
<td>x</td>
<td>y</td>
<td>p_1</td>
</tr>
<tr>
<td>j_1</td>
<td>o_1</td>
<td>y</td>
<td>x</td>
<td>p_2</td>
</tr>
<tr>
<td>j_1</td>
<td>o_2</td>
<td>y</td>
<td>x</td>
<td>p_1</td>
</tr>
<tr>
<td>j_2</td>
<td>o_1</td>
<td>y</td>
<td>y</td>
<td>p_2</td>
</tr>
</tbody>
</table>

**Figure 3.7. STT of \( M_2 \)**
and y corresponding to the states s2, s4 and s6.

To explain the transition determination for M2, let us consider the block

\((s_1, 00)(s_5, 00)\) of \(\pi_{S \times I}^{in} \). It corresponds to the state s1 or s5 of the specification machine with the primary input 00. In this situation, M2 is in state \(s_5\) (s1 and s5 are included in block \(s_1, s_3, s_5\) of \(\pi_2^S\) associated with state \(s_5\)). The output-component of its input is \(j_1\) (00 is included in the block 00, \(\Pi\) of \(\pi_2^I\) associated with the value \(j_1\)) and the imported component of the input is \(o_1\) ((s1, 00)(s5, 00) is a subset of the first block of \(\pi_{S \times I}^{in} \) associated with the value \(o_1\)).

The next states of the specification machine in (s1, 00) or (s5, 00) are s5 or s1, which are included in the first block (s1, s3, s5) of \(\pi_2^S\), which corresponds to the state x of M2. The output of M2 is determined by the block of \(\pi_{S \times I}^{in} \) including (s1, 00)(s5, 00), i.e. the first block of \(\pi_2^{S \times I}\) associated with the value \(p_1\). Thus, we have verified conditions (i) and (2) for the block (s1, 00)(s5, 00) and constructed transition ((j1, o1), x, x, p1), i.e. the first transition in Fig. 3.7. The verification of the conditions for the remaining blocks of \(\pi_{S \times I}^{in} \) results in STT of M2, presented in Fig. 3.7.

Interconnection network.

The interconnection network is responsible for the exchange of information between the partial machines. The functions implemented within the interconnection network (Coni) provide the mapping between the combined output of the partial machines and the imported input of a particular partial machine Mi. In this case, M1 does not import any information, so there’s no connection from M2 to M1, while M2 imports the entire output information of M1, so function Con2 is an identity function (a multi-valued “wire” from M1 output to M2 input).

Construction of the output decoder \(\Theta\).

The task of the output decoder is to determine the primary output value of the specification machine \(M\) based on the partial information about state/input delivered by the output variables of the partial machines \(M_i\). This partial information is a combination of the output information of the partial machines \(M_i\) and it is described by the partition \(\pi_{S \times I}\).

\[
\pi_{S \times I} = \pi_{S \times I}^1 \cdot \pi_{S \times I}^2 = \{(s_1, 00)(s_5, 00)(o_1, p_1); (s_2, 00)(s_6, 00)(o_1, p_2);
\]
\[
(s_3, 00)(s_4, 00)(o_2, p_1);
\]
\[
(s_1, 01)(s_1, 10)(s_2, 11)(s_3, 01)(s_3, 10)(s_3, 11)(s_4, 11)\ldots
\]
\[
(s_5, 01)(s_5, 10)(s_5, 11)(s_6, 11)(o_1, p_1);
\]
\[
(s_2, 01)(s_2, 10)(s_4, 01)(s_4, 10)(s_6, 01)(s_6, 10)(o_3, p_2)\}
\]

The interpretation of this partition is that when the machine M1 produces output \(o_1\) and the machine M2 produces output \(p_1\), the decoder "knows" that the specification machine \(M\) is either in state s1 or s5 and the input of the network is 00. However, since the decoder cannot distinguish between these two situations, the output of the specification machine for (s1, 00) and (s5, 00) has to be the same. This is the sense of
3.1. DECOMPOSITION OF COMPLETELY SPECIFIED FSM

condition (4). Consulting the STT of the specification machine $M$, we determine that the output for both $(s_1, 00)$ and $(s_5, 00)$ is indeed the same, i.e. 00, and therefore condition (4) is satisfied for the first block of $\pi_{S \times I}$. At the same time we determine that the output of the output decoder is 00 for the output $o_1$ of $M_1$ and $p_1$ of $M_2$, and thus also the primary output of $M$ is 00. The consideration of the remaining blocks of $\pi_{S \times I}$ leads to the output decoder function presented in Fig. 3.8.

\[
\begin{array}{cccc}
\pi^1_{S \times I} & \pi^2_{S \times I} & PO_1 & PO_2 \\
o_1 & p_1 & 0 & 0 \\
o_1 & p_2 & 1 & 1 \\
o_2 & p_1 & 0 & 0 \\
o_3 & p_1 & 1 & 1 \\
o_3 & p_2 & 0 & 1 \\
\end{array}
\]

Figure 3.8. Output decoder function

State behavior.

Finally, the product of the state partitions $\pi^1_S \cdot \pi^2_S$ of the partial machines is $\pi_S(0)$, which means that the combination of the states of the partial machines always indicates only a single state of the specification machine or is empty. Thus, in accordance with condition (5), the state behavior of the specification machine $M$ is also realized.

Resulting network.

All of the above information is sufficient to construct the decomposition network composed of the partial machines $M_1$ and $M_2$, input encoder $\Psi$, output decoder $\Theta$ and interconnection $C_{}\cap 0\cap 2$, which realizes the state and output behavior of the original machine $M$. This network is depicted in Fig. 3.9.

The combined behavior of the network of partial machines $M_1$ and $M_2$ is described by a composition machine in Fig. 3.10. Let us consider, for example, the situation when $M_1$ is in state $a$, $M_2$ is in state $x$ and the output of the input encoder $\Psi$ is $(i_1, j_1)$. This corresponds to the situation when the specification machine is in state $s_1$ (the intersection of the block $s_1, s_2$ corresponding to the state $a$ of $M_1$ and block $s_1, s_3, s_5$ corresponding to the state $x$ of $M_2$) with the input $00$ (see the input encoder construction). This situation triggers the transition $(i_1, a, a, o_1)$ in $M_1$. Since the output of $M_1$ is $o_1$ in $M_2$ the transition $((j_1, o_1), x, x, p_1)$ is triggered. The resulting next state of the network therefore is $(a, x)$ (which translates to the state $s_1$ of the specification machine) and the output is $(o_1, p_1)$ (which is decoded to 00 by the output decoder block). Comparing this with Fig. 3.4 we can verify that this is consistent with the specification machine $M$.

Output-only behavior realization.

It can be verified that all conditions except (5) of Theorem 3.1 are still satisfied when $\pi^1_S$ in the trinity $(\pi^1_S, \pi^2_S, \pi^1_{S \times I})$ is replaced by $\{s_1, s_2, s_5, s_6(a'); s_3, s_4(b)\}$, without any changes to the other partitions in the two trinities. It means that with the modified $\pi^1_S$ it is possible to realize the output behavior of the specification machine without realizing
3. GENERAL DECOMPOSITION THEOREM

its state behavior. Note that this change is interpreted as merging the states of \( a(s_1, s_2) \) and \( c(s_5, s_6) \) of the original \( M_1 \) into a new state \( a' \), so it is equivalent to implicit state minimization. Indeed, upon inspection of STT for \( M_1 \) it can be observed that the states \( a \) and \( c \) are equivalent.

The partitions affected by the change of \( \phi^1_3 \) are

\[
\pi_{S \times I}^1 = \{ (s_1, 00)(s_1, 01)(s_1, 10)(s_1, 11)(s_2, 00)(s_2, 01)(s_2, 10)(s_2, 11) \ldots \\
\ldots (s_5, 00)(s_5, 01)(s_5, 10)(s_5, 11)(s_6, 00)(s_6, 01)(s_6, 10)(s_6, 11) \\
(s_1, s_2, s_5, s_6); \\
(s_3, 00)(s_3, 01)(s_3, 10)(s_3, 11)(s_4, 00)(s_4, 01)(s_4, 10)(s_4, 11)(s_3, s_4) \}
\]

\[
\pi_{S \times I}^m = \{ (s_1, 00)(s_1, 01)(s_1, 10)(s_1, 11)(s_2, 00)(s_2, 01)(s_2, 10)(s_2, 11) \\
\ldots (s_5, 00)(s_5, 01)(s_5, 10)(s_5, 11)(s_6, 00)(s_6, 01)(s_6, 10)(s_6, 11) \\
(s_3, 00)(s_3, 01)(s_3, 10)(s_3, 11)(s_4, 00)(s_4, 01)(s_4, 10)(s_4, 11) \}
\]

\[
\pi_{S \times I}^* = \{ (s_1, 00)(s_1, 01)(s_1, 10)(s_1, 11)(s_3, 00)(s_3, 01)(s_3, 10)(s_3, 11) \\
(s_2, 00)(s_2, 01)(s_2, 10)(s_2, 11)(s_4, 00)(s_4, 01)(s_4, 10)(s_4, 11) \}
\]

Figure 3.9. Scheme of the example decomposition
Considering the modified input partition $\pi_{S\times I}^{in}$ of $M_1$, we see that in the first block of $\pi_{S\times I}^{in}$ being $(s_1,00)(s_2,00)(s_5,00)(s_6,00)$ the next state for symbols $(s_1,00)$ and $(s_2,00)$ is $s_1$, while for $(s_5,00)$ and $(s_6,00)$ it is $s_5$. However, $\pi_{S}^{1}$ has states $s_1$ and $s_5$ in one block ($a'$), so the block $(s_1,00)(s_2,00)(s_5,00)(s_6,00)$ still unambiguously identifies the next state of $M_1$ to be $a'$. The remaining blocks of $\pi_{S\times I}^{in}$ lead to the states $b$, $a'$ and $b$, respectively. Thus, condition (i) remains satisfied for $\pi_{S\times I}^{in}$ . Furthermore, $\pi_{S\times I}^{in}$ is still smaller-or-equal-to $\pi_{S\times I}^{1}$, so condition (2) is also satisfied. Since $\pi_{S\times I}^{1} = \pi_{S\times I}(1)$, and $\pi_{S\times I}$ remains unchanged, conditions (3) and (4) are not affected by the state minimization. Finally, the product state partition $\pi_S = \{s_1,s_5; s_2,s_6; s_3,s_4\}$, is not a zero-partition, so the state behavior of $M$ is not realized, as the combination of the partial machines does not recognize state $s_1$ from $s_5$ and $s_2$ from $s_6$. **Figure 3.10.** STT of the composition machine

<table>
<thead>
<tr>
<th>in</th>
<th>ps</th>
<th>ns</th>
<th>out</th>
</tr>
</thead>
<tbody>
<tr>
<td>(00)</td>
<td>$i_1j_1$</td>
<td>$ax$ $(s_1)$</td>
<td>$ax$ $(s_1)$</td>
</tr>
<tr>
<td>(01)</td>
<td>$i_2j_2$</td>
<td>$ax$ $(s_1)$</td>
<td>$by$ $(s_4)$</td>
</tr>
<tr>
<td>(10)</td>
<td>$i_2j_2$</td>
<td>$ax$ $(s_1)$</td>
<td>$by$ $(s_4)$</td>
</tr>
<tr>
<td>(11)</td>
<td>$i_2j_1$</td>
<td>$ax$ $(s_1)$</td>
<td>$bx$ $(s_3)$</td>
</tr>
<tr>
<td>(00)</td>
<td>$i_1j_1$</td>
<td>$ay$ $(s_2)$</td>
<td>$ax$ $(s_1)$</td>
</tr>
<tr>
<td>(01)</td>
<td>$i_2j_2$</td>
<td>$ay$ $(s_2)$</td>
<td>$by$ $(s_4)$</td>
</tr>
<tr>
<td>(10)</td>
<td>$i_2j_2$</td>
<td>$ay$ $(s_2)$</td>
<td>$by$ $(s_4)$</td>
</tr>
<tr>
<td>(11)</td>
<td>$i_2j_1$</td>
<td>$ay$ $(s_2)$</td>
<td>$bx$ $(s_3)$</td>
</tr>
<tr>
<td>(00)</td>
<td>$i_1j_1$</td>
<td>$bx$ $(s_3)$</td>
<td>$ay$ $(s_2)$</td>
</tr>
<tr>
<td>(01)</td>
<td>$i_2j_2$</td>
<td>$bx$ $(s_3)$</td>
<td>$by$ $(s_4)$</td>
</tr>
<tr>
<td>(10)</td>
<td>$i_2j_2$</td>
<td>$bx$ $(s_3)$</td>
<td>$by$ $(s_4)$</td>
</tr>
<tr>
<td>(11)</td>
<td>$i_2j_1$</td>
<td>$bx$ $(s_3)$</td>
<td>$bx$ $(s_3)$</td>
</tr>
<tr>
<td>(00)</td>
<td>$i_1j_1$</td>
<td>$by$ $(s_4)$</td>
<td>$ax$ $(s_1)$</td>
</tr>
<tr>
<td>(01)</td>
<td>$i_2j_2$</td>
<td>$by$ $(s_4)$</td>
<td>$by$ $(s_4)$</td>
</tr>
<tr>
<td>(10)</td>
<td>$i_2j_2$</td>
<td>$by$ $(s_4)$</td>
<td>$by$ $(s_4)$</td>
</tr>
<tr>
<td>(11)</td>
<td>$i_2j_1$</td>
<td>$by$ $(s_4)$</td>
<td>$bx$ $(s_3)$</td>
</tr>
<tr>
<td>(00)</td>
<td>$i_1j_1$</td>
<td>$cx$ $(s_5)$</td>
<td>$cx$ $(s_5)$</td>
</tr>
<tr>
<td>(01)</td>
<td>$i_2j_2$</td>
<td>$cx$ $(s_5)$</td>
<td>$by$ $(s_4)$</td>
</tr>
<tr>
<td>(10)</td>
<td>$i_2j_2$</td>
<td>$cx$ $(s_5)$</td>
<td>$by$ $(s_4)$</td>
</tr>
<tr>
<td>(11)</td>
<td>$i_2j_1$</td>
<td>$cx$ $(s_5)$</td>
<td>$bx$ $(s_3)$</td>
</tr>
<tr>
<td>(00)</td>
<td>$i_1j_1$</td>
<td>$cy$ $(s_6)$</td>
<td>$cx$ $(s_5)$</td>
</tr>
<tr>
<td>(01)</td>
<td>$i_2j_2$</td>
<td>$cy$ $(s_6)$</td>
<td>$by$ $(s_4)$</td>
</tr>
<tr>
<td>(10)</td>
<td>$i_2j_2$</td>
<td>$cy$ $(s_6)$</td>
<td>$by$ $(s_4)$</td>
</tr>
<tr>
<td>(11)</td>
<td>$i_2j_1$</td>
<td>$cy$ $(s_6)$</td>
<td>$bx$ $(s_3)$</td>
</tr>
</tbody>
</table>
3.2 Decomposition of incompletely specified FSM

3.2.1 Extensions of GDT

What is central to the General Decomposition Theorem (GDT) for completely specified FSMs is the concept of a partition pair, which reflects the possibility to compute some partial information from another partial information, i.e. compute information modeled by one partition using information modeled by some other partition. Let us consider how this concept changes with the introduction of nondeterminism, output don’t-cares and multi-state realizations.

Partitions are no longer sufficient to express don’t-care conditions for state and output. For this reason, they need to be replaced with more general unique-block covers (simply called covers in the following).

Nondeterminism

The definition of \( S \times I \rightarrow S \) partition pair uses block transition function \( \delta(D) \), which calculates a set of next states (combined next state) for a subset \( D \) of \( S \times I \). In the case of a nondeterministic machine each element from \( S \times I \) leads not to a single next state, but rather to a choice of next states. Conceptually, an element of \( S \times I \) can lead to any subset of its next-states-set. To establish the possible sets of next states for \( D \), we need to consider a combination of all subsets of all the next-state-sets for all \( S \times I \) elements in \( D \). Such a combination is performed by the operator \( \cup \).

**Definition 3.7 (Operator \( \cup \)).** \( \cup : 2^{S^n} \rightarrow 2^{2^S} \) and

\[
\cup([D_i]_{i=1}^n) = \{ B : \exists C^1 \subseteq D_1, C^1 \neq \emptyset \ldots \exists C^n \subseteq D_n, C^n \neq \emptyset : B = \bigcup_{i=1}^n C^n \}
\]

Operator \( \cup \) accepts a vector of subsets of \( S \). It returns all the possible combinations of sums of the non-empty subsets of \( S \), provided that each subset is taken from another vector position.

Ex. 3.2.1.

\[
\cup\left(\begin{bmatrix} \{1,2,3\} \\ \{1,4\} \end{bmatrix}\right) = \{ \{1\}, \{1,2\}, \{1,3\}, \{1,4\}, \{2,4\}, \{3,4\}, \\
\{1,2,3\}, \{1,2,4\}, \{2,3,4\}, \{1,2,3,4\}\}
\]

For the example of the application of operator \( \cup \) to the block transition function see also Example 3.2.3.

**Definition 3.8 (Operator \( \cap \)).** \( \cap : 2^{2^{S^n}} \rightarrow 2^S \) and

\[
\cap([C_i]_{i=1}^n) = \{ B : \exists C^1 \in D_1 \ldots \exists C^n \in D_n : B = \bigcap_{i=1}^n C^n \}
\]

Operator \( \cap \) accepts a vector of sets of subsets of \( S \). It returns all the possible combinations of intersections of the subsets of \( S \), provided that each intersected subset is taken from another vector position.
3.2. DECOMPOSITION OF INCOMPLETELY SPECIFIED FSM

Ex. 3.2.2. For \( S = \{1 \ldots 5\} \)
\[
\cap \left( \left[ \begin{array}{c}
\{1\} \{1, 2\} \{2, 3\} \{4\} \\
\{2, 3\} \{4, 5\} \\
\{1, 2, 4, 5\}
\end{array} \right] \right) = \{1\}, \{1, 2\}, \{2\}, \{4\}
\]

Block transition function \( \bar{\delta} \) now becomes
\[
\bar{\delta} : 2^S \times I \rightarrow 2^S \quad \text{and} \quad \bar{\delta}(D) = \bigcup_{d \in D} \delta(d)
\]

To adjust the partition pair concept to this situation, we need to use covers and ensure that at least one of the possible next-state-sets for \( D \) is included in a block of \( \phi_S \) cover, i.e. there is a choice of the next states for each state/input combination in \( D \) such that the resulting next-state-sets lead to a block of \( \phi_S \).

For this purpose we introduce a "cover transition function", which returns those next state blocks of \( \phi_S \) that are included in a block of \( \phi_S \).

Definition 3.9 (Cover transition function).
\[
\Delta^{\phi_S} : 2^{S \times I} \rightarrow 2^{\phi_S} \quad \text{and} \quad \Delta^{\phi_S}(D) = \{B' \in \phi_S | \exists B'' \in \bar{\delta}(D) : B'' \subseteq B'\}
\]

The \( S \times I - S \) cover pair condition can then be formulated as
\[
\forall D \in \phi_{S \times I} : \Delta^{\phi_S}(D) \neq \emptyset
\]

Note that if the machines are deterministic, this condition is exactly equivalent to the partition pair condition.

<table>
<thead>
<tr>
<th>in</th>
<th>ps</th>
<th>ns</th>
<th>out</th>
</tr>
</thead>
<tbody>
<tr>
<td>0</td>
<td>a</td>
<td>b</td>
<td>c</td>
</tr>
<tr>
<td>1</td>
<td>a</td>
<td>a</td>
<td>01</td>
</tr>
<tr>
<td>0</td>
<td>b</td>
<td>a, c</td>
<td>11</td>
</tr>
<tr>
<td>1</td>
<td>b</td>
<td>b</td>
<td>01</td>
</tr>
<tr>
<td>0</td>
<td>c</td>
<td>a</td>
<td>10, 01</td>
</tr>
<tr>
<td>1</td>
<td>c</td>
<td>b</td>
<td>10, 01, 11</td>
</tr>
</tbody>
</table>

Figure 3.11. Incompletely specified sequential machine

Ex. 3.2.3. For machine in Fig. 3.11 and \( \phi_{S \times I} = \{(a, 0)(b, 0); (a, 1)(b, 1)(c, 0)(c, 1)\} = \{D_1; D_2\} \)
\[
\bar{\delta}(D_1) = \bigcup \left[ \begin{array}{c}
\delta(a, 0) \\
\delta(b, 0)
\end{array} \right] = \bigcup \left[ \begin{array}{c}
\{b, c\} \\
\{a, c\}
\end{array} \right] = \{a, b\}, \{a, c\}, \{b, c\}, \{c\}, \{a, b, c\}
\]
\[
\bar{\delta}(D_2) = \bigcup \left[ \begin{array}{c}
\delta(a, 1) \\
\delta(b, 1) \\
\delta(c, 0) \\
\delta(c, 1)
\end{array} \right] = \bigcup \left[ \begin{array}{c}
\{a\} \\
\{b\} \\
\{a\} \\
\{b\}
\end{array} \right] = \{a, b\}
\]

Therefore, for \( \phi_S = \{a, b, c; b, c\} \), \( \Delta^{\phi_S}(D_1) = \{a, b, c; b, c\} \neq \emptyset \) and \( \Delta^{\phi_S}(D_2) = \{a, b, c\} \neq \emptyset \), which means that \( \phi_{S \times I} \) and \( \phi_S \) form a \( S \times I - S \) pair.
Furthermore, due to the fact that there is a choice of next states for the composition machine, it is not sufficient for a partial machine to move to a state consistent with any of the possible next states of \( M \). All partial machines need to move to their respective states consistent with the same subsets of the \( M \)'s next states. Otherwise, the product of the next states of partial machines is empty and the state of \( M \) is unspecified.

This leads to the concept of a synchronized set of cover pairs. A set of cover pairs is synchronized if and only if each of the pairs is a cover pair, and each of the blocks of the first covers leads to such blocks of the second covers that the intersection of the blocks of the first covers leads to the intersection of the blocks of the second covers. Note that synchronized sets of cover pairs is a stronger requirement than the cover pair of the product of the first covers with the product of the second covers.

**Definition 3.10 (Synchronized set of \( S \times I - S \) cover pairs).** \( ([\phi_{S \times I}^i], [\phi_{S}^i]) \) is a synchronized \( S \times I - S \) cover pair if and only if

1. \( \forall i \forall D_i \in \phi_{S \times I}^i \Delta^{\phi_{S}^i}(D_i) \neq \emptyset \) (cover pair condition)
2. \( \forall i \forall D^i \in \phi_{S \times I} \exists B_{D_i}^i \in \Delta^{\phi_{S}^i}(D^i) : \forall [D^i], i \in \times_i \phi_{S \times I}^i : \bigcap_i D_i \neq \emptyset \Rightarrow \exists B_D^i \in \bigcap_i B_{D_i}^i \exists B' \in \Delta^{\Pi_i \phi_{S}^i} \left( \bigcap_i D^i \right) : B' = B_D' \) (synchronization condition)

Taking all this into account, the condition (i) of GDT can be rewritten as

\( ([\phi_{S \times I}^0], [\phi_{S}^0]) \) is a synchronized set of \( S \times I - S \) cover pairs

For an example of synchronized set of cover pairs see Section 3.4.1.

**Output don’t-cares**

Similarly, the consideration of output don’t cares requires the adjustment of a block output function \( \overline{\lambda} \) and introduction of a cover output function \( \Lambda^{\phi_O} \).

**Definition 3.11 (Block output function).**

\[ \overline{\lambda} : 2^{S \times I} \rightarrow 2^O \quad \text{and} \quad \overline{\lambda}(D) = \bigcup_{d \in D} \lambda(d) \]

\( \overline{\lambda} \) computes a set of all possible combinations of choices between values of output for elements of \( D \).

**Definition 3.12 (Cover output function).**

\[ \Lambda^{\phi_O} : 2^{S \times I} \rightarrow 2^{\phi_O} \quad \text{and} \quad \Lambda^{\phi_O}(D) = \{ C \in \phi_O \mid \exists C' \in \overline{\lambda}(D) : C' \subseteq C \} \]

\( \Lambda^{\phi_O} \) returns those blocks of \( \phi_O \) that can be calculated as outputs for input combinations in \( D \), allowing for don’t-care assignment.

**Definition 3.13 (\( S \times I - O \) cover pair).** \( (\phi_{S \times I}, \phi_O) \) is a \( S \times I - O \) cover pair if and only if

\[ \forall D \in \phi_{S \times I} : \Lambda^{\phi_O}(D) \neq \emptyset \]
Ex. 3.2.4. For machine and $\phi_{S \times I}$ from Example 3.2.3

\[
\begin{align*}
\chi(D_1) &= \bigcup \begin{bmatrix}
\lambda(a,0) \\
\lambda(b,0)
\end{bmatrix} = \bigcup \begin{bmatrix}
\{00,11\} \\
\{11\}
\end{bmatrix} = \{\{00,11\}; \{11\}\} \\
\chi(D_2) &= \bigcup \begin{bmatrix}
\lambda(a,1) \\
\lambda(b,1) \\
\lambda(c,0) \\
\lambda(c,1)
\end{bmatrix} = \bigcup \begin{bmatrix}
\{01\} \\
\{01\} \\
\{10,01\} \\
\{10,01,11\}
\end{bmatrix} = \{\{01\}; \{10,01\}; \{01,11\}; \{10,01,11\}\}
\end{align*}
\]

Therefore, for $\phi_O = \{01; 11; 00, 10\}$, $\Lambda^{\phi_O}(D_1) = \{11\} \neq \emptyset$ and $\Lambda^{\phi_O}(D_2) = \{01\} \neq \emptyset$, which means that $\phi_{S \times I}$ and $\phi_O$ form a $S \times I - O$ pair.

Since output don’t cares can prevent some output symbols from being produced, the cover specialization concept needs to be introduced to allow for the reduction of the output alphabet.

**Definition 3.14 (Cover specialization)**. Specialization of cover $\phi_S$ is a cover $\phi_{S'}$ such that

\[
\begin{align*}
S' &\subseteq S \\
&\wedge \forall A \in \phi_S \exists B \in \phi_{S'} : B \subseteq A \\
&\wedge \forall B \in \phi_{S'} \exists A \in \phi_S : A \supseteq B
\end{align*}
\]

Cover is specialized by removing some elements from its blocks.

With this in mind, condition (4) of the GDT becomes:

$\phi_O$ has specialization $\phi_{O'}$ and $(\phi_{S \times I}, \phi_{O'})$ is a $S \times I - O'$ cover pair.

**Multi-state realization**

**Definition 3.15 (Multi-state realization)**. Incompletely specified machine $M_l(I_l, S_l, O_l, \delta_l, \lambda_l)$ is a multi-state realization of i.s. machine $M(I, S, O, \delta, \lambda)$ if and only if

1. $I_l = I$
2. $O_l = O$
3. there is an injective copy-assignment function $L : S \rightarrow 2^{S_l}$, which establishes correspondence between states in $M$ and their copies in $M_l$
4. $\delta_l$ is a multi-state realization of $\delta$, i.e.
   $$\forall s \in S \forall x \in I \forall s_l \in L(s) \quad \delta(s, x) \supseteq L'()$$
5. $\lambda_l$ is a multi-state realization of $\lambda$, i.e.
   $$\forall s \in S \forall x \in I \forall s_l \in L(s) \quad \lambda(s, x) \supseteq \lambda_l(s_l, x)$$

where $L' : 2^{S_l} \rightarrow 2^S$ is inverse of $L$ defined as $L'(A_l) = \{ s \in S | L(s) \cap A_l \neq \emptyset \}$
### Ex. 3.2.5.
Consider the state transition table in Fig. 3.12. It was created by splitting states $a$ and $c$ of the machine in Fig. 3.11 into its two copies: $a_1, a_2$ and $c_1, c_2$, respectively. It can be observed that outputs of the copies of the states are subsets of the outputs of their originals (under the same input) and that the next-states of the copies of the states are some copies of the next-states of the originals.

The possibility of multi-state realization means that there are multiple states of the network of partial machines (i.e. of the composition machine) corresponding to a single state in the specification machine. We treat these multiple states as the "labeled copies" of a particular state of the specification machine. Note that while we introduce this concept for incompletely specified FSMs, it is also valid for completely specified machines as a special case — their decomposition may also have multiple states corresponding to a single state of the original machine.

The most immediate consequence for the conditions of GDT is the expansion of the machine’s state set from $S$ to $S_l$, and input space from $S \times I$ to $S_l \times I$.

This introduces additional freedom to the composition machine. For the behavior of the network to be consistent with the behavior of the specification machine, it is sufficient for copies of a certain state to transit to some copies of its next state, and produce a compatible output under the same input. This freedom can be exploited to optimize the FSM implementation or to resolve some implementation problems (e.g. race-free assignment for asynchronous level-mode circuits using multi-state assignment called “multiple-row assignment”). We recall that the partition (or cover) pair concept corresponds to the ability of partial machines to perform transitions under the given input conditions (a block of input cover $\phi_i^{in}_{S \times I}$) to their next states (blocks of state cover $\phi_i^{out}$) in a manner consistent with the behavior of the specification machine. For this consistency to be preserved in the case of labeled covers $\phi_i^{in}_{S_l \times I}$ and $\phi_i^{out}$. It is sufficient if unlabeled blocks of $\phi_i^{in}_{S_l \times I}$ transit to unlabeled blocks of $\phi_i^{out}$. This can be expressed by an adequate reformulation of the cover transition and cover output functions to correctly operate on covers on labeled copies of the states.

**Definition 3.16 (Labeled cover transition function).**

$$\Delta^{\phi_i}_{S_l} : 2^{S_l \times I} \rightarrow 2^{\phi_i} \quad \text{and} \quad \Delta^{\phi_i}_{S_l} (D) = \{ B'_l \in \phi_i | \exists B' \in \delta(L'(D)) : B' \subseteq \delta(L'(B'_l)) \}$$
Definition 3.17 (Labeled cover output function).

\[ \Lambda^{\phi_O} : 2^S \times I \rightarrow 2^\phi_O \quad \text{and} \quad \Lambda^{\phi_O}(D) = \{ C | C \in \phi_O \land \exists C' \in \overline{\Lambda(L'(D))} : C' \subseteq C \} \]

As a consequence, cover pairs for the multi-state realizations become the following:

Definition 3.18 (S_i \times I - S_i cover pair). \( (\phi_{S_i \times I}, \phi_{S_i}) \) is a \( S_i \times I - S_i \) cover pair if and only if

\[ \forall D \in \phi_{S_i \times I} : \Delta^{\phi_{S_i}}(D) \neq \emptyset \]

Definition 3.19 (S_i \times I - O cover pair). \( (\phi_{S_i \times I}, \phi_O) \) is a \( S_i \times I - O \) cover pair if and only if

\[ \forall D \in \phi_{S_i \times I} : \Lambda^{\phi_O}(D) \neq \emptyset \]

Ex. 3.2.6. Consider machine M in Fig. 3.11 with state a split (labeled) into two copies \( a1 \) and \( a2 \) and covers \( \phi_{S_i \times I} = \{ (a1, 1)(c, 0); (a2, 1)(c, 0)(b, 0); (a1, 0)(a2, 0)(b, 1)(c, 1) \} \) \( \overline{\{ D_1; D_2; D_3 \}} \) and \( \phi_{S_i} = \{ a1, c, a2, b, c \} \). The unlabeled version of \( D_1 \) is \( L'(D_1) = (a, 1)(c, 0) \). For these two input combinations \( M \) moves to

\[ \overline{\delta(L'(D_1))} = \bigcup \begin{bmatrix} \delta(a, 1) \\ \delta(c, 0) \end{bmatrix} = \bigcup \begin{bmatrix} \{ a \} \\ \{ a \} \end{bmatrix} = \{ \{ a \} \} \]

After labeling, block \( \{ a \} \) can fit into either \( a1, c \) or \( a2 \) of \( \phi_{S_i} \), so \( \Delta^{\phi_{S_i}}(D_1) = \overline{\{ a1, c, a2 \}} \). For \( D_2 \) and \( D_3 \): \( L'(D_2) = (a, 1)(b, 0)(c, 0) \) and \( L'(D_3) = (a, 0)(b, 1)(c, 1) \).

\[ \overline{\delta(L'(D_2))} = \bigcup \begin{bmatrix} \delta(a, 1) \\ \delta(b, 0) \\ \delta(c, 0) \end{bmatrix} = \bigcup \begin{bmatrix} \{ a \} \\ \{ a, c \} \\ \{ a \} \end{bmatrix} = \{ \{ a \}; \{ a, c \} \} \quad \text{and} \]

\[ \Delta^{\phi_{S_i}}(D_2) = \{ a1, c, a2 \}. \]

\[ \overline{\delta(L'(D_3))} = \bigcup \begin{bmatrix} \delta(a, 0) \\ \delta(b, 1) \\ \delta(c, 1) \end{bmatrix} = \bigcup \begin{bmatrix} \{ b, c \} \\ \{ b \} \\ \{ b \} \end{bmatrix} = \{ \{ b \}; \{ b, c \}; \} \quad \text{and} \]

\[ \Delta^{\phi_{S_i}}(D_3) = \{ b, c \}. \]

Therefore, \( \phi_{S_i \times I} \) and \( \phi_{S_i} \) form a \( S_i \times I - S_i \) pair.

Finally, the fact that there are multiple instances of the same state necessitates the introduction of the \( S_i - S \) cover pair. It expresses the fact that blocks of \( \phi_{S_i} \) only include the copies of the states that are included in the same blocks of \( \phi_S \).

Definition 3.20 (\( S_i - S \) cover pair). \( (\phi_{S_i}, \phi_S) \) is a \( S_i - S \) cover pair if and only if

\[ \forall D \in \phi_{S_i} \exists B \in \phi_S \quad L'(D) \subseteq B \]
3.2.2 General Decomposition Theorem

Using the above discussed extensions necessary to account for nondeterminism (next-state don’t-cares), incompletely specified output function (output don’t-cares) and multi-state realizations, the General Decomposition Theorem for the multi-state realizations of incompletely specified nondeterministic FSMs can be formulated as follows:

Let \( \phi_O \) and \( \phi_S \) be covers on \( M = (I, S, O, \delta, \lambda) \) on \( O \) and \( S \), respectively. Let \( \phi^i_1, \phi^j_1, \phi^i_S, \phi^j_S \) be covers on multi-state specialization \( M_i = (I, S_i, O, \delta_i, \lambda_i) \) of machine \( M \), on \( I, S_i \) and \( S_i \times I \), respectively. Let cover \( \phi_O \) be a partition defined by the following expression:

\[
|o|\phi_O = |p|\phi_O \iff \forall s \in S \forall x \in I: o \in \lambda(s, x) \iff p \in \lambda(s, x) \quad (3.2.1)
\]

(\( \phi_O \) groups in one block the output symbols, which are not distinguished by any input combination).

Let \( \phi^i_S \times I \) be covers of \( S_i \times I \) such that \( \phi^i_S \times I \geq \phi^j_S \times I \) and \( \phi^i_S \times I = \prod_{i=1}^{n} \phi^i_S \times I \) and \( \phi^j_S \times I \). Let \( \phi^i_S \), \( \phi^j_S \), \( \phi^i_X \) and \( \phi^j_X \) be covers induced on \( S_i \times I \) by \( \phi^i_S \) and \( \phi^j_S \), respectively. Let \( \phi^i_S \times I = \prod_{i=1}^{n} \phi^i_S \times I \) and \( \phi^j_S \times I = \prod_{i=1}^{n} \phi^j_S \times I \). Let \( \phi^i_S \times I = \phi^i_S \times I \cdot \phi^i_S \times I \) and \( \phi^j_S \times I = \phi^j_S \times I \cdot \phi^j_S \times I \).

**Theorem 3.2 (General full-decomposition of incompletely specified FSM).** A sequential machine \( M = (I, S, O, \delta, \lambda) \) has a general full-decomposition with the output behavior realization with \( n \) component machines if and only if \( n \) input covers \( \phi_i^1 \), \( n \) state covers \( \phi_S^i \), and \( n \) labels assignments \( \phi_{O} \) exist such that \( n \) trinities of covers \( (\phi^i_S, \phi^j_S, \phi^i_X \times I) \) exist that satisfy the following conditions:

1. \((\phi^i_S \times I, \phi^j_S \times I)\) is an \( S_i \times I - S_i \) synchronized cover set pair
2. \( \phi^i_S \times I \leq \phi^j_S \times I \)
3. \( \phi^i_S \times I \cdot \phi^j_S \times I \leq \phi^j_S \times I \)
4. \( \phi_O \) has specialization \( \phi_{O} \) and \( (\phi^i_S \times I, \phi^j_S \times I) \) is a \( S_i \times I - O' \) cover pair

Additionally, if

5. \((\prod_i \phi^i_S, \phi_S(0))\) is an \( S_i - S \) cover pair

then the state behavior of \( M \) will be realized too.

3.3 Proof of the General Decomposition Theorem

3.3.1 Forward proof

For more clarity, we will conduct the forward part of the proof in two steps. First, we will introduce a machine \( M^* \) that is built based on the covers mentioned in the Theorem and represents the combined behavior of the partial machines. We will show that, if the conditions of the theorem are fulfilled, \( M^* \) is a realization of the specification machine
M. Then, we will show that a network of partial machines built based on the covers in
the theorem is indeed equivalent to \( M^* \), and hence is a realization of M.

In each of the steps, we will propose appropriate transition and output functions (\( \delta \) and \( \lambda \) for the discussed machines and show that under the conditions of the theorem
they are well defined. Then, we will show that thus defined machines are realizations
of the specification machine in the light of the definitions of the behavior realization
(Defs. 2.8 and 2.9).

In the proof, we will use the following lemmas.

Lemma 3.3.1. \( \forall S, T : \phi_S^1 \leq \phi_S^2 \Rightarrow \text{ind}_{S \times T}^S(\phi_S^1) \leq \text{ind}_{S \times T}^S(\phi_S^2) \)

Proof.
\[ \phi_S^1 \leq \phi_S^2 \iff \forall A^1 \in \phi_S^1 \exists A^2 : A^1 \subseteq A^2 \]
additionally, \( A^1 \subseteq A^2 \Rightarrow A^1 \times T \subseteq A^2 \times T \) hence,
\[ \forall B^1 \in \{ A^1 \times T \mid A^1 \in \phi_S^1 \} \exists B^2 \in \{ A^2 \times T \mid A^2 \in \phi_S^2 \} : B^1 \subseteq B^2 \]
by definition of induced cover (Def. 2.21)
\[ \forall B^1 \in \text{ind}_{S \times T}^S(\phi_S^1) \exists B^2 \in \text{ind}_{S \times T}^S(\phi_S^2) : B^1 \subseteq B^2 \]
which means that \( \text{ind}_{S \times T}^S(\phi_S^1) \leq \text{ind}_{S \times T}^S(\phi_S^2) \)

Lemma 3.3.2.
\[ \forall \phi_S, \forall D_l \subseteq S_l \times I \forall (s_l, x) \in D_l : \Delta^{\phi_S}(D_l) \neq \emptyset \Rightarrow \]
\[ \forall B'_l \in \Delta^{\phi_S}(D_l) \exists s'_l \in B'_l : s'_l \in L(\delta(L'(s_l), x)) \]
(for each element \((s_l, x)\) of \( D_l \), each block of \( \Delta^{\phi_S}(D_l) \) contains a copy of the specification
machine’s next state for the unlabeled version of \((s_l, x)\))

Proof. By definition of \( \Delta^{\phi_S} \), it only returns those blocks of \( \phi_S \), which, after unlabeling,
entirely include a block of \( \delta(L'(D_l)) \) (i.e. contain at least copies of all elements of a block
of \( \delta(L'(D_l)) \)). By definition of \( \delta \), each block of \( \delta(D) \) contains a next state for each element
\( d \in D \). Therefore, each block of \( \delta(L'(D_l)) \) contains a next state for the unlabeled version
of each element \( d_l \in D_l \). Therefore, for each block returned by \( \Delta^{\phi_S} \), it contains a copy
of the next state for the unlabeled version of each element of \( D_l \).

Lemma 3.3.3. For the specialization \( \phi_O' \) of \( \phi_O \) defined by (3.2.1)
\[ \forall D \subseteq S \times I \forall (s_l, x) \in D : \Lambda^{\phi_O'}(D) \neq \emptyset \Rightarrow \forall Y \in \Lambda^{\phi_O'}(D) : Y \subseteq \lambda(L'(s_l), x) \]
(for each element of \( D \), each block of \( \Lambda^{\phi_O'}(D) \) is a subset of the output values for the
unlabeled version of this element in the specification machine)

Proof. Each block of \( \Lambda^{\phi_O'}(D) \) includes some output for any element of \( D \). The blocks of
\( \phi_O \), and hence \( \phi_O' \), only contain output values not distinguished by any input combination
(by definition of \( \phi_O \) and specialization). Hence, any returned block of \( \phi_O' \) only contains
outputs that are common to all combinations in \( D \).

Lemma 3.3.4.
\[ \forall D^1 \subseteq S \times I \forall D^2 \subseteq S \times I : D^1 \subseteq D^2 \Rightarrow \Delta^{\phi_S}(D^1) \supseteq \Delta^{\phi_S}(D^2) \]
First, we will show that machine Composition machine Proof.

\[ \text{by definition of } \delta : D^1 \subseteq D^2 \Rightarrow \forall B^2 \in \delta(D^2) \exists B^1 \in \delta(D^1) : B^1 \subseteq B^2 \]

hence, \( \forall B^2 \in \delta(D^2) \forall B' \in \phi_S : (B^2 \subseteq B') \Rightarrow \exists B^1 \in \delta(D^1) : B^1 \subseteq B' \)

by the definition of \( \Delta^{\phi_S} (D) \forall B' \in \Delta^{\phi_S} (D^2) : B' \in \Delta^{\phi_S} (D^1) \)

and \( \Delta^{\phi_S} (D^1) \supseteq \Delta^{\phi_S} (D^2) \)

Composition machine

First, we will show that machine \( \mathbf{M}^* \), defined by the trinity of covers \( (\prod_i \phi^I_i, \prod_i \phi^S_i, \prod_i \phi^i_{S \times I}) \) that satisfy conditions of Theorem 3.2, is a realization of \( \mathbf{M} \).

Let functions \( \Phi^*, \Psi^* \) and \( \Theta^* \) be defined as follows:

\[ \Phi^* : 2^S \to 2^{\phi_{S_i}} \quad \text{and} \quad \Phi^*(S) = \bigcup_{s_i \in L(S)} [s_i] \phi_{S_i} \]  

(3.3.1)

\( \Phi^* \) returns all the blocks of \( \phi_{S_i} \) including copies of elements of \( S \).

\[ \Psi^* : I \to 2^{\phi_I} \quad \text{and} \quad \Psi^*(x) = [x] \phi_I \]  

(3.3.2)

\( \Psi^* \) returns all the blocks of \( \phi_I \) including \( x \).

\[ \Theta^* : 2^{\phi_{S \times I}} \to 2^O \quad \text{and} \quad \Theta^*(D) = \bigcup_{D \in \mathcal{D}} \Lambda^{\phi^O}(D) \]  

(3.3.3)

Let \( \mathbf{M}^* = (\phi_I, \phi_{S_i}, \phi_{S \times I}, \delta^*, \lambda^*) \) be the machine describing the combined behavior of the composition of \( n \) component machines. Let:

\[ \phi_I = \prod_i \phi^I_i \]  

(3.3.4)

\[ \phi_{S_i} = \prod_i \phi^S_i \]  

(3.3.5)

\[ \phi_{S \times I} = \prod_i \phi^i_{S \times I} \]  

(3.3.6)

\[ \delta^* : \phi_{S \times I} \times \phi_I \to \phi_{S_i} \quad \text{and} \quad \delta^*(B, A) = \Delta^{\phi_S} (B \times A) \]  

(3.3.7)

\[ \lambda^* : \phi_{S \times I} \times \phi_I \to \phi_{S \times I} \quad \text{and} \quad \lambda^*(B, A) = [B \times A] \phi_{S \times I} \]  

(3.3.8)

Since \( \phi_I = \prod_i \phi^I_i \leq \phi_I^* \), by Lemma 3.3.1, \( ind^I_{S \times I}(\phi_I) \leq ind^I_{S \times I}(\phi^I_i) \) and \( \forall A \in \phi_I \exists A^i \in \phi^i_{S \times I} : S_i \times A \subseteq A^i \). Similarly, \( \phi_{S_i} = \prod_i \phi^S_i \leq \phi^S_i \), \( ind^S_{S \times I}(\phi_{S_i}) \leq ind^S_{S \times I}(\phi^S_i) \) and \( \forall B \in \phi_{S_i} \exists B^i \in \phi^i_{S \times I} : B \times I \subseteq B^i \). Finally, \( ind^i_{S \times I}(\phi_{S_i}) \) \( \leq \phi^i_{S \times I} \) (by Condition (3)), so \( \forall A \in \phi_I \forall B \in \phi_{S_i} \exists C^i \in \phi^i_{S \times I} : B \times A \subseteq C^i \).

From all of the above:

\[ \forall A \in \phi_I \forall B \in \phi_{S_i} \forall i \]

\[ \exists D^i \in \phi^i_{S \times I} : (A^i \supseteq S_i \times A) \wedge (B^i \supseteq B \times I) \wedge (C^i \supseteq B \times A) \wedge \]

\[ \wedge (D^i = A^i \cap B^i \cap C^i \supseteq B \times A) \]  

(3.3.9)
Hence,
\[ \forall A \in \phi \forall B \in \phi_2 \exists i[D^i] \in \times_i \phi_{S_i}^m : B \times A \subseteq \bigcap_i D^i \]

Furthermore, condition (i) guarantees that
\[ \forall [D^i] \in \times_i \phi_{S_i}^m : \bigcap_i D^i \neq \emptyset \Rightarrow \Delta_{\Pi_i \phi_{S_i}}(\bigcap_i D^i) \neq \emptyset \quad (3.3.10) \]

From Lemma 3.3.4,
\[ B \times A \subseteq \bigcap_i D^i \Rightarrow \Delta_{\phi_{S_i}}(B \times A) \supseteq \Delta_{\Pi_i \phi_{S_i}}(\bigcap_i D^i) \neq \emptyset \]

Hence,
\[ \forall A \in \phi \forall B \in \phi_2 : \Delta_{\phi_{S_i}}(B \times A) \neq \emptyset \Rightarrow \delta^*(B, A) \neq \emptyset \]

Similarly, from (3.3.9), condition (2), and the fact that \( \phi_{S_i} \times I \prod_i \phi_{S_i}^m \), we conclude that \( B \times A \) is included in some block of \( \phi_{S_i} \times I \), so \( \lambda^*(B, A) \neq \emptyset \).

We will now show that \( M^* \) is a realization of \( M \).
Let \( \delta^*(\Phi^*(s), \Psi^*(x)) = \bigcup_{B \in \Phi^*(s), A \in \Psi^*(x)} \delta^*(B, A) \).

by definition of \( \Phi^* : \forall B \in \Phi^*(s) \exists s_1 \in L(s) : s_1 \in B \)
by definition of \( \Psi^* : \forall A \in \Psi^*(x) : x \in A \)
hence, \( \forall B \in \Phi^*(s) \forall A \in \Psi^*(x) \exists s_1 \in L(s) : (s_1, x) \in B \times A \)
by Lemma 3.3.2 \( \forall B' \in \Delta_{\phi_{S_i}}(B \times A) \exists s'_{1} \in B'_i : s'_1 \in L(\delta(L'(s_1), x)) = L(\delta(s, x)) \)
from that and the definition of \( \delta^* \) \( \forall B' \in \delta^*(B, A) \exists s'_{1} \in L(\delta(s, x)) : s'_1 \in B'_i \)
hence, \( \delta^*(B, A) \subseteq \bigcup_{s'_1 \in L(\delta(s, x))} [s'_1]_{\phi_{S_i}} = \Phi^*(\delta(s, x)) \)

Finally,
\[ \forall s \in S \forall x \in I : \delta^*(\Phi^*(s), \Psi^*(x)) \subseteq \Phi^*(\delta(s, x)) \quad (3.3.11) \]

Further,
\[ \forall B \in \Phi^*(s) \forall A \in \Psi^*(x) : \Theta^*(\lambda^*(B, A)) = \Theta^*(\Delta_{\phi_{S_i}^m}(B \times A)) = \bigcup_{D \in \Delta_{\phi_{S_i}^m}(B \times A)} \left( \Lambda_{\phi_{S_i}^m}(D) \right) \]

By Lemma 3.3.3:
\[ \forall (s_1, x) \in D \forall Y \in \Lambda_{\phi_{S_i}^m}(D) : Y \subseteq \lambda(L'(s_1), x) \quad (3.3.12) \]

By definition of \( \Phi^* \), \( \exists s_1 \in L(s) : (s_1, x) \in B \times A \),
\[ \forall D \in [B \times A]_{\phi_{S_i}^m} : (s_1, x) \in D \quad (3.3.13) \]
From (3.3.12) and (3.3.13) it follows that
\[ \forall D \in [B \times A] \phi_{S_i \times I} \forall Y \in \Lambda^{\phi_{o'}}(D) : Y \subseteq \lambda(L'(s_i), x) = \lambda(s, x) \]

Hence
\[ \bigcup_{D \in [B \times A] \phi_{S_i \times I}} \Lambda^{\phi_{o'}}(D) \subseteq \lambda(s, x) \]

and
\[ \Theta^*(\lambda^*(\Phi^*(s_i), \Psi^*(x))) \subseteq \lambda(s, x) \]

(3.3.14)

From (3.3.11), (3.3.14) and definition 2.8 it follows that \( M^* \) is an output behavior realization of \( M \).

**State behavior realization**

We will show that under the additional condition (5) of the Theorem, the composition machine \( M^* \) is a state behavior realization (s.b.r.) of \( M \).

**Mapping function:**
\[ \Phi^*: 2^{\phi_{S_i}} \rightarrow 2^{S} \quad \text{and} \quad \Phi^*(B) = \bigcup_{B \in B} L'(B) \]

Then
\[ \forall B \in \phi_{S_i} \forall x \in I : \Phi^*(\delta^*(B, \Psi^*(x))) = \Phi^*(\delta^*(B, [x]\phi_I)) \]
\[ \forall A \in [x]\phi_I : \Phi^*(\delta^*(B, A)) = \Psi^*(\Delta^{\phi_{S_i}}(B \times A)) \]

From (5) it follows that
\[ \forall B \in \phi_{S_i} \exists s_B \in S : L'(B) = \{s_B\} \]

(3.3.15)

so,
\[ \forall B \in \phi_{S_i} \forall s_i \in B \exists s_B \in S : L'(B) = L'(s_i) = s_B \]

(3.3.16)

Hence,
\[ \forall (s_i, x) \in B \times A : L'(s_i) = s_B \]

then, by Lemma 3.3.2
\[ \forall B' \in \Delta^{\phi_{S_i}}(B \times A) \exists s'_i \in B' : s'_i \in L(\delta(L'(s_i), x)) = L(\delta(s_B, x)) \]

hence, since \( B' \in \phi_{S_i} \) and (3.3.16)
\[ \forall B' \in \Delta^{\phi_{S_i}}(B \times A) \forall s'_i \in B' : s'_i \in L(\delta(s_B, x)) \]
Finally,
\[ \forall B' \in \Delta^{\phi_{S_i}}(B \times A) : L'(B') \subseteq \delta(s_B, x) = \delta(\Psi^*(B), x) \]
\[ \forall B \in \phi_{S_i} \forall x \in I : \bigcup_{A \in \Psi^*(x)} \bigcup_{B' \in \Delta^{\phi_{S_i}}(B \times A)} L'(B') \subseteq \delta(\Psi^*(B), x) \]
and hence
\[ \Psi^*(\delta^*(B, \Psi^*(x))) \subseteq \delta(\Psi^*(B), x) \quad (3.3.17) \]

Additionally, by Lemma 3.3.3
\[ \forall B \in \phi_{S_i} \forall x \in I \forall A \in \Psi^*(x) \forall D \in \lambda^*(B, A) \]
\[ \forall(s_t, x) \in D \forall Y \in \Lambda^{\psi^l} (D) : \]
\[ Y \subseteq \lambda(L'(s_t), x) \quad (3.3.16) \lambda(L'(B), x) \]
hence,
\[ \forall B \in \phi_{S_i} \forall x \in I : \bigcup_{D \in \lambda^*(B, \Psi^*(x))} \Lambda^{\psi^l} (D) \subseteq \lambda(L'(B), x) = \lambda(\Psi^*(B), x) \]
and
\[ \Theta^*(\lambda^*(B, \Psi^*(x))) \subseteq \lambda(\Psi^*(B), x) \quad (3.3.18) \]

By (3.3.17),(3.3.18) and definition 2.9, machine \( M^* \) therefore is a state behavior realization of \( M \). Since, as shown in the following section, the network of partial machines \( M_i \) always is a state realization of \( M^* \), we conclude that if condition (5) is met, the network of partial machines \( M_i \) is a state and output behavior realization of \( M \).

Component machines

Let \( M_i = \left( \phi^i_{S_i}, \phi^i_{S_i \times I}, \phi^i_{S_i \times I}, \delta^i, \lambda^i \right) \) be a component sequential machine for which the following conditions are satisfied:

1. \( (\phi^i_{S_i}, \phi^i_{S_i \times I}) \) satisfy the conditions of Theorem 3.2

2. 
\[ \forall B \in \phi_{S_i} \forall A \in \phi^i_{S_i} \forall B^i \in [B]_{\phi^i_{S_i}} \forall A^i \in [A]_{\phi^i_{S_i}} \forall C^i \in [B \times A]_{\phi^i_{S_i \times I}} : \]
\[ \delta^i (B^i, (A^i, C^i)) = \left\{ B'_i \mid B'_i \in \Delta^{\phi^i_{S_i}}(D^i) \land \right. \]
\[ \land D^i \in [(B^i \times A^i) \cap C^i]_{\phi^i_{S_i \times I}} \land \forall [D^i]_j \in \times_j \phi^i_{S_i \times I} \} : \]
\[ \bigcap_j D^i \neq \emptyset \Rightarrow \forall B' \in \bigcap_j \delta^i (D^j) \exists B^* \in \delta^i (B, A) : B^* \supseteq B' \right\} \quad (3.3.19) \]

3. 
\[ \lambda^i (B^i, (A^i, C^i)) = \left( (B^i \times A^i) \cap C^i \right)_{\phi^i_{S_i \times I}} \quad (3.3.20) \]
The two conditions for the next state of component machine \( (B'_i) \) may interpreted as follows:

- The first \( (B'_i \in \Delta \phi_{si} (\ldots)) \) is a local condition and ensures that the component machine may only perform a transition to its legal state (a block of \( \phi_{si} \)).
- The second is a global synchronization condition – it ensures that all the component machines make the choice between their next-states in such way that the resulting state of the network is a legal state (a block of \( \phi_{Si} \)).

Since \( (B^i \times A^i) \cap C^i \) is included in a block of \( \phi_{Si}^{in} \) (by definition of \( \phi_{Si}^{in} \)), \([B^i \times A^i) \cap C^i] \phi_{Si}^{in} \neq \emptyset \). From condition (1.1) it follows that there is a block of \( \phi_{Si} \) that includes next-states of \( D^i \in [(B^i \times A^i) \cap C^i] \phi_{Si}^{in} \). Hence, \( \Delta \phi_{si} \; (D^i) \neq \emptyset \). Also, there is a block of \( \phi_{Si}^{in} \) containing \( (B^i \times A^i) \cap C^i \) (from condition (2)).

Since \( \delta^*(B, A) = \Delta \Pi \phi_{si} (B \times A) \), condition (1.2) guarantees that

\[
\exists B'_i \in \Delta \phi_{si} \; (D^i) \quad : \quad \forall [D^i]_i \in \times j [B \times A] \phi_{Si}^{in} \; j : \\
\bigcap_j D^i \neq \emptyset \Rightarrow \forall B'_i \in \bigcap_j \delta^i \; (D^i) \exists B^* \in \delta^*(B, A) : B^* = B'
\]

hence,

\[
\forall B \in \phi_{Si} \forall A \in \phi_{I} \; \forall B^i \in [B] \phi_{Si} \; \forall A^i \in [A] \phi_{I} \forall C^i \in [B \times A] \phi_{Si}^{in} \; i : \\
\delta^i \; (B^i, (A^i, C^i)) \neq \emptyset \wedge \lambda^i \; (B^i, (A^i, C^i)) \neq \emptyset
\]

The behavior of the net of component machines is described by the following functions:

\[
\delta_{GC} : \times_i \phi_{Si}^{in} \times_i \phi_{I} \rightarrow \times_i \phi_{Si}^{in} \text{ and } \\
\lambda_{GC} : \times_i \phi_{Si}^{in} \times_i \phi_{I} \rightarrow \times_i \phi_{Si}^{in} \text{ and }
\]

\[
\delta_{GC} ([B^i]_i, [A^i]_i) = \left[ \bigcup_{C^i \in [B \times A] \phi_{Si}^{in} \; i} \delta^i \; (B^i, (A^i, C^i)) \right]_i \\
\lambda_{GC} ([B^i]_i, [A^i]_i) = \left[ \bigcup_{C^i \in [B \times A] \phi_{Si}^{in} \; i} \lambda^i \; (B^i, (A^i, C^i)) \right]_i
\]

Let the mapping functions be:

\[
\Phi : \phi_{Si} \rightarrow \times_i 2^{\phi_{si}} \quad \text{and} \quad \Phi (B) = \left[ [B] \phi_{Si}^{in} \right]_i \quad (3.3.21) \\
\Psi : \phi_{I} \rightarrow \times_i 2^{\phi_{si}} \quad \text{and} \quad \Psi (A) = \left[ [A] \phi_{I}^{in} \right]_i \quad (3.3.22) \\
\Theta : \times_i 2^{\phi_{Si} \times I} \rightarrow 2^{\phi_{Si} \times I} \quad \text{and} \quad \Theta \left( [D^i]_i \right) = \bigcup_{[D^i]_i \times D^i \; i} \left[ \bigcap_j D^i \right] \phi_{Si} \times I \quad (3.3.23) \\
\Phi' : \times_i \phi_{Si}^{in} \rightarrow 2^{\phi_{si}} \quad \text{and} \quad \Phi' ([B^i]_i) = \left[ \bigcap_i B^i \right] \phi_{Si} \quad (3.3.24)
\]
3.3. PROOF OF THE GENERAL DECOMPOSITION THEOREM

From the above it follows that

\[ B \in \phi_{S_i}, \forall [B^i]_i \in \times_i [B]_{\phi_{S_i}} \forall A \in \phi_I : \Phi' \left( \delta_{GC} ([B^i]_i, \Psi(A)) \right) = \Phi' \left( \delta_{GC} ([B^i]_i, [A]_{\phi_I}) \right) \]

and

\[ \forall [A^i]_i \in \times_i [A]_{\phi_I} : \delta_{GC} ([B^i]_i, [A^i]_i) = \left[ \bigcup_{C^i \in [B \times A]_{\phi_{S_i}} \times I^i} \delta^i \left( B^i, (A^i, C^i) \right) \right]_i \]

By definition of \( \delta^i \)

\[ \forall B' \in \bigcap_i \delta^i (B^i, (A^i, C^i)) \exists B^* \in \delta^* (B, A) : B^* = B' \]

hence,

\[ \forall B' \in \bigcap_i \left[ \bigcup_{C^i \in [B \times A]_{\phi_{S_i}} \times I^i} \delta^i (B^i, (A^i, C^i)) \right] \exists B^* \in \delta^* (B, A) : B^* = B' \]

Since \( \forall B^* \in \delta^* : B^* \in \phi_{S_i} : \)

\[ \forall B' \in \bigcap_i \left[ \bigcup_{C^i \in [B \times A]_{\phi_{S_i}} \times I^i} \delta^i (B^i, (A^i, C^i)) \right] \exists B^* \in \delta^* (B, A) : [B']_{\phi_{S_i}} = [B^*]_{\phi_{S_i}} = B^* \]

Therefore,

\[ \Phi' \left( \delta_{GC} ([B^i]_i, [A^i]_i) \right) \subseteq \delta^* (B, A) = \delta^* (\Phi' ([B^i]_i), A) \]

and

\[ \forall B \in \phi_{S_i}, \forall [B^i]_i \in \times_i [B]_{\phi_{S_i}} \forall A \in \phi_I : \Phi' \left( \delta_{GC} ([B^i]_i, \Psi(A)) \right) \subseteq \delta^* (\Phi' ([B^i]_i), A) \]

(3.3.25)

Further,

\[ \forall B \in \phi_{S_i}, \forall [B^i]_i \in \times_i [B]_{\phi_{S_i}} \forall A \in \phi_I : \Theta \left( \lambda_{GC} ([B^i]_i, \Psi(A)) \right) = \Theta \left( \lambda_{GC} ([B^i]_i, [A]_{\phi_I}) \right) \]

and

\[ \forall [A^i]_i \in \times_i [A]_{\phi_I} : \lambda_{GC} ([B^i]_i, [A^i]_i) = \left[ \bigcup_{C^i \in [B \times A]_{\phi_{S_i}} \times I^i} \lambda^i (B^i, (A^i, C^i)) \right]_i \]
Since \((B^i \times A^i) \cap C^i \supseteq B \times A\)
\[
\forall [Y^i]_i, \in \lambda_{GC}([B^i]_i, [A^i]_i) Y^i \supseteq B \times A
\]
therefore,
\[
\forall [Y^i]_i, \in \lambda_{GC}([B^i]_i, [A^i]_i) \bigcap_i Y^i \supseteq B \times A
\]
and
\[
\bigcap_i Y^i[\phi_{S_i \times I}] \subseteq [B \times A][\phi_{S_i \times I}] = \lambda^*(B, A)
\]
This leads to
\[
\forall B \in \phi_{S_i} [B^i]_i \in \times_i [B] \phi_{S_i} \forall A \in \phi_I : \Theta \left(\lambda_{GC}([B^i]_i, \Psi(A)) \right) \subseteq \lambda^*(\Phi'([B^i]_i), A)
\]
(3.3.26)

From (3.3.25) and (3.3.26) it follows that the net of component machines is an unconditional state and output behavior realization of the composition machine \(M^*\). Therefore, it is an output behavior realization of \(M\) whenever \(M^*\) is (i.e. under conditions (i)-(4)) and state and output behavior realization, whenever \(M^*\) is (i.e. under conditions (i)-(5))

3.3.2 Reverse proof

Let the output behavior of \(M\) be realized by a general composition \(M^*\) of \(n\) machines \(M_i = (I^*_i, S_i, O_i, \delta^i, \lambda^i)\), where:
\[
\begin{align*}
I^*_i &= I'_i \times I_i \\
Con_i : \times 2^{O_i} &\rightarrow I'_i \quad \text{is a surjective function}
\end{align*}
\]
Let:
\[
\begin{align*}
\Psi : I &\rightarrow \times I_i \quad \text{be a function} \\
\Phi : S &\rightarrow \times 2^{S_i} \quad \text{be a function} \\
\Theta : \times 2^{O_i} &\rightarrow 2^O \quad \text{be a surjective partial function}
\end{align*}
\]
In any case, there is a multi-state specialization \(M_i\) of \(M\) such that \(M^*\) is an output behavior realization of \(M_i\) (in a trivial case, \(M_i \equiv M\)). Hence, it is possible to identify two functions : \(\Phi_i : S \rightarrow 2^{S_i}\) and \(\Phi^* : S_i \rightarrow \times S_i\) such that \(\Phi(s) = \Phi^*(\Phi_i(s))\). Intuitively, the process of mapping of states of \(M\) to (sets of) states of \(M^*\) is divided in two parts: first, a state of \(M\) is mapped to a set of states of its multi-state specialization \(M_i\), and then a state of \(M_i\) is mapped to a state of composition machine \(M^*\). Again, in a trivial case, where \(M_i \equiv M, S_i = S, \Phi_i\) is the identity function and \(\Phi^* = \Phi\).

Then, the mappings \(\Psi\) and \(\Phi^*\) introduce the following covers on \(I\) and \(S_i\), respectively:
\[
\begin{align*}
\phi_i^t : [x] &\phi_i^t \cap [z] \phi_i^t \not\subseteq x_i = z_i \\
& \text{where} \ (x_1, \ldots , x_i, \ldots , x_n) = \Psi(x) \text{ and } (z_1, \ldots , z_i, \ldots , z_n) = \Psi(z)
\end{align*}
\]
\[
\begin{align*}
\phi_{S_i}^t : [s] &\phi_{S_i}^t \cap [t] \phi_{S_i}^t \not\subseteq s_i = t_i \\
& \text{where} \ (s_1, \ldots , s_i, \ldots , s_n) = \Phi^*(s) \text{ and } (t_1, \ldots , t_i, \ldots , t_n) = \Phi^*(t)
\end{align*}
\]
If the composition of $M_i$ is legal, then the output function $\lambda^i$ of each component machine computes its values from (a part of) the original primary input and state information present in the composition machine (directly or imported). Therefore, $\lambda^i$ can be considered as a function $\lambda^i : S_i \times I \rightarrow 2^{O_i}$. Thus, it introduces a cover $\phi^i_{S_i \times I}$ on $S_i \times I$ such that

$$[(s_i, x)]\phi^i_{S_i \times I} \cap [(t_i, z)]\phi^i_{S_i \times I} \neq \emptyset \iff \lambda^i(s_i, x) \cap \lambda^i(t_i, z) \neq \emptyset$$

The values of $\lambda^i$ (the elements of $O_i$) can be considered as the blocks of $\phi^i_{S_i \times I}$, or the names of the blocks of $\phi^i_{S_i \times I}$.

In a similar way, the connection functions $Com_i$ introduce the covers $\phi^i_{S_i \times I} = \prod_j \phi^i_{S_i \times I}$.

Since $\lambda^i : S_i \times I \times I'_i \rightarrow 2^{O_i}$ is a function, values of $\lambda^i$ (i.e. blocks of $\phi^i_{S_i \times I}$) must be computed from $\phi^i_{S_i \times I} \left(\phi^i_{S_i \times I}\right)$, $\phi^i_{S_i \times I}$, and $\phi^i_{S_i \times I}$. This is equivalent to condition (2). Similarly, since $\delta^i : S_i \times I \times I'_i \rightarrow 2^{S_i}$ is a function, the blocks of $\phi^i_{S_i}$ must be computed from $\phi^i_{S_i} \left(\phi^i_{S_i \times I}\right)$, $\phi^i_{S_i \times I}$, and $\phi^i_{S_i \times I}$. This requirement is stated in condition (1 p.1).

The condition (1 p.1), however, only guarantees that for any state/input combination for each partial machine $M_i$, there is a valid next-state that the partial machine can transit to. The fact that the partial machines transit to their valid states does not guarantee that the composition machine does. If the intersection of the current states of partial machines (i.e. intersection of blocks of $\phi^i_{S_i}$) is empty, then the present state of the composition machine (i.e. a block of $\phi_{S_i}$) is undefined. To avoid such a situation, it must be guaranteed that for any state/input combination there is a choice of next-states in the partial machines such that the intersection of the chosen next-states results in a valid state of the composition machine. This synchronization requirement is fulfilled by the condition (1 p.2).

If the composition does not contain combinational loops, and thus is legal, then values of each connection function, and so the values of $\phi^i_{S_i \times I}$ must only be computed (directly or indirectly) from the original primary input and state information present in the composition. Since the total primary input information in the composition is defined by $\phi^i_{S_i \times I}$ and the total state information by $\phi^i_{S_i \times I}$, condition (3) must be satisfied.

The output information produced by all the component machines and described by $\phi^i_{S_i \times I}$ enables the computation of the output information of the original machine $M$. Let us consider the original output function $\lambda : S \times I \rightarrow 2^O$ of $M$. $\lambda$ introduces the output cover (in fact, it always is a partition) $\phi_O$ such that

$$[0] \phi_O = [p] \phi_O \iff \forall s \in S \forall x \in I : o \in \lambda(s, x) \iff p \in \lambda(s, x)$$

The blocks of $\phi_O$ thus are the blocks of values of $\lambda$ not distinguished from each other by any input/state combination (in most cases, $\phi_O = \phi_O(0)$). To calculate the output we need to be able to calculate either a block of $\phi_O$ or a block of a cover smaller than $\phi_O$ (with smaller blocks - removing output value from a block of $\phi_O$ is equivalent to filling in the don’t-care). This is guaranteed by condition (4).

If additionally the state behavior of $M$ is realized, then the state information of all the component machines enables the unambiguous computation of the state for the specification machine $M$, i.e. a surjective partial function $\Phi^i : S_i \rightarrow S$ exists. Such a function
introduces the following $n$ covers $\phi^i_{S_i}$ on $S$:

$$\phi^i_{S_i} : [s] \phi^i_{S_i} \cap [t] \phi^i_{S_i} \neq \emptyset \iff s_i = t_i$$

where $s = \Phi'(s_1, \ldots, s_i, \ldots, s_n), t = \Phi'(t_1, \ldots, t_i, \ldots, t_n)$

Since $\Phi'$ is a surjective partial function, each element from $S$ must be unambiguously defined by $n$-tuples of elements from $\times \phi^i_{S_i}$, i.e. elements of $\pi_S(0)$ must be calculated from the elements of $\bigcap_i \phi^i_{S_i}$. This is equivalent to stating condition ($5$).

Summarizing, if a sequential machine $M$ has a general full-decomposition then $n$ triplets of partitions $(\phi^i_I, \phi^i_{S_i}, \phi^i_{S_i \times I})$ exist and they satisfy conditions (1)–(5) of Theorem 3.2. This ends the proof.

### 3.4 General decomposition example

#### 3.4.1 Output and state behavior realization

<table>
<thead>
<tr>
<th>$s_i t_i$</th>
<th>$p s$</th>
<th>$n s$</th>
<th>$o_0 o_1$</th>
</tr>
</thead>
<tbody>
<tr>
<td>00</td>
<td>1</td>
<td>1</td>
<td>00</td>
</tr>
<tr>
<td>01</td>
<td>1</td>
<td>1</td>
<td>11,01</td>
</tr>
<tr>
<td>10</td>
<td>1</td>
<td>1</td>
<td>11,01</td>
</tr>
<tr>
<td>11</td>
<td>1</td>
<td>3</td>
<td>11</td>
</tr>
<tr>
<td>00</td>
<td>2</td>
<td>1,2</td>
<td>11</td>
</tr>
<tr>
<td>01</td>
<td>2</td>
<td>1</td>
<td>01</td>
</tr>
<tr>
<td>10</td>
<td>2</td>
<td>1</td>
<td>01</td>
</tr>
<tr>
<td>11</td>
<td>2</td>
<td>3</td>
<td>11</td>
</tr>
<tr>
<td>00</td>
<td>3</td>
<td>2</td>
<td>00</td>
</tr>
<tr>
<td>01</td>
<td>3</td>
<td>1</td>
<td>11,01</td>
</tr>
<tr>
<td>10</td>
<td>3</td>
<td>1</td>
<td>11,01</td>
</tr>
<tr>
<td>11</td>
<td>3</td>
<td>3</td>
<td>11</td>
</tr>
<tr>
<td>00</td>
<td>4</td>
<td>4</td>
<td>00</td>
</tr>
<tr>
<td>01</td>
<td>4</td>
<td>1</td>
<td>11,01</td>
</tr>
<tr>
<td>10</td>
<td>4</td>
<td>1</td>
<td>11,01</td>
</tr>
<tr>
<td>11</td>
<td>4</td>
<td>3</td>
<td>11</td>
</tr>
<tr>
<td>00</td>
<td>5</td>
<td>4,5</td>
<td>11</td>
</tr>
<tr>
<td>01</td>
<td>5</td>
<td>1</td>
<td>01</td>
</tr>
<tr>
<td>10</td>
<td>5</td>
<td>1</td>
<td>01</td>
</tr>
<tr>
<td>11</td>
<td>5</td>
<td>3</td>
<td>11</td>
</tr>
</tbody>
</table>

<table>
<thead>
<tr>
<th>$s_i$</th>
<th>$I$</th>
</tr>
</thead>
<tbody>
<tr>
<td>00</td>
<td>0a</td>
</tr>
<tr>
<td>01</td>
<td>1a</td>
</tr>
<tr>
<td>10</td>
<td>1b</td>
</tr>
<tr>
<td>11</td>
<td>2a</td>
</tr>
<tr>
<td>00</td>
<td>3a</td>
</tr>
<tr>
<td>01</td>
<td>3b</td>
</tr>
<tr>
<td>10</td>
<td>3c</td>
</tr>
<tr>
<td>11</td>
<td>3d</td>
</tr>
<tr>
<td>00</td>
<td>4a</td>
</tr>
<tr>
<td>01</td>
<td>4b</td>
</tr>
<tr>
<td>10</td>
<td>4c</td>
</tr>
<tr>
<td>11</td>
<td>4d</td>
</tr>
<tr>
<td>00</td>
<td>5a</td>
</tr>
<tr>
<td>01</td>
<td>5b</td>
</tr>
<tr>
<td>10</td>
<td>5c</td>
</tr>
<tr>
<td>11</td>
<td>5d</td>
</tr>
</tbody>
</table>

Figure 3.14. $S_i \times I$ space mapping

Figure 3.13. Example FSM

Let us now illustrate with an example the General decomposition theorem for the multi-state realizations of incompletely specified nondeterministic FSMs and its application for the construction of the decompositional FSM structures.

Let us consider an incompletely specified finite state machine $M$ in Fig. 3.13 and its
where:

\[ \phi_S^1 = \{1, 2, 3, 4, 5\} \quad \text{and} \quad \phi_S^2 = \{00; 01, 10, 11\} \quad \text{for } M_1 \]

and \[ \phi_S^3 = \{1, 3, 4, 1, 2, 5\} \quad \text{and} \quad \phi_S^4 = \{00, 11; 01, 10\} \quad \text{for } M_2 \]

After the labeling of states this introduces two copies of state 1: 1a and 1b and leaves other states unchanged. Two trinities of covers are given: \( (\phi_I^1, \phi_{S_1}^1, \phi_{S_1 \times I}^1) \) and \( (\phi_I^2, \phi_{S_1}^2, \phi_{S_1 \times I}^2) \),

where:

\[
\begin{align*}
\phi_I^1 &= \{00; 01, 10, 11\} = \{i_1; i_2\} \quad \phi_{S_1}^1 = \{1a, 2b, 3; 4, 5\} = \{a; b; c\} \\
\phi_{S_1 \times I}^1 &= \{0, 8, 16, 20, 4, 12; \\
&\quad 1, 2, 3, 5, 6, 7, 9, 10, 11, 13, 14, 15, 17, 18, 19, 21, 22, 23\} \\
&= \{o_1; o_2; o_3\} \\
\phi_I^2 &= \{00, 11; 01, 10\} = \{j_1; j_2\} \\
\phi_{S_1}^2 &= \{1a, 3, 4; 1b, 2, 5\} = \{x; y\} \\
\phi_{S_1 \times I}^2 &= \{0, 1, 2, 3, 4, 7, 11, 12, 13, 14, 15, 16, 17, 18, 19, 23; \\
&\quad 1, 2, 5, 6, 8, 9, 10, 13, 14, 17, 18, 20, 21, 22\} = \{p_1; p_2\}\end{align*}
\]

In the above, for the conciseness of notation, the elements of the \( S_1 \times I \) space were replaced with the numbers assigned to them by the mapping introduced in Fig. 3.14.

The interpretation of this situation is as follows. Machine \( M \) has a multi-state realization \( M_L \), which in turn is composed of two partial machines: \( M_1 \) and \( M_2 \). States of the partial machines are described by covers \( \phi_{S_i}^i \) \( (i = 1, 2) \). Each cover \( \phi_{S_i}^i \) groups in one block the states of the machine \( M_L \), which correspond to a single state of a partial machine \( M_i \) defined by this block. For example, \( \phi_{S_2}^2 \) defines a two-state machine \( M_2 \), which is in its first state \( 1a, 3, 4(x) \), whenever the machine \( M_L \) is in state 1a, 3, or 4, and it is in its second state \( 1b, 2, 5(y) \) whenever the machine \( M_L \) is in one of the states 1b, 2 or 5.

Similarly, covers \( \phi_I^i \) \( (i = 1, 2) \) define (multi-valued) input variables to the partial machines \( M_i \). Values of the original inputs of \( M \) grouped in a single block of \( \phi_I^i \) correspond to the value of \( i_i \) associated with this block. Note that except for primary input information delivered to partial machines and modeled by \( \phi_I^i \), the input space of partial machines is expanded by information imported from other partial machines, and modeled by \( \phi_{S_i \times I}^i \).

Multi-valued output variables of the partial machines \( M_i \) defined by the blocks of covers \( \phi_{S_i \times I}^i \) \( (i = 1, 2) \) convey partial information about the current state and input of the machine \( M_L \), which is computed from the total input and state information of \( M_i \) and transferred to \( M_i \)'s output. This information is used by function \( \Theta \) to determine the values of the primary outputs of \( M_L \) and by other partial machines to acquire the state/input information necessary for their computations and not produced locally. The scheme of this decomposition is presented in Fig. 3.15. In the following we will show how to determine the encoder/decoder functions and the STTs of the partial machines.

First, let us check whether the trinities define a valid decomposition. The derived covers mentioned in Theorem 2 are the following:

\[ \phi_D = \{00; 11; 01\} \]
$M_1 \quad S_I = S_1 \quad I_1 = I \quad O_1 = O$

$M^* \quad S^* = \phi_{S_I} \quad I^* = \phi_I \quad O = \phi_{S_I \times I}$

Figure 3.15. Scheme of the example decomposition
\[ \phi_O \] groups in one block these output symbols, which are not distinguished by any input combination; in this case there are no such symbols and \( \phi_O \) is a zero-partition;

\[
\phi_{S_1 \times I}^1 = \{0, 4, 8, 12, 16, 20(00); 1, 2, 5, 6, 9, 10, 11, 13, 14, 15, 17, 18, 21, 22, 23(01, 10, 11)\}
\]

is the state/input information derived from the primary input information available to machine \( M_1 \);

\[
\phi_{S_1 \times I}^2 = \{0, 3, 4, 7, 8, 11, 12, 15, 16, 19, 20, 23(00, 11); 1, 2, 5, 6, 9, 10, 13, 14, 17, 18, 21, 22(01, 10)\}
\]

is the state/input information derived from the primary input information available to machine \( M_2 \);

\[
\phi_{S_1 \times I}^1 = \{0, 1, 2, 3, 8, 9, 10, \Pi(a); 4, 5, 6, 7, 12, 13, 14, 15(b); 16, 17, 18, 19, 20, 21, 22, 23(c)\}
\]

is the state/input information derived from the state information available to machine \( M_1 \);

\[
\phi_{S_1 \times I}^2 = \{0, 1, 2, 3, 12, 13, 14, 15, 16, 17, 18, 19(x); 4, 5, 6, 7, 8, 9, 10, 11, 13, 14, 15, 17, 18, 19(y)\}
\]

is the state/input information derived from the state information available to machine \( M_2 \);

\[
\phi_{S_1 \times I}^1 = \phi_{S_1 \times I}^1 \cdot \phi_{S_1 \times I}^2 = \{0, 4, 8, 12, 16, 20(00); 3, 7, 11, 15, 19, 23(11); 1, 2, 5, 6, 9, 10, 13, 14, 17, 18, 21, 22(01, 10)\}
\]

\[
\phi_{S_1 \times I}^2 = \phi_{S_1 \times I}^1 \cdot \phi_{S_1 \times I}^2 = \{0, 1, 2, 3(1a); 4, 5, 6, 7(1b); 8, 9, 10, 11\}
\]

\[
\phi_{S_1 \times I}^1 = \phi_{S_1 \times I}^1 \cdot \phi_{S_1 \times I}^2 = \{0, 1, 2, 3, 12, 13, 14, 15, 16, 17, 18, 19(1); 4, 5, 6, 7, 8, 9, 10, 11, 13, 14, 15, 17, 18, 19(1); 20, 21, 22, 23(5)\}
\]

\[ \varphi_{S_i} \] is the state information derived from the state information available to machine \( M_i \); \( i = 1 \) or \( 2 \); \( \varphi_{S_1} \) is defined as in Table 3.4.

\[
\varphi_{S_1} = \{1\varphi(a, b); 2\varphi(a, b); 3\varphi(a, b); 4\varphi(a, b); 5\varphi(a, b); 6\varphi(a, b); 7\varphi(a, b); 8\varphi(a, b); 9\varphi(a, b); 10\varphi(a, b)\}
\]

\[ \varphi_{S_1} = \{1\varphi(a, b) + 2\varphi(a, b) + 3\varphi(a, b) + 4\varphi(a, b) + 5\varphi(a, b) + 6\varphi(a, b) + 7\varphi(a, b) + 8\varphi(a, b) + 9\varphi(a, b) + 10\varphi(a, b)\}
\]

\[
\varphi_{S_1} = \{1\varphi(a, b) + 2\varphi(a, b) + 3\varphi(a, b) + 4\varphi(a, b) + 5\varphi(a, b) + 6\varphi(a, b) + 7\varphi(a, b) + 8\varphi(a, b) + 9\varphi(a, b) + 10\varphi(a, b)\}
\]

\[
\varphi_{S_1} = \{1\varphi(a, b) + 2\varphi(a, b) + 3\varphi(a, b) + 4\varphi(a, b) + 5\varphi(a, b) + 6\varphi(a, b) + 7\varphi(a, b) + 8\varphi(a, b) + 9\varphi(a, b) + 10\varphi(a, b)\}
\]

Condition (r p.1) requires that the entire information available to a partial machine \( M_i \) (i.e. \( \phi_{S_1 \times I}^1 \cdot \phi_{S_1 \times I}^1 \cdot \phi_{S_1 \times I}^1 \)) is sufficient to unambiguously calculate the next state of
M₁. Similarly, condition (2) requires that this information is sufficient to unambiguously calculate the output of M₁. The total information available to the partial machines M₁ and M₂ is as follows:

\[
\phi_{S_1 \times I}^{in} \quad = \quad \phi^I_{S_1 \times I} \cdot \phi^{S_1}_{S_1 \times I} \cdot \phi^{I}_{S_1 \times I} = \\
\{0,8;4,12;16,20;1,2,3,9,10,11;5,6,7,13,14,15;17,18,19,21,22,23\} \text{ – for } M₁
\]

\[
\phi_{S_1 \times I}^{in} \quad = \quad \phi^I_{S_1 \times I} \cdot \phi^{S_1}_{S_1 \times I} \cdot \phi^{I}_{S_1 \times I} = \\
\{0,16,12;3,15,19;8,20;4,7,11,23;1,2,13,14,17,18;5,6,9,10,21,22\} \text{ – for } M₂
\]

It is easy to verify that the condition (2) is met, i.e. \(\phi_{S_1 \times I}^{in} \subseteq \phi_{S_1 \times I}^I\text{ and }\phi_{S_1 \times I}^{in} \subseteq \phi_{S_1 \times I}^I\).

To show how to verify condition (1), let us consider the first block of \(\phi_{S_1 \times I}^{in} \cup \{0,8\}\). First, we check whether this block leads to a block of \(\phi_{S_1}^I\). For this purpose, we calculate the value of \(\Delta^a_{S_1}(0,8)\) (Def. 3.16). Recall that \(\Delta^a_{S_1}(D)\) returns a block of \(\phi_{S_1}\), which includes possible next states for all elements of D. Since \(D = 0,8 = (1a,00)(2,00)\), its unlabeled version is \(L'(D) = (1,00)(2,00)\). In the specification machine M this block can lead to the blocks described by

\[
\overline{\delta((1,00)(2,00))} = \bigcup \left[ \delta(1,00) \bigcup \{1\}; \delta(2,00) \bigcup \{1,2\} \right] = \{1, \{1, 2\}\}.
\]

For the composition machine to be consistent with the specification machine, the next state of \(M₁(B'₁ \in \phi_{S_1}^I)\) must include copies of the states in \(\delta(0,8)\), i.e. \(\{1\} \subseteq L'(B'₁)\) or \(\{1,2\} \subseteq L'(B'₁)\). The blocks of \(\phi_{S_1}\) satisfying this condition are \(1a, 2 \text{ and } 1b, 3\). Thus, \(\Delta^a_{S_1}(0,8) = \{1a, 2; 1b, 3\} \neq \emptyset\) and the first part of condition (1) (cover pair condition) for the block 0,8 is satisfied.

Similarly, for the first block of \(\phi_{S_1 \times I}^{in} \cup \{0,16\}\), \(\overline{\delta((1,00)(4,00))} = \bigcup \left[ \delta(1,00) \bigcup \{1\}; \delta(4,00) \bigcup \{4\} \right] = \{1, \{4\}\}\), which is consistent with \(\{1a, 3, 4\}\), and therefore \(\Delta^a_{S_1}(0,16) = \{1a, 3, 4\} \neq \emptyset\).

The synchronization criterion of the synchronized sets of cover pairs additionally requires that the intersection of the blocks of the input covers of the partial machines \((\cap, D^I \in \phi_{S_1 \times I}^I)\) leads to a block being an intersection of the partial machine's next states. In the case of the previously discussed blocks, for block \(0,8\) of \(\phi_{S_1 \times I}^{in} \cup \{0,8\}\) there are two blocks of \(\phi_{S_1 \times I}^{in} \cup \{0,16\}\) with nonempty intersections: 0,16 and 8,20. For the first combination \(D^I = 0,8, D^I = 0,16\), \(\Delta^a_{S_1}(D^I) = \{1a, 2; 1b, 3\}, \Delta^a_{S_1}(D^I) = \{1a, 3, 4\}\), i.e. in these input conditions, \(M₁\) can perform the transition to state \(1a, 2\) or \(1b, 3\), and \(M₂\)
in a block of some specialization have a common subset of output values in the specification machine by elements of $S_i$. This means that the combination of the two partial machines can move to either $1a$ or $\overline{3}$ in these conditions. Meanwhile, the product of the input conditions $\bigcap_i D_i = \emptyset$ should lead to the state $\Delta S_i$ in the composition machine. $\delta(L'(\emptyset)) = \delta(1,00) = 1$, so $\Delta S_i = \{1a; 1b\}$. This means that for the composition machine to be consistent with the specification machine, it has to move to either $1a$ or $1b$. The confrontation of these results shows that it is possible for the combination of partial machines to move to a state inconsistent with the specification machine $\emptyset$. In terms of the definition of the synchronized set of cover pairs it means that for $B'_{D_1} = 1b, 3 \in \Delta \phi D_1$ and $B'_{D_2} = 1a, 3, 4 \in \Delta \phi D_2$ their intersection $\bigcap i B'_{D_i} = \emptyset$ is not included in any block of $\Delta S_i \cap \bigcap_i D_i = \{1a; 1b\}$.

However, if we limit the freedom of $S_i$ by removing the state $1b, 3$ from the set of its possible next states, the synchronization condition for the remaining state of $S_i$ $\{1a, 2\}$ is satisfied, because for $B'_{D_1} = 1a, 2 \in \Delta \phi D_1$ and $B'_{D_2} = 1a, 3, 4 \in \Delta \phi D_2$ their intersection $\bigcap i B'_{D_i} = \emptyset$ is included in a block of $\Delta S_i \cap \bigcap_i D_i$.

Following this procedure we can verify that the next state blocks for the blocks of $\phi_{S_i \times I}$ are as follows: $\{1a, 2(a), 1b, 3(b)\}$ for $0, 8$ (and $1b, 3(b)$ is then removed by the synchronization criterion); $a$ for $4, 12$; $c$ for $16, 20$; $b$ for $1, 2, 3, 9, 10, 11$; $b$ for $5, 6, 7, 13, 14, 15$; and for $17, 18, 19, 21, 22, 23$. For the blocks of $\phi_{I \times S_i}$ the next state blocks are as follows: for $0, 16$; $y$ for $12$; $x$ for $3, 15, 19$; $x; y$ for $8, 20$; $\{x; y\}$ for $\overline{4}$ and $y$ is removed by the synchronization criterion; $x$ for $7, 11, 23$; $\{x; y\}$ for $1, 2, 3, 14, 17, 18$ (and $x$ is removed by the synchronization criterion); $\{x; y\}$ for $5, 6, 9, 10, 21, 22$ (and $x$ is removed by the synchronization criterion).

Condition (3) requires that the information imported by any partial machine does not exceed the total information available to all partial machines. It is satisfied, because

$$
\phi_{S_i \times I} \cdot \phi_{S_i \times I} = \{0; 1, 2, 3; 4, 5, 6; 7, 8, 9, 10; 11, 12, 13, 14, 15, 16; 17, 18, 19, 20, 21, 22, 23\}
$$

is smaller than both $\phi_{S_i \times I}$. The condition (4) is met when all elements in a block of $\phi_{S_i \times I}$ (after unlabeled) have a common subset of output values in the specification machine $M$ that is included in a block of some specialization $\phi_{O'}$ of $\phi_{O}$. Recall that $\phi_{S_i \times I} = \{0, 16, 8, 20, 4, 12, 1, 2, 3, 7, 11, 13, 14, 15, 17, 18, 19, 23\}$ and $\phi_{O'} = \{\overline{0}; 11; 01\}$. Let $\phi_{O'} = \phi_{O}$. To check condition (4) we need to compute values of $\Delta_{\phi_{O'}}(D)$ for each $D \in \phi_{S_i \times I}$.

For example, for block $D = 1, 2, 3, 7, 11, 13, 14, 15, 17, 18, 19, 23$, the value of $\Delta_{\phi_{O'}}(D)$ is the combination of all possible output values produced in the specification machine by elements of $D$. For the unlabeled elements $3, 7, 11, 15, 19, 23$ the output value of the specification machine is 11. The unlabeled versions of elements $1, 2, 13, 14, 17, 18$ produce output 01 or 11 in the specification machine. Therefore, $\Delta_{\phi_{O'}}(D) = \{\overline{11}; 11, 01\}$. 

3.4. GENERAL DECOMPOSITION EXAMPLE
Since \( \Pi \) fits in the second block of \( \phi^D \), so \( \Lambda^{\phi^D}(D) = \Pi \neq \emptyset \). For the other blocks of \( \phi_{ST \times I} \), the values of \( \Lambda^{\phi^D}(D) \) are: 00 for 0, 16, 11 for 8, 20, 00 for 4, 12 and 01 for 1, 2, 5, 6, 9, 10, 13, 14, 17, 18, 21, 22.

Condition (5) must only be met for the state behavior realization. It states that each block of the product state cover of the partial machines needs to contain exclusively copies of the same state of \( M \), so that the state of the specification machine can be determined unambiguously. In this case, \( \phi_S^1 \cdot \phi_S^2 = \phi_S(0) \) so the condition is met and the decomposition is both state and output behavior realization.

Thus we have demonstrated that the above given covers indeed define a valid decomposition of \( M \).

### 3.4.2 Output behavior realization

Similarly as in the example in Section 3.1.2, the state space of machine \( M^* \) can be minimized without affecting its output behavior. If the state space of \( M_1 \) is defined by the cover \( \phi_{S_1} = \{1a, 2, 4, 5a; 1b, 3(b)\} \), the product state cover of both partial machines becomes \( \phi_{S_2} = \{1a, 4(a); 2, 5(a); 3(b); 1b(b)\} \), which does not form a \( S_1 - S \) pair with \( \phi_{S_1}(0) \), since the first block contains copies of two different states 1 and 4 for example. This is a violation of (5), and so the state behavior of \( M \) is not realized.

If the input and output covers remain unchanged, the only covers affected by the change of \( \phi_{S_1}^1 \) are the following:

\[
\begin{align*}
\phi_{S_1 \times I}^1 & = \{0, 1, 2, 3, 8, 9, 10, 11, 16, 17, 18, 19, 20, 21, 22, 23(a); 4, 5, 6, 7, 12, 13, 14, 15(b)\} \\
\phi_{S_1 \times I}^2 & = \{0, 8, 16, 20; 1, 2, 3, 9, 10, 11, 17, 18, 19, 21, 22, 23; 4, 12; 5, 6, 7, 13, 14, 15\} \\
\phi_{S_1 \times I}^3 & = \{0, 16; 1, 2, 17, 18, 3, 19, 4; 5, 6; 7, 8, 20; 9, 10, 21, 22; 11, 23; 12, 13, 14, 15\}
\end{align*}
\]

The verification of condition (t) proceeds along the lines outlined in Section 3.4.1. For example, let us consider block \( D = 0, 8, 16, 20 \), \( \Delta^{\phi_{S_1}}(D) = \{B^t \in \phi_{S_1} | \exists B' \in \mathfrak{S}(L'(D)) : B' \subseteq L'(B^t)\} \). The possible next state of the specification machine for this block is given by \( \mathfrak{S}(L'(D)) \). Since \( L'(D) = L'((1a, 00)(2, 00)(4, 00)(5, 00)) = (1, 00)(2, 00)(4, 00)(5, 00), \)

\[
\mathfrak{S}(L'(D)) = \bigcup \begin{bmatrix}
\delta(1, 00) \\
\delta(2, 00) \\
\delta(4, 00) \\
\delta(5, 00)
\end{bmatrix} = \bigcup \begin{bmatrix}
\{1\} \\
\{1, 2\} \\
\{4\} \\
\{4, 5\}
\end{bmatrix} = \{1, 4\}; \{1, 2, 4\}; \{1, 2, 5\}; \{1, 4, 5\}; \{1, 2, 4, 5\}.
\]

Since any of these blocks fits in the unlabeled version of the first block of \( \phi_{S_1}^a \) \( (L'(1a, 2, 4, 5) = 1, 2, 4, 5) \), the next state of \( M_1 \) for the input situation described by \( D \) will be \( a \). For the other blocks of \( \phi_{S_1 \times I}^1 \), the next states are \( b, a \) and \( b \), respectively.

As the cover \( \phi_{S_1 \times I}^2 \) did not change, the condition of \( S_1 \times I - S_1 \) pair for \( \phi_{S_1 \times I}^2 \) and \( \phi_{S_1}^2 \) is still satisfied. However, the change of \( \phi_{S_1 \times I}^1 \) can influence the synchronization
criterion of the synchronized set of cover pairs. In this case, the synchronization criterion is still satisfied, so condition (1) is still met for both partial machines. Also, $\phi^{in}_{S_i \times I} \leq 1$ is still smaller-or-equal-to $\phi^1_{S_i \times I}$, so condition (2) is met. Since $\phi^1_{S_i \times I} = \phi_{S_i \times I}(1)$, condition (3) is still satisfied. Finally, the output cover of $M_1$ did not change, so condition (4) remains satisfied.

Thus, if $\phi^1_{S_i} = \{1a, 2, 4, 5(a); 1b, 3(b)\}$ the covers satisfy all conditions of General Decomposition Theorem except for (5) and the network of partial machines described by these covers realizes output behavior of $M$ without realizing its state behavior.

3.4.3 Construction of the decomposition structure

Composition machine $M^*$

We will show, how the machines $M^*$ and $M_i$ considered in the proof of the General Decomposition Theorem can actually be constructed from their cover descriptions, and that they indeed constitute a realization of machine $M$.

Machine $M^*$ is defined by a quintuple $(\phi_I, \phi_{S_i}, \phi_{S_i \times I}, \delta^*, \lambda^*)$. Its input alphabet is described by the cover $\phi_I = \phi^1_I \cdot \phi^2_I = \{00; 11; 01, 10\}$. It therefore has three distinct input values. Similarly, its state cover $\phi_{S_i} = \phi^1_{S_i} \cdot \phi^2_{S_i} = \{1a; 1b; 2, 3, 4, 5\}$, describes a 6-valued state variable. Finally, output of the machine is described by the output cover

$$\phi_{S_i \times I} = \phi^1_{S_i \times I} \cdot \phi^2_{S_i \times I} = \{0, 16(r_1); 8, 20(r_2); 4, 12(r_3); 1, 2, 3, 7, 11, 13, 14, 15, 17, 18, 19, 23(r_4); 1, 2, 5, 6, 9, 10, 13, 14, 17, 18, 21, 22(r_5)\}$$

that defines a 5-valued output variable.

To build the state transition table (STT) for machine $M^*$, we need to calculate values of its next-state and output functions for all combinations of input (blocks of $\phi_I$) and state (blocks of $\phi_{S_i}$). Recall that according to definitions (3.3.7) and (3.3.8), the functions are defined as follows:

$$\forall B \in \phi_{S_i} \forall A \in \phi_I : \delta^*(B, A) = \Delta^\phi_{S_i}(B \times A) and \lambda^*(B, A) = [B \times A]_{\phi_{S_i \times I}}$$

Let us consider the behavior of $M^*$ in the state $B = 1a$ under the input $A = 01, 10$. The total state/input information available in these conditions is $B \times A = (1a, 01)(1a, 10) = \overline{1a, 2}$. Next state of the machine is calculated by calculating the next state for this block as $\Delta^\phi_{S_i}$. In this case, the unlabeled version of this block is $(1a, 01)(1a, 10)$, which leads to state $1$ in the specification machine. In $M^*$ state $1$ can be labeled to fit either in the block $\overline{1a}$ or $\overline{1b}$ of $\phi_{S_i}$. Therefore, $\delta^*(B, A) = \{1a, 1b\}$. The output of $M^*$ in this situation is established as a set of the blocks of the output cover including $B \times A$. In this case, these are blocks $4$ and $5$, so $\lambda^*(B, A) = r_4, r_5$.

Following this procedure, the STT in Fig. 3.16 is built. Since a check was done in the verification process of the condition (4) to see whether for any block of $\phi_{S_i \times I}$ an output value of the specification machine can be computed, the corresponding output values are given in parenthesis. It can be easily verified that machine in Fig. 3.16 is a multi-state and output behavior realization of machine in Fig. 3.13, because for any copy of a state of $M$, $M^*$ moves under the same input conditions to some copy (or copies) of the next states of $M$ and produces a subset of output values.
Partial machines $M_1$

Remember that the partial machine $M_1$ has three states associated with the states of the original machine in the way defined by cover $\phi_{S_l}$. We will refer to those states as $a$, $b$ and $c$. The input alphabet of $M_1$ is determined by combination of $\phi_1^1$ and $\phi_1^2$. In this case, since $\phi_1^1 = \phi_{S_l}(0)$ imported cover does not introduce new information, and may be omitted in consideration. The possible input values for $M_1$ therefore are $i_1$ and $i_2$. Let us establish functions $\delta_1^1$ and $\lambda_1$, as defined by expressions (3.3.19) and (3.3.20).

For example, let us consider behavior of $M_1$ in state $a$ under input $i_1$. The current state and input of the specification machine available to the partial machine $M_1$ is given by the expression $(a, i_1) \cap (S_l \times I)$. The first component of the expression is the information available locally in the machine (combination of local state information and local input information), while the second component expresses a refinement of the information available through imported information. In this case, since no information is imported, the lack of refinement is indicated by an intersection with the entire space $(S_l \times I)$. The information available to the machine therefore is $(a \times i_1) = (a, 00); (a, 00); (2, 00)$ or, in short, $0, 8$. To establish the next state for these input conditions, we follow the procedure used for verification of condition (1) in Section 3.4.1. Since each distinct input situation of a partial machine corresponds to a block of $\phi_1^{1,2}$, condition (1) guarantees that a next state can be found for this input condition. In this case, in Section 3.4.1 we have shown that the next state in input conditions $0, 8$ is $a$ or $b$, but the synchronization criterion excludes $b$. Thus, $\delta_1^1(a, i_1) = a$. Condition (2) guarantees that $0, 8$ is included in some block of $\phi_1^{1,2}$ (in this case, the first), and the output of $M_1$
therefore is \( \lambda^1(a, i_1) = o_1 \). Following the same procedure we obtain the complete state transition table of machine \( M_1 \) given in Fig. 3.17.

Similarly, machine \( M_2 \) has two states \( x \) and \( y \) defined by blocks of \( \phi_{S_1}^2 \). However, since \( M_2 \) imports information from \( M_1 \), its input alphabet is a combination of symbols of \( \phi_{S_1}^1 \) and \( \phi_{S_1}^2 \): \( \{ j_1, j_2 \} \times \{ J'_1, J'_2, J'_3 \} \). Since \( \phi_{S_1}^{2} = \phi_{S_1}^{1} \), \( J'_1 = o_1, J'_2 = o_2 \) and \( J'_3 = o_3 \), and hence \( I^2 = \{ (j_1, o_1); (j_1, o_2); (j_1, o_3); (j_2, o_1); (j_2, o_2); (j_2, o_3) \} \). Let us consider, for example, the transition of \( M_2 \) from state \( x \) under input \( (j_1, o_1) \). The input/state of the specification machine is given by \( (x \times j_1) \cap o_1 = (Ia, 3, 4 \times 00, 11) \cap o_8 \). In the same fashion, the remaining transitions of \( M_2 \) can be determined. The resulting STT of \( M_2 \) is given in Fig. 3.18. Note that four of the transitions do not have next-state/output values specified. These transitions correspond to the input combinations of \( j_2 \) and \( o_1 \) or \( o_2 \), which actually cannot occur. This stems from the fact that \( j_2 \) corresponds to primary input values \( o1 \) or \( 10 \), while both \( o_1 \) and \( o_2 \) may only be produced by \( M_1 \) under input \( i_1 \), which corresponds to primary input value \( \infty \). These two input conditions cannot occur at the same time, and therefore the four transitions may be treated as don’t-cares and used for minimization of \( M_2 \).

The combination of the above-constructed partial machines \( M_1 \) and \( M_2 \) (the machine \( M_{GC} \)) has a transition table defined by functions \( \phi_{GC} \) and \( \lambda_{GC} \). To illustrate determination of these functions, let us consider a situation, in which machine \( M_1 \) is in state \( a \), \( M_2 \) in state \( x \), and the value of function \( \Psi \) is \( i_1, j_1 \).

<table>
<thead>
<tr>
<th>in</th>
<th>imp</th>
<th>ps</th>
<th>ns</th>
<th>out</th>
</tr>
</thead>
<tbody>
<tr>
<td>( i_1 )</td>
<td>a</td>
<td>a</td>
<td>o_1</td>
<td></td>
</tr>
<tr>
<td>( i_2 )</td>
<td>a</td>
<td>b</td>
<td>o_3</td>
<td></td>
</tr>
<tr>
<td>( i_1 )</td>
<td>b</td>
<td>a</td>
<td>o_2</td>
<td></td>
</tr>
<tr>
<td>( i_2 )</td>
<td>b</td>
<td>b</td>
<td>o_3</td>
<td></td>
</tr>
<tr>
<td>( i_1 )</td>
<td>c</td>
<td>c</td>
<td>o_1</td>
<td></td>
</tr>
<tr>
<td>( i_2 )</td>
<td>c</td>
<td>b</td>
<td>o_3</td>
<td></td>
</tr>
</tbody>
</table>

Figure 3.17. STT of \( M_1 \)

<table>
<thead>
<tr>
<th>in</th>
<th>imp</th>
<th>ps</th>
<th>ns</th>
<th>out</th>
</tr>
</thead>
<tbody>
<tr>
<td>( j_1 )</td>
<td>o_1</td>
<td>x</td>
<td>x</td>
<td>p_1</td>
</tr>
<tr>
<td>( j_1 )</td>
<td>o_2</td>
<td>x</td>
<td>y</td>
<td>p_1</td>
</tr>
<tr>
<td>( j_1 )</td>
<td>o_3</td>
<td>x</td>
<td>x</td>
<td>p_1</td>
</tr>
<tr>
<td>( j_2 )</td>
<td>o_1</td>
<td>x</td>
<td>*</td>
<td>*</td>
</tr>
<tr>
<td>( j_2 )</td>
<td>o_2</td>
<td>x</td>
<td>*</td>
<td>*</td>
</tr>
<tr>
<td>( j_2 )</td>
<td>o_3</td>
<td>x</td>
<td>y</td>
<td>p_1,p_2</td>
</tr>
<tr>
<td>( j_1 )</td>
<td>o_1</td>
<td>y</td>
<td>x,y</td>
<td>p_2</td>
</tr>
<tr>
<td>( j_1 )</td>
<td>o_2</td>
<td>y</td>
<td>x</td>
<td>p_1</td>
</tr>
<tr>
<td>( j_1 )</td>
<td>o_3</td>
<td>y</td>
<td>x</td>
<td>p_1</td>
</tr>
<tr>
<td>( j_2 )</td>
<td>o_1</td>
<td>y</td>
<td>*</td>
<td>*</td>
</tr>
<tr>
<td>( j_2 )</td>
<td>o_2</td>
<td>y</td>
<td>*</td>
<td>*</td>
</tr>
<tr>
<td>( j_2 )</td>
<td>o_3</td>
<td>y</td>
<td>y</td>
<td>p_2</td>
</tr>
</tbody>
</table>

Figure 3.18. STT of \( M_2 \)
\[ \delta_{GC} \left( \begin{bmatrix} a \\ x \end{bmatrix}, \begin{bmatrix} i_1 \\ j_1 \end{bmatrix} \right) = \left[ \begin{array}{c} \cup_{C^1 \in [a \cap x] \times (i_1 \cap j_1)} \delta_1(a, (i_1, C^1)) \\ \cup_{C^2 \in [a \cap x] \times (i_1 \cap j_1)} \delta_2(x, (j_1, C^2)) \end{array} \right] \]

\[ \lambda_{GC} \left( \begin{bmatrix} a \\ x \end{bmatrix}, \begin{bmatrix} i_1 \\ j_1 \end{bmatrix} \right) = \left[ \begin{array}{c} \cup_{C^1 \in [a \cap x] \times (i_1 \cap j_1)} \lambda_1(a, (i_1, C^1)) \\ \cup_{C^2 \in [a \cap x] \times (i_1 \cap j_1)} \lambda_2(x, (j_1, C^2)) \end{array} \right] \]

This situation triggers transition \((i_1, a, a, o_1)\) in \(M_1\). Since the output of \(M_1\) is \(o_1\), the transition \((j_1, o_1, x, x, p_1)\) is triggered in \(M_2\). Combined, this results in transition \((i_1, j_1), [a, x], [a, x], [o_1, p_1]\). The entire STT of \(M_{GC}\) is given in Fig. 3.19. Taking into consideration the mapping functions \(\Psi, \Phi\) and \(\Theta\), we obtain corresponding values of input, state and output of the machine \(M^*\) given in parenthesis. Again, it can be verified that the composition of partial machines \(M_i\) is a state and output behavior realization of \(M^*\) and of \(M\).

In this way, we have demonstrated how the trinities of covers from Theorem 3.2 determine the multi-state decompositional realization structures of incompletely specified sequential machines.

<table>
<thead>
<tr>
<th>in</th>
<th>ps</th>
<th>ns</th>
<th>out</th>
</tr>
</thead>
<tbody>
<tr>
<td>(<a href="00">i_1, j_1</a>)</td>
<td>(<a href="1a">a, x</a>)</td>
<td>(<a href="1a">a, x</a>)</td>
<td>(<a href="r_1">o_1, p_1</a>)</td>
</tr>
<tr>
<td>([i_2, j_2](01, 10))</td>
<td>(<a href="1a">a, x</a>)</td>
<td>(<a href="1b">b, y</a>)</td>
<td>(<a href="r_4">o_3, p_2</a>, <a href="r_5">o_3, p_2</a>)</td>
</tr>
<tr>
<td>(<a href="11">i_2, j_1</a>)</td>
<td>(<a href="1a">a, x</a>)</td>
<td>(<a href="3">b, x</a>)</td>
<td>(<a href="r_4">o_3, p_1</a>)</td>
</tr>
<tr>
<td>(<a href="00">i_1, j_1</a>)</td>
<td>(<a href="2">a, y</a>)</td>
<td>(<a href="1a">a, x</a>, <a href="2">a, y</a>)</td>
<td>(<a href="r_2">o_3, p_2</a>)</td>
</tr>
<tr>
<td>([i_2, j_2](01, 10))</td>
<td>(<a href="2">a, y</a>)</td>
<td>(<a href="1b">b, y</a>)</td>
<td>(<a href="r_5">o_3, p_2</a>)</td>
</tr>
<tr>
<td>(<a href="11">i_2, j_1</a>)</td>
<td>(<a href="2">a, y</a>)</td>
<td>(<a href="3">b, x</a>)</td>
<td>(<a href="r_4">o_3, p_1</a>)</td>
</tr>
<tr>
<td>(<a href="00">i_1, j_1</a>)</td>
<td>(<a href="3">b, x</a>)</td>
<td>(<a href="2">a, y</a>)</td>
<td>(<a href="r_3">o_2, p_1</a>)</td>
</tr>
<tr>
<td>([i_2, j_2](01, 10))</td>
<td>(<a href="3">b, x</a>)</td>
<td>(<a href="1b">b, y</a>)</td>
<td>(<a href="r_4">o_3, p_1</a>, <a href="r_5">o_3, p_2</a>)</td>
</tr>
<tr>
<td>(<a href="11">i_2, j_1</a>)</td>
<td>(<a href="3">b, x</a>)</td>
<td>(<a href="3">b, x</a>)</td>
<td>(<a href="r_4">o_3, p_1</a>)</td>
</tr>
<tr>
<td>(<a href="00">i_1, j_1</a>)</td>
<td>(<a href="1b">b, y</a>)</td>
<td>(<a href="1a">a, x</a>)</td>
<td>(<a href="r_3">o_2, p_1</a>)</td>
</tr>
<tr>
<td>([i_2, j_2](01, 10))</td>
<td>(<a href="1b">b, y</a>)</td>
<td>(<a href="1b">b, y</a>)</td>
<td>(<a href="r_5">o_3, p_2</a>)</td>
</tr>
<tr>
<td>(<a href="11">i_2, j_1</a>)</td>
<td>(<a href="1b">b, y</a>)</td>
<td>(<a href="3">b, x</a>)</td>
<td>(<a href="r_4">o_3, p_1</a>)</td>
</tr>
<tr>
<td>(<a href="00">i_1, j_1</a>)</td>
<td>(<a href="4">c, x</a>)</td>
<td>(<a href="4">c, x</a>)</td>
<td>(<a href="r_1">o_1, p_1</a>)</td>
</tr>
<tr>
<td>([i_2, j_2](01, 10))</td>
<td>(<a href="4">c, x</a>)</td>
<td>(<a href="1b">b, y</a>)</td>
<td>(<a href="r_4">o_3, p_1</a>, <a href="r_5">o_3, p_2</a>)</td>
</tr>
<tr>
<td>(<a href="11">i_2, j_1</a>)</td>
<td>(<a href="4">c, x</a>)</td>
<td>(<a href="3">b, x</a>)</td>
<td>(<a href="r_4">o_3, p_1</a>)</td>
</tr>
<tr>
<td>(<a href="00">i_1, j_2</a>)</td>
<td>(<a href="5">c, y</a>)</td>
<td>(<a href="4">c, x</a>)</td>
<td>(<a href="r_2">o_1, p_2</a>)</td>
</tr>
<tr>
<td>(<a href="00">i_1, j_1</a>)</td>
<td>(<a href="5">c, y</a>)</td>
<td>(<a href="1b">b, y</a>)</td>
<td>(<a href="r_5">o_3, p_2</a>)</td>
</tr>
<tr>
<td>([i_2, j_2](01, 10))</td>
<td>(<a href="5">c, y</a>)</td>
<td>(<a href="1b">b, y</a>)</td>
<td>(<a href="r_5">o_3, p_2</a>)</td>
</tr>
<tr>
<td>(<a href="11">i_2, j_1</a>)</td>
<td>(<a href="5">c, y</a>)</td>
<td>(<a href="3">b, x</a>)</td>
<td>(<a href="r_4">o_3, p_1</a>)</td>
</tr>
<tr>
<td>(<a href="0">i_1, j_2</a>)</td>
<td>(_)</td>
<td>(_)</td>
<td>(_)</td>
</tr>
</tbody>
</table>

**Figure 3.19. STT of \(M_{GC}\)**


3.5 Conclusions

In this chapter, we have presented the theory of decomposition of finite state machines based on information modeling with covers. We have presented the theorem about existence of the general decomposition of an incompletely specified FSM into a network of partial FSMs interconnected in an arbitrary fashion, with input encoder and output decoder and with multi-state realization of the FSM. The theorem views all elements of the decomposition (i.e. partial machines and coders) as information processing sub-systems, where the input and output information streams are modeled by covers. The theoremformulates conditions that these input/output covers need to fulfill in order to define a valid decomposition. Given a set of covers fulfilling the conditions of the theorem, we are guaranteed to be able to build an actual network of partial FSMs that will realize the behavior of the original specification machine. We demonstrated with an example, how the covers describing the decomposition relate to the particular partial machines and coder blocks in the decomposition network.

The decomposition scheme described by the theorem is the most general known decomposition scheme for any finite state machine. It includes multi-state behavior realizations and generalized don’t-cares in both output and next-state functions. Thus, it covers also nondeterministic FSMs. Since Boolean functions are a special case of FSMs with a single state and trivial next-state function, the theorem also covers decompositions of Boolean functions. In [35], Jóźwiak showed that other well known decomposition structures, such as serial and parallel decomposition, are some special cases of the structures described by the General Decomposition Theorem.

In the following chapters we will focus on one specific special case of the General Decomposition Theorem that is related to FSM state assignment, and we will use the conditions of the theorem to formulate conditions for a valid assignment. We will also mention other practical application of the theorem to functional decomposition of Boolean functions. However, the applicability of the theorem goes far beyond these two applications. It can be used in any field dealing with discrete systems involving finite state machines, functions or relations, and their decompositions.
Chapter 4

General Decomposition in circuit synthesis

4.1 Sequential synthesis

The goal of FSM state encoding is to assign binary codes to the symbolic states of a sequential machine in such a way that the resulting binary next-state and output functions can be effectively and efficiently realized in the target implementation technology. Regardless of the method used to assign the codes, a sequential machine encoded with a valid encoding is functionally equivalent or compatible with the original, symbolic machine. If no state minimization is performed by the state encoding, the encoded machine is a state and output behavior realization of the symbolic machine. Otherwise it is an output behavior realization. We can interpret the encoded machine structure as a special case of the general decomposition of the original, symbolic machine into a network of two-state (binary) partial machines.

In the following, we will show how a particular state assignment induces the corresponding decomposition of the symbolic machine. By applying conditions of the General Decomposition Theorem to this specific decomposition, we derive conditions for a valid state assignment. These conditions, which are more general and flexible than conditions used in current state assignment methods, form the basis of a general, implementation-platform-independent encoding method. As a result of the underlying flexible conditions, the method is able to exploit some implicit optimizations of the encoded machine, making its implementation more efficient.

4.1.1 State-encoding-induced decomposition

Let us consider how state encoding induces decomposition of a symbolic finite state machine. First, we will discuss encoding of completely specified machines and then extend the results to the incompletely specified case.

To illustrate the process, we will consider the completely specified FSM given in Fig. 4.1(a). The example state assignment for this machine, given in Fig. 4.1(b), minimized the state space of the machine by assigning the same code to states $b$ and $c$. The gate implementation of the example machine is shown in Fig. 4.2. We superimposed on
the circuit the outline of the decomposition structure introduced by the encoding. Let us discuss how this structure was derived.

**Decomposition structure**

The current state of the symbolic machine is determined by the value of a single symbolic (multi-valued) state variable. In the case of encoded machine, the current state is a composition of the states of the binary encoding variables. For instance, in the encoded machine in Fig. 4.1 the current state \( a \) is composition of the state ‘0’ of the first encoding bit and state ‘1’ of the second encoding bit, while state \( b \) is indicated by state ‘1’ of the first encoding bit and ‘1’ of the second. In this formulation, we can view each binary encoding variable as a state variable of a partial, two-state machine. The composition of the current states of these partial machines gives the current state of the realized original symbolic machine.

In addition to the current state computation performed by the partial state machines, the encoded machine has to produce the primary output of the realized machine. This task is performed by the output decoder block \( \Theta \). This combinational block uses outputs of the partial machines (binary encoding bits), which deliver the state information, and the primary inputs to compute the primary outputs. (In general, the output decoder uses some sub-sets of the state variables and input variables to compute a particular primary output). Naturally, as in the case of any combinational function, the output encoder block can be viewed as a special case of sequential machine with trivial state behavior. This way, it is subject to further decomposition according to General Decomposition Theorem. In particular, it can be decomposed into a set of binary functions (combinational partial machines), each producing a single binary primary output of the original machine.

The last element of general decomposition is the input encoder block \( \Psi \). Since state assignment does not introduce any input encoding, the role of the input encoder is limited to distribution of the binary inputs to the particular partial machines and the output decoder sub-circuits. The connection between the input encoder and the output decoder blocks is realized by a trivial combinational partial machine that simply transfers inputs received from the input encoder block to the output decoder block (identity function). Note that for a Moore machine, where the output only depends on the current state, this trivial machine is not needed, as the output decoder does not need any primary input information.

The resulting decomposition structure is shown in Fig. 4.3. The partial machine \( M_1 \) is the sequential machine associated with the first encoding bit. Therefore, the machine has two states: ‘0’ for the original machine’s states \( a \) and \( d \) and ‘1’ for states \( b \), \( c \) and \( e \) (see encoding table in Fig. 4.1). Thus, its state space is described by the set system \( \pi_1 = \{a, d; b, c, e\} \). The second state-bit-machine, \( M_2 \), has the state space defined by the second encoding bit — it is in state ‘0’ for the states \( c \) and \( d \) and in state ‘1’ for the states \( a \), \( b \) and \( e \). Therefore, \( \pi_2 = \{c, d; a, b, e\} \).

Since in general state assignment does not introduce any encoding of the inputs or state information flowing between the machines, the role of both the input encoder block \( \Psi \) and the inter-machine connection blocks \( \text{Con}_{i,j} \) is reduced to distribution of the binary input and state variables. The selection of particular binary signals to distribute to partial machines is determined by the input supports of the combinational functions associated with each of the machines. These input supports can be derived from the encoded form...
### 4.1. SEQUENTIAL SYNTHESIS

#### (a) Transition table

<table>
<thead>
<tr>
<th>$pi_1$</th>
<th>$pi_2$</th>
<th>$ps$</th>
<th>$ns$</th>
<th>$po_1$</th>
<th>$po_2$</th>
</tr>
</thead>
<tbody>
<tr>
<td>00</td>
<td>a</td>
<td>d</td>
<td>01</td>
<td></td>
<td></td>
</tr>
<tr>
<td>01</td>
<td>a</td>
<td>a</td>
<td>00</td>
<td></td>
<td></td>
</tr>
<tr>
<td>10</td>
<td>a</td>
<td>c</td>
<td>11</td>
<td></td>
<td></td>
</tr>
<tr>
<td>11</td>
<td>a</td>
<td>b</td>
<td>10</td>
<td></td>
<td></td>
</tr>
<tr>
<td>00</td>
<td>b</td>
<td>a</td>
<td>11</td>
<td></td>
<td></td>
</tr>
<tr>
<td>01</td>
<td>b</td>
<td>d</td>
<td>10</td>
<td></td>
<td></td>
</tr>
<tr>
<td>10</td>
<td>b</td>
<td>a</td>
<td>11</td>
<td></td>
<td></td>
</tr>
<tr>
<td>11</td>
<td>b</td>
<td>d</td>
<td>10</td>
<td></td>
<td></td>
</tr>
<tr>
<td>00</td>
<td>c</td>
<td>d</td>
<td>11</td>
<td></td>
<td></td>
</tr>
<tr>
<td>01</td>
<td>c</td>
<td>a</td>
<td>11</td>
<td></td>
<td></td>
</tr>
<tr>
<td>10</td>
<td>c</td>
<td>d</td>
<td>11</td>
<td></td>
<td></td>
</tr>
<tr>
<td>11</td>
<td>c</td>
<td>a</td>
<td>11</td>
<td></td>
<td></td>
</tr>
<tr>
<td>00</td>
<td>d</td>
<td>a</td>
<td>01</td>
<td></td>
<td></td>
</tr>
<tr>
<td>01</td>
<td>d</td>
<td>d</td>
<td>01</td>
<td></td>
<td></td>
</tr>
<tr>
<td>10</td>
<td>d</td>
<td>e</td>
<td>11</td>
<td></td>
<td></td>
</tr>
<tr>
<td>11</td>
<td>d</td>
<td>c</td>
<td>11</td>
<td></td>
<td></td>
</tr>
<tr>
<td>00</td>
<td>e</td>
<td>a</td>
<td>11</td>
<td></td>
<td></td>
</tr>
<tr>
<td>01</td>
<td>e</td>
<td>d</td>
<td>10</td>
<td></td>
<td></td>
</tr>
<tr>
<td>10</td>
<td>e</td>
<td>a</td>
<td>11</td>
<td></td>
<td></td>
</tr>
<tr>
<td>11</td>
<td>e</td>
<td>d</td>
<td>10</td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

<table>
<thead>
<tr>
<th>$pi_1$</th>
<th>$pi_2$</th>
<th>$sb_1$</th>
<th>$sb_2$</th>
<th>$sh_1$</th>
<th>$sh_2$</th>
<th>$po_1$</th>
<th>$po_2$</th>
</tr>
</thead>
<tbody>
<tr>
<td>00</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td>01</td>
<td>1</td>
<td>0</td>
<td>1</td>
<td>0</td>
<td>1</td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td>10</td>
<td>a</td>
<td>0</td>
<td>1</td>
<td>0</td>
<td>0</td>
<td>1</td>
<td>1</td>
</tr>
<tr>
<td>11</td>
<td>a</td>
<td>b</td>
<td>0</td>
<td>1</td>
<td>0</td>
<td>1</td>
<td>1</td>
</tr>
<tr>
<td>00</td>
<td>b</td>
<td>a</td>
<td>1</td>
<td>1</td>
<td>0</td>
<td>1</td>
<td>1</td>
</tr>
<tr>
<td>01</td>
<td>b</td>
<td>d</td>
<td>1</td>
<td>0</td>
<td>1</td>
<td>1</td>
<td>0</td>
</tr>
<tr>
<td>10</td>
<td>b</td>
<td>a</td>
<td>1</td>
<td>0</td>
<td>1</td>
<td>1</td>
<td>0</td>
</tr>
<tr>
<td>11</td>
<td>b</td>
<td>d</td>
<td>1</td>
<td>0</td>
<td>1</td>
<td>1</td>
<td>0</td>
</tr>
<tr>
<td>00</td>
<td>c</td>
<td>d</td>
<td>1</td>
<td>1</td>
<td>1</td>
<td>0</td>
<td>1</td>
</tr>
<tr>
<td>01</td>
<td>c</td>
<td>a</td>
<td>1</td>
<td>1</td>
<td>1</td>
<td>0</td>
<td>1</td>
</tr>
<tr>
<td>10</td>
<td>c</td>
<td>d</td>
<td>1</td>
<td>1</td>
<td>0</td>
<td>0</td>
<td>1</td>
</tr>
<tr>
<td>11</td>
<td>c</td>
<td>a</td>
<td>1</td>
<td>1</td>
<td>0</td>
<td>0</td>
<td>1</td>
</tr>
<tr>
<td>00</td>
<td>d</td>
<td>a</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td>01</td>
<td>d</td>
<td>d</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td>10</td>
<td>d</td>
<td>e</td>
<td>1</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td>11</td>
<td>d</td>
<td>c</td>
<td>1</td>
<td>1</td>
<td>0</td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td>00</td>
<td>e</td>
<td>a</td>
<td>1</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>1</td>
</tr>
<tr>
<td>01</td>
<td>e</td>
<td>d</td>
<td>1</td>
<td>0</td>
<td>1</td>
<td>1</td>
<td>0</td>
</tr>
<tr>
<td>10</td>
<td>e</td>
<td>a</td>
<td>1</td>
<td>0</td>
<td>1</td>
<td>1</td>
<td>0</td>
</tr>
<tr>
<td>11</td>
<td>e</td>
<td>d</td>
<td>1</td>
<td>0</td>
<td>1</td>
<td>1</td>
<td>0</td>
</tr>
</tbody>
</table>

#### (b) Encoding

<table>
<thead>
<tr>
<th>state code</th>
<th>01</th>
<th>00</th>
<th>01</th>
<th>10</th>
<th>11</th>
</tr>
</thead>
<tbody>
<tr>
<td>$a$</td>
<td>b</td>
<td>c</td>
<td>d</td>
<td>e</td>
<td>f</td>
</tr>
<tr>
<td>$b$</td>
<td>a</td>
<td>d</td>
<td>c</td>
<td>f</td>
<td>e</td>
</tr>
<tr>
<td>$c$</td>
<td>b</td>
<td>e</td>
<td>a</td>
<td>f</td>
<td>d</td>
</tr>
<tr>
<td>$d$</td>
<td>c</td>
<td>f</td>
<td>b</td>
<td>e</td>
<td>a</td>
</tr>
<tr>
<td>$e$</td>
<td>d</td>
<td>f</td>
<td>c</td>
<td>a</td>
<td>b</td>
</tr>
<tr>
<td>$f$</td>
<td>e</td>
<td>a</td>
<td>d</td>
<td>b</td>
<td>c</td>
</tr>
</tbody>
</table>

#### (c) Encoded machine

Figure 4.1. Example finite state machine

Figure 4.2. Circuit implementing the encoded machine
of the finite state machine, such as the one presented in Fig. 4.1(c). Using any input support minimization method we can determine that the supports of the binary outputs are as follows:

\[
\begin{align*}
 sb_1 & : \{pi_1, sb_1\} \\
 sb_2 & : \{pi_2, sb_1, sb_2\} \\
 po_1 & : \{pi_1, sb_1\} \\
 po_2 & : \{pi_2, sb_2\}
\end{align*}
\]

We may further differentiate between the primary inputs and state inputs in the input support:

\[
\begin{align*}
 sb_1 & : \{IS_1, SS_1\} = \{\{pi_1\}, \{sb_1\}\} \\
 sb_2 & : \{IS_2, SS_2\} = \{\{pi_2\}, \{sb_1, sb_2\}\} \\
 po_1 & : \{IS_3, SS_3\} = \{\{pi_1\}, \{sb_1\}\} \\
 po_2 & : \{IS_4, SS_4\} = \{\{pi_2\}, \{sb_2\}\}
\end{align*}
\]

This way we determine that the primary input information delivered to M1 by the \( \Psi \) block is equal to the set system induced on the input symbols by the first input bit, i.e. \( \pi_1 = \{00, 01, 10, 11\} \). In general, the primary input information available to a partial

\[
\begin{align*}
 M & = \{M_1, M_2, M_3, M_4\} \\
 I & = (\pi_1^{B1}, \pi_1^{B2}) \\
 O & = \pi_1^{B1} \\
 I & = (\pi_1^{B2}, \pi_1^{B2}) \\
 O & = \pi_1^{B2} \\
 I & = (\pi_1^{B3}, \pi_1^{B4}) \\
 O & = \pi_1^{B3} \\
 I & = (\pi_1^{B4}, \pi_1^{B4}) \\
 O & = \pi_1^{B4}
\end{align*}
\]

\[\text{Figure 4.3. Encoding-induced decomposition}\]
machine can be described by the product of set systems induced by the primary inputs in the input support of the state bit associated with this machine, i.e.

$$\pi_I = \prod_{p_i \in IS_i} \pi_I^{IB}$$  \hspace{1cm} (4.1.2)

where:
- $IS_i$ are the primary inputs in the support of the state bit $i$ (primary support)
- $\pi_I^{IB}x$ is the set system induced on the input symbols by the primary input bit $x$.

Similarly, the state information imported by a particular partial machine is a product of the state set systems associated with the state bits in the state input support of the partial machine, i.e.

$$\pi_S = \prod_{sb_j \in SS_i \setminus \{sb_h\}} \pi_S^{j}$$  \hspace{1cm} (4.1.3)

where $SS_i$ are the state inputs in the support of the state bit $i$ (state support)

In the example above, the state support of $M2$ contains $sb_1$, and therefore the state information imported by $M2$ is $\pi_S = \{ad, bce\}$.

This example brings us to another specific feature of the encoding-induced decomposition. Since the output of a partial sequential machine is in fact an encoding bit, it only conveys state information, without any partial primary input information. Therefore, instead of describing the output of a partial machine by $\pi_S^{'}$, as in General Decomposition Theorem, we can use simplified representation by the state set system of the partial machine $\pi_S$.

Let us consider now the combinational machines $M_1$ and $M_2$ producing the primary outputs. As already mentioned, they are the result of decomposition of the combinational output decoder block $\Theta$. As for any combinational machine, the state space of $\Theta$ is a single state that does not change, and therefore the state behavior of the output machines is trivial and can be omitted. The input of $\Theta$ is composed of state bits (outputs of sequential partial machines $M_1$ and $M_2$) and the primary inputs forwarded by the trivial machine $M_3$. These inputs are distributed without any encoding to the output machines according to the input supports of the primary output functions. Therefore, as in the case of sequential machines, the input information of an output machine has two components: the primary input component being the combination of primary input information delivered by the primary input support of the binary output function computed by the machine, and the state component being the combination of state information delivered by the state bits in the input support of the binary output function. Since the output of each output machine is a primary binary output, the output information of the machine is described by the set system on the state/input space induced by the corresponding binary output variable $\pi_S^{OB_i}$. 

**Impact on General Decomposition Theorem**

The structure described in the previous section is a special case of the general decomposition of the symbolic machine. Therefore, as for any other decomposition, the conditions imposed by the General Decomposition Theorem on the set systems describing the
information flows within the decomposition structure have to hold. Now, we can formulate the conditions imposed by the General Decomposition Theorem on the set systems defining the encoding. Let us analyze how the above mentioned specific features of the encoding-induced decomposition influence the specific formulation of General Decomposition Theorem.

**Partial machines** In our analysis we have to distinguish between the two types of machine present in the encoded network: the sequential partial machines (state-bit machines) implementing the next-state function, and the combinational partial machines (output machines) implementing the output decoder, i.e. the output function. We will omit the trivial machine that forwards the input information to the block (M3 in Fig.4.3), as it does not perform any information processing, and therefore does not introduce any new set systems.

The most significant impact on the formulation of General Decomposition Theorem is associated with the fact that the trinity of partitions \((\pi_S^i, \pi_S^{i|}, \pi_S^{i|S})\) is no longer necessary to define a partial machine. In the case of state-bit machines, their input information is unambiguously defined by their input supports, and the state information is defined by the state partition. The output is equal to the state partition, and so \(\pi_S^{i|S}\) is reduced to \(\pi_S^i\). A state-bit machine can be therefore described by a pair \((IS_i, SS_i, \pi_S^i)\). For each output machine, its input information is also defined by its input support, and its output information is induced by the corresponding primary output variable. Thus, an output machine can be described by a pair \((IS_o, SS_o, \pi_S^{OBo})\).

**Input information of a partial machine** This influences the expression for the total input and state information available to a partial machine \((\pi_S^{in}, \pi_S^{i|}, \pi_S^{i|S})\). For state-bit machines, the last factor in this expression (the imported information) becomes now the input/state information induced by the imported state information, i.e.

\[
\pi_S^{i|S} = \text{ind}^S_{S \times I} (\pi_S^{i|S}) = \text{ind}^S_{S \times I} \left( \prod_{sb_i \in SS_i} \pi_S^i \right) \tag{4.1.4}
\]

This expression contains the information about the machine’s own state if it is used by the machine \((\pi_S^i if sb_i \in SS_i)\), so the first factor of the input information \(\pi_S^{S \times I} = \text{ind}^S_{S \times I} (\pi_S^i)\) is covered by the third and can be removed.

The output machines do not import information from each other, but to introduce a uniform notation we may treat the state-bit inputs in a support of an output machine as an imported information. Then, the imported part of the input information is described with the same expression as for state-bit machines. The output machines do not have any own state information, so the first factor of input information can also be removed.

The second factor, the state/input information delivered by primary inputs can be described for both state-bit and output machines by

\[
\pi_S^{I \times I} = \text{ind}^I_{S \times I} (\pi_I^i) = \text{ind}^I_{S \times I} \left( \prod_{p_i \in IS_i} \pi_I^{IB} \right) \tag{4.1.5}
\]
In summary, the total state/input information available to either state-bit or output partial machine is

\[ \pi_{S \times I}^{in} = ind_{S \times I}^{I} \left( \prod_{\pi_j \in S_i} \pi_{I}^{IB_j} \right) \cdot ind_{S \times I}^{S} \left( \prod_{\phi_j \in S_i} \pi_{S}^{j} \right) \] (4.1.6)

This expression can be used to formulate the specific modified conditions (1) and (2) of the General Decomposition Theorem.

In the example we follow, the state information available to \( M_1 \) is determined by its state input support (\( \{sb_1\} \), see Eq.4.1.1) and is equal to \( \pi_{S}^{1} \). The primary input information is determined by the primary input support (\( \{i_1\} \)) of the state bit associated with the machine and is equal to \( \pi_{I}^{IB_1} \). Therefore, the total input information available to \( M_1 \) is

\[ \pi_{S \times I}^{in} = ind_{S \times I}^{I} \left( \pi_{I}^{IB_1} \right) \cdot ind_{S \times I}^{S} \left( \pi_{S}^{1} \right) = \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
\[ \cdot \]
For machines $M_{01}$ and $M_{02}$ input information is

$$\pi_{S \times I}^{in} = \text{ind}_{S \times I}^{I^{B1}}(\pi_{S}^{I^{B1}}) \cdot \text{ind}_{S \times I}^{S}(\pi_{S}^{I}) = \text{ind}_{S \times I}^{I}(\{00, 01; 10, 11\}) \cdot \text{ind}_{S \times I}^{S}(\{ad; be\})$$

$$\pi_{S \times I}^{in} = \{ (a, 00)(a, 01)(d, 00)(d, 01); (a, 10)(a, 11)(d, 10)(d, 11); (b, 00)(b, 01)(c, 00)(c, 01)(e, 00)(e, 01); (b, 10)(b, 11)(c, 10)(c, 11)(e, 10)(e, 11) \} \quad (4.1.9)$$

Output of the partial machines The condition (2) of the General Decomposition Theorem requires in general case that the output information of a machine can be computed from its input information. For state-bit machines, the output of a machine is its state, so a machine is capable of producing the output as long as it is capable of producing the state. Thus, condition (2) is covered by condition (1) and can be left out. For output-bit machines, the input information has to be sufficient to compute the binary primary output of the machine that is associated with the particular partial machine. This can be described as

$$\pi_{S \times I}^{in} \cdot \pi_{S \times I}^{o} \leq \pi_{S \times I}^{OB} \quad (4.1.10)$$

The $n + o$ index reflects the numbering of the partial machines, such that the machines 1, ..., $n$ (where $n$ is the code length) are the state-bit partial machines, while $n+1, \ldots, n+l$ (where $l$ is the number of primary output bits) are the output-bit partial machines.

Remaining conditions The condition (3), which prevents forming of combinational loops through imported information, can be removed as only state information is transferred between the machines and the feasibility of producing this information is already ascertained by condition (1).

The condition (4) of General Decomposition Theorem describes the conditions for the network to produce valid primary output of the realized machine. This is, however, already guaranteed by the validity of the output machines, i.e. condition (2). Therefore, condition (4) can be omitted.

The condition (5) of General Decomposition Theorem remains unaffected by the specific nature of the decomposition. For the network to realize the state behavior of the symbolic machine, the combination of state information delivered by particular machines still has to allow for unambiguous identification of the current state of the symbolic machine.

State assignment theorem As discussed above, the specific structure of the encoding-induced decomposition gives rise to specific reformulation of the conditions of the General Decomposition Theorem.
The particular form of partial machines and the structure of their interconnections allows to express the input information of the partial machines as a function of the supports of the binary functions realized by these machines. This influences the formulation of the conditions expressing the relationship of input and output information in the partial machine. Moreover, the distinction between the state-bit and output-bit partial machines allows to omit some of the conditions concerning state or output behavior. In particular, the output behavior of state-bit machines is the direct consequence of their state behavior, so the output-behavior condition (2) for these machines can be omitted. On the other hand, the output-bit machines are purely combinational, so the state-behavior condition (1) can be omitted for them.

Also, the conditions (3) and (4) turn out to be superseded by the other conditions, and thus can be removed. The only condition that remains essentially unaffected by the encoding-specific structure is the condition (5).

Taking into account all of the above, we can derive from the General Decomposition Theorem the conditions for a valid state assignment described by a set of two-block set systems.

Let \( M = \{ I, S, O, \delta, \lambda \} \) be a completely specified finite state machine with a set of \( k \) binary inputs \( P_{IM} = \{ p_{i1}, \ldots, p_{ik} \} \) and a set of \( l \) binary outputs \( PO_M = \{ po_{i1}, \ldots, po_{il} \} \). Let \( \pi_{IBi}^j \) for \( i = 1, \ldots, k \) be a partition induced on the input alphabet symbols by the binary primary bit input number \( i \) and \( \pi_{IBO}^j \) for \( o = 1, \ldots, l \) be a partition induced on the output alphabet symbols by the binary primary output bit number \( o \). Let encoding \( E_M = \{ \pi_S^1, \ldots, \pi_S^n \} \) be a set of \( n \) two-block partitions on the set of states \( S \) of \( M \). Let \( SB_M(E_M) = \{ sb_1, \ldots, sb_n \} \) be a set of binary state variables, such that a variable \( sb_i \) assumes value 0 for states that are present in the first block of \( \pi_S^i \) and value 1 for states that are present in the second block of \( \pi_S^i \).

**Definition 4.1 (Support pair set).** A support pair set for a \( k \)-input, \( l \)-output machine \( M \) encoded with \( n \)-bit encoding \( E_M \) is a set of pairs \((IS_i, SS_i)\) for \( i = 1, \ldots, n, \ldots, n + l \), such that \( IS_i \subseteq P_{IM}, SS_i \subseteq SB_M(E_M) \) and \( IS_i \cup SS_i \) is an input support of the \( i \)-th binary next-state \((i = 1, \ldots, n) \) or output function \((i = n + 1, \ldots, n + l)\) of the encoded \( M \).

For a given encoding \( E_M \) and some support pair set \( \{(IS_i, SS_i) \mid i = 1, \ldots, n + l\} \) for \( M \) encoded with \( E_M \), let

\[
\pi_S^{in}_{i} = \text{ind}_{S \times I} \left( \prod_{p_{ij} \in IS_i} \pi_{IBi}^j \right) \cdot \text{ind}_{S \times O} \left( \prod_{sb_j \in SS_i} \pi_{IBO}^j \right) \quad (4.1.11)
\]

denote the input information of a partial state-bit machine (for \( i = 1, \ldots, n \)) or a partial output-bit machine (for \( i = n + 1, \ldots, n + l \)).

**Theorem 4.1 (Encoding of completely specified FSM).** Let \( M = \{ I, S, O, \delta, \lambda \} \) be a completely specified finite state machine with \( k \) binary inputs and \( l \) binary outputs. A set of two-block partitions \( E = \{ \pi_S^1, \ldots, \pi_S^n \} \) defines a valid \( n \)-bit encoding with output behavior realization of \( M \) if and only if a support pair set \( \{(IS_i, SS_i) \mid i = 1, \ldots, n + l\} \) exists for \( M \) encoded with \( E \) such that

1. \( \forall i \in \{1 \ldots n\} \quad (\pi_S^{in}_{i}, \pi_S^i) \) is a \((S \times I - S)\) partition pair
2. \( \forall o \in \{1 \ldots l\} \quad \pi_S^{in}_{o+n} \leq \pi_{BO}^o \)
Additionally, if

\[ \prod_{i \in \{1, \ldots, n\}} \pi_S^i = \pi_S(0) \]

then the state behavior of \( M \) will be realized too.

To verify the conditions of the above theorem for the example machine, we first analyze the next states of the blocks of \( \pi_{S \times I}^{\text{in}} \) (Eq. 4.4.7). For elements of the first block \( (a, 00)(a, 01)(d, 00)(d, 01) \) the next states of the realized machine are \( d, a, a, d \), respectively. Therefore, \( \mathcal{S}(a, 00)(a, 01)(d, 00)(d, 01) = \{a, d\} \) is a subset of the first block of \( \pi_S^1 \). Similarly, for the remaining blocks of \( \pi_{S \times I}^{\text{in}} \) the next state sets are \( \{b, c, e\}, \{a, d\}, \) and \( \{a, d\} \), respectively. Each of these next state sets is included in a block of \( \pi_S^1 \), so \( (\pi_{S \times I}^1, \pi_S^1) \) is a \( (S \times I - S) \) pair. In the same manner it can be verified that \( (\pi_{S \times I}^2, \pi_S^2) \) is a \( (S \times I - S) \) pair, and therefore the condition (1) is fulfilled.

Condition (2) requires that the input partitions of the output machines \( \pi_{OB}^{\text{in}1} \) and \( \pi_{OB}^{\text{in}2} \), see Eq. 4.4.9) are smaller or equal to the partitions induced by the values of the primary output bits. We can derive the induced partitions directly from the STT in Fig. 4.1.

\[
\pi_{S \times I}^{OB1} = \{(a, 00)(a, 01)(d, 00)(d, 01); \\
(a, 10)(a, 11)(b, 00)(b, 01)(b, 10)(b, 11)(c, 00)(c, 01)(c, 10)(c, 11)(d, 00)(d, 01)(d, 10)(d, 11) \cdots \}
\]

\[
\pi_{S \times I}^{OB2} = \{(a, 01)(a, 11)(b, 00)(b, 01)(b, 10)(b, 11)(e, 00)(e, 01)(e, 10)(e, 11) \cdots \}
\]

and verify that indeed every block of \( \pi_{S \times I}^{\text{in}1} \) fits into a block of \( \pi_{S \times I}^{OB1} \) and every block of \( \pi_{S \times I}^{\text{in}2} \) fits into a block of \( \pi_{S \times I}^{OB2} \).

Finally, the product partition of the state partitions \( \pi_S^1 \cdot \pi_S^2 = \{a; b; c; d\} \) is not a zero-partition, so the encoded machine only realizes the output behavior of the symbolic machine, without realizing its state behavior.

Multi-state incompletely specified encoding We can also apply the above reasoning to the General Decomposition Theorem for the incompletely specified, non-deterministic machines. In this case, the theorem reads as follows.

Let \( M = \{I, S, O, \delta, \lambda\} \) be an incompletely specified finite state machine with a set of \( k \) binary inputs \( P_I = \{p_{i1}, \ldots, p_{ik}\} \) and a set of \( l \) binary outputs \( P_O = \{p_{o1}, \ldots, p_{o1}\} \). Let \( \phi_i^{OB} \) for \( i = 1, \ldots, k \) be a set system induced on the input alphabet symbols by the binary primary bit input number \( i \) and \( \phi_o^{OB} \) for \( o = 1, \ldots, l \) be a set system induced on the output alphabet symbols by the binary primary output bit number \( o \). Let encoding \( E_M = \{\phi_S^1, \ldots, \phi_S^n\} \) be a set of \( n \) two-block set systems on the set of states \( S \) of \( M \). Let \( SB_M(E_M) = \{s_{b1}, \ldots, s_{bn}\} \) be a set of binary state variables, such that a variable \( sb \) assumes value 0 for states that are present in the first block of \( \phi_S^i \), value 1 for states that are present in the second block of \( \phi_S^i \) and any of the values 0 or 1 (denoted “–”) for states that are present in both blocks of \( \phi_S^i \). Let \( L^i \) for \( i = 1, \ldots, n \) be
4.1. SEQUENTIAL SYNTHESIS

a set of mappings of blocks of $\phi^S_S$ containing states of $M$ to the blocks containing some copies of the same states in a multi-state realization $M_l = \{I, S_l, O, \delta_l, \lambda_l\}$ of $M$.

For a given encoding $E_M$ and some support pair set $\{(IS_i, SS_i) \mid i = 1, \ldots, n + l\}$ for $M$ encoded with $E_M$ let

$$\phi^{in}_{S_i \times I} = \text{ind}_{S_i \times I} \left( \prod_{p_i \in IS_i} \phi^{IB_i}_I \right) \cdot \text{ind}_{S_i \times I} \left( \prod_{s_{I_i} \in SS_i} L^j(\phi^S_S) \right) \quad (4.1.13)$$

denote the input information of a partial state-bit machine (for $i = 1, \ldots, n$) or a partial output-bit machine (for $i = n + 1, \ldots, n + l$).

**Theorem 4.2 (Encoding of incompletely specified FSM).** Let $M = \{I, S, O, \delta, \lambda\}$ be an incompletely specified non-deterministic finite state machine with $k$ binary inputs and $l$ binary outputs. A set of two-block set systems $E = \{\phi^S_S, \ldots, \phi^S_S\}$ defines a valid $n$-bit encoding with output behavior realization of $M$ if and only if $n + l$ support pair sets $(IS_i, SS_i)$ and $n$ label assignments $L^i : \phi^S_S \rightarrow \phi^S_S$ exist such that

1. $\forall i \in \{1, \ldots, n\}$, $\left( \{\phi^{in}_{S_i \times I}, \phi^S_S\} \right)$ is a synchronized $(S_i \times I - S_i)$ cover pair
2. $\forall o \in \{1, \ldots, l\}$, $L^{-1}(\phi^{in}_{S_i \times I} + n) \leq \phi^O_{S \times I}$

Additionally, if

3. $\left( \prod_{i \in \{1, \ldots, n\}} \phi^S_S, \phi^S_S(0) \right)$ is a $S_i - S$ cover pair

then the state behavior of $M$ will be realized too.

**4.1.2 Mechanics of state assignment**

The example in the previous section illustrated the specific structure of the encoding-induced decomposition. In this structure, a single partial machine is present for each of the binary next-state and output functions induced by the encoding. The partial machines are interconnected with binary wires, without any interconnection encoders.

Therefore, the problem of state encoding can be formulated as the problem of finding the specific decomposition of a symbolic sequential machine to partial machines with at most two states (two states for sequential machines and one state for combinational). Additionally, the output set systems of the combinational partial machines are constrained to the set systems induced by the primary output variables (output of a combinational machine is a binary primary output), and the output set systems of the sequential machines are constrained to the state set systems $\pi^S_S$ (output of a sequential machine is a binary state bit). The set systems are subject to conditions formulated in Theorem 4.2.

This formulation is a generalization of the formulation used by the dichotomy-based state assignment methods, which search for a set of dichotomies defining particular encoding bits (equivalent to 2-block set systems defining the state spaces of the sequential partial machines). However, most current dichotomy-based methods require that the product of all dichotomies is equal to zero-partition. In other words, they require a unique code for each of the states. This requirement is equivalent to the condition (3) of Theorem 4.2, i.e. the state behavior realization. Meanwhile, usually it is not necessary to realize state behavior of a sequential machine as long as the output behavior is
realized. The condition (3) can be therefore dropped. Such relaxation of the conditions means that the encoded machine is only required to realize the output behavior of the symbolic machine, and therefore allows for implicit state minimization during state assignment. Under this formulation, very flexible state assignment is possible, allowing for state minimization and incompletely specified codes (state set systems with overlapping blocks).

In the following, we will describe mechanics of the construction of a valid state encoding of a finite state machine, which exploit the freedom afforded by the Theorem 4.2. This mechanics are independent of the target implementation platform, as they only consider validity of the encoding, without attempt at optimizing the implementation effectiveness or cost. On top of this mechanics, platform-specific heuristics have to be used to guide the encoding construction process towards efficient implementations.

State assignment construction

The goal of a state encoding method is the construction of a set of two-block set systems fulfilling the conditions of Theorem 4.2. In this process we propose to exploit the correspondence between a set system and its information set.

A symbolic FSM state variable has a single value to represent each state of the machine, e.g. the state variable of the example FSM in Fig. 4.1 has a separate value $a, b, c, d$ or $e$ for each of the machine’s states. Therefore, if we denote the set of states \{a, b, c, d, e\} as $T$, the symbolic state variable delivers the full information about $T$, described as a zero partition $\pi_T(0) = \{a; b; c; d; e\}$ or an information set of all distinctions $IS(\pi_T(0)) = \{a/b, a/c, a/d, a/e, b/c, b/d, b/e, c/d, c/e, d/e\}$. After encoding, this information is delivered by a set of binary state variables, each of which can be represented by their corresponding two-block set system or information set. Therefore, the construction of the two-block set systems that define the encoding can be described as a problem of finding a set of two-block set systems that in product give the zero partition describing the information delivered by the symbolic encoding variable. Alternatively, it is equivalent to the problem of distribution of the distinctions from $IS(\pi_T(0))$ between the information sets of the encoding set systems.

The above formulation obeys the condition (3) of Theorem 4.2, which is required for state behavior realization. If we drop this requirement and only require output behavior realization of the encoded machine, the product of the encoding set systems does not need to be a zero partition. In the information set formulation it means that the sum of information sets of the encoding variables does not need to include all distinctions from $IS(\pi_T(0))$. Therefore, a valid encoding can be constructed by distribution of some of the state distinctions realized by the symbolic encoding variable between the information sets of the encoding variables in such way that the set systems corresponding to these information sets fulfill the conditions of Theorem 4.2.

For instance, to construct the encoding in Fig. 4.1(b), the distinctions from $IS(\pi_T(0))$ can be distributed over the two information sets $IS_1 = \{a/b, a/c, a/e, b/d, c/e, d/e\}$ and $IS_2 = \{a/c, a/d, b/c, b/d, c/e, d/e\}$. Their corresponding set systems are $\Phi_1 = \{a; d, b, c, e\}$ and $\Phi_2 = \{c, d; a, b, e\}$, which we recognize as the encoding set systems of the example machine. Note that the information sets need not be disjoint and that not all distinctions from $IS(\pi_T(0))$ are present in the union of $IS_1$ and $IS_2$. In particular, the distinction $b/e$ is missing from the union, which manifests itself in the encoding by the
fact that the states b and e have the same code.

The construction procedure of the encoding set systems performs simultaneous selection of the distinctions for the information set and construction of the corresponding encoding set system. For this purpose, dichotomies (see Def. 2.22) are used. We recall that dichotomy is an equivalent shorthand notation for a two-block set system and therefore is suitable for encoding representation. The procedure starts with a set of atomic dichotomies. Atomic dichotomy represents elementary information and is equivalent to set systems delivering single distinction between two particular states. A set system delivering a single distinction is a set system that has the information set containing exactly one elementary information (distinction).

Take for instance the FSM in Fig. 4.4. It has four states a, b, c and d, and therefore all state distinctions are \( \{a/b, a/c, a/d, b/c, b/d, c/d\} \). The set system delivering a single distinction \( a/b \) is \( \phi_{a/b} = \{a, c, d; b, c, d\} \) and it has an equivalent atomic dichotomy \( a/b \).

At this point the set of atomic dichotomies already defines a valid encoding (so called atomic encoding, see Fig. 4.5), as the union of their information sets gives the information set containing all state distinctions, and therefore the encoded machine is state and output behavior realization of the symbolic machine. However, the length of the encoding \( s \cdot (s - 1)/2 \), where \( s \) is the number of states) is likely to introduce prohibitive cost of the realization. Therefore, a more compact encoding is constructed by combining some atomic dichotomies what corresponds to combining information sets of the combined atomic dichotomies, i.e. by construction of encoding set systems with information sets containing more elementary information items (more state distinctions). Naturally, the more state distinctions a single encoding set system realizes, the less set systems are necessary to cover all required distinctions, and therefore the less the length of the encoding. The process of combining the information sets of particular dichotomies corresponds to merging of the dichotomies involved (see Def. 2.26). For example, joining informa-

<table>
<thead>
<tr>
<th>i</th>
<th>ps</th>
<th>ns</th>
<th>o</th>
</tr>
</thead>
<tbody>
<tr>
<td>0</td>
<td>a</td>
<td>c</td>
<td>1</td>
</tr>
<tr>
<td>1</td>
<td>a</td>
<td>b</td>
<td>0</td>
</tr>
<tr>
<td>0</td>
<td>b</td>
<td>c</td>
<td>1</td>
</tr>
<tr>
<td>1</td>
<td>b</td>
<td>a</td>
<td>0</td>
</tr>
<tr>
<td>0</td>
<td>c</td>
<td>a</td>
<td>0</td>
</tr>
<tr>
<td>1</td>
<td>c</td>
<td>d</td>
<td>1</td>
</tr>
<tr>
<td>0</td>
<td>d</td>
<td>c</td>
<td>1</td>
</tr>
<tr>
<td>1</td>
<td>d</td>
<td>b</td>
<td>1</td>
</tr>
</tbody>
</table>

Figure 4.4. Example FSM

<table>
<thead>
<tr>
<th>dich</th>
<th>a/b</th>
<th>a/c</th>
<th>a/d</th>
<th>b/c</th>
<th>b/d</th>
<th>c/d</th>
</tr>
</thead>
<tbody>
<tr>
<td>( \phi )</td>
<td>{acd; bcd}</td>
<td>{abd; bcd}</td>
<td>{abc; bcd}</td>
<td>{abd; acd}</td>
<td>{abc; acd}</td>
<td>{abc; abd}</td>
</tr>
<tr>
<td>a</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>--</td>
<td>--</td>
<td>--</td>
</tr>
<tr>
<td>b</td>
<td>1</td>
<td>--</td>
<td>--</td>
<td>0</td>
<td>0</td>
<td>--</td>
</tr>
<tr>
<td>c</td>
<td>--</td>
<td>1</td>
<td>--</td>
<td>1</td>
<td>--</td>
<td>0</td>
</tr>
<tr>
<td>d</td>
<td>--</td>
<td>--</td>
<td>1</td>
<td>--</td>
<td>1</td>
<td>1</td>
</tr>
</tbody>
</table>

Figure 4.5. Atomic encoding of the example machine
tion sets of the atomic dichotomies \( \frac{a}{c} \) and \( \frac{c}{d} \) corresponds to merging these two dichotomies to a dichotomy \( \{a, d; c\} \) (or \( \frac{ad}{c} \)) with information set \( IS(ad/c) = \{a/c, c/d\} \).

Additionally, assuming that only output behavior realization is required, the encoding can be compacted by removing some of the atomic dichotomies from the encoding, as long as the conditions (1) and (2) of the Theorem 4.2 are fulfilled by the remaining set of dichotomies.

For instance, the dichotomy \( \frac{a}{b} \) can be removed from the atomic encoding without violating the first two conditions of the Theorem 4.2. The remaining dichotomies are then \( \frac{ad}{c}, \frac{a/d}{b/c} \) and \( \frac{b/d}{c} \), and the corresponding encoding set systems: \( \phi^1 = \{abd; bc\}, \phi^2 = \{abc; bcd\}, \phi^3 = \{abd; acd\} \) and \( \phi^4 = \{abc; acd\} \) define four partial state machines associated with four state bits \( \{sb_1, sb_2, sb_3, sb_4\} \). After encoding the FSM with the encoding defined by these set systems (see Fig. 4.6), input supports of the state bits and output are as follows

\[
\begin{align*}
\text{sb}_1 : IS_1, SS_1 & = \{i\}, \{sb_1, sb_3\} \\
\text{sb}_2 : IS_2, SS_2 & = \{i\}, \{sb_3\} \\
\text{sb}_3 : IS_3, SS_3 & = \{i\}, \{sb_3\} \\
\text{sb}_4 : IS_4, SS_4 & = \{i\}, \{sb_1\} \\
o : IS_5, SS_5 & = \{i\}, \{sb_1, sb_2, sb_3, sb_4\} \\
\end{align*}
\]

Let us assume that encoding does not introduce any multi-state realization for any of the states of the symbolic FSM. Then, all mappings \( L_i \) are identity functions and all input and output set systems in Theorem 4.2 are on \( S \times I \) instead of \( S_i \times I \). The input

<table>
<thead>
<tr>
<th>dich</th>
<th>( \phi^1 )</th>
<th>( \phi^2 )</th>
<th>( \phi^3 )</th>
<th>( \phi^4 )</th>
</tr>
</thead>
<tbody>
<tr>
<td>( a )</td>
<td>0</td>
<td>0</td>
<td>–</td>
<td>–</td>
</tr>
<tr>
<td>( b )</td>
<td>–</td>
<td>–</td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td>( c )</td>
<td>1</td>
<td>–</td>
<td>1</td>
<td>–</td>
</tr>
<tr>
<td>( d )</td>
<td>0</td>
<td>1</td>
<td>–</td>
<td>1</td>
</tr>
</tbody>
</table>

**Figure 4.6. Modified encoding of the example machine**
set systems of the four state bit machines and one output machine are then

\[
\phi_{S \times I}^1 = \text{ind}_{S \times I}^I (\phi_i^{B_1}) \cdot \text{ind}_{S \times I}^S (\phi^B_1 \cdot \phi^S_1)
\]

\[
= \text{ind}_{S \times I}^I (\{0,1\}) \cdot \text{ind}_{S \times I}^S \left( \{\text{abd}; \text{bc}\} \cdot \{\text{abc}; \text{acd}\} \right)
\]

\[
= \{(a,0)(b,0)(c,0)(d,0); (a,1)(b,1)(c,1)(d,1)\}
\]

\[
\cdot \{(a,0)(a,1)(b,0)(b,1)(d,0)(d,1); (c,0)(c,1)\}
\]

\[
= \{(a,0)(b,0)(0,0); (c,0); (a,1)(b,1)(d,1); (c,1)\}
\]

\[
\phi_{S \times I}^2 = \{(a,0)(b,0)(d,0); (a,0)(c,0)(d,0); (a,1)(b,1)(d,1); (a,1)(c,1)(d,1)\}
\]

\[
\phi_{S \times I}^3 = \{(a,0)(b,0)(c,0)(d,0); (a,1)(b,1)(c,1)(d,1)\}
\]

\[
\phi_{S \times I}^4 = \{(a,0)(a,1)(b,0)(b,1)(d,0)(d,1); (b,0)(b,1)(c,0)(c,1)\}
\]

\[
\phi_{S \times I}^5 = \{(a,0)(b,0); (c,0); (d,0); (a,1)(b,1); (c,1); (d,1)\}
\]

It can be verified that the next state sets of the blocks of \(\phi_{S \times I}^1\) are \{c\}, \{a\}, \{a, b\} and \{d\}, respectively, which fit into blocks 2,1,1, and 2 of \(\phi^S_5\), respectively. \(\phi_{S \times I}^1\) and \(\phi^S_5\) are therefore a \(S \times I - S\) cover pair. Since there is no non-determinism present in the FSM, this is sufficient to verify the condition (i) of Theorem 4.2 for \(i = 1\). The verification for \(i = 2, 3, 4\) is left as an exercise. To verify condition (2), we have to show that \(L^{-1}(\phi_{S \times I}^m) \leq \phi_{S \times I}^{B_1}\). This is true, since

\[
\phi_{S \times I}^{B_1} = \{(a,1)(b,1)(c,0); (a,0)(b,0)(c,1)(d,0)(d,1)\}
\]

and \(L^{-1}\) is the identity function. Thus, we have verified that removing atomic dichotomy \(a/b\) from the set of encoding dichotomies, and therefore assigning overlapping codes to states \(a\) and \(b\), still preserves the output behavior realization of the original machine.

Further, dichotomy \(ad/c\) can be merged with \(b/c\) and \(a/d\) with \(b/d\), and thus the final encoding defined by two dichotomies \(ab/c\) and \(ab/d\) is constructed (Fig. 4.7).

Additional considerations

The method of combining information sets of the encoding dichotomies, or merging the dichotomies, offers a significant flexibility in the construction of the encoding. In fact, using this method, any valid non-redundant encoding of the FSM can be constructed.

<table>
<thead>
<tr>
<th>dich</th>
<th>(sb_1: ab/c)</th>
<th>(sb_2: ab/d)</th>
</tr>
</thead>
<tbody>
<tr>
<td>(\phi)</td>
<td>{abd, c}</td>
<td>{abc, cd}</td>
</tr>
<tr>
<td>a</td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td>b</td>
<td>0</td>
<td>0</td>
</tr>
<tr>
<td>c</td>
<td>1</td>
<td>-</td>
</tr>
<tr>
<td>d</td>
<td>0</td>
<td>1</td>
</tr>
</tbody>
</table>

Figure 4.7. Final encoding of the example machine
However, the merging of information sets is not entirely arbitrary. There are two important aspects that have to be considered: compatibility and implied information (distinctions).

The compatibility consideration has to do with the fact that the merged information set has to correspond to a two-block set system to have a valid dichotomy representation and define a valid encoding bit. However, for some sets of information (distinctions) it is impossible to build a corresponding two-block set system. Consider, for example, the information set $\text{IS}_{\text{acd}} = \{a/c, a/d, c/d\}$. The corresponding set system with the minimum of blocks $\{ab; bc; bd\}$ has three blocks rather than two, and therefore it does not define a single valid binary encoding variable. In the case of code construction by pairwise merging of the encoding dichotomies this problem is alleviated by considering dichotomy compatibility. If two dichotomies are compatible (Def. 2.24) they can be merged to a single dichotomy and therefore their combined information set is guaranteed to have a corresponding two-block set system. If we consider a pair of dichotomies $a/cd$ and $c/d$, with corresponding information sets $\{a/c, a/d\}$ and $\{c/d\}$, the compatibility test reveals that both blocks of dichotomy $c/d$ intersect the right-block of dichotomy $a/cd$, and therefore the two dichotomies are incompatible and their combined information set $\{a/c, a/d, c/d\}$ does not have a corresponding two-block set system. If, on the other hand, the second dichotomy was $a/b$, the pair of dichotomies $a/cd$ and $a/b$ would be compatible and the combined information set $\{a/b, a/c, a/d\}$ would have a corresponding two-block set system $\{a/cd\}$. The compatibility test fails however for the sets of dichotomies larger than two. For example, the three atomic dichotomies defined by the distinctions in $\text{IS}_{\text{acd}}$ cannot be merged together, even though they are pairwise-compatible. The problem of compatibility of an arbitrary set of dichotomies is in general equivalent to graph coloring, and therefore it is NP-hard.

The second consideration when merging dichotomies and their information sets is the implied information (implied distinctions). The implied information consists of all elementary information items in the information set of a dichotomy that were not present in the information sets of the dichotomies that were merged to obtain this dichotomy. Consider, for example, two dichotomies $a/b$ and $c/d$. When merged, they result in dichotomy $ac/bd$ (or $ad/bc$), which has information set $\{a/b, a/d, b/c, c/d\}$. Since the only elementary distinctions present in the original information sets were $a/b$ and $c/d$, the remaining two distinctions $a/d$ and $b/c$ are implied. The implied information is the result of the requirement that the information set of a dichotomy has to have a corresponding two-block set system, the same requirement that introduced the compatibility problem. In the above example, naturally it is possible to just merge the information sets of the dichotomies, but the set system corresponding to the resulting information set $\{a/b, c/d\}$ is $\{ac; ad; bc; bd\}$ – a four-block set system, which does not have a dichotomy representation and does not define a single valid binary encoding bit.

As a result of the implied information, the encoding bits defined by the merged dichotomies may deliver elementary distinctions that they were not intended to deliver. In some cases this may be beneficial to the encoding process. In particular, this means that there is more information compressed on a single binary state variable, which may minimize both interconnections and input supports of the functions using this variable. This is the case when the implied information is relevant, i.e. it can be used by the functions using this variable. However, if the implied information is redundant, it may actually complicate implementation of the functions using this variable. Then, the implementa-
tion networks of the functions have to “sort out” the redundant information, which may result in more complex network structures. Also, the fact that a state variable delivers some extra information may have influence on the information that is necessary to compute this variable. In particular, the fact that a variable delivers more information may mean that more information is needed to compute this variable. In this way the implementation of the function that computes this variable may be more complicated and less efficient.

Summary

In this section, we have outlined a general method that constructs a state assignment of an FSM. The method uses dichotomies to represent the encoding. It starts with the dichotomies representing the encoding in which each state variable delivers a single elementary distinction between the states. In the code construction process, the encoding dichotomies are merged together to compress the state information delivered by the merged variables on a single variable. To ascertain validity of the constructed encoding, the method uses correspondence between dichotomies and two-block set systems. This correspondence allows to check the validity of the encoding defined by the dichotomies (and their corresponding set systems) against the conditions of Theorem 4.2.

The fact that Theorem 4.2 covers all valid state assignments, including state assignments reducing (for output-only behavior realization) and expanding (for multi-state realizations) the state space of machine, allows the method to explore the solution space unavailable to most other state assignment methods. Starting from the atomic encoding and using dichotomy merging to modify it makes it possible to reach any point of this solution space and provides fine-grained control over the form of constructed encoding. As a result, the method is extremely flexible and provides any control algorithm guiding it towards high-quality solution with a great deal of information about the structure of the FSM and information flows within it. In this thesis, we will show how the flexibility of the method and the information provided by it can be utilized as the basis for effective and efficient target-specific state assignment method.

4.2 Combinational synthesis

In the previous sections we described how the General Decomposition Theorem defines conditions that must be satisfied by a valid state assignment of a finite state machine. In this section, we will briefly discuss another special case of general decomposition and its application to combinational logic synthesis. The method, authored by Artur Chojnacki and Lech Jóźwiak and implemented in the software tool IRMA2FPGA is described in detail in [41].

A combinational function can be interpreted as a special case of finite state machine with a single state and trivial state transition function. In this interpretation, the General Decomposition Theorem can be used to describe a decomposition of the function into a network of partial combinational functions. The synthesis method is based on the recursive application of the serial decomposition scheme, in which at each step the considered function is decomposed into two sub-functions (partial combinational machines): predecessor sub-function $g$ and successor sub-function $h$ (see Fig. 4.8). The selection of function $g$ is driven by the choice of the input and output set systems of the function. At the
same time, the objectives and constraints of the target implementation architecture are taken into account and $g$ functions are built that fulfill these constraints and optimize the objectives. Therefore, the decomposed network is directly mappable, as it is composed of valid and optimized technology primitives. This approach of combination of global information processing optimization with platform-specific local optimization decisions gives superior synthesis results, as reported in [41].

In the combinational network built by the method no state information is present. Therefore, the set systems describing input and output information of the partial machines (functions) are reduced from the set $S \times I$ to $I$, i.e. only information about the primary input symbols is present and processed by the network. Therefore, the input information available to a partial machine is composed of the primary input information $\phi_1$ augmented by the information imported from other machines ($\phi_{S \times I}^I$), which in a combinational network become $\phi_1^I$. Since the network is built bottom-up (from primary inputs towards the outputs) and no loops are allowed in the circuit, the signals available as inputs to a particular $g$ function being built are the primary inputs and the outputs of the blocks built earlier (at the lower levels of the circuit). All these signals can be treated uniformly as delivering information about inputs described by set systems, and thus the input set system of the function $\phi_1^{in}$ can be determined as the product of the set systems associated with each of the variables in the bound set. The output of a partial machine is described in General Decomposition Theorem by the output set system $\phi_1^{out}$. Again, due to lack of state information, the symbol set of this set system is reduced from $S$ to $I$, resulting in the output set system $\phi_1^{out}$. The structure of an example combinational decomposition is shown in Fig. 4.9.

Note that a truth-table (or PLA) description of a Boolean function specifies output values of the function for different input combinations. Each of such (not necessarily disjoint) input combinations (referred to as term or cube) can be treated as a separate symbol from the input alphabet $I$ of the function and, therefore, the above mentioned set systems used to analyze the information flows in the function can be defined on the set of terms of the function. This approach is illustrated by the example of functional decomposition of a Boolean function presented in Section 2.4.1.

In [41], the conditions for existence of serial decomposition are formulated as a special case of the General Decomposition Theorem. Let $f$ be a $k$-input, $n$-output incompletely specified Boolean function. Let $X = \{x_i \mid i = 1 \ldots k\}$ be the set of binary input variables

![Figure 4.8. Serial decomposition](image-url)
of \( f \) and \( \phi_x \), \((i = 1 \ldots k)\) be the set systems induced by the particular input variables \( x_i \) on the set of the terms (cubes) of function \( f \). Let \( Y = \{ y_i \mid i = 1 \ldots n \} \) be the set of binary output variables of \( f \) and \( \phi_y \), \((i = 1 \ldots n)\) be the set systems induced by the particular output variables \( y_i \) on the set of the terms of \( f \). Let \( \phi_Y = \bigotimes_{i=1}^n \phi_{y_i} \) be the product set system induced by all binary output variables on the set of the terms of \( f \).

Let \( U \) and \( V \) be the subsets \( X \), referred to as bound set and free set, respectively, and the set systems \( \phi_U = \bigotimes_{x_i \in U} \phi_{x_i} \) and \( \phi_V = \bigotimes_{x_i \in V} \phi_{x_i} \) be the set systems on the terms of \( f \) induced by the input variables in \( U \) and \( V \), respectively.

**Theorem 4.3 (Existence of serial decomposition [41]).** If there exists a set system \( \phi_g \) on the set of terms of \( f \), such that \( \phi_U \leq \phi_g \) and \( \phi_g \cdot \phi_V \leq \phi_Y \), then the function \( f \) has a serial functional decomposition with respect to \((U, V)\) in the form \( f = h(g(U), V) \).

This theorem is recursively applied to the function \( f \) and then sub-functions \( g \) and \( h \) until the entire network is composed of the functions directly implementable in the target architecture.

The construction of function \( g \) can be completed in the following three steps: selection of an adequate input support of \( g \) (bound-set selection); construction of multi-valued function \( g \) with this support (construction of the output set system \( \phi_{\text{out}}^g \), or \( \phi_g \) in the Theorem 4.3); and implementation of the multi-valued function as a set of binary function(s) (encoding). Based on information relationship analysis, a limited set of the promising bound sets \( U \) is selected. For each bound set, the input set system \( \phi_{\text{in}}^g \) is calculated as a product of the set systems associated with the variables in the bound set. Then, the output set system is built. The condition (2) of General Decomposition Theorem requires that the output set system of a partial machine is larger than the input set system (i.e. provides less information). Therefore, any legal output set system can be constructed by

![Figure 4.9. Combinational decomposition](image-url)
merging the blocks of the input set system. In this method, the merging is performed in such a way as to minimize the number of blocks in the output set system (i.e. the number of output values of the multi-valued function $g$) without losing any unique distinctions (the distinctions only present on the inputs to the $g$ block). The selection of the best bound set favors the bound sets, which result in the output set systems with the smallest number of blocks, as this number determines the number of binary outputs of the block ($\lceil \log_2(|\phi_{out}^i|) \rceil$), and therefore the size of implementation and the convergence of the block. In addition to convergence criterion, a number of information structuring criteria is used to break the ties.

The choice of the input support and the output set system defines a multi-valued function $g$ with binary inputs. To implement this function in hardware, it has to be transformed into a set of binary function, i.e. the output values of the function have to be assigned with binary codes. As in the case of state assignment, the problem of encoding can be expressed as the problem of finding a set of two-block set systems that in product give the multi-block set system to encode ($\phi_{out}^i$). In this process the sets of two-block set systems are favored that repeat the least available (“unique”) information from the bound set on as many binary output variables as possible. In this way, the further decomposition of function $h$ is facilitated by eliminating the constraints associated with unique information transmission.

4.3 Conclusions

In this chapter, we have discussed some special cases of the General Decomposition Theorem that were used in the research reported in this thesis. In particular, using the conditions of the theorem, we derived conditions for a valid state assignment of FSM and based on that proposed a general, target-independent method of building an encoding. The method starts with the encoding defined by the elementary state information items (atomic dichotomies) and constructs a more compact encoding by merging compatible information items (dichotomies). The conditions formulated in this chapter for a valid encoding are more general than those used in other state assignment methods and therefore they allow to consider a larger number of valid encodings. In particular, the method allows considering compatible machines with reduced and expanded state sets (state minimization and multi-state realization, respectively). This potentially enables identification of efficient assignments unattainable for other methods.

The general encoding method provides mechanics for construction of a valid encoding. To ascertain that the valid encoding also results in an efficient implementation of the encoded machine on the target platform, platform-specific heuristics have to be used on top of this mechanics to guide the encoding construction process. In the next chapter, we will discuss such heuristics leading to the encodings resulting in efficient implementation in FPGA. We will present a complete FPGA-targeted state assignment method that is the result of implementing the heuristics in the framework of the general encoding method.

In this chapter we also showed that the functional decomposition of Boolean functions (combinational machines) is a special case of the general decomposition of sequential machines and outlined a combinational circuit synthesis method based on the functional decomposition. The combinational synthesis method is compatible with the
FPGA-targeting state assignment method we will propose in the next chapter, as they are both based on the same information-driven circuit synthesis paradigm and on the general decomposition theory. This makes this method particularly suitable for combinational synthesis of the Boolean functions resulting from the state assignment of FSM performed with the proposed state assignment method. Therefore, in our experiments with state assignment we will use the software tool implementing the combinational synthesis method, IRMA2FPGA, to implement the encoded FSMs.
4. GENERAL DECOMPOSITION IN CIRCUIT SYNTHESIS
In Chapter 4 we outlined the general, target-independent mechanics of a state assignment procedure based on a bitwise code construction using atomic state information and dichotomies. We pointed out that this procedure does not include any target-specific criteria to arrive at an encoding resulting in an efficient FSM implementation. It only assures that the resulting encoding is a valid one. The target-specific criteria need to be applied by particular encoding methods using this general scheme.

In this chapter, we will describe the criteria and heuristics guiding the general code construction procedure to achieve encoding resulting in an efficient implementation of the FSM in look-up-table-based FPGA. We will show how these heuristics are implemented in a state assignment software tool – SECODE. We will present the outline and an example of the operations of the tool and then focus on some specific data structures and algorithms crucial to the effectiveness and efficiency of the tool. We will also propose a dedicated analysis procedure that will allow us to address a specific encoding strategy – one-hot encoding.

5.1 Introduction

The choice of particular encoding ultimately determines the number of flip-flops and the binary functions (combinational components of the partial machines), and hence the form and cost of the circuit’s implementation. Therefore, the primary concern of any state assignment method is the creation of the encoding that will minimize the implementation cost (in terms of area, delay, power dissipation, etc.). To achieve this goal, state assignment needs to consider three important factors of the implementation process: the target implementation platform, the optimization objectives, and the combinational synthesis method that follows the encoding.

Clearly, the target implementation platform has decisive influence on the cost of the implementation. Depending on the platform different features of binary functions determine their implementation cost. For instance, for PLA structures the circuit area depends on the number and size of the product terms in the minimized sum-of-products
description of the binary function. PLA-targeting state assignment aims therefore on the encoding, which induces binary next-state and output functions described by small sum-of-products expressions. On the other hand, in LUT-based FPGAs the number or the size of terms do not play any role — a building block of this architecture – LUT or CLB – is capable of realizing any function with a fixed number of inputs (typically, 4 to 6). However, a totally different set of objectives and cost considerations may be applicable if the optimization objective changes, and in place of area, e.g. the delay or power is to be optimized.

In addition to platform characteristics, state assignment has to account for the method used to implement the combinational component on the target architecture. With combinational synthesis methods applying elaborate heuristics to find a near-optimal realization, state assignment cannot easily predict, how a particular binary function will be realized, and therefore its cost. Traditionally, state assignment methods deal with this uncertainty by taking into account main heuristics of a particular combinational synthesis method and searching for an encoding that results in binary functions easy to implement for the combinational synthesis method. JEDI is an example of this approach. It assumes that the combinational synthesis is based on extraction of common cubes in sum-of-products expression, and therefore creates encoding that maximizes potential for common cube extraction. Therefore, it is particularly suitable for use with division-based logic synthesis systems, such as SIS [71].

In our experience, however, SIS handles FPGA synthesis inadequately. Therefore, we are targeting logic synthesis with IRMA2FPGA — the logic synthesis method based on functional decomposition and outlined in Section 4.2. This tool significantly outperforms state-of-the-art academic and commercial tools, synthesizing circuits on average about 2 times smaller and 50% faster, and reaching up to 5 times area and 3 times delay reduction for some specific function classes (such as symmetric functions and functions with many don’t-cares) [41].

5.2 State assignment heuristics

5.2.1 Heuristic outline

In the previous chapter, we explained that to find a valid state encoding for a symbolic finite state machine it is sufficient to find a set of \( n \) (for \( n \)-bit encoding) 2-block set systems (or dichotomies), and we formulated conditions on these set systems in Theorem 4.2. When building a set system describing an encoding bit we have to have in mind two important considerations: the production and consumption of the information associated with this bit. We recall from Section 4.1.1 that each state bit is realized by a partial two-state sequential machine. Each such state-bit-machine is implemented as a flip-flop driven by the output of a combinational circuit implementing certain Boolean function of some primary inputs and state bits. From the viewpoint of the production of the information for the particular encoding bit it is therefore vital that the related combinational logic can be efficiently implemented in the target architecture. On the other hand, from the viewpoint of the consumption of the information from a particular encoding bit, the bit is used as an input to some other partial machines. Therefore, the information delivered by this bit has influence on the implementation of other partial machines.

As mentioned earlier, the character of this influence heavily depends on the target
implementation platform. In the case of LUT-FPGAs there are two important factors in the implementation of a circuit: size of the input supports of the functions and the number and length of interconnections. Since LUT is capable of realizing any function of no more than a certain maximal number of variables, the size of the input support of a function becomes a much better indicator of the realization complexity than number of terms or literals. Additionally, the interconnections (especially long), which are relatively scarce and slow on the FPGA platform, are often the determining factor in the circuit's delay and area. Thus, to facilitate efficient implementation of a circuit, (long) interconnections and input supports of particular functions should be minimized.

Therefore, the heuristic of optimized information flows for the selection of encoding set systems was proposed. In this context, optimization of information flows means distribution of state information between the state variables aiming at optimization of production and consumption of this information. From the viewpoint of information production, it is advantageous to group together the elementary information items (distinctions) which require similar information for their production. Then, the state bit being the composition of these elementary information items will require relatively little and relatively similar input information. This results in a compact input support and little interconnections required by partial machine realizing this state bit. From the point of view of information consumption, the information (distinctions) used together, especially by many functions, should be delivered together on the same state bit. Then, the functions using this information require fewer input support bits and fewer connections.

To properly analyze production and consumption of particular atoms of state information, we begin with FSM encoded with the atomic encoding and calculate input supports of primary and state output functions in this encoding. The input supports of the functions supply us with detailed information about how a particular atomic information is produced (which primary inputs and atomic state variable are in the support of the output function corresponding to the atomic state bit associated with this information) and how it is used (which primary and state output functions use the input bit corresponding to the atomic state variable associated with this information). This analysis is a basis for the creation of clusters of atomic dichotomies that should be conveyed together by the same state variable for the benefit of state information production, distribution and consumption. Once the clusters of atomic dichotomies are identified, their relative importance or quality is estimated and the actual encoding is constructed by considering clusters of dichotomies in descending order of their quality and merging the dichotomies in the considered cluster together on the same encoding dichotomy, or, if this turns out to be impossible, on a smallest possible set of dichotomies.

5.2.2 Information flows example

To illustrate the ideas behind information flow optimization with an example, let us consider again the FSM in Fig. 4.1(a). Fig. 5.1 shows the graph of information flows between the particular elementary state distinctions and primary inputs and outputs. As discussed above, the graph was determined by encoding the FSM with atomic encoding and computing input supports of all atomic encoding variables and primary output variables. Analysis of the graph reveals the structure of the information flows. We see very strongly connected subgraphs of the graph marked A and B. Within and between these subgraphs there is a lot of state information exchanged. Moreover, both subgraphs use
the same primary input \( p_{i1} \) and are used together to compute the primary output \( p_{o1} \). From this we conclude that if the atomic dichotomies in subgraphs A and B are merged together, the resulting code bit(s) will be very independent (they will only depend on one primary input and themselves) and they will provide compressed information to compute the primary output \( p_{o1} \). Clearly, this makes the dichotomies very good candidates for merging.

The fact that the coherent system of six dichotomies was split into two subgraphs A and B is the consequence of another information flow consideration. We can see from the graph that the three dichotomies in the subgraph A deliver state information to the other subgraph of the information flow graph — C. The subgraph C forms another coherent and relatively independent component of the graph. The dichotomies in the subgraph C depend on the same primary input \( p_{i2} \) and on each other. They are also all used to compute the primary output \( p_{o2} \). Therefore, they are good candidates to form a separate subsystem. As a consequence, all external information delivered the potential subsystem (i.e. information produced in subgraph A) should be compressed to optimize its usage and minimize interconnections. Another motivation to consider subgraphs A and B

![Figure 5.1. Information flows in the example FSM](image-url)
separately is the fact that dichotomies in B are used to compute both primary outputs, while the dichotomies in A are only used by output $po_1$.

Finally, the isolated dichotomy $b/e$, which is not used by any other node in the graph, indicates that this particular state distinction is not necessary to compute any other state or output information, and thus suggests possible state space minimization.

Given the above analysis, we can reason about the most beneficial merging of the atomic dichotomies. The prime candidate for merging is the subgraph A. There are two possible ways of merging the dichotomies in A on one dichotomy – $ac=bde$ or $ad=bce$.

We will evaluate these two options with the information flows in mind. For that purpose, let us analyze the complete information delivered by the encoding dichotomies, i.e. the implied dichotomies (see Section 4.1.2) as well as the merged atomic dichotomies. For the encoding dichotomy $ac=bde$, the information set is $\{a|b, a|d, a|e, b|c, c|d, c|e\}$, while for $ad=bce$ it is $\{a|b, a|c, a|e, b|d, c|d, d|e\}$. In the first case, we can see that the implied dichotomies ($a|d, b|c, c|e$) are not closely related (as far as information flows are concerned) to the dichotomies in the subgraph A. They use different input information and are used by different outputs. In the case of the second merging option, on the other hand, the implied dichotomies ($a|c, b|d, d|e$) are all contained in the subgraph B that is very closely related to A. Clearly, this is the preferred merging option.

When merging the dichotomies in subgraph C, we are also confronted with two merging choices: $abe=cd$ or $ac=bde$. In this case both options introduce implied distinctions that are not closely related to the distinctions in C. In the first case, these are the distinctions in subgraph B, in the second case – the distinctions in subgraph A. At this point, we may consider the consumption of the resulting information by primary outputs. We may observe that the primary output $po_2$ that uses information produced in C also uses information from B. This indicates that if C is merged in such way that it implies B, the resulting state bit will deliver complete state information consumed by the primary output $po_2$.

### 5.2.3 Additional issues

In the process of identification and merging of dichotomy groups, we have to consider additional issues discussed in Section 4.1.2 – the compatibility problem and the implied dichotomies. The problem of implied dichotomies was already discussed and illustrated in the above example. The compatibility issue, on the other hand, means that we cannot assume that any set of distinctions (atomic dichotomies) can be grouped (merged) together and represented by a valid encoding dichotomy with two blocks. To circumvent this issue, set systems with more than two blocks can be initially allowed to define the sequential partial machines. In this case, the encoding is not a binary encoding – an encoding variable has as many values as there are blocks in the encoding set system. Returning to the analogy between encoding and decomposition, the partial state-bit-machines have more than two states (they have as many states as there are blocks in the set system) and the encoding becomes in this way a specific decomposition, which can be transformed to a valid encoding structure by encoding the partial non-binary sequential machines. This concession gives additional flexibility to the process of collecting related distinctions and therefore allows determination of a more natural, non-constrained structure of the finite state machine. The resulting larger partial machines form then “subsystems” processing and producing related information. This can be beneficial from the point of view of
minimizing long interconnections (much of the information is processed locally by the partial machines) as well as the potential for common sub-functions. The latter stems from the fact that such subsystems have several binary outputs requiring similar information, which provides room for common sub-functions.

Since the ultimate target remains the binary encoding, each of the non-binary partial machines has to be further encoded. However, if the initial decomposition does not depart too far from the target binary decomposition, i.e. the sequential partial machines are not too large, the encoding problem for the partial machines is much smaller and easier than the original encoding problem. This problem can be again formulated as the distribution of the distinctions present in the multi-block set system defining a particular partial machine among several two-block set systems. In this case, the distribution can be determined by the local analysis of the relations of the information (e.g. some distinctions may be “more related” than others).

5.3 The method

5.3.1 Outline

The goal of the information-driven state encoding that we propose is to assign binary codes to the symbolic states of a machine in such a manner that the resulting network of the binary output- and next-state functions is composed of small, coherent, and relatively independent parts, with particular binary next-state or output functions depending on a small number of inputs. To achieve this goal, the state information is assigned to particular binary state variables in such a way that the information using particular inputs, as well as the information used by particular outputs is grouped on as few as possible state variables, thus reducing the cost of information production and distribution.

The binary state variables are represented using dichotomies. The state information delivered by a particular binary variable is given by the atomic dichotomies covered by the encoding dichotomy representing this variable. We construct the adequate encoding dichotomies for particular state variables delivering a certain desired information by merging together the atomic dichotomies corresponding to this information. The choice of the atoms of information to be merged together is guided by their affinity in relation to a combination of their common source (common inputs used to produce particular atoms of information) and common destination (common outputs using the atoms).

The state encoding process is divided in two phases (see Fig. 5.2):

- analysis phase – computation of the affinities for the atomic state information
- code construction phase – using the affinity information for code construction

Analysis phase

The analysis phase is composed of four steps:

1. Initial atomic encoding of the FSM.

2. Input support computation of the next-state and output variables of the initially encoded FSM.
5.3. THE METHOD

Analysis phase

state variable clustering
ordering
codeweight calculation
input combination and sub-function clusters
input clusters
output clusters

initial encoding
support computation
input weight calculation

.KISS file

cluster combination and ordering

code construction

Figure 5.2. Outline of the state assignment method
3. Input weight calculation.

4. Clustering of the state variables from the initial encoding based on the results of steps 2 and 3.

The clustering results are then used to determine which state variables of the initial encoding will be merged together in the code construction phase.

The analysis of the source and destination of a particular atomic information is performed by encoding the machine with the atomic encoding, i.e. the encoding in which each state variable is defined by an atomic dichotomy, thus delivering atomic state information (i.e. a single distinction between two particular states). Analysis of the input supports in such encoded machine supplies us with the detailed information about how a particular atomic information is produced (which primary inputs and atomic state variables are in its support) and how it is used (in which supports it appears). This analysis is augmented by considering not only which inputs are present in particular input supports, but also their relative importance. The importance of an input is expressed as its input weight. The weight reflects the influence of the input on the quality of the solution.

Consider, for example, the FSM in Fig. 5.3. The initial atomic encoding for the five states of the machine is given in Fig. 5.4. Encoding the machine with the atomic encoding results in the PLA description (truth table) of the combinational logic necessary to realize the next state and output behavior of the machine. The PLA is presented in Fig. 5.5. Given the binary next state and output function, we calculate minimum input supports for the functions, shown in Fig. 5.6. These supports are the basis for information flow analysis. The analysis informs us, for example, that to produce an atom of information distinguishing states \( a \) and \( b \) we require only the distinction \( d = e \). The distinction \( a/b \) is then used to calculate distinction \( c/d \) and the primary output \( o_1 \). For clarity, we will not consider input weights in this example. The issue of input weights is discussed in detail in Section 5.3.3.

Using the results of the input support analysis, we identify clusters of atomic dichotomies that need to be merged together to optimize information flow. For the sake of optimized information distribution and consumption, dichotomies that are used by the outputs that may form independent subsystems should be conveyed together. Then,
5.3. THE METHOD

the subsystem formed by the related outputs is fed with compressed, relevant information. To identify such related outputs, we analyze information consumed by the outputs and cluster together outputs with high similarity of consumed information, or with high potential for extracting common subfunctions.

In the example we are considering, we observe that the outputs $o_2, a/b, a/c, a/d$ and $b/e$ use similar information. They have therefore the potential to form a separate, coherent sub-network. From the viewpoint of the state information transmission and consumption it would be therefore advantageous to combine the information consumed by these outputs and deliver it in as compressed form as possible. The information used by these outputs is $a/d, a/e, b/d, b/e, c/d, c/e$ and $d/e$. These atomic distinction form therefore a dichotomy cluster that is a candidate for merging together.

On the other hand, from the viewpoint of the state information production, we would like the dichotomies (state information) produced using the same information to be

<table>
<thead>
<tr>
<th>$i$</th>
<th>$a$</th>
<th>$b$</th>
<th>$c$</th>
<th>$d$</th>
<th>$e$</th>
<th>$o_1$</th>
<th>$o_2$</th>
</tr>
</thead>
<tbody>
<tr>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>1</td>
<td>1</td>
</tr>
<tr>
<td>0</td>
<td>1</td>
<td>1</td>
<td>1</td>
<td>1</td>
<td>1</td>
<td>0</td>
<td>1</td>
</tr>
<tr>
<td>0</td>
<td>1</td>
<td>1</td>
<td>1</td>
<td>1</td>
<td>1</td>
<td>0</td>
<td>1</td>
</tr>
<tr>
<td>0</td>
<td>1</td>
<td>1</td>
<td>1</td>
<td>1</td>
<td>1</td>
<td>0</td>
<td>1</td>
</tr>
<tr>
<td>0</td>
<td>1</td>
<td>1</td>
<td>1</td>
<td>1</td>
<td>1</td>
<td>0</td>
<td>1</td>
</tr>
<tr>
<td>0</td>
<td>1</td>
<td>1</td>
<td>1</td>
<td>1</td>
<td>1</td>
<td>0</td>
<td>1</td>
</tr>
<tr>
<td>0</td>
<td>1</td>
<td>1</td>
<td>1</td>
<td>1</td>
<td>1</td>
<td>0</td>
<td>1</td>
</tr>
<tr>
<td>0</td>
<td>1</td>
<td>1</td>
<td>1</td>
<td>1</td>
<td>1</td>
<td>0</td>
<td>1</td>
</tr>
<tr>
<td>0</td>
<td>1</td>
<td>1</td>
<td>1</td>
<td>1</td>
<td>1</td>
<td>0</td>
<td>1</td>
</tr>
<tr>
<td>0</td>
<td>1</td>
<td>1</td>
<td>1</td>
<td>1</td>
<td>1</td>
<td>0</td>
<td>1</td>
</tr>
</tbody>
</table>

Figure 5.5. PLA of the FSM with atomic encoding

\[
\begin{array}{c|cccc}
\text{acd/be} & a & b & c & d & e \\
\hline
a & 0 & 0 & 0 & 0 & 0 \\
b & 1 & 0 & 0 & 0 & 0 \\
c & 0 & 0 & 0 & 0 & 0 \\
d & 0 & 1 & 0 & 0 & 0 \\
e & 1 & 1 & 0 & 0 & 0 \\
\end{array}
\]

Figure 5.6. Initial input supports

\[
\begin{array}{c|cccc}
\text{abc/de} & a & b & c & d & e \\
\hline
a & 0 & 0 & 0 & 0 & 0 \\
b & 1 & 0 & 0 & 0 & 0 \\
c & 0 & 0 & 0 & 0 & 0 \\
d & 0 & 1 & 0 & 0 & 0 \\
e & 1 & 1 & 0 & 0 & 0 \\
\end{array}
\]

Figure 5.7. Final encoding

\[
\begin{array}{c|cccc}
\text{d/e} & a & b & c & d & e \\
\hline
a & 0 & 0 & 0 & 0 & 0 \\
b & 1 & 0 & 0 & 0 & 0 \\
c & 0 & 0 & 0 & 0 & 0 \\
d & 0 & 1 & 0 & 0 & 0 \\
e & 1 & 1 & 0 & 0 & 0 \\
\end{array}
\]

Figure 5.6. Final encoding
merged, so that the distribution and usage of the input information is optimized. Therefore, we construct clusters of state outputs using similar input information. The dichotomies represented by these outputs are candidates for merging. In our example, the obvious candidates for merging are dichotomies \(a/e, b/c, b/d, c/e\) and \(d/e\), which all use the same, single primary input. If they are merged together, the resulting code bit can potentially only depend on one input, thus making its implementation very efficient.

Once the clusters are determined, they are combined and ordered according to a cluster quality measure. The measure estimates the saved interconnections and circuitry if the dichotomies in the cluster are merged together. For instance, the dichotomy cluster \(\{a/e, b/c, b/d, c/e, d/e\}\) has a very high quality, as it results in a very small implementation of the code bit. This is the consequence of the fact that, when merged together, the dichotomies form an encoding bit that can be computed using only a single primary input.

**Code construction**

In the code construction phase, the atomic dichotomy clusters sorted in the descending order of their quality are merged in order to realize the gains predicted by the cluster quality. The encoding dichotomy(-ies) resulting from the merging of particular clusters define consecutive state variables of the final encoding. Since some atomic dichotomy clusters cannot be merged to one dichotomy due to compatibility constraints, the distribution of the clusters’ atomic dichotomies over the two or more encoding dichotomies has to be established. This process is driven by the atomic-dichotomy-affinity measure, which for a given pair of atomic dichotomies reflects how important it is for the two dichotomies to be merged together. This measure is calculated based on the number and quality of the atomic dichotomy clusters, in which the pair of dichotomies appears together.

The merging process stops when all the atomic dichotomies that appear in any input support are realized by some of the current encoding dichotomies. Note, that this is not necessarily equivalent to realizing all the atomic dichotomies, since some state distinctions may not be needed (as is the case in the not minimized sequential machines). This relaxation of the constraint to realize all the atomic dichotomies is a consequence of Theorem 4.2 and means that our method is capable of implicit state minimization of the encoded machine by assigning overlapping, incompletely specified codes to equivalent states.

For instance, the dichotomies in the cluster \(\{a/e, b/c, b/d, c/e, d/e\}\) can be merged together to a single encoding dichotomy \(a/cd/be\). However, the dichotomies in the second cluster \(\{a/d, a/e, b/d, b/e, c/d, c/e, d/e\}\) cannot be merged to one bit, due the fact that three dichotomies \(a/d\), \(a/e\) and \(d/e\) can never be realized on a single binary state variable. The maximum compression of this information is achieved when the atomic dichotomies are merged to two dichotomies: \(abc/de\) and \(d/e\). In this way, we have obtained three encoding dichotomies: \(abc/de\), \(d/e\) and \(acd/be\). Since the second dichotomy \(d/e\) is covered by the third, we may remove it. Finally, we have obtained two encoding dichotomies, which define the encoding showed in Fig. 5.7. We may observe that the states \(a\) and \(c\) were assigned with the same code. This is because the distinction \(a/c\) is not realized by any of the encoding dichotomies. However, if we analyze the input supports in Fig. 5.4, we can see that the distinction \(a/c\) is not used for calculation of any output,
so it needs not to be computed. In this way, an implicit state minimization is performed: states $a$ and $c$ are both represented by a single state $00$ in the final encoding.

The realization structure of the resulting binary machine is shown in Fig. 5.8. We can observe that the encoding introduced a serial decomposition of the realization structure, with two distinct submachines. First submachine, associated with the encoding bit defined by dichotomy $acd/be$, only uses primary input and local state information to produce the encoding bit and the first primary output. The second submachine, associated with encoding bit $abc/de$, imports state information from the first submachine and produces the second encoding bit and the second primary output.

In contrast, in Fig. 5.9, the realization structure for the same machine encoded with the natural binary encoding is shown. Not only does it require more memory elements, as it does not benefit from the implicit state minimization performed by our method, but also the supports of the functions are larger and the interconnection structure is much more complicated. This results in a much less efficient implementation.

![Figure 5.8. Realization structure of the encoded machine](image)

![Figure 5.9. Realization structure of the binary-encoded machine](image)

**Summary**

In this section, we have outlined the state assignment method based on information flow analysis and optimization. The method encodes the FSM with the initial encoding based on atomic dichotomies. It then computes the input supports of the binary functions in thus encoded machine to analyze the origin and destination of particular state information items. It also determines the relative importance of the information flowing within the network by assigning weights to signals carrying this information. Based on that analysis, the method determines clusters of related logic that may form coherent subsystems and the clusters of atomic dichotomies that need to be delivered together to optimize production and consumption of information in the subsystems. The atomic dichotomies in the clusters are merged together to compress the information represented by them on as few encoding variables as possible. In the following, we will describe in more detail the abovementioned steps.
5.3.2 Initial encoding

The input of the method is a Mealy-type finite state machine in KISS format. In this format the transitions of the machine are listed in the form

\[ \text{< input combination > < present state > < next state > < output >} \]

As described in Section 4.1.2, the machine is initially encoded with the atomic encoding. This encoding is defined by \( s(s - 1)/2 \) atomic dichotomies (where \( s \) is the number of states). An atomic dichotomy is constructed for each pair \( (a, b) \) of states by placing the first state in the left block of the dichotomy, and the second in the right block. An encoding bit defined by an atomic dichotomy \( (a/b) \) has 0 for state \( a \), 1 for state \( b \), and don’t-care for the rest of states. Thus, a symbolic FSM is transformed to a table of the binary output and next state functions by replacing the state symbols (names) in the state transition table with binary vectors (codes of the states). The result is an incompletely specified Boolean function with \( i + s_b \) inputs (\( i \) is number of the FSM inputs, \( s_b \) is the number of state bits in initial encoding) and \( s_b + o \) outputs (\( o \) is number of outputs of the FSM).

5.3.3 Input supports calculation

To analyze production and consumption of the atomic information about states, the minimum input supports of the binary next state and primary output functions resulting from the atomic encoding are calculated.

Minimum input support selection is formulated as row covering problem. For each binary output function a set of distinction between the STT rows necessary to calculate this output is found. A distinction between rows \( r_i \) and \( r_j \) is necessary to calculate an output \( o \) if and only if the values of the output \( o \) are distinct (i.e. both are specified – not don’t-care – and different) for rows \( r_i \) and \( r_j \). For each row distinction a set of inputs capable of making this distinction is found. An input \( i \) is capable of making a distinction between rows \( r_i \) and \( r_j \) if and only if the values of the input \( i \) are distinct for rows \( r_i \) and \( r_j \). (Minimum) input support of output \( o \) is a (minimum cardinality) set of inputs such that each of the row distinctions necessary for the output function is realized by at least one of the inputs. The choice of the input set is made by constructing a cover matrix \( C \) with columns labeled by the inputs and rows labeled by the distinctions required for the output. Matrix \( C \) has 1 in position \( (i, j) \) if input \( j \) is capable of realizing distinction \( i \). Therefore the input support problem may be reformulated as finding (minimum) number of columns (inputs) such that at least one 1 is present in each of the rows (each distinction is realized by at least one input). In our state assignment tool, the task of column selection is realized by QuickScan [45] algorithm.

Inputs in each of the input supports may be divided in two groups: primary inputs and state inputs. The fact that each of the state bits is defined by an atomic dichotomy, and therefore may be interpreted as delivering an elementary information about states (single distinction between two states) allows us to interpret the state input support as the state information required for the “production” of the function. Therefore, the production of a particular state distinction is determined by the state distinctions present in the input support of the output state bit defined by the distinction. Its consumption is then determined by the input supports in which the input state bit defined by the distinction appears.
5.3. THE METHOD

This type of analysis provides us with very detailed information about the information flows and dependencies in the analyzed machine.

**Input weights** The information about production and consumption of the state information also allows us to make informed decision about the relative importance of the state distinctions and primary inputs. The importance is related to the influence of a particular input on the cost of the solution. We recall that our main stated objective is the reduction of the input supports of the particular binary functions and interconnections, as this has positive influence on the implementation cost of the circuit. Therefore, the importance of an input (or input weight) is related to the predicted input support saving associated with the input.

Intuitively, the weight reflects how important a particular input is from the viewpoint of the input support reduction, and hence how important it is to merge two dichotomies sharing this input. The weights depend on the frequency of occurrence of a particular input in the input supports. The rationale behind this approach is that the inputs that are used by most of the outputs will have to appear in all or almost all of the input supports anyway, so no significant gain can be realized. Also, the inputs which appear only in one or two supports offer no gain, since they will be used by one output only anyway the merging proceeds. For these inputs the potential gains are low, hence their low weights. On the other hand, high gains may be realized in the case of inputs that appear in about 30% of input supports. If the dichotomies that use them are merged together, those inputs may potentially be used by very few outputs. However, if the dichotomies that use the outputs are not merged together, the inputs may be used everywhere in the circuit. It is therefore very important to group outputs having common high-weight input bits in supports, because this is where the savings lay.

To determine the weights, we calculate for each input bit a number of input supports in which the input is expected to appear. This figure has two components: input supports of primary outputs (these are the certain part, and will not change), and input supports of state bits (which are still to be merged, so the final input supports are not certain). The first component is calculated by simple counting the primary outputs, which have the input bit in the support. For estimation of the second component, we take the percentage of the expected final number of the state bits that is proportional to the percentage of all atomic state bits using the particular input.

Once the expected number of the outputs using given input is calculated, the input weight is determined by the weight mapping function that for a given percentage of

![Figure 5.10. Weight mapping function](attachment:image.png)
output number returns the input weight. The function is built based on the reasoning outlined above: the inputs with very low or very high utilization are assigned low weights. The maximum weight is assigned to the inputs with about 30% occurrence rate. In our experience, good results are achieved for the weight mapping functions of the shape similar to the function in Fig. 5.10.

5.3.4 Dichotomy clustering

The main problem in the general encoding scheme presented in Section 4.1.2 is the selection of the groups of atomic dichotomies to be merged together on the encoding dichotomies. In our method this selection is performed by clustering the atomic dichotomies in groups based on several information-flow-based criteria. In this procedure we identify several types of clusters of output functions, each reflecting different aspect of possible information flow optimization:

- **output clusters (input approach)** The output logic is analyzed to identify groups of outputs with large common subsets of input supports. Such relationship can indicate significant similarity between the outputs and possible realization as a separate, coherent and relatively independent subsystem. During implementation, the circuitry for such group of outputs is likely to be placed together in the layout. To optimize information delivery to such prospective subsystem, the information consumed by the subsystem's functions should be compressed. Therefore, the atomic dichotomies in the supports of the clustered outputs are candidates to be merged together.

- **output clusters (sub-function approach)** The output logic is analyzed to identify potential groups of outputs with large common sub-functions. Such groups are very good candidates for subsystems within the circuit. Following the reasoning of compressed information for subsystems, the atomic dichotomies in the supports of such related outputs are good candidates to be merged together.

- **next-state-bit clusters (output approach)** To optimize information production, it is advisable to compute together state information that consumes similar information. Then, the resulting next state functions will have compact input supports and will be relatively independent of the other next state functions, thus reducing inter-connections. This is reflected by the dichotomy clusters that group together atomic dichotomies with similar input supports.

The clusters are created by two general clustering approaches: hierarchical and seed clustering. The hierarchical clustering creates non-disjoint clusters of elements by iterative merging of the current set of clusters. The seed clustering, on the other hand, initially identifies a number of “seeds” that initialize the set of clusters and then assigns the remaining elements to one of the clusters. Using different affinity functions and cluster selection criteria, the two general approaches are used to create the output and dichotomy clusters. For more detailed discussion of clustering techniques see Section 5.4.2.

The clustering procedure produces three different types of clusters, with different quality criteria and different impact on the implementation cost. Also, some of the clusters identified may not be useful, e.g. because they are too large or too small and thus impose too many or too few constraints on the encoding to be meaningful. To build
a framework for unified analysis of the clusters and their relative importance, a single cluster quality measure is introduced. It is related to the estimated size of saved circuitry and connections in the final circuit if the atomic dichotomies associated with the cluster are merged to one encoding dichotomy in comparison to their distribution over more encoding dichotomies. Based on their qualities, the different clusters are ordered.

In the following we will discuss the steps of the clustering and cluster ordering process: creation of the three types of clusters, ordering of the different types of clusters based on a common quality measure, transformation of the different clusters to dichotomy clusters, and the final ordering of the dichotomy clusters.

**Input-oriented clustering**

In the input-oriented clustering we identify groups of outputs that use similar state and input information and therefore have a chance of becoming coherent subsystems in the circuit. The similarity of the consumed information is estimated based on the input supports of the output functions. The clustering is performed in two steps. First, hierarchical clustering is applied to the set of singleton clusters. The affinity measure (TotalDifferenceAffinity) used in this step favors clustering outputs with similar supports (small difference). This phase, however, fails to identify relationships between outputs with large and small supports, even if small support is completely included in the large one. Therefore, the resulting clusters of the first phase are subjected to hierarchical clustering with inclusion-favoring measure (InclusionAffinity).

The reason for this two-phase approach is the tendency for the primary outputs to dominate the clustering process. We recall that at this point the state outputs are defined by atomic dichotomies. This means that they require relatively little information to be computed and therefore have small input supports, when compared to the primary outputs. Such small supports tend to be easily dominated by the large supports of the primary outputs. The result are large clusters built around particular primary outputs. To avoid this effect, in the first phase clusters of outputs with similar size of supports are favored.

**Phase I.** TotalDifferenceAffinity calculates similarity of two clusters as a difference between the weight of the inputs in the common input support and the weight of the remaining inputs in the supports. This value is given by $2 \cdot \text{common\_weight} - \text{summary\_weight}$.

Both common and summary weight are calculated as a sum of two components: weight of primary inputs in the common or summary support and estimated weight of state inputs in the common or summary support. The state input weight is estimated by multiplying average weight of state input in the state supports by the expected number of state bits, on which the state bits will be merged. The expected number of merged bits is directly proportional to the number of atomic dichotomies in the supports. In this phase criteria allow clustering of the clusters with positive affinity, i.e. only outputs that have more common inputs bits that different are clustered together.

**Phase II.** InclusionAffinity detects inclusion of small supports in larger ones. It reflects the percentage of the small support that is included in the large support. The
value of the affinity is calculated as

\[
1 - \frac{\text{smaller\_tail\_size}}{\text{smaller\_IS\_size}}
\]

Where: smaller\_IS\_size is the number of inputs in the smaller of the two supports.

smaller\_tail\_size is the number of inputs present in smaller of the input supports and not present in the larger one

Since in this step we are interested in clustering outputs with fully or almost fully covered input supports, the affinity criteria allow for clustering of outputs with affinity above 0.8.

Sub-function clustering

This procedure identifies clusters of outputs with potentially large common subfunctions. The clusters are created in hierarchical fashion (see 5.4.2), with the initial cluster set consisting of singleton clusters. The affinity measure guiding the clustering algorithm is based on the number of common row distinctions between the output functions. Since the combinational synthesis method implemented in IRMA2FPGA is based on analysis of information expressed as term (row) distinctions, common row distinctions are consistent with the notion of common information central to the synthesis. Therefore, it is likely that the combinational synthesis detects and extracts common logic from two functions with high affinity measure.

The affinity measure is calculated as the ratio of common distinctions to the sum of distinctions, i.e. for two outputs \( o_i \) and \( o_j \), the affinity measure is

\[
\text{aff}(o_i, o_j) = \frac{|IS(\phi_{S\times I}(o_i)) \cap IS(\phi_{S\times I}(o_j))|}{|IS(\phi_{S\times I}(o_i)) \cup IS(\phi_{S\times I}(o_j))|}
\]

As described in Section 5.4.2, the hierarchical clustering procedure uses affinity criteria to determine which affinities between the clusters are high enough to merge the clusters together. In the subfunction-oriented clustering, the affinity criteria allow for merging of the clusters with the minimum amount of 70% of common distinctions.

Next-state-bit clustering

In the next-state-bit clustering step the next state outputs consuming similar information are clustered. The functions that compute these outputs have the potential for being compact, independent subsystems, and therefore the dichotomies that define them will be merged together. When clustering the state bits we are interested in a partition of dichotomies into non-overlapping blocks, and, preferably, each block corresponding to a state bit. For this reason, seed clustering (see Section 5.4.2) is performed rather than the hierarchical clustering.

The seed clustering starts by selecting a number of “seed” state bits around which the clusters will form. Since the target number of clusters is close to the number of expected encoding bits, the number of initial seeds is equal to the minimum encoding length of
5.3. THE METHOD

The remaining state bits are then assigned to one of the existing clusters or, if they do not fit into any cluster, a new cluster is created.

This process is guided by the affinity measure reflecting similarity of input supports. This measure is the weighted average of the affinities of the primary and state input supports. The weight is related to the ratio of primary to state inputs in the encoded machine. The support similarity is evaluated as the ratio of common inputs in the two supports to the sum of all inputs in the two supports. Thus, the affinity measure is

\[
\left( p \cdot \frac{\text{common\_primary\_weight}}{\text{summary\_primary\_weight}} + (1 - p) \cdot \frac{\text{common\_state\_weight}}{\text{summary\_state\_weight}} \right) \cdot (1 - \text{size\_penalty})
\]

where \( p \) is the percent of primary inputs in the final number of inputs given by

\[
\frac{\text{primary\_inputs}}{\text{primary\_inputs} + \left\lceil \log_2 \#\text{states} \right\rceil}
\]

The additional factor \( \text{size\_penalty} \) can be introduced to prevent too large clusters from forming. The penalty is related to the size of the cluster size. In our experiments we used linear penalty ranging from 0 to 1 for clusters of the sizes between \( m \) and \( M \), where \( M \) is the number of all dichotomies and \( m \) is the minimum penalized size, e.g. 50% above the cluster size that would result from the balanced partition to equally sized clusters.

Output cluster ordering

The output groups identified by the clustering procedures described in the previous sections represent three different approaches to the optimization of the encoded machine. To reconcile the three views and compare their relative influence on the cost of implementation, a combined quality measure is necessary. The measure reflects savings in the realization cost if a given cluster is taken into account in the code creation. Of course, since each of the cluster types was created with different optimizations in mind, the quality estimation procedures will have to consider different saving sorts for different cluster types.

The procedures described below analyze possible savings in the realization logic if the dichotomies associated with a particular cluster are merged together. These savings involve both primary output and next state logic. To estimate the savings, the procedures analyze reductions of the input supports. These reductions involve primary input bits and present state bits. However, at this stage of the analysis, the state bits are defined by atomic dichotomies. This means that their number is much larger than it is going to be in the final encoding, but also that the state bits will change significantly, so any savings associated with them have to be considered as uncertain. Therefore, the identified savings have two distinct components: the primary savings – associated with primary inputs and outputs – and the state savings – associated with the state bits.

Output clusters The output clusters (both subfunction-based and input-based) are created to optimize information delivery to possible subsystems in the realization structure. The aim is to merge the dichotomies used by the clustered outputs to make the output's
supports compact and delivering compressed, relevant information. Therefore, the quality measure for output clusters reflects savings in the input supports of the clustered outputs if the used dichotomies are merged together.

These savings are evaluated as a difference between worst-case and best-case sizes of state bit support for the outputs in the cluster. The worst-case is when all the dichotomies in the support in the final realization are on separate state bits (limited by the expected number of state bits). Best-case is determined by the coloring of (no preference) twin graph of dichotomies. This procedure returns (close to) minimum number of dichotomies on which the given dichotomies may be merged (for details see 5.4.1).

**Next-state-bit clusters** We recall that the next-state-bit clusters are created to group together the state bits that share common input information and therefore make good candidates to be realized together. The savings associated with merging of such a group of state bits are twofold. On the one hand, the shared information is used by a single state bit (or a small number of bits) created by the merging of the dichotomies, thus making its distribution easier and promoting decomposition of the circuit into independent subsystems. On the other hand, we have to consider the influence of the created state bit on the supports of other functions, as it will be used by the functions instead of the individual dichotomies in the cluster.

Therefore, the quality measure for the state bit clusters has two components: input gain and output gain. Input gain expresses gain in terms of number of inputs saved if the state bits in the cluster are merged, and therefore calculated as one state bit, instead of being realized separately, on different state bits. Output gain expresses gain in terms of the number of connections saved if the state bits are sent to the functions that use them on one bit, instead of being sent on many bits. The above values are estimated as follows:

- **input gain**

  is calculated as a difference between estimated worst-case and best-case scenario. The worst-case input support for the dichotomies in the cluster is calculated by randomly placing the dichotomies on all (predicted) state bits, and assuming the input support of the cluster to be the sum of the input supports of thus created state bits. A state bit support is the sum of the supports of component dichotomies. Best-case is placing the dichotomies on as few bits as possible (no preference twin-graph coloring) and calculating the resulting input support as above.

- **output gain** For all outputs including in their supports some of the dichotomies in the cluster, the gain is estimated as the difference between the case, when all the used dichotomies are on maximum versus minimum possible number of bits.

**Cluster transformation**

At this point, the three types of output clusters exist and have cluster qualities associated with them. We recall, however, that for different clusters different sets of dichotomies need to be merged to realize the savings associated with the cluster. In this step, the groups of dichotomies associated with each cluster are created and organized as dichotomy clusters.
The transformation of clusters resulting from sub-function clustering and input-oriented clustering is performed in the following manner. Dichotomies (state input bits) in the input support of the outputs in the cluster are extracted. Intersections of the dichotomy supports form a partition into the dichotomy groups. Each group has a “depth” associated with the number of supports intersected, i.e. the number of outputs in the cluster using the dichotomies in the group. The graphical illustration of dichotomy groups is presented in Fig. 5.11. The sum of the dichotomy groups forms the dichotomy cluster.

![Figure 5.11. Dichotomy groups](image)

The clusters resulting from output-oriented clustering already include the dichotomies we wish to merge together. Therefore, the transformation of these clusters only involves replacing the atomic state bits in the clusters with the atomic dichotomies corresponding to the bits and placing the atomic dichotomies in a single dichotomy group that makes up the dichotomy cluster.

**Dichotomy cluster ordering**

Finally, the identified dichotomy clusters are ordered according to their quality. We recall that the estimated quality of clusters has two components: primary savings, involving primary inputs and outputs and therefore relatively easy to predict; and state savings, which involve atomic state bits and can therefore significantly change in the dichotomy merging process. The ordering criterion is therefore the “reliable” primary gain on the cluster. The uncertain state gain is used as tie-breaker. It can only cause a change in the order introduced by certain gain if the difference in uncertain gain is very large (two times), and the difference in certain gain – small (10 percent).

**Clustering process summary**

In the above we have described the steps of the clustering of atomic dichotomies that should be conveyed together on the same encoding variables to optimize the implementation of the encoded FSM.

To find the dichotomy clusters, we identified three types of output function clusters that correspond to different potential optimizations. The input-oriented clusters focus...
on the output functions that share common inputs and therefore may form a coherent sub-system. The sub-function-clusters reflect the term distinctions common to different output functions, and thus the potential for common subfunctions for these functions. Finally, the next-state-bit clusters identify atomic dichotomies that share input information and therefore should be realized together.

The identified clusters are ordered by introducing the common cluster quality measure related to estimated savings if the cluster is considered in the code construction. The ordered clusters are then transformed to corresponding clusters of dichotomies that should be merged, and the dichotomy clusters are finally ordered to prepare them for the code construction process.

5.3.5 Code construction

In this phase the created clusters of dichotomies are considered in the descending order of quality and the dichotomies in the cluster under consideration are merged on a (close to) minimum set of bits. Thus, an attempt is made to realize the gains estimated by quality of the clusters. The final result is a set of $n$ encoding dichotomies, which defines the $n$-bit encoding.

In the beginning of the procedure, the preferences for each pair of the dichotomies to be realized on one bit are gathered over all the dichotomy clusters. This is done by considering number and quality of the clusters, in which the dichotomies appear together. The result is the dichotomy-affinity matrix, containing in position $(i,j)$ the level of preference for dichotomies $i$ and $j$ to be realized on one bit.

Then, the code is created. In each step a set of already created encoding dichotomies (partial code) is available. The highest quality, yet not merged cluster is selected, and the dichotomies in it are merged. The merging procedure has as the objective to fit the merged dichotomies on a minimum number of bits (encoding dichotomies). If the merged dichotomies do not fit on one bit (due to incompatibility), the distribution of dichotomies on the set of bits is guided by the dichotomy-affinity matrix, in such way that the distribution chosen (heuristically) maximizes summary preference over all pairs over all created bits. The merging procedure is stopped, when the current set of encoding dichotomies realizes all the atomic dichotomies.

**Dichotomy-affinity matrix** The dichotomy-affinity matrix reflects number and quality of the clusters that the particular pair of dichotomies is placed in. Therefore, it reflects preference for these two dichotomies to be realized on one code bit.

The matrix is initialized with zero values, and then updated by analysis of the dichotomy cluster set. For each dichotomy group in each dichotomy cluster, all the pairs of dichotomies formed within the group are enumerated and each the affinity of each of the pairs is increased with the value of cluster quality (predicted gain) multiplied by group depth. Additionally, for all the pairs in the cluster (even if they are not within one group) the affinity is increased by cluster quality value.

**Cluster merging** In this phase the code is constructed in such way that the dichotomies placed in one cluster are (preferably) put on a minimum number of code bits. To achieve this the clusters are considered in descending order of quality. For each cluster the dichotomies in the cluster are merged on a minimum number of encoding dichotomies.
If the number of resulting dichotomies is larger than one, the distribution of the atomic dichotomies over the encoding dichotomies is related to the pairwise affinity of the dichotomies expressed in the affinity matrix. This task is performed by the twin graph coloring procedure (see Section 5.4.1) with the preferences between dichotomies determined by the dichotomy-affinity-matrix.

After merging some encoding dichotomies may include not all state symbols. This is especially the case if the merged cluster was relatively small or if it did not fit on one encoding dichotomy and a small number of atomic dichotomies had to be merged on a separate encoding dichotomy. Such “low-density” encoding dichotomies realize only a small number of state distinctions, so they contribute to the growing of the number of state bits in the final encoding. In general, we want to avoid this effect, unless the low-density dichotomy is of such high quality that the code length increase is justified.

To increase the density of the encoding dichotomy, we select a limited number of not yet merged dichotomy clusters and attempt to add the not merged atomic dichotomies to the encoding dichotomy. This solution is only considered if the dichotomies can be merged on the same number of encoding dichotomies, as the existing cluster, i.e. the compactness of representation for the previously merged atomic dichotomies is not affected. This is especially important when we consider that the previously merged clusters have higher quality, and hence are more important to realize compactly. If the equally compact representation is possible, the quality of the two competing solutions is compared. The quality measure involves the increase of the density of the encoding dichotomy, weighted against the increased input support of the dichotomy.

**Stepwise merging**

As an alternative to the regular cluster-based merging procedure so called *stepwise merging* was developed. The method is similar to the approach proposed in [56], where at each step a pair of state bits occurring together in a large number of input supports is merged. In this way, each of the supports where the pair occurs is reduced by one bit.

Our method selects at each step a pair of compatible state bits (a pair of compatible current encoding dichotomies) with the highest affinity and merges them together. The affinity measure considers similar aspects as the clustering methods. It takes into account the aspects important for efficient realization of the resulting binary functions: the production and consumption of the state information.

The efficient consumption of the created encoding bit is taken into account in two ways. Firstly, the candidates for merging are selected from the pairs of encoding bits which occur together in a large number of input supports. These bits are often used together and therefore it is advantageous to merge them together. The rationale behind it is that the two merged state bits will be replaced by one bit in all the supports where the pair occurs and therefore the supports will be reduced by one bit. Furthermore, the number of interconnections routed to each of the functions will decrease, further simplifying implementation. When considering the input support reduction as a result of merging, one needs to take into account not only the two merged dichotomies, but also all the dichotomies implied by the merged dichotomy. For instance, merging encoding bits defined by dichotomies \( a/b \) and \( c/d \) will produce an encoding dichotomy \( ac/bd \) which will replace in the input supports not only \( a/b \) and \( c/d \), but also \( a/d \) and \( b/c \). Therefore,
the total input support saving introduced by merging two dichotomies $d_1$ and $d_2$ is

$$iss(d_1, d_2) = \sum_{i \in IS, d_1 \in iS \land d_2 \in iS} \{d \mid d \in iS \land d \subseteq d_1 \lor d_2\} - 1$$

The second aspect of information consumption considered when merging state bits is the number of term distinctions realized by the new bit that are required by output functions. The larger the number, the more information required by the output is delivered by the input bit and therefore the fewer other inputs may be required by the output.

From the point of view of information production, the two merged bits (and all the implied dichotomies) should share the common input information, so that the resulting next-state function has an efficient implementation. To measure this aspect of state bit's affinity, the input support affinity measure is used (see 5.3.4). Its value, calculated for all pairs of dichotomies covered by the merged one, reflects how much the supports of the component dichotomies have in common.

The merging procedure selects from the current set of encoding dichotomies the dichotomy pairs that appear together in the largest number of supports. Then it analyzes the other encoding dichotomies covered by the merged pair and the supports they are used in. During this analysis, the pairwise input support affinities of the covered dichotomies are calculated. Finally, the state input bit resulting from merging the dichotomies is constructed and the term distinction realized by it and used by outputs are identified. The above measures are normalized to the range of \([0.0; 1.0]\) and multiplied to arrive at the composite affinity measure of the two considered dichotomies. The pair of encoding dichotomies with the highest composite affinity measure is merged and the data structures are updated to include the new bit in the input supports.

The update procedure is crucial to the efficiency of the method. In general, to determine input supports of the output functions at each step of the merging procedure, the machine should be encoded with the current encoding and time consuming input support procedure should be called. However, considering that the merging process is composed of relatively small changes to the encoding, most of the input support information in a given step can be obtained by appropriate modification of the input supports from the previous step. During the affinity analysis we already established the current encoding bits covered by the newly created bit. These bits can all be removed from the current encoding and they can be replaced in the input supports of the remaining outputs by the new bit. The input support of the new bit can be either established as a sum of the supports of the covered dichotomies or, for more precision, its input support may be calculated by the regular input support routine. The recalculation of just one input support at each merging step does not introduce severe performance penalty.

### 5.4 Structures and algorithms

In this section, we will discuss in more detail some of the fundamental data structures and algorithms used in SECODE.

#### 5.4.1 Twin Graph

Twin graph is a structure first introduced in \([12]\). It is an undirected and self-loop-free graph with two types of edges: regular and “twin edges”. Each node of the graph has
5.4. STRUCTURES AND ALGORITHMS

exactly one “twin node” that is connected to it with a twin edge. Additionally, every node can be connected with arbitrary other nodes by undirected regular edges. In this section we will present the formal definitions of twin graph and their application to dichotomy merging and the heuristics we developed to efficiently merge dichotomies with given preferences.

**Definition 5.1 (Matching).** Let $G = (V, E)$ be an undirected, self-loop-free graph. A matching on a set of vertices $X \subseteq V$ is a set of edges $T \subseteq E$ such that

$$\forall x \in X \quad \exists \{v, v'\} \in T : x = v \oplus x = v'$$

$$\forall \{v, v'\} \in T \quad \forall \{w, w'\} \in T \setminus \{v, v'\} : \{v, v'\} \cap \{w, w'\} = \emptyset$$

In other words, a matching is a set of edges that group the vertices from $X$ into unique, disjoint pairs.

**Definition 5.2 (Twin graph).** A twin graph is a pair $(G; T)$, where $G = (V, E)$ is an undirected, self-loop-free graph and $T$ is a matching on $V$. Each pair $\{v, v'\} \in T$ is called a twin couple and $v$ is called a twin of $v'$.

The graphical representation of a twin graph $(G, T)$, with $G = (V, E), V = \{a, b, c, d, e\}, E = \{\{a, b\}, \{a, d\}, \{c, d\}, \{d, e\}, \{e, f\}\}, T = \{\{a, b\}, \{c, d\}, \{e, f\}\}$ is shown in Fig 5.12(a). The twin edges are represented with thick lines.

An instance graph of a twin graph $(G, T)$ is a graph $G \setminus V'$, where $V'$ is a set of twin vertices that does not contain any twin couple and $|V'| = |T|$. In other words, an instance graph is a graph with exactly one vertex from each twin couple removed (see Fig. 5.12(b)). The remaining twin vertex is called twin representative. A coloring of a twin graph is a coloring of one of its instance graphs and minimum coloring is the minimum cardinality coloring that can be obtained for a twin graph over all its instance graphs (Fig. 5.12(c)).

**Dichotomy merging with twin graphs**

Twin graphs are particularly suitable for merging of sets of unordered dichotomies. For any set of unordered dichotomies, a twin graph can be built with a couple of twin nodes for each of the dichotomies. Each of the twins corresponds to one of the two possible orderings of the unordered dichotomy. For instance, the unordered dichotomy $ab//cd$ generates a pair of twin nodes with associated ordered dichotomies $ab//cd$ and $cd//ab$. Then,
a regular edge between the nodes indicates incompatibility of the ordered dichotomies associated with the nodes connected by the edge. The minimum coloring of the twin graph constructed in this way gives the minimum number of dichotomies that the initial dichotomies can be merged to. The resulting merged dichotomies are obtained by merging together all ordered dichotomies associated with the nodes colored to the same color.

Consider, for example, the set of unordered dichotomies \{ae=bd, a/c, b/c, c/de\}. The twin graph for this set is shown in Fig. 5.13(a). The colored instance graph is given in Fig. 5.13(b). The dichotomies ae=bd and a/c were selected in the positive polarity (i.e. ae//bd and a//c) and given the same color. They are therefore merged to form the dichotomy ae//bcd. Note that due to the symmetry of the dichotomy merging problem, any instance graph $G \setminus V'$ has the same cardinality coloring as the instance graph $V'$. In other words, given a colored instance graph, same cardinality coloring can be obtained for the instance graph created by replacing each node by its removed twin. In this example, in such inverse graph, the dichotomies ae=bd and a/c would be both present in negative polarity (bd//ae and c//a), colored to the same color and merged to dichotomy bcd//ae. Since merging in both polarities is possible, the result is an unordered dichotomy ae//bcd. The same reasoning applies to the dichotomies b/c and c/de, which can be merged in negative-positive polarity c//b, c//de or positive-negative (b//c, de//c), so the result is the dichotomy c/bde.

**Heuristics for twin graph coloring**

The problem of graph coloring is known to be NP-complete. Therefore, efficient heuristics are necessary to solve it in reasonable time. In the case of twin graph coloring, the
process is additionally complicated by the necessity to select one vertex out of each twin couple.

The basic heuristic is derived from the well-known DSATUR \([5]\) graph coloring heuristic. It is based on sequential selection of the “most difficult to color” vertices and coloring them to the first available color. This way, the most constrained vertices are dealt with in the early stages of coloring, when the freedom of color’s choice is not yet limited by the previous choices. In this context, the most difficult vertices are the ones that already have many colored neighbors (they have high saturation), so the number of colors allowed for these vertices is small. As a tie breaker, the number of edges incident to the vertex is used. This follows the rationale that vertices that impose many constraints should be colored early on, so that the imposed constraints can be addressed in the early stages of the coloring procedure.

On top of that, the heuristic for twin representative selection needs to be introduced. In this case, the obvious choice is the selection of the less constrained of the two twins (i.e. the twin with a lower DSATUR-score).

These two heuristics are intertwined in the framework of the coloring method. The strategy that proves to work well is the selection at each step of the most constrained twin couple (i.e. the couple for which the representative is more constrained than representatives for other couples), removing the more constrained twin, and coloring of the remaining representative.

Consider the initial score values for the twin graph in Fig. 5.13(a) (scores for particular nodes are given in Fig. 5.14(a) in form saturation, number of neighbors). The most constrained twin pairs are the pairs associated with dichotomies \(ae=bd\) and \(c=de\). Both twin vertices in each of these pairs can be considered a representative with no colored neighbors (saturation = 0) and four incident regular edges. Let us select the pair \(ae=bd\) and the vertex \(ae=bd\) to be the representative of the pair. The twin \(bd=ae\) is then removed and \(ae=bd\) is colored to the first color (gray). The resulting saturations are shown in Fig. 5.14(b). Now, both pairs \(a=c\) and \(b=c\) have a representative with score \((0,2)\). In the pair \(c=de\), on the other hand, both vertices have score \((1,3)\), so this pair is selected as the most constrained, the vertex \(c=de\) is arbitrarily selected as the representative and colored to the second color (black). Further, the pair \(a=c\) has two representatives with score \((1,2)\) and the pair \(b=c\) a representative with score \((0,1)\). \(a=c\) is selected as the representative of the more constrained pair and colored to the first color. Finally, \(c=b\) is the representative of the remaining pair, and is colored to the second color, resulting in the coloring in Fig. 5.13(b).

**Heuristics specific to dichotomy merging**

As illustrated by the above example, even with multiple constraints, the choice of equally constrained vertices to color can be quite large. Furthermore, in some cases a choice of colors can be available for a selected vertex. For instance, when coloring the pair \(a=c\) in the above example, both twin vertices were equally constrained with the score of \((1,2)\), but the vertex \(a=c\) could only be colored to the first color if selected as representative, while the vertex \(c=a\) could only be colored to the second color. Also, if there were more than two colors currently used in the graph, both of the vertices would have additional color to choose from. In situations like that, additional criteria associated with particular problem can be used to break the ties.
Figure 5.14. Twin graph coloring process
In the case of dichotomy merging, one of such criteria can be the preference of particular dichotomies to be merged together. This preference is given as an affinity matrix, where the pairwise preferences for all pairs of dichotomies are recorded. For instance, the choice of coloring $a/c$ to the first color over coloring $c/a$ to the second color could be dictated by the fact that the dichotomy $a/c$ has more preferences towards merging with $ae/bd$ rather than with $c/de$. Even more detail is introduced by analyzing atomic dichotomies resulting from coloring a particular vertex (i.e. merging it with other dichotomies already colored to the same color). Returning to the previous example, it might be the case that the atomic dichotomy $a/c$ has more affinity towards the atomic dichotomies $a/b$, $a/d$, $b/c$, and $d/e$ represented by $ae/bd$, than towards $c/d$ and $c/e$ represented by $c/de$.

Another criterion for the selection of colors can be the atomic dichotomies implied and forbidden by the choice of particular coloring. In general, the goal of dichotomy merging is to determine as few as possible dichotomies that cover all the initial dichotomies. In some cases, it may additionally be desirable that the merged dichotomies cover as few as possible atomic dichotomies outside of the initial set, provided that the minimum cardinality of the coloring is still preserved. An example of such case is the encoding construction, where a number of atomic dichotomies is selected to be merged together on as few as possible encoding dichotomies, and the preference is that only these atomic dichotomies are present on the resulting encoding dichotomies. Consider, for example, the set of atomic dichotomies $a/b$, $a/c$, $b/c$, and $c/d$, and assume that $a/b$ and $a/c$ where both colored to color 1 and $b/c$ to color 2. This means that $d/c$ has a choice of being colored to any of these colors. However, if colored to color 1, the resulting merged dichotomy $ad/c$ would include additional implied atomic dichotomy $b/d$ and would forbid the dichotomy $a/d$ from being realized on the same bit. On the other hand, if $d/c$ is colored to color 2, the resulting merged dichotomy $bd/e$ does not introduce any implied or forbidden dichotomies.

**Twin graph implementation**

The abovementioned heuristics were implemented in TwinGraph class. The class constructor accepts a list of unordered dichotomies and optionally the affinity matrix for the dichotomies. The affinity matrix can include positive affinities as well as negative (indicating that the dichotomies should not merged if possible). The negative affinity value INT_MIN is reserved to indicate that the two dichotomies should never be merged together. For each unordered dichotomy, a couple of twin nodes is constructed and connected with a twin edge. Then, the incompatibilities between the ordered dichotomies represented by nodes are determined and represented as regular edges in the graph. The coloring procedure is described in Alg. 5.1.

The procedure at each step selects the best candidate for coloring and colors it to the best color. The candidates for coloring are selected from the group of twin couple representatives with the highest DSATUR score (line 3). All the candidates are required to have the same saturation value, but the number of neighbors can differ within a given margin of the highest neighbors number in the group. Then, the candidate list is sorted in the descending order of preference level (line 6). The preference level is a sum of absolute values of positive and negative affinities that the candidate vertex has to other vertices in the graph. A large value of preference level indicates that the vertex has strong
Algorithm 5.1 Twin graph coloring

1: while( uncolored vertices present )
{
3:     candidates = list of twin representatives within a margin
      of the highest DSATUR score;

6:     if( preferences present ) sort candidates by preference level;
       else sort candidates by DSATUR score;

9:     candidates = a number of candidates from the front of the list;

12:    for each v in candidates {
13:        determine colors available for v;
14:        determine preferences of v for each of the available colors;
15:        colors = select a number of colors with highest preference;
16:        for each c in colors {
17:            analyze dichotomies implied and forbidden by coloring
                vertex v to color c;
18:            if( better than current best ) bestv, bestc = v, c
19:        }
22:    remove twin of bestv;
23:    color bestv to bestc;
}

preferences in the selection of a color for it, so it should be considered first. Depending
on the size of the problem, some of the less promising candidates can be discarded to save
runtime (line 9). For each of the remaining candidates, available colors and preferences
to the vertices already colored to the available colors are determined (lines 12 and 13). The
most preferred colors are then analyzed to determine, what dichotomies would be
implied and forbidden if the vertex was colored to a particular color (line 16). The best
vertex and its best color with high preference and low implied/forbidden score is selected
(line 18), the vertex's twin is removed and the vertex is colored (line 22).

5.4.2 Clustering

A major part of the analysis phase of SECODE is finding groups of outputs and atomic
dichotomies that are related as far as consumed and produced information is concerned.
To find these groups, we employ clustering techniques. Clustering of data is a well known
and much studied problem used in diverse fields, such as statistics, social sciences, mar-
ting, geography, biology and many others. An extensive overview of clustering clas-
sification and techniques can be found in [29] and some practical implementations of
clustering are discussed in [53].

In our problem, we applied two different clustering strategies. First of them is a
variation of hierarchical clustering, modified to handle cases of non-disjoint clusters. The
other one is a partitional (disjoint) clustering method based on growing clusters around
Hierarchical clustering

The hierarchical clustering algorithm accepts as input initial clusters (usually, one-element clusters) and merges them into larger ones. The resulting clusters constitute the input to the next level of clustering. The clustering stops when no merging is possible. The criteria for merging are based on similarity measure between clusters. At each level, similarities between all pairs of clusters are calculated. The similarity-measuring-function is a parameter to the clustering algorithm. The resulting similarities are presented as a graph (clusters are nodes, similarities are the weights of the edges). Then, the criterion-function, which also is a parameter to the algorithm, is applied to eliminate some of the edges. This way, a sort of threshold graph is constructed. However, the application of criterion-function rather than a fixed threshold makes the selection much more flexible and capable of dynamic adjustment of thresholds, depending on the characteristics of the problem or on how advanced the solution is. In the resulting graph the cliques (complete sub-graphs) are identified, and the clusters in the cliques are merged together.

Seed clustering

The seed clustering algorithm identifies (based on similarity matrix) seeds of the clusters (single objects, mutually distant and with strong connections to other objects). The seeds form initial (one-element) clusters. Then, each of the objects left (one at a time) is assigned to one of the existing clusters. If for an object no cluster exists to which the object could be added, a new cluster is created, with the object as an element. The seed selection procedure starts with identification of the object with strongest relations (largest average affinity to its neighbors). Then, one-by-one, the remaining seeds are selected as the objects with lowest summary affinity to the current seed set (with 20% tolerance) and (tie-breaker) strongest relations. In the object assignment phase, an object with the strongest relations is selected. Its affinity to current clusters is determined as average affinity to the cluster’s elements. If the affinity to a cluster is larger than affinity to remaining neighbors of the object, the object is placed in the cluster. Otherwise, a new cluster is created, with the object as a seed.

5.4.3 Code improvement with simulated annealing

In many problems with complicated characteristics, after a solution has been found, it is often a good idea to perform a local search around the solution to determine if no better solutions similar to the original one are present. Since the concerned neighborhood is much smaller than the entire solution space, more detailed and more time-consuming criteria can be applied to search for an improved solution, without a prohibitive performance penalty.

Simulated annealing

Simulated annealing is an optimization technique based on the analogy to the physical process of metal recrystallization [54]. In this process, a metal is heated to high tempera-
ture, at which it enters a highly disordered, high energy state. At any given temperature, the melt is allowed to stabilize, i.e. to achieve the lowest possible energy state at this temperature. Due to the nature of the process, the route to the stable low energy state may lead via some higher energy states. Once the equilibrium is achieved, the temperature is lowered and the melt is allowed to reach equilibrium at the new temperature. As cooling proceeds, the system becomes more ordered and approaches a “frozen”, crystalline minimum energy state.

Optimization methods based on simulated annealing draw the analogy between the state of the thermodynamic system and the solution of the optimization problem at hand. The energy of the system is reflected by the cost of the solution. Following the analogy, at each temperature the solution is perturbed to form a new candidate solution. If the candidate solution is better (has a lower energy, i.e. lower cost), it is accepted. However, even if it is worse, it can still be accepted with the probability given by the Boltzman factor $\exp(-\Delta E/T)$, where $\Delta E$ denotes the energy (cost) increase and $T$ is current temperature. Note that, faithful to the physical analogy, the procedure will allow higher cost solutions to be accepted at higher temperatures, but with cooling the probability of such an event decreases. The outline of a typical simulated annealing algorithm is presented in Alg. 5.2.

```
Algorithm 5.2 Simulated annealing

determine initial solution $S$;
$T = T_0$;
repeat {
    while (not an equilibrium) {
    perturb $S$ to get a new solution $S'$;
    $\Delta E = E(S') - E(S)$;
    if ($\Delta E < 0$)
        replace $S$ with $S'$;
    else
        replace $S$ with $S'$ with probability $\exp(-\Delta E/T)$;
    }
    $T = T \cdot \alpha$; // $0 < \alpha < 1$
} until (freeze);
```

A number of variations on this general scheme is possible. Some approaches use a fixed number of perturbations at each temperature, without the requirement for equilibrium. This saves significant amount of runtime, especially in the early stages of the algorithm, when the system is unstable due to high temperature. This savings come naturally at the expense of good statistical properties of the solution, but these may have small effect on the quality, especially if the initial solution was already a good one, so no large perturbations and no extensive search of the search space is required. It is also possible to dynamically adjust the number of perturbations at a given temperature, depending on the current conditions. In the early stages, it is also possible to use less precise cost functions, as the high temperature and the resulting high acceptance rate introduce an element of error that may offset small errors in the cost function [17].

One of the most important aspects of the algorithm is so called “annealing schedule”,
that is the choice of initial temperature and the method to decrease it in the subsequent steps of the algorithm. The simple-minded approach is the linear decrease of temperature by a factor $\alpha \in (0, 1)$ (as in Alg. 5.2). More elaborate schedules are also possible. The most popular is the schedule that saves the runtime of the algorithm by quick cooling at the initial temperatures (e.g. by a factor of $\alpha^4$) and slower cooling in the finishing stages (e.g. by $\alpha$).

**Code improvement**

The code improvement procedure was implemented as simulated annealing in the scheme presented in Alg. 5.2. The solution is a given encoding, with the initial solution being a result of the code construction stage of SECODE. The initial temperature is selected in such way that a solution with the quality 10% worse that the initial solution is accepted with probability of 50%. This value is given by equation

$$T_0 = \text{initial\_cost} \cdot \frac{0.1}{\ln(2)}$$

A current encoding is perturbed in a random manner by adding or removing a symbol from a block of one of the encoding dichotomies. If a symbol is removed, the atomic dichotomies lost by removing this symbol and not realized by other encoding dichotomies are recovered by adding appropriate symbols to the other encoding dichotomies. At each temperature, a limited number of perturbations is generated. This number increases logarithmically from the number of perturbations equal to the length of the encoding at the highest temperature (i.e. statistically, each bit is perturbed once) to 10 times the length of the encoding at the temperature 1. The number of perturbations at a given temperature is therefore given by

$$\text{num\_perturbations}(T) = \text{encoding\_length} \cdot \left(10 - 9 \cdot \frac{\ln(T)}{\ln(T_0)}\right)$$

The new solution is then evaluated by a cost function. The cost function is a parameter to the annealing procedure. We implemented two basic cost functions: the dichotomy-affinity-cost-function and the input-support-cost-function. The dichotomy-affinity-cost-function evaluates a solution based on a dichotomy affinity matrix, which reflects pairwise affinities of atomic dichotomies to be merged together. A solution is evaluated by enumeration of atomic dichotomies realized by the encoding dichotomies and summing up pairwise affinities of the atomic dichotomies realized on the same encoding dichotomy. The input-support-cost-function performs more precise evaluation of the solution by determining input supports of the output and next-state functions resulting from encoding the FSM with the encoding represented by the evaluated solution. The cost is then determined as the sum of sizes of input support of the functions. This procedure in general can be time consuming. However, taking into account that subsequent solutions are produced by small perturbations of the previous solutions, caching is extensively used to only recalculate the input supports actually affected by the perturbation, and to cache the unchanged rest.
5.5 Special encodings

When designing any computer optimization method it is usually desirable to make the
method as general as possible, with few assumptions about the character of the sought
solution. Any assumptions inevitably limit the explored solution space and may there-
fore preclude some very good, but “unexpected” solutions. To avoid this effect, a method
should strive to search the solutions that optimize objective quality criteria, rather than
the ones that follow a presumed pattern. On the other hand, such general, unconstrained
methods usually have trouble finding some very specific, characteristic solutions that
may be good in the particular domain. Naturally, when exploring a huge and irregular,
discrete solution space, chance of finding one very specific, isolated point in it is small,
however sophisticated the method. Therefore, practical methods often implement a gen-
eral search algorithm coupled with a separate analysis of some few specific, well known
solutions.

This is unfortunately not the case in state encoding methods implemented in most
of the current commercial FPGA synthesis tools. They essential limit themselves to the
analysis of only some specific encodings, namely: one-hot, sequential (natural binary),
and Gray code, or even leave the choice entirely to the user. No general search for a good
encoding is performed. Undoubtedly, this choice was dictated by the difficulty of the
encoding problem and the perception that finding a good code is too great a burden on
the runtime of the tool. However, a quick but naive choice of state assignment may result
in a circuit which is not only much larger and slower, but also, due to its complexity, much
harder to synthesize. The additional effort required from the combinational synthesis
method may in this case exceed the effort required to find a more suitable encoding.

However limited this simplistic approach to encoding is, it stems from a valid obser-
vation that a certain limited number of practical, industrial benchmarks have good im-
plementations when encoded with one of these specific encodings. This is particularly
true for one-hot encoding, which is known to result in combinational logic described
by very simple sum-of-product (SOP) expressions. While usually sum-of-product form
is not the best indicator of the complexity of FPGA implementation, the specific form
of the one-hot-induced functions makes them more often good candidates for an effi-
cient implementation. In particular, the SOPs resulting from one-hot encoding often
depend on less than all state variables and take form of a sum of very small (sometimes
single-literal) product terms, with little overlap between the terms. This may result in
the Boolean functions that are unate in some of their input variables and possess good
disjoint decompositions. Both of these traits simplify logic synthesis of the Boolean func-
tions and may result in effective circuit implementations. On the other hand, in many
cases the sheer number of functions and memory elements required to implement a
one-hot encoded machine is so large that the resulting realization is very large, however
simple the particular functions are. Also, even the simplest functions with very wide
supports still require a large number of LUTs to realize, as the LUTs are limited by the
maximum number of inputs. As a result, one-hot encoding can only be applied to a small
subset of all finite state machines and is not a universal method.

In this section we will describe a method of determining good candidates for one-hot
encoding before the actual encoding. With this method we are able to efficiently analyze
if a machine has a potential for good one-hot realization and if so, encode it without the
relatively costly general encoding procedure, yielding good results in a fraction of the
5.5. SPECIAL ENCODINGS

runtime.

5.5.1 One-hot encoding

Our method is based on the prediction of the number of look-up-tables required for realization of the machine encoded with one-hot encoding and with some minimum-length encoding. Since our general encoding method results in close-to-minimum encoding length, it is a reasonable approximation of the possible state assignment result. The estimation of the number of LUTs is based on the number of inputs required to compute each function.

One-hot encoding cost

In the case of one-hot encoding, the inputs required to calculate each particular output and state bit are easy to estimate by analyzing the state transition table. Since each state will be represented in the one-hot encoded machine by a separate bit, the state bit support for a primary output bit consists of state bits corresponding to the states, in which the particular output is active. Also, for a next-state-bit, the previous states of the state corresponding to this particular bit will form its state-bit-support. In the cases where thus assessed state bit support is very large, it should be reduced to account for the possibility of expressing the state conditions in a complementary fashion. For instance, for a 5-state machine the output that is active for the states 1 through 4 (and therefore depends on the first four state bits) can be expressed as an output active when the state is not 5, and therefore depends solely on the fifth state bit.

To estimate the primary inputs involved in the computation of a particular output, the primary input cubes for which the output is active can be analyzed to identify the active inputs. Of course, due to possible redundancy of representation in state transition table some of the seemingly active inputs may be not actually used by the output. For instance, an output active for input cubes 111 and 110 really only depends on the first two input bits, as the value of the third bit does not influence the value of the output. To identify such situation, a quick distance-1 merge can be performed among the primary input cubes. For more precision, actual input supports of the output functions can also be computed using a simple minimum input support algorithm (e.g. best first search).

Once the number of inputs used by an output is determined, the expected realization size is estimated by the expression

\[
\text{lut\_count\_hot}(in) = \begin{cases} 
0 & \text{in } \leq 1 \\
1 & 2 \leq \text{in } \leq 5 \\
2 + \frac{\text{in} - 6}{2} & \text{in } \geq 6
\end{cases}
\]

Note that the expression introduces linear dependency between the support size and the size of the realization. While in general it is far from true, the specific form of the Boolean functions resulting from one-hot encoding is such that their realization is very simple and requires little logic. In particular, one-hot encoded Boolean functions have very often very good disjoint decompositions, which simplifies functional decomposition of the circuit performed in the combinational synthesis and results in compact realization.
Minimum length encoding cost

The estimation of realization size for the machine encoded with a minimum length encoding is much more difficult. Naturally, the realizations for different encodings of the same length differ very considerably, which is exactly the reason that fuels state assignment research. Therefore, any estimations dealing with the precise size of the realization have to be wildly inaccurate.

However, in this case we are not interested in the exact size of the realization, but rather with the relation between the size of one-hot and some min-length realization. We are looking for a verdict that will indicate whether one of the realizations is clearly better, or the difference is too small to determine by such an inaccurate method. This approach calls for much smaller precision and, as we found out in our experiments, the rough estimation based on input support size is in most cases sufficient.

As in the case of one-hot realization, the min-length realization size is estimated based on predicted input supports of the binary next state and output functions. When determining the supports, we assume that a primary output function will use the same primary inputs it used in one-hot realization plus all encoded state bits. The next state functions have all the same support consisting of all primary inputs used by the state logic in one-hot encoding plus all the encoded state bits.

Once the predicted input supports of the output functions are determined, the size of realization is estimated by the sum of realizations of particular primary output and next state functions. The realization cost of a single output function is a function of its input support size and is given by the expression

$$lut\_count\_minlen(in) = \begin{cases} 0 & in \leq 1 \\ 1 & 2 \leq in \leq 5 \\ (2 + \frac{in-6}{2}) \cdot \frac{in}{6} & in \geq 6 \end{cases}$$

As can be observed, this expression has an additional factor of $\frac{in}{6}$ when compared with the expression used for evaluation of the size of one-hot realization. This factor introduces non-linear dependency of realization size on the number of inputs. This is necessary for functions resulting from min-length encoding, as they are much more complicated to realize than the one-hot functions. This is due to the fact that the information about states is encoded on the state bits in a more complicated fashion than in one-hot encoding, so additional circuitry is necessary to decode the information necessary to calculate the output functions.

5.6 Conclusions

In this chapter, we have discussed the state assignment method implemented in software tool SECODE. The method builds upon the general set-system-based encoding procedure proposed in Chapter 4. It uses the information relationships and measures apparatus to analyze relationships between information flows within the encoded machine and constructs the final encoding in such manner, as to optimize the information flows. To this end, it considers production and consumption of state and primary input information by the next-state and primary output functions. It then identifies groups of items of elementary state information that should be conveyed by the same binary encoding bits because
5.6. CONCLUSIONS

of the common source or destination of the elementary state information. Finally, it creates an encoding by explicitly constructing binary encoding bits conveying the desired elementary state information.

The information flow optimization implemented in SECODE realizes a general goal of finding a “natural” decomposition structure of the realization circuit, with separate, relatively independent, coherent sub-circuits. The optimized information flows deliver to these sub-circuits compressed, relevant information that can be efficiently processed. This approach is beneficial to any logic synthesis method or target implementation platform, but it addresses particularly the characteristics of LUT-based FPGAs and IRMA\textsubscript{2}FPGA as a logic synthesis method. In LUT-FPGAs, limits are placed on the number of inputs to a logic block and on the interconnection structure. Compression of relevant information results in Boolean functions with few inputs and the separation of independent sub-systems reduces the inter-system communication and, thus, the non-local interconnections. This approach is also compatible with the logic synthesis method implemented in IRMA\textsubscript{2}FPGA. As discussed in Section 4.2, this functional decomposition method is based on analysis of information delivered by particular inputs of a Boolean function and required by the output of the function. The method builds a network that removes the redundant input information and reorganizes it to form the desired output information. If the inputs deliver mostly relevant information, the process of removing the redundant information is greatly simplified, and smaller realization circuits can be found.

In this way, SECODE and IRMA\textsubscript{2}FPGA form a coherent synthesis chain for FPGA implementations of sequential circuits. In the next chapter, we will present experimental results testifying to the effectiveness of this approach.
Chapter 6

Experiments

In the previous chapters, we described the state assignment method based on set system (or dichotomy) representation of encoding and the heuristics that are used in the framework of this method to find encodings resulting in effective state assignment. To test the effectiveness of the proposed approach, we implemented this encoding method in the experimental EDA software tool SECODE. In this chapter, we will present and discuss the results of encoding experiments that we performed with the tool using a large set of benchmarks.

6.1 Comparison of one-hot and min-length encodings

In Section 5.5.1, we indicated that while generality is a very desired quality of a synthesis method, it comes at an expense of having to deal with a large solution space. In this huge space, a general synthesis method being guided only by some general heuristics may have a difficulty to reach some very specific solution points, even if they are known to often result in high quality circuits. Therefore, we proposed to accompany our general encoding method by a dedicated procedure to test in relation to a given FSM for high quality of some specific encodings, and in particular of the one-hot encoding. We proposed the method of evaluating the size of implementation of a one-hot-encoded FSM as compared to the FSM encoded with a minimum length encoding. This method enables us to predict with a reasonably high accuracy, which sort of encoding will be better for a given FSM: the specific one-hot encoding or the general encoding.

In Tab. 6.1 we present experimental results of the proposed method on the standard set of IWLS benchmarks [62]. In these experiments, we confronted the predictions of the method with the actual synthesis results of machines encoded both with the one-hot and min-length encodings. For min-length encoding we used JEDI, which is known to produce high-quality minimum length encodings. The encoded FSMs were synthesized with IRMA2FPGA, the synthesis method targeting LUT-FPGAs and based on functional decomposition [41]. IRMA2FPGA produces on average much better circuits than any other available academic or commercial combinational synthesis method [41]. In our experiments with IWLS benchmarks, it outperformed the popular academic logic synthesis tool, SIS [71], on average by 25% in terms of area of the synthesized circuits and by 30% in terms of delay measured by the number of levels.
The parameters of the synthesized circuits are given in Tab. 6.1. The columns *jedi* and *hot* give size (in 5-input LUTs) of the implementation of FSMs encoded with *JEDI* and 1-hot, respectively. The column *ind* shows the results of our cost estimation analysis. ‘+’ stands for predicted increase of size in one-hot implementation, ‘−’ for decrease in size and ‘?’ for uncertain. The uncertain status was assigned to cases where the predicted difference in implementation size was under 15%. The *real* column shows real increase in the size of the realization when one-hot encoding is used instead of min-length.

We can observe that in over 80% of the cases the relationship between the size of one-hot and min-length realization is evaluated correctly. Especially for larger machines, the estimations agree with reality (e.g. *planet*, *s1488* or *s820*).

In most cases where the estimation is wrong, it overestimates the size of min-length realization and thus incorrectly indicates that one-hot encoding is better. In these cases, *JEDI* is able to find some optimization very specific to the given machine, which is impossible to predict just analyzing the input supports. This is particularly glaring in the case of the *s1* FSM. The procedure estimated the sizes of both realizations as comparable, while in reality the one-hot realization is 185% larger than min-length. In this particular example, however, the large area reduction is due to very large number of common sub-

<table>
<thead>
<tr>
<th>fsm</th>
<th>jedi</th>
<th>hot</th>
<th>ind</th>
<th>real</th>
<th>fsm</th>
<th>jedi</th>
<th>hot</th>
<th>ind</th>
<th>real</th>
</tr>
</thead>
<tbody>
<tr>
<td>a02</td>
<td>22</td>
<td>32</td>
<td>−</td>
<td>45%</td>
<td>opus2</td>
<td>26</td>
<td>21</td>
<td>−</td>
<td>−19%</td>
</tr>
<tr>
<td>a03</td>
<td>15</td>
<td>16</td>
<td>?</td>
<td>7%</td>
<td>patgen</td>
<td>29</td>
<td>36</td>
<td>−</td>
<td>24%</td>
</tr>
<tr>
<td>a04</td>
<td>43</td>
<td>53</td>
<td>+</td>
<td>23%</td>
<td>patrec</td>
<td>49</td>
<td>46</td>
<td>−</td>
<td>−6%</td>
</tr>
<tr>
<td>a05</td>
<td>11</td>
<td>16</td>
<td>+</td>
<td>45%</td>
<td>percent</td>
<td>15</td>
<td>20</td>
<td>+</td>
<td>33%</td>
</tr>
<tr>
<td>bbaza</td>
<td>15</td>
<td>20</td>
<td>+</td>
<td>33%</td>
<td>planet</td>
<td>119</td>
<td>88</td>
<td>−</td>
<td>−26%</td>
</tr>
<tr>
<td>bssse</td>
<td>29</td>
<td>31</td>
<td>?</td>
<td>7%</td>
<td>pma</td>
<td>43</td>
<td>47</td>
<td>+</td>
<td>9%</td>
</tr>
<tr>
<td>bbtas</td>
<td>5</td>
<td>8</td>
<td>+</td>
<td>60%</td>
<td>s1</td>
<td>33</td>
<td>94</td>
<td>?</td>
<td>185%</td>
</tr>
<tr>
<td>becnt</td>
<td>9</td>
<td>13</td>
<td>+</td>
<td>44%</td>
<td>s1488</td>
<td>164</td>
<td>119</td>
<td>−</td>
<td>−27%</td>
</tr>
<tr>
<td>coffee</td>
<td>23</td>
<td>20</td>
<td>−</td>
<td>−13%</td>
<td>s1494</td>
<td>165</td>
<td>120</td>
<td>−</td>
<td>−27%</td>
</tr>
<tr>
<td>cse</td>
<td>53</td>
<td>56</td>
<td>?</td>
<td>6%</td>
<td>s208</td>
<td>20</td>
<td>25</td>
<td>+</td>
<td>25%</td>
</tr>
<tr>
<td>dk14</td>
<td>21</td>
<td>29</td>
<td>+</td>
<td>38%</td>
<td>s27</td>
<td>4</td>
<td>15</td>
<td>+</td>
<td>275%</td>
</tr>
<tr>
<td>dk15</td>
<td>7</td>
<td>15</td>
<td>+</td>
<td>114%</td>
<td>s386</td>
<td>41</td>
<td>29</td>
<td>−</td>
<td>−29%</td>
</tr>
<tr>
<td>dk17</td>
<td>6</td>
<td>16</td>
<td>+</td>
<td>167%</td>
<td>s420</td>
<td>19</td>
<td>26</td>
<td>+</td>
<td>37%</td>
</tr>
<tr>
<td>dk27</td>
<td>5</td>
<td>8</td>
<td>+</td>
<td>60%</td>
<td>s510</td>
<td>64</td>
<td>67</td>
<td>−</td>
<td>5%</td>
</tr>
<tr>
<td>dk312</td>
<td>7</td>
<td>19</td>
<td>+</td>
<td>171%</td>
<td>s8</td>
<td>5</td>
<td>8</td>
<td>+</td>
<td>60%</td>
</tr>
<tr>
<td>ex1</td>
<td>84</td>
<td>71</td>
<td>?</td>
<td>−15%</td>
<td>s820</td>
<td>114</td>
<td>65</td>
<td>−</td>
<td>−43%</td>
</tr>
<tr>
<td>ex4</td>
<td>15</td>
<td>20</td>
<td>?</td>
<td>33%</td>
<td>s812</td>
<td>107</td>
<td>71</td>
<td>−</td>
<td>−34%</td>
</tr>
<tr>
<td>ex5</td>
<td>11</td>
<td>15</td>
<td>+</td>
<td>36%</td>
<td>sand</td>
<td>191</td>
<td>174</td>
<td>?</td>
<td>−9%</td>
</tr>
<tr>
<td>ex6</td>
<td>24</td>
<td>25</td>
<td>+</td>
<td>4%</td>
<td>shiftreg</td>
<td>4</td>
<td>9</td>
<td>+</td>
<td>125%</td>
</tr>
<tr>
<td>example</td>
<td>9</td>
<td>11</td>
<td>?</td>
<td>22%</td>
<td>sse</td>
<td>29</td>
<td>31</td>
<td>?</td>
<td>7%</td>
</tr>
<tr>
<td>keyb</td>
<td>50</td>
<td>67</td>
<td>+</td>
<td>34%</td>
<td>tav</td>
<td>7</td>
<td>4</td>
<td>−</td>
<td>−43%</td>
</tr>
<tr>
<td>lion</td>
<td>3</td>
<td>5</td>
<td>+</td>
<td>67%</td>
<td>tbk</td>
<td>58</td>
<td>207</td>
<td>+</td>
<td>257%</td>
</tr>
<tr>
<td>lion9</td>
<td>5</td>
<td>10</td>
<td>+</td>
<td>100%</td>
<td>tma</td>
<td>21</td>
<td>32</td>
<td>+</td>
<td>52%</td>
</tr>
<tr>
<td>mark1</td>
<td>22</td>
<td>25</td>
<td>−</td>
<td>14%</td>
<td>train11</td>
<td>6</td>
<td>13</td>
<td>+</td>
<td>117%</td>
</tr>
<tr>
<td>mc</td>
<td>7</td>
<td>7</td>
<td>?</td>
<td>0%</td>
<td>vtiidec</td>
<td>53</td>
<td>56</td>
<td>−</td>
<td>6%</td>
</tr>
<tr>
<td>opus</td>
<td>23</td>
<td>18</td>
<td>−</td>
<td>−22%</td>
<td>vtiuar</td>
<td>71</td>
<td>66</td>
<td>−</td>
<td>−7%</td>
</tr>
</tbody>
</table>

Table 6.1. Realization cost of one-hot vs min-length encoding
functions in the realization of min-length encoding. This type of optimization is naturally very difficult to accurately predict with our simple, but fast, estimation method.

Our assignment cost prediction method is very fast and accurate enough to be used in practical settings. Moreover, in the uncertain cases, we can always check both assignment methods.

6.2  SECODE

In this section, we will discuss the results of experiments with our information-driven state assignment method described in Chapter 5 and implemented in the software tool SECODE. First, we will present the results for the set of standard IWLS benchmarks, followed by the discussion of results for a large set of FSMs generated with benchmark generation tool BENG that we developed. BENG enables us to generate diverse FSMs having characteristics typical to circuits encountered in various industrial applications.

6.2.1 Standard benchmarks

For comparison of the encoding effectiveness, we used JEDI state assignment tool, as in our experiments it consistently produced better results than other available tools, such as NOVA or MUSTANG. Again, as in the case of one-hot experiments described in the previous section, we used IRMA2FPGA for combinational synthesis. The results of synthesis of the IWLS FSMs encoded with SECODE and JEDI are presented in Tab. 6.2. For each method, we give the number of 5-LUTs in the realization (column a) and the depth of the circuit (column d). Additionally, the percentage of difference with JEDI in area and depth is given in columns a% and d%, respectively.

<table>
<thead>
<tr>
<th>FSM</th>
<th>JEDI</th>
<th>SECODE</th>
<th>SECODE'</th>
</tr>
</thead>
<tbody>
<tr>
<td></td>
<td>a</td>
<td>d</td>
<td>a%</td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>a02</td>
<td>22</td>
<td>3</td>
<td>20</td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>-9</td>
</tr>
<tr>
<td>a03</td>
<td>15</td>
<td>2</td>
<td>14</td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>-7</td>
</tr>
<tr>
<td>a04</td>
<td>43</td>
<td>3</td>
<td>44</td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>2</td>
</tr>
<tr>
<td>a05</td>
<td>11</td>
<td>2</td>
<td>8</td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>-27</td>
</tr>
<tr>
<td>bbara</td>
<td>15</td>
<td>2</td>
<td>12</td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>-20</td>
</tr>
<tr>
<td>bbse</td>
<td>29</td>
<td>3</td>
<td>26</td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>-10</td>
</tr>
<tr>
<td>bbtas</td>
<td>5</td>
<td>1</td>
<td>5</td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>0</td>
</tr>
<tr>
<td>beecn</td>
<td>9</td>
<td>2</td>
<td>6</td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>-33</td>
</tr>
<tr>
<td>coffee</td>
<td>23</td>
<td>2</td>
<td>18</td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>-22</td>
</tr>
<tr>
<td>cse</td>
<td>53</td>
<td>3</td>
<td>49</td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>-8</td>
</tr>
<tr>
<td>dk14</td>
<td>21</td>
<td>2</td>
<td>18</td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>-14</td>
</tr>
<tr>
<td>dk15</td>
<td>7</td>
<td>1</td>
<td>7</td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>0</td>
</tr>
<tr>
<td>dk17</td>
<td>6</td>
<td>1</td>
<td>6</td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>0</td>
</tr>
<tr>
<td>dk27</td>
<td>5</td>
<td>1</td>
<td>5</td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>0</td>
</tr>
<tr>
<td>dk512</td>
<td>7</td>
<td>1</td>
<td>7</td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>0</td>
</tr>
<tr>
<td>ex1</td>
<td>84</td>
<td>3</td>
<td>50</td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>-40</td>
</tr>
<tr>
<td>ex4</td>
<td>15</td>
<td>2</td>
<td>14</td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>-7</td>
</tr>
<tr>
<td>ex5</td>
<td>11</td>
<td>2</td>
<td>7</td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>-36</td>
</tr>
<tr>
<td>ex6</td>
<td>24</td>
<td>2</td>
<td>21</td>
</tr>
<tr>
<td></td>
<td></td>
<td></td>
<td>-13</td>
</tr>
</tbody>
</table>
As we can see in the table, in 65% of cases, SECODE manages to find encodings that result in smaller circuits than JEDI and for further 20% the results are the same. The identical results occur in most cases for the FSMs so small that no further improvement is possible. In some non-trivial examples, such as ex1, s386, s820, s832 or vtiuar, the improvement of area reaches and exceeds one-third. Crucially, the area improvement does not come at a cost of delay increase. In fact, in the case of s832 and vtiuar the 35% area improvement is accompanied by 50% delay improvement. The global improvement (summary over all machines) delivered by SECODE is equal to 8% for area and 10% for delay. On average (average percentage improvement per machine), the area of the implementations of FSMs encoded with SECODE is 14% smaller and the delay 9% smaller than of those encoded with JEDI.

Careful consideration of the results indicates that the global improvement is adversely...
affected by just a few relatively large FSMS (such as planet, s1488 or s1494), for which SECODE encoding is worse than JEDI. Further analysis of the results reveals that these are the machines that have a good one-hot encoding. It turns out that SECODE performs encoding that heads towards one-hot, however, in the process it is discouraged by the rapidly growing encoding length and applies more aggressive merging strategy that limits the further growth of the number of state bits. The result is the encoding that offers neither the simplicity of functions in one-hot encoding, nor the small number of functions of near-minimum-length encoding.

While we could change the heuristic to account for this effect, this was not necessary, as the functions with good one-hot realizations can be efficiently detected by the procedure described in the previous section. In the experiments reported in the third column of Tab. 6.2 (secode), we integrated the one-hot analysis with the SECODE tool. If the analysis indicated strong preference for one-hot realization (‘−’ in Tab. 6.1), one-hot encoding was performed. Otherwise, the FSM was encoded with the general SECODE procedure. As discussed in Section 6.1, in some cases the one-hot analysis method incorrectly identifies good one-hot candidates. For instance, for a02, it indicated good one-hot realization, which is fact is 45% worse in terms of area. However, on the majority of benchmarks, the

<table>
<thead>
<tr>
<th>fsm</th>
<th>JEDI</th>
<th>bin</th>
<th>Gray</th>
</tr>
</thead>
<tbody>
<tr>
<td></td>
<td>a</td>
<td>d</td>
<td>a%</td>
</tr>
<tr>
<td>a02</td>
<td>22</td>
<td>3</td>
<td>0</td>
</tr>
<tr>
<td>a03</td>
<td>15</td>
<td>2</td>
<td>0</td>
</tr>
<tr>
<td>a04</td>
<td>43</td>
<td>3</td>
<td>16</td>
</tr>
<tr>
<td>a05</td>
<td>11</td>
<td>2</td>
<td>18</td>
</tr>
<tr>
<td>bbba</td>
<td>15</td>
<td>2</td>
<td>0</td>
</tr>
<tr>
<td>bbseau</td>
<td>29</td>
<td>3</td>
<td>34</td>
</tr>
<tr>
<td>bbtaa</td>
<td>5</td>
<td>1</td>
<td>0</td>
</tr>
<tr>
<td>beecnnt</td>
<td>9</td>
<td>2</td>
<td>11</td>
</tr>
<tr>
<td>ccoffee</td>
<td>23</td>
<td>2</td>
<td>-9</td>
</tr>
<tr>
<td>cse</td>
<td>53</td>
<td>3</td>
<td>-9</td>
</tr>
<tr>
<td>dk14</td>
<td>21</td>
<td>2</td>
<td>24</td>
</tr>
<tr>
<td>dk15</td>
<td>7</td>
<td>1</td>
<td>0</td>
</tr>
<tr>
<td>dk17</td>
<td>6</td>
<td>1</td>
<td>0</td>
</tr>
<tr>
<td>dk27</td>
<td>5</td>
<td>1</td>
<td>0</td>
</tr>
<tr>
<td>dk512</td>
<td>7</td>
<td>1</td>
<td>0</td>
</tr>
<tr>
<td>exi</td>
<td>84</td>
<td>3</td>
<td>-10</td>
</tr>
<tr>
<td>ex4</td>
<td>15</td>
<td>2</td>
<td>7</td>
</tr>
<tr>
<td>ex5</td>
<td>11</td>
<td>2</td>
<td>-9</td>
</tr>
<tr>
<td>ex6</td>
<td>24</td>
<td>2</td>
<td>21</td>
</tr>
<tr>
<td>example</td>
<td>9</td>
<td>2</td>
<td>11</td>
</tr>
<tr>
<td>keyb</td>
<td>50</td>
<td>4</td>
<td>25</td>
</tr>
<tr>
<td>lion</td>
<td>3</td>
<td>1</td>
<td>0</td>
</tr>
<tr>
<td>lion9</td>
<td>5</td>
<td>1</td>
<td>-20</td>
</tr>
<tr>
<td>marki</td>
<td>22</td>
<td>2</td>
<td>9</td>
</tr>
<tr>
<td>mc</td>
<td>7</td>
<td>1</td>
<td>0</td>
</tr>
<tr>
<td>opus</td>
<td>23</td>
<td>2</td>
<td>30</td>
</tr>
<tr>
<td>opus2</td>
<td>26</td>
<td>3</td>
<td>-4</td>
</tr>
<tr>
<td>patgen</td>
<td>29</td>
<td>3</td>
<td>72</td>
</tr>
<tr>
<td>patrec</td>
<td>49</td>
<td>3</td>
<td>33</td>
</tr>
</tbody>
</table>
indications of the one-hot analysis method are correct. Therefore, it complements very well the general method and the combined procedures achieve global 16% area reduction and 17% delay reduction on the IWLS benchmarks.

In the next series of experiments reported in Tab. 6.3 we evaluated two encoding strategies prevalent in commercial synthesis tools: sequential binary encoding that assigns to states successive minimum length binary vectors representing natural numbers, and Gray encoding – another min-length encoding that assigns to subsequent states codes that differ on exactly one bit. Together with the earlier discussed one-hot encoding, these two approaches give the comparison of our method to the state assignment methods currently used in industrial synthesis tools.

As we can see from the results, sequential and Gray encodings are clearly inferior to all other presented methods. On average, sequential encoding results in circuits that are 17% larger and 13% slower than those encoded with JEDI, and over 40% larger and slower than encoded with our method. For Gray, the respective numbers are: 14% area increase w.r.t. JEDI and 46% w.r.t. SECODE, and, respectively, 10% and 35% delay increase. This result is not surprising, as both encodings, although having some specific patterns, are in fact random encodings for a particular FSM and can hardly be expected to produce consistently good results. Such encoding strategies are clearly not able to identify any optimizations of the encoded machine that the constructive methods such as JEDI and SECODE realize. Being minimum-length encodings they also do not benefit from the simplified output function form that in some cases favors one-hot encoding.

Table 6.3. Comparison JEDI with binary encoding
6.2.2 Interconnections

The results presented in the previous section evaluate the quality of implementation based on the number of blocks and the depth of the realization network for the encoded machine. While the number of blocks is a very good indicator of the area of the circuit (there is a direct correspondence with FPGA slices), the depth of the circuit does not necessarily reflect the actual delay of the circuit once it is implemented in FPGA. This is due to the fact that in many circuits implemented in the modern FPGA devices interconnections become the dominant contributor to the overall delay of the circuit, instead of the logic. This is caused, among others, by the delay introduced by switches that the longer, programmable interconnections have to pass through. This switch delay is comparable to the LUT delay.

For this reason we indicated in Section 5.2.1 that one of the targets of our encoding heuristic is minimization of interconnections in the circuit, and particularly — long interconnections. In this section we discuss the results of the comparison of interconnection complexity encountered in circuits encoded with SECODE and JEDI.

Table 6.4 shows the results of analysis of interconnection structure in the circuits

```
sweep
xl_part_coll -m -g 2
xl_coll_ck
xl_partition -m
simplify
xl_imp
xl_partition -t
xl_cover -e 30 -u 200
xl_coll_ck -k
xl_merge -u 8 -o doc -l
```

![Figure 6.1. SIS script](image)

<table>
<thead>
<tr>
<th>fsm</th>
<th>jedi+irma</th>
<th>secode+irma</th>
<th>jedi+sis</th>
</tr>
</thead>
<tbody>
<tr>
<td></td>
<td>c</td>
<td>lc</td>
<td>c</td>
</tr>
<tr>
<td>a02</td>
<td>99</td>
<td>7</td>
<td>116</td>
</tr>
<tr>
<td>a03</td>
<td>69</td>
<td>0</td>
<td>66</td>
</tr>
<tr>
<td>a04</td>
<td>200</td>
<td>3</td>
<td>208</td>
</tr>
<tr>
<td>a05</td>
<td>52</td>
<td>0</td>
<td>41</td>
</tr>
<tr>
<td>bbara</td>
<td>73</td>
<td>0</td>
<td>57</td>
</tr>
<tr>
<td>bbase</td>
<td>139</td>
<td>0</td>
<td>126</td>
</tr>
<tr>
<td>bbtas</td>
<td>30</td>
<td>0</td>
<td>26</td>
</tr>
<tr>
<td>beecnt</td>
<td>42</td>
<td>0</td>
<td>36</td>
</tr>
<tr>
<td>coffee</td>
<td>110</td>
<td>0</td>
<td>100</td>
</tr>
<tr>
<td>cse</td>
<td>252</td>
<td>4</td>
<td>231</td>
</tr>
<tr>
<td>dk14</td>
<td>94</td>
<td>0</td>
<td>86</td>
</tr>
<tr>
<td>dk15</td>
<td>42</td>
<td>0</td>
<td>42</td>
</tr>
<tr>
<td>dk17</td>
<td>36</td>
<td>0</td>
<td>36</td>
</tr>
<tr>
<td>dk27</td>
<td>25</td>
<td>0</td>
<td>25</td>
</tr>
<tr>
<td>dk512</td>
<td>42</td>
<td>0</td>
<td>42</td>
</tr>
</tbody>
</table>
### Table 6.4. Interconnection complexity for different encoding methods

<table>
<thead>
<tr>
<th>ex1</th>
<th>382</th>
<th>10</th>
<th>234</th>
<th>0</th>
<th>526</th>
<th>21</th>
</tr>
</thead>
<tbody>
<tr>
<td>ex4</td>
<td>78</td>
<td>0</td>
<td>76</td>
<td>0</td>
<td>104</td>
<td>1</td>
</tr>
<tr>
<td>ex5</td>
<td>52</td>
<td>0</td>
<td>34</td>
<td>0</td>
<td>66</td>
<td>0</td>
</tr>
<tr>
<td>ex6</td>
<td>117</td>
<td>0</td>
<td>95</td>
<td>0</td>
<td>146</td>
<td>11</td>
</tr>
<tr>
<td>example</td>
<td>39</td>
<td>0</td>
<td>31</td>
<td>0</td>
<td>80</td>
<td>0</td>
</tr>
<tr>
<td>keyb</td>
<td>214</td>
<td>16</td>
<td>180</td>
<td>15</td>
<td>274</td>
<td>19</td>
</tr>
<tr>
<td>lion</td>
<td>15</td>
<td>0</td>
<td>15</td>
<td>0</td>
<td>14</td>
<td>0</td>
</tr>
<tr>
<td>lion9</td>
<td>20</td>
<td>0</td>
<td>18</td>
<td>0</td>
<td>49</td>
<td>3</td>
</tr>
<tr>
<td>mark1</td>
<td>93</td>
<td>0</td>
<td>117</td>
<td>0</td>
<td>113</td>
<td>0</td>
</tr>
<tr>
<td>mc</td>
<td>42</td>
<td>0</td>
<td>42</td>
<td>0</td>
<td>24</td>
<td>0</td>
</tr>
<tr>
<td>opus</td>
<td>105</td>
<td>0</td>
<td>94</td>
<td>0</td>
<td>118</td>
<td>0</td>
</tr>
<tr>
<td>opus2</td>
<td>104</td>
<td>8</td>
<td>99</td>
<td>0</td>
<td>159</td>
<td>5</td>
</tr>
<tr>
<td>patgen</td>
<td>139</td>
<td>1</td>
<td>166</td>
<td>0</td>
<td>168</td>
<td>7</td>
</tr>
<tr>
<td>patrec</td>
<td>236</td>
<td>6</td>
<td>229</td>
<td>0</td>
<td>306</td>
<td>15</td>
</tr>
<tr>
<td>planet</td>
<td>73</td>
<td>0</td>
<td>37</td>
<td>0</td>
<td>96</td>
<td>4</td>
</tr>
<tr>
<td>pma</td>
<td>533</td>
<td>29</td>
<td>414</td>
<td>6</td>
<td>843</td>
<td>60</td>
</tr>
<tr>
<td>opus</td>
<td>204</td>
<td>5</td>
<td>170</td>
<td>1</td>
<td>702</td>
<td>63</td>
</tr>
<tr>
<td>si</td>
<td>138</td>
<td>14</td>
<td>123</td>
<td>4</td>
<td>193</td>
<td>9</td>
</tr>
<tr>
<td>s1488</td>
<td>734</td>
<td>42</td>
<td>511</td>
<td>10</td>
<td>909</td>
<td>80</td>
</tr>
<tr>
<td>s1494</td>
<td>731</td>
<td>46</td>
<td>512</td>
<td>14</td>
<td>820</td>
<td>62</td>
</tr>
<tr>
<td>s258</td>
<td>86</td>
<td>1</td>
<td>72</td>
<td>0</td>
<td>118</td>
<td>4</td>
</tr>
<tr>
<td>s27</td>
<td>19</td>
<td>0</td>
<td>19</td>
<td>0</td>
<td>29</td>
<td>0</td>
</tr>
<tr>
<td>s386</td>
<td>175</td>
<td>7</td>
<td>133</td>
<td>0</td>
<td>184</td>
<td>7</td>
</tr>
<tr>
<td>s420</td>
<td>89</td>
<td>0</td>
<td>75</td>
<td>2</td>
<td>140</td>
<td>7</td>
</tr>
<tr>
<td>s510</td>
<td>273</td>
<td>18</td>
<td>271</td>
<td>2</td>
<td>345</td>
<td>23</td>
</tr>
<tr>
<td>s8</td>
<td>21</td>
<td>0</td>
<td>5</td>
<td>0</td>
<td>45</td>
<td>13</td>
</tr>
<tr>
<td>s820</td>
<td>531</td>
<td>23</td>
<td>297</td>
<td>2</td>
<td>463</td>
<td>16</td>
</tr>
<tr>
<td>s832</td>
<td>482</td>
<td>35</td>
<td>320</td>
<td>2</td>
<td>483</td>
<td>20</td>
</tr>
<tr>
<td>sand</td>
<td>752</td>
<td>137</td>
<td>820</td>
<td>110</td>
<td>692</td>
<td>49</td>
</tr>
<tr>
<td>shiftreg</td>
<td>20</td>
<td>0</td>
<td>20</td>
<td>0</td>
<td>10</td>
<td>0</td>
</tr>
<tr>
<td>sse</td>
<td>139</td>
<td>0</td>
<td>126</td>
<td>0</td>
<td>183</td>
<td>4</td>
</tr>
<tr>
<td>tav</td>
<td>29</td>
<td>0</td>
<td>32</td>
<td>0</td>
<td>43</td>
<td>0</td>
</tr>
<tr>
<td>tbatk</td>
<td>108</td>
<td>0</td>
<td>107</td>
<td>0</td>
<td>324</td>
<td>14</td>
</tr>
<tr>
<td>train11</td>
<td>25</td>
<td>0</td>
<td>15</td>
<td>0</td>
<td>52</td>
<td>0</td>
</tr>
<tr>
<td>vtiidc</td>
<td>250</td>
<td>10</td>
<td>268</td>
<td>5</td>
<td>358</td>
<td>3</td>
</tr>
<tr>
<td>vtiuar</td>
<td>347</td>
<td>8</td>
<td>320</td>
<td>8</td>
<td>442</td>
<td>13</td>
</tr>
</tbody>
</table>

| Σ     | 8971 | 447 | 7611 | 204 | 11790 | 656 |
| j+iΔ% | 15   | -54 | 31   | 47  |
| s+iΔ% | 55   | 222 |

encoded and synthesized with different encoding and synthesis methods. For each encoding/synthesis combination we give the number of resulting short and long interconnections between LUTs in the network (columns c and lc, respectively). For the purposes of this comparison, we define a connection as a single path from an output of a LUT to an input of another LUT. To evaluate the length of connections, we adopted a simplifying assumption that a short interconnection is a connection between two LUTs on neighboring levels of the network (produced output is consumed directly on the next level), while
6.2. SECODE

6.2. SECODE

a long interconnection has to travel over more than one level. In the comparison we used IRMA2FPGA and the popular division-based combinational synthesis tool SIS [71]. The SIS script used for FPGA synthesis was the “good” script proposed in [71], and repeated in Figure 6.1.

The results indicate that our encoding method indeed reduces number of interconnections, in particular the long ones. As we can see in the table, number of short interconnections is 15% smaller in SECODE-encoded circuits than in circuits encoded by JEDI, when both circuits are synthesized with IRMA2FPGA. For long interconnections, this reduction increases to 54%.

At the same time, we can see that the overall number of long interconnections resulting from synthesis with IRMA2FPGA is very small. This is an expected result, as interconnection reduction is one of the fundamental targets of the functional decomposition method implemented in IRMA2FPGA. Comparing the results of synthesis of the same JEDI encoding with IRMA2FPGA versus SIS, we can see that the synthesis method implemented in SIS introduced 31% more short interconnections, and 47% more long interconnections.

The combination of interconnection reduction introduced by the whole decomposition-based synthesis chain (SECODE + IRMA2FPGA) is clearly visible in comparison with the results of the division-based synthesis chain (JEDI + SIS). The achieved reduction reaches 55% percent for short and as much as 222% for long interconnections. This result is the testimony to the effectiveness, with which the information-driven, decomposition-based sequential synthesis chain addresses interconnections - the crucial aspect of modern digital circuits.

6.2.3 Layout results

In the previous sections we showed that SECODE generates encodings that result in circuits with smaller number of LUTs, smaller depth and less interconnections (particularly, long interconnections) in comparison with JEDI. However, to conclusively evaluate the quality of the synthesized circuits, we need to consider the area and the delay of the circuits after placement and routing on the target FPGA device. Only then are the final placement of the logic blocks and the resulting interconnections between them fully known.

To perform this comparison, we generated layouts for the finite state machines from the IWLS benchmark, encoded with JEDI and SECODE and synthesized with IRMA2FPGA.

<table>
<thead>
<tr>
<th>fsm</th>
<th>JEDI</th>
<th>SECODE</th>
<th>SECODE’</th>
</tr>
</thead>
<tbody>
<tr>
<td></td>
<td>a</td>
<td>d</td>
<td>a%</td>
</tr>
<tr>
<td>a02</td>
<td>18</td>
<td>4</td>
<td>16</td>
</tr>
<tr>
<td>a05</td>
<td>10</td>
<td>3.1</td>
<td>10</td>
</tr>
<tr>
<td>a04</td>
<td>34</td>
<td>5.8</td>
<td>39</td>
</tr>
<tr>
<td>a05</td>
<td>8</td>
<td>3.0</td>
<td>7</td>
</tr>
<tr>
<td>bbara</td>
<td>12</td>
<td>4</td>
<td>9</td>
</tr>
<tr>
<td>bbse</td>
<td>23</td>
<td>5</td>
<td>22</td>
</tr>
<tr>
<td>bbta</td>
<td>5</td>
<td>3.2</td>
<td>4</td>
</tr>
<tr>
<td>beecnt</td>
<td>6</td>
<td>2</td>
<td>6</td>
</tr>
<tr>
<td>coffee</td>
<td>18</td>
<td>5</td>
<td>14</td>
</tr>
<tr>
<td>cse</td>
<td>46</td>
<td>6.4</td>
<td>42</td>
</tr>
</tbody>
</table>
The output networks of 5-LUTs synthesized by IRMA2FPGA were converted to the networks of FPGA primitives in EDIF format and fed to Xilinx ISE 5.2 synthesis system for placement and routing on the VirtexII FPGA (device 2V1000FG256-4). The resulting...
area of the circuit (in slices) and the clock period (in nanoseconds) are given in Tab. 6.5. As we can see, the layout results confirm the improvements achieved using SECODE encoding instead of JEDI. The average improvements of 15% and 14% for area and delay remained after layout essentially the same at 14% and 18%. Of course, calculation of improvement as an average of per-machine improvements increases the influence of smaller machines, which form the majority of IWLS benchmark. In the average, reduction of 1 block in a 10-block machine will weigh the same as reduction of 10 blocks in a 100-block machine (both represent 10% improvement). Meanwhile, we can see that for many of the small machines no further reduction is possible. On the other hand, SECODE manages to achieve significant improvements primarily for larger machines, where the optimization potential is the greatest. These improvements are more clearly visible in the summary numbers over all machines, where each machines weighs according to its size. In terms of summary area and delay, the improvement introduced by SECODE reached 23% and 25%, respectively.

We can also observe that after layout the summary improvements have increased both for area and for delay when compared to the results after synthesis only. The additional area improvements stem from the difference between a 5-input LUT and a slice. While a slice (see Section 1.2.2) is usually treated as a single 5-input LUT, in fact it is composed of two 4-input LUTs, which can be configured to use separately. As we can see in Tab. 6.5, using number of slices as an area measure, SECODE outperforms JEDI by 23%, compared to 16% improvement in terms of 5-LUTs (Tab. 6.2). This indicates that the function blocks in synthesized JEDI circuits often use all 5 inputs, so they use a whole slice to implement. Meanwhile, blocks in SECODE circuits are not only fewer, but also smaller and use more often functions of 4 and less inputs. This enables implementation of more than one function in a single slice, and constitutes an additional advantage of SECODE.

We can also see that the summary delay improvement of 17% that SECODE achieves versus JEDI after logic synthesis, improves to 25% after layout. This can be contributed to the fact that the SECODE-encoded circuits have not only less levels, but also less interconnections.

### 6.2.4 Comparison of synthesis chains

In the previous sections we focused on the comparison of various state assignment techniques — the main subject of this thesis. To obtain a full picture of the synthesis, we performed additional evaluation of the complete synthesis chains — from FSM transition table to mapped LUT network.

As we indicated in Section 5.1, good cooperation between state assignment and combinational synthesis is essential to obtaining good quality results. Both steps of the synthesis trajectory need to have similar objectives and follow compatible heuristics. Otherwise, state assignment may produce functions difficult for synthesis and the combinational synthesis may not fully realize optimization potential introduced by the state assignment. Therefore, in this section we present the results of the experiments with complete synthesis chains, i.e. state assignment and combinational synthesis methods implemented in the same packages or following the same heuristics.

In Table 6.6 we present the results of comparison of four synthesis chains: information-driven, decomposition-based flow represented by SECODE and IRMA2FPGA; division-based flow represented by JEDI and SIS; and two popular commercial FPGA synthesis
tools, denoted A and B. In all cases, the input to the flow was transition table of the FSM expressed in KISS format or equivalent Verilog description (for commercial tools). All four synthesis chains were then used to perform state assignment (commercial tools were setup to automatically determine the best state assignment) and synthesize and map the network. Synthesis setup was as follows: for SIS we used script described in Figure 6.1; since the main target objective of IRMA2FPGA is the delay reduction, for commercial tools we used the highest available effort settings for speed-driven optimization and the target clock speed of 1 GHz. The resulting networks were placed and routed on VirtexII-1000-fg256-4 device with Xilinx ISE 5.2. The results are reported as the number of slices (column sl) and clock period (column clk) in the resulting circuit.

As we can see, the uniform, information-driven sequential synthesis chain significantly outperforms all other synthesis flows. In particular, division-based approach produced circuits in total 60% larger and slower than our methods. The comparison is more favorable for the evaluated commercial methods. Tool A produced circuits 15% larger and 7% slower than the decompositional methods. Tool B managed to produce circuits fraction faster than our methods, but at the expense of 50% area increase.

We should note, however, that the actual performance of state assignment and logic synthesis alone are difficult to evaluate in case of commercial tools. Their closed architecture does not allow to precisely determine what steps are performed to produce the final network. These tools utilize additional optimization steps at the network level, such as retiming, and the intimate knowledge of the target architecture, to produce best circuits. In particular, they are able to produce networks composed not only of LUTs, but also of specific dedicated resources available at the target FPGA device (e.g. additional logic gates, carry chains, special fast local interconnections, etc). In comparison, IRMA2FPGA, or SIS produce very simple networks composed of basic building blocks alone. It is very likely that if our synthesis methods were better integrated in the mapping process and thus were able to additionally benefit from device-dependent optimizations performed by commercial tools, the results would significantly improve.

<table>
<thead>
<tr>
<th></th>
<th>secode+irma</th>
<th>jedi+sis</th>
<th>A</th>
<th>B</th>
</tr>
</thead>
<tbody>
<tr>
<td>sl</td>
<td>clk</td>
<td>sl</td>
<td>clk</td>
<td>sl</td>
</tr>
<tr>
<td>a02</td>
<td>25 4,274</td>
<td>37 6,914</td>
<td>42 5,414</td>
<td>45 5,394</td>
</tr>
<tr>
<td>a03</td>
<td>10 2,818</td>
<td>23 8,255</td>
<td>12 3,138</td>
<td>32 4,997</td>
</tr>
<tr>
<td>a04</td>
<td>39 5,668</td>
<td>69 11,076</td>
<td>56 7,791</td>
<td>42 4,248</td>
</tr>
<tr>
<td>a05</td>
<td>7 3,525</td>
<td>13 4,932</td>
<td>21 3,549</td>
<td>19 3,129</td>
</tr>
<tr>
<td>bhara</td>
<td>9 3,569</td>
<td>14 5,798</td>
<td>8 3,285</td>
<td>13 4,136</td>
</tr>
<tr>
<td>bbsse</td>
<td>22 3,971</td>
<td>30 7,06</td>
<td>25 4,362</td>
<td>19 4,07</td>
</tr>
<tr>
<td>bbtas</td>
<td>4 3,201</td>
<td>4 3,207</td>
<td>3 1,93</td>
<td>5 2,146</td>
</tr>
<tr>
<td>beecnt</td>
<td>6 2,647</td>
<td>7 3,957</td>
<td>10 3,227</td>
<td>12 3,397</td>
</tr>
<tr>
<td>coffee</td>
<td>13 4,111</td>
<td>19 6,543</td>
<td>12 4,285</td>
<td>11 3,653</td>
</tr>
<tr>
<td>cse</td>
<td>42 6,305</td>
<td>47 7,97</td>
<td>41 5,815</td>
<td>34 4,21</td>
</tr>
<tr>
<td>dk14</td>
<td>12 3,361</td>
<td>19 3,67</td>
<td>24 3,856</td>
<td>25 3,961</td>
</tr>
<tr>
<td>dk15</td>
<td>7 2,633</td>
<td>7 3,254</td>
<td>7 2,389</td>
<td>17 3,801</td>
</tr>
<tr>
<td>dk17</td>
<td>6 3,215</td>
<td>6 2,674</td>
<td>10 3,333</td>
<td>12 3,399</td>
</tr>
<tr>
<td>dk27</td>
<td>3 1,967</td>
<td>3 2,423</td>
<td>3 2,184</td>
<td>4 2,815</td>
</tr>
<tr>
<td>dk512</td>
<td>7 3,215</td>
<td>6 3,207</td>
<td>7 2,593</td>
<td>9 2,852</td>
</tr>
</tbody>
</table>
6.3 Generated benchmarks

In the previous sections, we performed various comparisons of sequential and combinational synthesis methods on the set of standard IWLS benchmarks. To further evaluate
the effectiveness of our encoding method on a larger set of benchmarks with various characteristics, we developed together with Lech Jóźwiak and Dominik Gawłowski the benchmark generation software \textsc{BEN\textsc{EN}}. With this tool, we generated 350 FSMs exhibiting characteristics typical to the circuits encountered in various industrial applications, and compared the different encoding results for these machines.

In the following, we will shortly introduce the benchmark generator \textsc{BEN\textsc{EN}}. The more detailed discussion of the tool can be found in [42]. Further, we will discuss the characteristics of the generated benchmarks and analyze the results of encoding of the machines with various encoding methods.

\section*{6.3.1 \textsc{BEN\textsc{EN}} benchmark generator}

The benchmark generator was developed in response to the shortage of typical industrial benchmarks. Such benchmarks are necessary to evaluate the effectiveness of the state assignment method on relevant machines commonly encountered in real-life systems. Furthermore, to assess robustness of a method, it is necessary to have access to a large number of benchmarks. Unfortunately for the EDA research community, such benchmarks are guarded by the designers of the systems, as well as by the vendors of EDA tools that usually have access to the circuit libraries of their clients.

To address this issue, we developed the benchmark generator \textsc{BEN\textsc{EN}} that enables us to generate large sets of FSMs with various characteristics. These include:

- FSMs with different number of states and various transition patterns between the states (e.g. chains of states with forward and/or backward transitions, loops, conditional “case” structures etc. and their combinations);

- FSMs with different numbers of inputs and outputs, and different proportions between the next-state and output logic (state-dominated, balanced or output-dominated), as well as, between the primary-input and state-input (input-dominated, balanced, state-dominated), and their mixtures;

- FSMs with various dependence of particular transitions and output variables on the number of inputs and input conditions;

- completely, incompletely and weakly specified FSMs.

This also includes FSMs representative to various typical industrial application areas, as for instance, having typical structure of controllers from various application areas, or representing various sequential data-path circuits (e.g. counters).

\textsc{BEN\textsc{EN}} enables us to efficiently construct FSMs, but also to modify the constructed or industrial FSMs, and to very precisely "fine-tune" the benchmarks. This last feature is extremely useful in sensitivity analysis of state assignment to the changes in the input data, i.e. small changes in the FSM characteristics.

The generation process is based on guided random generation of branches of an FSM. A branch is a series of states following one another, with possible backward transitions from next states back to the previous, and state self-loops indicating that FSM stays in the same state. By choosing the first and the last state of different loops, their length and patterns of backward- and self-transitions, arbitrary FSM state transition structures
can be constructed. The primary input values that trigger particular transitions are also generated according to user preferences.

**BenGen** has two work modes: batch and interactive mode. In the batch mode, the parameters of the FSM to generate are supplied in a script file. These parameters include: the number of inputs and outputs of the FSM; the number and length of the FSM’s branches; the characteristics of a branch, such as the number of backward transitions or loops; the number of inputs and outputs active for a given branch, etc. Most of these parameters can be specified in the form of a probability distribution to randomize their values in the specific instances of the generated FSMs. As a result, in the batch mode BenGen can be used to easily generate large sets of FSM benchmarks with certain characteristics using the same script file. The script file can be easily modified to generate a next batch of somewhat different FSMs.

The interactive mode of BenGen provides more control over the generation process. The user can interactively enter any of the parameters available in the batch mode. In addition to that, operations allowing modifications of single branches, or transitions are provided. This makes BenGen in interactive mode an ideal tool for fine-tuning of the generated or industrial FSMs for the specific characteristics, required for instance to check the behavior or sensitivity of a method or tool in relation to a certain aspect.

In both modes, BenGen takes away the burden of tedious specification of single transitions, or checking the consistency of the constructed FSM. Instead, the user can focus on specifying high-level characteristics of the machine. Given the global machine’s characteristics, BenGen generates the required number of state chains, with the requested number of states, and appropriate backward transitions between and self-loops in the states, if needed. Given the state-transition behaviour defined in abstract terms, BenGen generates the input conditions for particular transitions, etc. In this process not only does it consider user’s requirements concerning the active inputs for a given branch, but also ascertains that the machine is consistent, i.e. distinct transitions have disjoint input conditions and all possible input conditions are specified (for completely specified FSMs). The generator also determines the output values for the transitions, taking into account the outputs active for a given branch.

With the above characteristics, BenGen is a very useful tool enabling efficient generation of an arbitrary number of various sorts of well-characterized FSM benchmarks, with the minimum effort of the user. On the other hand, it also enables fine-grained control over the generation process and editing of the generated or industrial FSMs. Appropriately used, BenGen enables generation of FSMs representative to many practical applications. Examples of such machines are discussed in the following section.

### 6.3.2 Experimental results for generated benchmarks

To evaluate the effectiveness of the proposed state assignment method, we used BenGen to generate 350 FSMs that exhibit characteristics typical to FSMs encountered in various real-life industrial applications. We identified a number of typical schemes of sequential behavior and for each scheme generated a set of benchmarks of different sizes and with differing proportions of input, state and output logic. For instance, the sequential behavior schemes included: a single loop of states (typical for a counter or simple sequencer); a number of loops starting from a common initial state (typical for a controller realizing a few different control programs for different operation modes); a single loop with
sub-loops attached to states along the main loop (a main control "program" with "sub-
routines" - the subroutine loops may have their own sub-subroutine loops); and more
complicated cases of sequential behavior. Within each scheme, we varied proportions of
backward transitions and state self-loops within loops and branches generated.

The generated machines were encoded with four encoding methods: JEDI, SECODE,
one-hot and sequential binary encoding. The results of logic synthesis of the encoded
machines with IRMA2FPGA are presented in detail in Appendix A. For clarity of presen-
tation in this section, we categorized the generated machines according to three criteria:
the size, the proportion of the number of primary input bits to the number of state bits,
and the proportion the number of primary output bits to the number of state bits. The
size criterion divides FSMs into small (max. 8 states), medium (9 to 32 states) and large
(more than 32 states). The proportion of the number of primary input/output bits to state
bits categorizes the FSMs as input/output dominated if the number of input/output bits
is 50% larger than the number of state bits, state dominated if the number of state bits
is larger than the number of input/output bits and balanced otherwise. In the generated
benchmark, we made sure that each category is well represented, with some discrepan-
cies between the number of machines in each category consistent with the occurrence
frequency in real-life applications. Numbers of machines in each category are summa-
rized in Table 6.7, in the row marked with #.

The overall picture of efficiency of SECODE encoding is consistent with the results
achieved for standard IWLS benchmarks. Out of 350 generated benchmarks, SECODE
produced encodings resulting in smaller realization circuits than JEDI in 278 cases (80%),
153 of these with the improvement of over 10%. In 44 cases, the results for both encoding
methods were equal, and only for 28 machines (8%), JEDI outperformed SECODE. The
circuits encoded with SECODE achieved area results between the reduction of 40% and
15% area increase. Globally, the circuits encoded with SECODE were 7% smaller and 4%
faster than those encoded with JEDI.

<table>
<thead>
<tr>
<th>size</th>
<th>small</th>
<th>medium</th>
<th>large</th>
</tr>
</thead>
<tbody>
<tr>
<td>#</td>
<td>87</td>
<td>219</td>
<td>44</td>
</tr>
<tr>
<td>Δ</td>
<td>-10.3%</td>
<td>-9.2%</td>
<td>-4.4%</td>
</tr>
</tbody>
</table>

<table>
<thead>
<tr>
<th>input</th>
<th>input dom.</th>
<th>balanced</th>
<th>state dom.</th>
</tr>
</thead>
<tbody>
<tr>
<td>#</td>
<td>105</td>
<td>156</td>
<td>89</td>
</tr>
<tr>
<td>Δ</td>
<td>-9.3%</td>
<td>-5.9%</td>
<td>-5.3%</td>
</tr>
</tbody>
</table>

<table>
<thead>
<tr>
<th>output</th>
<th>output dom.</th>
<th>balanced</th>
<th>state dom.</th>
</tr>
</thead>
<tbody>
<tr>
<td>#</td>
<td>125</td>
<td>149</td>
<td>76</td>
</tr>
<tr>
<td>Δ</td>
<td>-5.4%</td>
<td>-9.4%</td>
<td>-8.2%</td>
</tr>
</tbody>
</table>

Table 6.7. Summary results for generated benchmarks

Table 6.7 summarizes the comparison between the results achieved with SECODE and
JEDI encodings in each of the identified FSM categories — with respect to size, input and
output characteristics. For each category, in the row marked with Δ, two numbers indicate
the difference within the category between the size and the depth of the circuits for
FSMs encoded with JEDI and SECODE. The analysis of the results confirms that SECODE
is a stable method achieving good results regardless of the characteristics of the encoded
FSM. The overall area improvement of 7% fluctuated between the categories from 4.4%
6.3. GENERATED BENCHMARKS

However, SECODE still produced better results than JEDI in all categories. Also the delay improvement remained in all categories in the region of overall improvement of 4%, ranging from 2.6% to 6.1%.

In another comparison, we augmented our encoding approach with the method for identifying efficient one-hot encodings described in Section 5.5.1. The FSMs identified by the method as having potentially efficient one-hot encoding are marked in the table in Appendix A with “+” in the last column (l+). As we can see in the table, the method has correctly predicted efficiency of one-hot encoding in 301 out of 350 of the cases (86%). When the FSMs identified as having good one-hot encoding are encoded with one-hot instead of SECODE, the improvement of the area over JEDI grows from 7% for SECODE alone to 8% for the combined SECODE/one-hot. Unfortunately, this 1% of area reduction is accompanied by 1% delay increase — from 4% reduction to 3% reduction.

The relatively small improvement achieved by introduction of one-hot results is the consequence of the overall poor performance of one-hot encoding. On the generated benchmarks, only for 38 machines one-hot produced better circuit than SECODE. In total, one-hot encoding increased the area by over 9% when compared with JEDI and over 17% when compared with SECODE. The delay was also only marginally (under 1%) better when compared with JEDI, but clearly worse (4%) when compared with SECODE. These results confirm our observation that one-hot encoding is in general not effective for a wide variety of FSM types. As discussed earlier and confirmed by these results, only a small proportion of specific FSMs can benefit from simplified function form achieved in one-hot encoding sufficiently to offset the increased number of output functions and inputs to the functions introduced by this encoding.

Not surprisingly, even worse results were recorded for sequential binary encoding. Consistently with the results for IWLS benchmarks, binary encoding produced circuits 21% larger and 15% slower than JEDI and 30% larger and 20% slower than SECODE. An aspect that deserves a comment here is the difference between improvement achieved by our method with respect to essentially random binary encoding in the case of industrial IWLS benchmarks and the generated benchmarks (40% versus 30%). This discrepancy is the result of special characteristics and additional optimization opportunities encountered in real-life FSMs that are very difficult to emulate in partially random machines. However, the fact that the overall trend remains the same for IWLS and generated benchmarks supports our claim that the machines generated with BENGEn reflect well the characteristics of real-life industrial circuits.

Even assuming that for each machine full synthesis is performed for the one-hot and binary encodings and the better result is chosen (even though that would double the synthesis time), the combined methods are still inferior to SECODE. The combined results of the one-hot and binary encoding are on average 10% larger and 6% slower than the circuits resulting from SECODE encoding. In our experiments, next to the two above methods, we also considered Gray encoding — the minimum-length encoding, where the consecutive state codes differ on exactly one code bit. The results for this method (not reported here) show similar picture to the sequential binary encoding. This is consistent with our expectations, as Gray encoding is in fact also a random encoding method that cannot be expected to consistently deliver effective results with respect to area and speed of the synthesized circuits. It is, on the other hand, known to reduce the power consumed by the circuits, provided that the states are assigned consecutive Gray codes in correct order.
Summing up the above analysis, we conclude that our encoding method implemented in the prototype tool SECODE is more effective with respect to the area and speed of the resulting circuits than the most popular current academic (represented by JEDI) and industrial (represented by one-hot, binary and Gray) encoding methods.
Chapter 7

Conclusions and future work

In this thesis we have discussed the problems of decomposition and state encoding of Finite State Machines. In the introduction, we pointed out that FSMs are universally accepted as a functional description of sequential digital circuits. In the face of dynamic growth of the number and importance of digital circuit applications, the efficient synthesis of FSMs in hardware became a problem of primary practical importance. Within this domain, we are particularly interested in hardware realizations in the emerging FPGA-based reconfigurable platforms, such as reconfigurable System-on-Chip (SoC) platforms, and SoCs with embedded FPGAs. These platforms have recently gained much popularity due to their low costs for short and medium production series and inherent flexibility allowing virtually unlimited modifications to the already produced circuit or system. With the logic densities and speed of new generations of reconfigurable platforms growing rapidly, we expect that their challenge to hardwired ASICs will continue and gain momentum.

The main contributions of this work are as follows:

- Together with Jóźwiak, I formulated the final version and developed the precise proof of the General Decomposition Theorem of incompletely specified, non-deterministic FSMs with multi-state behavior realization, based on the initial version of this theorem and the outline of its proof proposed by Jóźwiak as an extension of his General Decomposition Theorem for completely specified FSMs [35]. This theorem states conditions for a legal decomposition of an FSM into a network of cooperating partial FSMs and this way defines the most general known generator of correct circuit structures (decompositions) for FSMs. It covers as special cases all other known decomposition structures, such as serial or parallel FSM decomposition, and the decomposition of discrete functions and relations. This way, a sound theoretical base was created for research in decomposition of FSMs and discrete relations.

- Based on the extended GDT, we expressed the state assignment of an FSM as a special case of the general decomposition and formulated the problem of state assignment in terms of finding and appropriately implementing of some specific collections of two-block set systems, or dichotomies, to define the corresponding partial machines and their interconnections. We have shown that the new formulation allows for more flexibility in state assignment than in the previously proposed
methods, permitting among others explicit consideration of information flows and their relationships and allowing for incompletely specified, overlapping codes. The formulation enables the explicit consideration of the interconnection structure, input supports of particular partial machines and sub-functions, as well as implicit behavior-preserving optimizations, such as state minimization during the state assignment process.

- Based on the new formulation of the conditions for a valid state assignment, we proposed a generic state assignment method. The method constructs a valid encoding by forming the encoding dichotomies as a combination of smaller “atomic” dichotomies. In this approach, very detailed analysis of information dependencies is possible, using simple and efficient means of input support minimization. Also, by merging atomic dichotomies any encoding can be constructed, including encodings with incompletely specified, overlapping state codes. This features allows exploitation of the encoding freedom afforded by the flexible formulation of the conditions for valid state assignment discussed above.

- On top of the generic state assignment method, we have proposed a system of novel heuristics that guide the encoding construction towards encoded FSMs that have efficient implementation in FPGAs. The heuristics are based on the analysis and optimization of the information flows within the encoded machine. This analysis is supported by the application of the apparatus of Information Relationships and Measures. The generic state assignment method equipped with the heuristics constitutes a complete effective and efficient FPGA-targeted FSM state encoding method.

- I implemented the FPGA-targeted FSM state assignment method in the form of the prototype EDA software tool – SECODE – that significantly outperforms other popular academic and industrial state assignment approaches and tools by producing encodings resulting in smaller and faster FPGA realizations of the FSMs.

- As a result of the above developments and experimental research, we demonstrated that the information-driven approach based on the general decomposition and the apparatus of information relationships and measures is an effective and efficient approach to sequential circuit synthesis.

While in this thesis we applied the theories of general decomposition and information relationships and measures to state assignment of FSMs, the practical usefulness of these theories extends far beyond that problem. In particular, the theories have been successfully applied in a separate work to circuit synthesis of Boolean functions [41]. Furthermore, general decomposition can be applied in any field of modern engineering and science that deals with finite state machines, discrete functions and relations.

Other contributions of the work include:

- We have evaluated and debunked the common conviction that any particular encoding method, be it one-hot, sequential binary or any other particular sort of semi-random encoding can universally produce a high-quality encoding, especially in relation to the FSM implementations in FPGAs, with their complex characteristics. Various encodings are good for various FSMs, different implementation technologies and various optimization objectives. Even for a particular implementation
technology and specific objectives (as in the case of FPGA implementations considered in this thesis), a specific encoding (e.g. one-hot or Gray) is only good for a certain specific subset of FSMs.

- The above argument notwithstanding, we recognize that for a certain class of FSMs, one-hot encoding can lead to efficient FPGA implementation in terms of area and speed. We have analyzed this problem and proposed a reasonably effective heuristic to identify good candidates for one-hot encoding prior to performing the actual encoding. Thus, we are able to find efficient encodings for a group of FSMs at a fraction of the effort required to perform the full encoding.

- To address the issue of low availability of industrial FSM benchmarks, we developed together with L. Jóźwiak and D. Gawlowski the benchmark generator software \textsc{BenGen}. This software enables efficient generation of an arbitrary number of various sorts of well-characterized FSM benchmarks, with the minimum effort of the user, while retaining fine-grained control over the generation process and editing of the generated or industrial FSMs.

- The implementation of the prototype encoding tool \textsc{SeCode} resulted in the development of a software library that efficiently implements various operations frequently encountered in logic synthesis, and in particular in decomposition and state assignment. The contents of the library ranges from basic data structures, such as graphs or efficient vectors of Boolean and logic variables with their related operations, to generic and extensible implementations of algorithms for problems from graph theory, clustering, simulated annealing, etc. Within the library, a number of generic algorithms was extended with heuristics relevant to the subject field. For instance, the twin graph coloring approach was extended with the heuristics for efficient merging of dichotomies, while considering their mutual affinities. Also, an incremental input support minimization routine was developed that saves significant computing effort by evaluating input supports of a modified output function as a derivative of the original function, instead of fully recomputing the input support.

  The library can be useful for further research in this field, providing basis for rapid development of new algorithms and enabling quick experimental evaluation of new research ideas. Also, the \textsc{SeCode} software itself is a good framework for further analysis and experiments with encoding and decomposition of FSMs.

While the theory of general decomposition, which has been developed during the past 15 years, is quite mature, the newly developed state assignment method and software is in the prototype stage and would benefit from \textbf{further development}.

In particular, an important role in the analysis process is played by the input support computation. Quite often, especially in optimized machines, there is a unique input support for a given output function. In other cases, however, there is a choice of several minimum or close-to-minimum input supports for the given function. In these cases, the selection of one of the supports may have influence on further analysis and hence on the final encoding. It would be interesting to thoroughly investigate and exploit the influence of these choices on the results. A particularly interesting choice is between the primary and state variables in the support. It may occur that, for instance, two primary
input variables in the minimum input support may be replaced with several atomic state variables. In terms of decomposition, it means that the partial machine imports some state/input information from other partial machines rather than deriving it from the primary input information. The input support with state variables is then no longer minimum. However, after the atomic state variables are merged, it may turn out that the atomic state variables in the input support are merged to a single encoding variable and, therefore, in the final input support the two primary inputs are replaced with a single merged state variable. In this context, it is even possible to store multiple input supports for an output function and determine its affinity to other functions based on the support that fits best with the other function.

It would be also beneficial to continue research into predicting good specific encodings, such as one-hot or Gray encoding. More sophisticated cost estimation for one-hot encoding are certainly possible. Also, for Gray encoding, in which the consecutive codes have Hamming distance-1, one could analyze the best order of states for a given machine.

An important issue is compatibility of the state assignment method with the combinational synthesis method used to synthesize the encoded FSMs. As discussed in Chapter 5, the state assignment method has to be aware of the combinational synthesis to be able to produce binary functions that will have efficient implementations when using a particular combinational synthesis method targeting a particular implementation technology. Therefore, the encoding method would certainly benefit from further research into the desired characteristics of Boolean functions that yield efficient FPGA realizations. In particular, such research would be beneficial for the case of our combinational method of choice – the functional decomposition implemented in software tool IRMA2FPGA. Knowing more precisely characteristics of functions that are “simple” for IRMA2FPGA to synthesize would make it possible to favor such functions when encoding the FSM.

Finally, there is a number of possible optimization objectives that can drive the state assignment process. In our method we addressed the objectives that are important for most applications – the area and the delay of the circuit implementation. As discussed in Chapter 5, the area is addressed by limiting the interconnection and compressing the information relevant to particular functions on as few input variables as possible, and in this way reducing the input supports. Delivering compressed, relevant information to functions also tends to simplify the processing of the information, thus it may reduce the depth and hence the delay of the circuit. Delay is also reduced by limiting interconnections and especially avoiding long interconnections. In the further research on the topic of state assignment, some other objectives could be considered, such as power dissipation or testability of the circuit, even though they are perhaps more relevant to other target implementation platforms, such as ASICs involving CMOS gate networks. The other objectives can be reasonably easy to account for in our approach by adding some extra constraints and heuristics to the existing FSM assignment method and tool, and should not require any changes to the underlying theories or generic assignment method.

Summing up, the research presented in this thesis resulted in a sound theoretical base for research in FSM decomposition and encoding, generic information-driven FSM state assignment method, complete heuristic FPGA-targeted state assignment method and the prototype EDA software tool that implements the method. Using our encoding tool and a large set of benchmark FSMs, we performed an extensive experimental
research. The results of the experimental research clearly demonstrate that our tool significantly outperforms other popular academic and industrial FSM state assignment approaches and tools. Consequently, the research presented in this thesis demonstrates that the information-driven approach to circuit synthesis based on the general decomposition and information relationships and measures is an effective and efficient approach to sequential circuit synthesis. This way, the aims of the research reported in this thesis have been fully realized.
Bibliography

finite state machines using a genetic algorithm. IEE Proc. on Computers and Digital

[2] D. B. Armstrong. On the efficient assignment of internal codes to sequential ma-


of finite state machines. Proc. of Int. Conf. on Systems, Man and Cybernetics, pages

function decomposition, pages 262–276. To check, 2000?

University of Waterloo Research Report, CS-97-01, Waterloo, Canada, January (re-

[9] M. Burns, M. Perkowski, and L. Jóźwiak. An efficient approach to decomposi-

grated state assignment and flipflop selection in finite state machine synthesis. Proc.


[62] K. McElvain. IWLS’93 Benchmark Set: Version 4.0. Distributed as a part of IWLS’93


[64] M. Perkowski, M. Marek-Sadowska, L. Jóźwiak, T. Luba, S. Grygiel, M. Nowicka,.,

[65] M. Rawski, L. Jóźwiak, and T. Luba. The influence of the number of values in sub-
functions on the effectiveness and efficiency of the functional decomposition. In

[66] M. Rawski, L. Jóźwiak, and T. Luba. Functional decomposition with an efficient in-
put support selection for sub-functions based on information relationship measures.

boolean functions and its application in FPGA-oriented technology mapping. In

Paradigm for Algorithm Design. Final Technical Report, Wright Laboratories,

[69] C. Sarwary, E. Prado Lopes, L. Burgun, and A. Greiner. FSM synthesis on FPGA

[70] H. Selvaraj, H. Niewiadomska, M. Pleban, and P. Sapiecha. Decomposition of digital
Synthesis at The Fifth Multi-Conference on Systemics, Cybernetics and Informatics - SCI'2001,

[71] E. Sentovich, K. Singh, L. Lavagno, C. Moon, R. Murgai, A. Saldanha, H. Savoj,
P. Stephan, R. Brayton, and A. Sangiovanni-Vincentelli. SIS: A System for Sequential
Circuit Synthesis, Memorandum No. UCB/ERL M92/41. University of California,


decomposition. In Proc. International Conference on Computational Intelligence and

[74] T. Villa and A. Sangiovanni-Vincentelli. NOVA: state assignment of finite state
machines for optimal two-level logic implementation. IEEE Trans. on CAD, pages 905–
924, 1990.

Appendix A

Encoding results for generated FSMs

The following table summarizes the synthesis results for generated benchmark FSMs encoded with four encoding methods: popular multi-level encoding method JEDI, the method presented in this thesis – SECODE, one-hot and sequential binary encoding. All encoded FSMs were synthesized with IRMA2FPGA [41] logic synthesis method as a network of 5-input LUTs.

For each FSM, the parameters of the machines are listed — number of states (column s), number of primary input (column i) and number of primary outputs (column o). Further, for each encoding methods the results of logic synthesis of the encoded machine are reported as the number of LUTs and number of levels of the LUT-network. Finally, the last column (h+) marks with “+” the machines, for which the algorithm discussed in Section 5.5.1 indicated efficient one-hot realization.

<table>
<thead>
<tr>
<th>name</th>
<th>s</th>
<th>i</th>
<th>o</th>
<th>jedi lut#</th>
<th>jedi lvls</th>
<th>sec ode lut#</th>
<th>sec ode lvls</th>
<th>hot lut#</th>
<th>hot lvls</th>
<th>bin lut#</th>
<th>bin lvls</th>
<th>h+</th>
</tr>
</thead>
<tbody>
<tr>
<td>fsn000</td>
<td>19</td>
<td>3</td>
<td>2</td>
<td>24</td>
<td>3</td>
<td>21</td>
<td>2</td>
<td>34</td>
<td>2</td>
<td>31</td>
<td>3</td>
<td></td>
</tr>
<tr>
<td>fsn001</td>
<td>13</td>
<td>4</td>
<td>12</td>
<td>24</td>
<td>12</td>
<td>21</td>
<td>12</td>
<td>34</td>
<td>2</td>
<td>42</td>
<td>2</td>
<td></td>
</tr>
<tr>
<td>fsn002</td>
<td>11</td>
<td>4</td>
<td>4</td>
<td>16</td>
<td>4</td>
<td>16</td>
<td>4</td>
<td>17</td>
<td>2</td>
<td>19</td>
<td>2</td>
<td></td>
</tr>
<tr>
<td>fsn003</td>
<td>6</td>
<td>8</td>
<td>4</td>
<td>16</td>
<td>2</td>
<td>17</td>
<td>2</td>
<td>26</td>
<td>3</td>
<td>25</td>
<td>3</td>
<td></td>
</tr>
<tr>
<td>fsn004</td>
<td>1</td>
<td>7</td>
<td>1</td>
<td>1</td>
<td>1</td>
<td>4</td>
<td>1</td>
<td>1</td>
<td>1</td>
<td>1</td>
<td>1</td>
<td></td>
</tr>
<tr>
<td>fsn005</td>
<td>8</td>
<td>4</td>
<td>4</td>
<td>16</td>
<td>3</td>
<td>18</td>
<td>3</td>
<td>25</td>
<td>3</td>
<td>21</td>
<td>3</td>
<td></td>
</tr>
<tr>
<td>fsn006</td>
<td>20</td>
<td>2</td>
<td>8</td>
<td>40</td>
<td>2</td>
<td>40</td>
<td>2</td>
<td>45</td>
<td>2</td>
<td>39</td>
<td>2</td>
<td></td>
</tr>
<tr>
<td>fsn007</td>
<td>7</td>
<td>4</td>
<td>4</td>
<td>10</td>
<td>2</td>
<td>7</td>
<td>1</td>
<td>12</td>
<td>2</td>
<td>9</td>
<td>2</td>
<td></td>
</tr>
<tr>
<td>fsn008</td>
<td>12</td>
<td>4</td>
<td>4</td>
<td>37</td>
<td>3</td>
<td>33</td>
<td>3</td>
<td>53</td>
<td>3</td>
<td>48</td>
<td>3</td>
<td></td>
</tr>
<tr>
<td>fsn009</td>
<td>6</td>
<td>4</td>
<td>4</td>
<td>22</td>
<td>2</td>
<td>23</td>
<td>2</td>
<td>33</td>
<td>2</td>
<td>23</td>
<td>2</td>
<td></td>
</tr>
<tr>
<td>fsn010</td>
<td>10</td>
<td>12</td>
<td>16</td>
<td>24</td>
<td>3</td>
<td>24</td>
<td>3</td>
<td>32</td>
<td>3</td>
<td>30</td>
<td>5</td>
<td></td>
</tr>
<tr>
<td>fsn011</td>
<td>9</td>
<td>4</td>
<td>4</td>
<td>17</td>
<td>2</td>
<td>17</td>
<td>2</td>
<td>30</td>
<td>2</td>
<td>23</td>
<td>2</td>
<td></td>
</tr>
<tr>
<td>fsn012</td>
<td>8</td>
<td>4</td>
<td>4</td>
<td>17</td>
<td>2</td>
<td>17</td>
<td>2</td>
<td>30</td>
<td>2</td>
<td>23</td>
<td>2</td>
<td></td>
</tr>
<tr>
<td>fsn013</td>
<td>15</td>
<td>2</td>
<td>2</td>
<td>14</td>
<td>2</td>
<td>14</td>
<td>2</td>
<td>19</td>
<td>2</td>
<td>16</td>
<td>2</td>
<td></td>
</tr>
<tr>
<td>fsn014</td>
<td>10</td>
<td>4</td>
<td>4</td>
<td>43</td>
<td>3</td>
<td>38</td>
<td>3</td>
<td>66</td>
<td>3</td>
<td>51</td>
<td>3</td>
<td></td>
</tr>
<tr>
<td>fsn015</td>
<td>15</td>
<td>8</td>
<td>4</td>
<td>65</td>
<td>5</td>
<td>46</td>
<td>4</td>
<td>49</td>
<td>3</td>
<td>61</td>
<td>5</td>
<td></td>
</tr>
<tr>
<td>fsm</td>
<td>20</td>
<td>2</td>
<td>8</td>
<td>23</td>
<td>2</td>
<td>22</td>
<td>2</td>
<td>34</td>
<td>2</td>
<td>24</td>
<td>2</td>
<td></td>
</tr>
<tr>
<td>-----</td>
<td>-----</td>
<td>----</td>
<td>----</td>
<td>----</td>
<td>----</td>
<td>----</td>
<td>----</td>
<td>----</td>
<td>----</td>
<td>----</td>
<td>---</td>
<td></td>
</tr>
<tr>
<td>fsm016</td>
<td>14</td>
<td>2</td>
<td>13</td>
<td>2</td>
<td>10</td>
<td>2</td>
<td>16</td>
<td>2</td>
<td>14</td>
<td>2</td>
<td></td>
<td></td>
</tr>
<tr>
<td>fsm017</td>
<td>8</td>
<td>4</td>
<td>4</td>
<td>11</td>
<td>2</td>
<td>10</td>
<td>2</td>
<td>17</td>
<td>2</td>
<td>13</td>
<td>2</td>
<td></td>
</tr>
<tr>
<td>fsm018</td>
<td>6</td>
<td>4</td>
<td>12</td>
<td>20</td>
<td>2</td>
<td>15</td>
<td>2</td>
<td>19</td>
<td>2</td>
<td>19</td>
<td>2</td>
<td></td>
</tr>
<tr>
<td>fsm019</td>
<td>20</td>
<td>2</td>
<td>8</td>
<td>35</td>
<td>2</td>
<td>33</td>
<td>2</td>
<td>37</td>
<td>2</td>
<td>34</td>
<td>2</td>
<td></td>
</tr>
<tr>
<td>fsm020</td>
<td>10</td>
<td>12</td>
<td>16</td>
<td>37</td>
<td>3</td>
<td>27</td>
<td>2</td>
<td>24</td>
<td>2</td>
<td>32</td>
<td>3</td>
<td></td>
</tr>
<tr>
<td>fsm021</td>
<td>31</td>
<td>2</td>
<td>2</td>
<td>22</td>
<td>2</td>
<td>20</td>
<td>2</td>
<td>36</td>
<td>2</td>
<td>20</td>
<td>2</td>
<td></td>
</tr>
<tr>
<td>fsm022</td>
<td>38</td>
<td>4</td>
<td>10</td>
<td>168</td>
<td>5</td>
<td>132</td>
<td>4</td>
<td>154</td>
<td>5</td>
<td>156</td>
<td>6</td>
<td></td>
</tr>
<tr>
<td>fsm023</td>
<td>12</td>
<td>4</td>
<td>4</td>
<td>25</td>
<td>2</td>
<td>20</td>
<td>2</td>
<td>25</td>
<td>2</td>
<td>26</td>
<td>3</td>
<td></td>
</tr>
<tr>
<td>fsm024</td>
<td>20</td>
<td>2</td>
<td>8</td>
<td>35</td>
<td>2</td>
<td>30</td>
<td>2</td>
<td>42</td>
<td>2</td>
<td>34</td>
<td>2</td>
<td></td>
</tr>
<tr>
<td>fsm025</td>
<td>10</td>
<td>4</td>
<td>4</td>
<td>19</td>
<td>2</td>
<td>16</td>
<td>2</td>
<td>22</td>
<td>2</td>
<td>21</td>
<td>2</td>
<td></td>
</tr>
<tr>
<td>fsm026</td>
<td>17</td>
<td>8</td>
<td>96</td>
<td>4</td>
<td>81</td>
<td>4</td>
<td>87</td>
<td>4</td>
<td>126</td>
<td>6</td>
<td></td>
<td></td>
</tr>
<tr>
<td>fsm027</td>
<td>25</td>
<td>2</td>
<td>2</td>
<td>20</td>
<td>2</td>
<td>20</td>
<td>2</td>
<td>32</td>
<td>2</td>
<td>22</td>
<td>3</td>
<td></td>
</tr>
<tr>
<td>fsm028</td>
<td>10</td>
<td>4</td>
<td>4</td>
<td>26</td>
<td>2</td>
<td>21</td>
<td>2</td>
<td>23</td>
<td>2</td>
<td>25</td>
<td>3</td>
<td></td>
</tr>
<tr>
<td>fsm029</td>
<td>8</td>
<td>4</td>
<td>4</td>
<td>17</td>
<td>2</td>
<td>17</td>
<td>2</td>
<td>30</td>
<td>3</td>
<td>19</td>
<td>2</td>
<td></td>
</tr>
<tr>
<td>fsm030</td>
<td>12</td>
<td>8</td>
<td>4</td>
<td>155</td>
<td>7</td>
<td>133</td>
<td>7</td>
<td>171</td>
<td>7</td>
<td>168</td>
<td>7</td>
<td></td>
</tr>
<tr>
<td>fsm031</td>
<td>14</td>
<td>4</td>
<td>4</td>
<td>18</td>
<td>2</td>
<td>18</td>
<td>2</td>
<td>29</td>
<td>2</td>
<td>22</td>
<td>2</td>
<td></td>
</tr>
<tr>
<td>fsm032</td>
<td>30</td>
<td>2</td>
<td>2</td>
<td>22</td>
<td>2</td>
<td>22</td>
<td>2</td>
<td>39</td>
<td>2</td>
<td>24</td>
<td>3</td>
<td></td>
</tr>
<tr>
<td>fsm033</td>
<td>17</td>
<td>8</td>
<td>8</td>
<td>115</td>
<td>4</td>
<td>96</td>
<td>4</td>
<td>104</td>
<td>5</td>
<td>112</td>
<td>4</td>
<td></td>
</tr>
<tr>
<td>fsm034</td>
<td>13</td>
<td>12</td>
<td>4</td>
<td>46</td>
<td>4</td>
<td>40</td>
<td>3</td>
<td>39</td>
<td>2</td>
<td>57</td>
<td>4</td>
<td></td>
</tr>
<tr>
<td>fsm035</td>
<td>9</td>
<td>4</td>
<td>4</td>
<td>12</td>
<td>2</td>
<td>11</td>
<td>2</td>
<td>15</td>
<td>2</td>
<td>15</td>
<td>2</td>
<td></td>
</tr>
<tr>
<td>fsm036</td>
<td>15</td>
<td>2</td>
<td>2</td>
<td>17</td>
<td>2</td>
<td>14</td>
<td>2</td>
<td>18</td>
<td>2</td>
<td>19</td>
<td>2</td>
<td></td>
</tr>
<tr>
<td>fsm037</td>
<td>6</td>
<td>8</td>
<td>4</td>
<td>31</td>
<td>4</td>
<td>27</td>
<td>3</td>
<td>42</td>
<td>4</td>
<td>44</td>
<td>4</td>
<td></td>
</tr>
<tr>
<td>fsm038</td>
<td>12</td>
<td>4</td>
<td>4</td>
<td>44</td>
<td>3</td>
<td>42</td>
<td>3</td>
<td>60</td>
<td>3</td>
<td>43</td>
<td>3</td>
<td></td>
</tr>
<tr>
<td>fsm039</td>
<td>14</td>
<td>4</td>
<td>4</td>
<td>22</td>
<td>2</td>
<td>23</td>
<td>2</td>
<td>29</td>
<td>2</td>
<td>27</td>
<td>3</td>
<td></td>
</tr>
<tr>
<td>fsm040</td>
<td>12</td>
<td>4</td>
<td>4</td>
<td>23</td>
<td>2</td>
<td>17</td>
<td>2</td>
<td>22</td>
<td>2</td>
<td>21</td>
<td>2</td>
<td></td>
</tr>
<tr>
<td>fsm041</td>
<td>12</td>
<td>8</td>
<td>12</td>
<td>94</td>
<td>3</td>
<td>81</td>
<td>4</td>
<td>70</td>
<td>3</td>
<td>96</td>
<td>4</td>
<td></td>
</tr>
<tr>
<td>fsm042</td>
<td>28</td>
<td>3</td>
<td>2</td>
<td>43</td>
<td>3</td>
<td>41</td>
<td>3</td>
<td>42</td>
<td>3</td>
<td>45</td>
<td>3</td>
<td></td>
</tr>
<tr>
<td>fsm043</td>
<td>8</td>
<td>4</td>
<td>12</td>
<td>34</td>
<td>2</td>
<td>32</td>
<td>2</td>
<td>46</td>
<td>2</td>
<td>36</td>
<td>2</td>
<td></td>
</tr>
<tr>
<td>fsm044</td>
<td>8</td>
<td>4</td>
<td>4</td>
<td>12</td>
<td>2</td>
<td>10</td>
<td>2</td>
<td>13</td>
<td>2</td>
<td>13</td>
<td>2</td>
<td></td>
</tr>
<tr>
<td>fsm045</td>
<td>11</td>
<td>4</td>
<td>4</td>
<td>26</td>
<td>2</td>
<td>23</td>
<td>2</td>
<td>26</td>
<td>2</td>
<td>28</td>
<td>2</td>
<td></td>
</tr>
<tr>
<td>fsm046</td>
<td>64</td>
<td>4</td>
<td>4</td>
<td>272</td>
<td>6</td>
<td>269</td>
<td>6</td>
<td>337</td>
<td>8</td>
<td>291</td>
<td>6</td>
<td></td>
</tr>
<tr>
<td>fsm047</td>
<td>37</td>
<td>12</td>
<td>12</td>
<td>226</td>
<td>6</td>
<td>204</td>
<td>4</td>
<td>144</td>
<td>3</td>
<td>315</td>
<td>7</td>
<td></td>
</tr>
<tr>
<td>fsm048</td>
<td>7</td>
<td>4</td>
<td>4</td>
<td>14</td>
<td>2</td>
<td>13</td>
<td>2</td>
<td>20</td>
<td>2</td>
<td>14</td>
<td>2</td>
<td></td>
</tr>
<tr>
<td>fsm049</td>
<td>8</td>
<td>4</td>
<td>4</td>
<td>25</td>
<td>2</td>
<td>23</td>
<td>2</td>
<td>47</td>
<td>2</td>
<td>26</td>
<td>3</td>
<td></td>
</tr>
<tr>
<td>fsm050</td>
<td>16</td>
<td>4</td>
<td>4</td>
<td>25</td>
<td>2</td>
<td>24</td>
<td>2</td>
<td>31</td>
<td>3</td>
<td>34</td>
<td>3</td>
<td></td>
</tr>
<tr>
<td>fsm051</td>
<td>19</td>
<td>2</td>
<td>2</td>
<td>16</td>
<td>2</td>
<td>17</td>
<td>2</td>
<td>22</td>
<td>2</td>
<td>16</td>
<td>2</td>
<td></td>
</tr>
<tr>
<td>fsm052</td>
<td>12</td>
<td>8</td>
<td>4</td>
<td>38</td>
<td>4</td>
<td>36</td>
<td>3</td>
<td>39</td>
<td>3</td>
<td>51</td>
<td>5</td>
<td></td>
</tr>
<tr>
<td>fsm053</td>
<td>13</td>
<td>8</td>
<td>16</td>
<td>96</td>
<td>3</td>
<td>83</td>
<td>4</td>
<td>85</td>
<td>3</td>
<td>113</td>
<td>4</td>
<td></td>
</tr>
<tr>
<td>fsm054</td>
<td>40</td>
<td>8</td>
<td>12</td>
<td>106</td>
<td>3</td>
<td>105</td>
<td>4</td>
<td>102</td>
<td>3</td>
<td>118</td>
<td>5</td>
<td></td>
</tr>
<tr>
<td>fsm055</td>
<td>7</td>
<td>1</td>
<td>1</td>
<td>4</td>
<td>1</td>
<td>4</td>
<td>1</td>
<td>6</td>
<td>2</td>
<td>4</td>
<td>1</td>
<td></td>
</tr>
<tr>
<td>fsm056</td>
<td>9</td>
<td>4</td>
<td>12</td>
<td>27</td>
<td>2</td>
<td>24</td>
<td>2</td>
<td>28</td>
<td>2</td>
<td>27</td>
<td>2</td>
<td></td>
</tr>
<tr>
<td>fsm057</td>
<td>36</td>
<td>12</td>
<td>25</td>
<td>96</td>
<td>3</td>
<td>89</td>
<td>4</td>
<td>81</td>
<td>3</td>
<td>113</td>
<td>5</td>
<td></td>
</tr>
<tr>
<td>fsm058</td>
<td>10</td>
<td>8</td>
<td>2</td>
<td>15</td>
<td>2</td>
<td>11</td>
<td>2</td>
<td>14</td>
<td>2</td>
<td>16</td>
<td>3</td>
<td></td>
</tr>
</tbody>
</table>
| fsm059 | 39  | 12 | 12 | 481| 8  | 481| 9  | 406| 10 | 1012| 9 | +
| fsm060 |                  |                  |                  |                  |                  | + |
| fsm061 |                  |                  |                  |                  |                  | + |
| fnm108 | 15 8 4 | 48 3 | 50 3 | 48 3 | 59 4 |
| fnm109 | 7 8 2 | 14 2 | 15 2 | 22 2 | 24 4 |
| fnm110 | 15 2 2 | 16 2 | 15 2 | 20 2 | 19 2 |
| fnm111 | 14 2 2 | 15 2 | 13 2 | 20 2 | 19 2 |
| fnm112 | 39 12 8 | 87 4 | 82 3 | 68 3 | 114 5 |
| fnm113 | 13 4 12 | 25 2 | 23 2 | 34 2 | 31 3 |
| fnm114 | 12 4 4 | 52 3 | 44 3 | 68 4 | 62 4 |
| fnm115 | 20 2 8 | 22 2 | 20 2 | 36 2 | 22 2 |
| fnm116 | 8 4 4 | 16 2 | 18 3 | 38 2 | 18 3 |
| fnm117 | 10 4 4 | 42 3 | 39 3 | 59 3 | 46 3 |
| fnm118 | 39 4 16 | 265 5 | 232 4 | 289 5 | 266 6 |
| fnm119 | 6 8 4 | 19 2 | 15 2 | 20 2 | 18 3 |
| fnm120 | 10 12 16 | 38 4 | 28 2 | 27 2 | 38 4 |
| fnm121 | 12 4 12 | 54 2 | 53 2 | 57 2 | 60 3 |
| fnm122 | 11 4 4 | 31 3 | 30 3 | 46 3 | 42 3 |
| fnm123 | 71 4 16 | 500 7 | 486 7 | 457 6 | 469 7 |
| fnm124 | 34 4 12 | 94 3 | 84 3 | 86 3 | 85 5 |
| fnm125 | 6 4 4 | 13 2 | 13 2 | 25 2 | 15 2 |
| fnm126 | 25 2 2 | 25 2 | 26 2 | 32 2 | 27 3 |
| fnm127 | 10 8 2 | 13 2 | 11 2 | 14 2 | 17 4 |
| fnm128 | 39 8 12 | 103 3 | 102 4 | 99 3 | 118 5 |
| fnm129 | 10 4 4 | 45 3 | 40 3 | 60 3 | 52 3 |
| fnm130 | 18 12 16 | 98 3 | 99 3 | 81 3 | 124 5 |
| fnm131 | 11 2 2 | 15 2 | 14 2 | 14 2 | 17 2 |
| fnm132 | 92 4 16 | 673 7 | 680 7 | 614 8 | 663 7 |
| fnm133 | 7 1 1 | 4 1 | 4 1 | 5 1 | 4 1 |
| fnm134 | 14 8 2 | 49 4 | 38 3 | 46 3 | 65 5 |
| fnm135 | 17 8 4 | 113 4 | 98 6 | 89 4 | 143 6 |
| fnm136 | 13 4 4 | 48 4 | 41 3 | 58 3 | 48 4 |
| fnm137 | 11 4 4 | 18 2 | 18 2 | 21 2 | 23 2 |
| fnm138 | 12 8 4 | 58 3 | 53 4 | 53 3 | 75 5 |
| fnm139 | 7 4 12 | 16 2 | 17 2 | 23 2 | 20 2 |
| fnm140 | 7 4 4 | 11 2 | 9 2 | 14 2 | 11 2 |
| fnm141 | 14 2 2 | 15 2 | 12 2 | 16 2 | 16 2 |
| fnm142 | 8 8 8 | 38 3 | 34 3 | 50 3 | 46 4 |
| fnm143 | 5 4 12 | 14 2 | 10 1 | 17 2 | 14 2 |
| fnm144 | 20 2 8 | 34 2 | 35 2 | 40 2 | 38 3 |
| fnm145 | 8 4 4 | 13 2 | 12 2 | 22 2 | 15 2 |
| fnm146 | 7 1 1 | 4 1 | 1 1 | 4 1 | 4 1 |
| fnm147 | 41 12 8 | 135 4 | 128 5 | 98 4 | 137 5 |
| fnm148 | 20 2 8 | 35 2 | 33 2 | 44 2 | 35 2 |
| fnm149 | 13 4 12 | 37 2 | 38 3 | 45 2 | 45 3 |
| fnm150 | 14 8 4 | 195 7 | 185 7 | 212 7 | 193 7 |
| fnm151 | 23 2 2 | 21 2 | 21 2 | 28 2 | 19 3 |
| fnm152 | 13 4 12 | 41 2 | 37 2 | 40 2 | 41 3 |
| fnm153 | 19 2 2 | 18 2 | 18 2 | 20 2 | 18 2 |
| fsn154 | 13 4 12 | 27 2 | 25 2 | 33 2 | 33 3 | + |
| fsn155 | 10 4 4 | 16 2 | 13 2 | 21 3 | 16 2 | |
| fsn156 | 30 4 12 | 214 6 | 190 5 | 206 5 | 231 5 | |
| fsn157 | 8 4 12 | 21 2 | 21 2 | 28 2 | 23 2 | |
| fsn158 | 26 2 2 | 27 2 | 25 2 | 36 2 | 26 3 | |
| fsn159 | 7 8 12 | 25 3 | 22 3 | 41 4 | 29 3 | |
| fsn160 | 13 4 12 | 34 2 | 33 2 | 39 2 | 36 2 | |
| fsn161 | 13 4 4 | 57 4 | 55 3 | 82 3 | 78 4 | |
| fsn162 | 14 4 12 | 43 2 | 44 2 | 47 2 | 50 3 | |
| fsn163 | 13 8 4 | 52 4 | 46 3 | 50 2 | 60 5 | |
| fsn164 | 14 4 4 | 56 4 | 54 4 | 80 4 | 78 4 | |
| fsn165 | 25 2 2 | 21 2 | 20 2 | 33 2 | 17 2 | |
| fsn166 | 13 8 4 | 51 4 | 45 3 | 49 3 | 66 4 | |
| fsn167 | 10 8 16 | 32 3 | 26 2 | 23 2 | 32 3 | |
| fsn168 | 23 2 2 | 26 2 | 25 2 | 37 3 | 24 3 | |
| fsn169 | 11 4 4 | 33 3 | 33 3 | 47 3 | 36 3 | |
| fsn170 | 19 2 2 | 22 2 | 19 2 | 25 2 | 21 2 | |
| fsn171 | 10 8 2 | 19 3 | 19 2 | 24 3 | 28 4 | |
| fsn172 | 12 12 4 | 145 6 | 124 6 | 142 7 | 200 9 | |
| fsn173 | 38 4 10 | 189 5 | 176 5 | 204 5 | 235 6 | |
| fsn174 | 9 4 4 | 16 2 | 11 2 | 16 2 | 17 2 | |
| fsn175 | 17 2 2 | 18 2 | 14 2 | 20 2 | 15 2 | |
| fsn176 | 13 8 4 | 36 4 | 35 3 | 38 2 | 44 4 | |
| fsn177 | 26 2 2 | 25 2 | 25 2 | 31 2 | 26 3 | |
| fsn178 | 13 8 16 | 118 4 | 110 4 | 131 5 | 139 5 | |
| fsn179 | 29 2 2 | 27 3 | 25 2 | 36 2 | 29 3 | |
| fsn180 | 7 4 4 | 16 2 | 14 2 | 28 2 | 16 2 | |
| fsn181 | 5 8 12 | 29 2 | 27 2 | 35 3 | 35 3 | |
| fsn182 | 10 4 4 | 17 2 | 16 2 | 22 2 | 19 2 | |
| fsn183 | 21 2 2 | 20 2 | 20 2 | 27 2 | 19 2 | |
| fsn184 | 12 4 4 | 25 3 | 24 2 | 42 3 | 33 3 | |
| fsn185 | 13 4 4 | 36 3 | 31 2 | 45 3 | 37 3 | |
| fsn186 | 6 8 4 | 28 4 | 24 3 | 39 4 | 38 4 | |
| fsn187 | 13 8 4 | 36 4 | 33 3 | 43 4 | 49 4 | |
| fsn188 | 8 4 4 | 15 2 | 11 2 | 17 2 | 14 2 | |
| fsn189 | 12 4 4 | 46 3 | 47 3 | 74 3 | 56 4 | |
| fsn190 | 7 8 12 | 26 2 | 23 2 | 27 2 | 21 2 | |
| fsn191 | 13 4 4 | 37 3 | 36 3 | 47 3 | 44 3 | |
| fsn192 | 16 8 2 | 168 7 | 156 8 | 217 8 | 212 7 | |
| fsn193 | 16 4 4 | 20 2 | 18 2 | 29 3 | 28 3 | |
| fsn194 | 7 4 4 | 7 1 | 7 1 | 11 2 | 9 2 | |
| fsn195 | 8 4 4 | 23 2 | 19 3 | 34 2 | 24 2 | |
| fsn196 | 8 4 4 | 24 3 | 21 2 | 41 3 | 28 3 | |
| fsn197 | 59 4 4 | 209 6 | 217 6 | 269 7 | 256 6 | |
| fsn198 | 37 12 12 | 255 11 | 240 9 | 218 5 | 589 9 | |
| fsn199 | 13 4 4 | 49 3 | 40 3 | 59 3 | 44 3 | |
### A. Encoding Results for Generated FSMS

<table>
<thead>
<tr>
<th>FSM</th>
<th>4</th>
<th>4</th>
<th>9</th>
<th>2</th>
<th>8</th>
<th>2</th>
<th>12</th>
<th>2</th>
<th>10</th>
<th>2</th>
</tr>
</thead>
<tbody>
<tr>
<td>FSM200</td>
<td>7</td>
<td>4</td>
<td>4</td>
<td>9</td>
<td>2</td>
<td>8</td>
<td>2</td>
<td>12</td>
<td>2</td>
<td>10</td>
</tr>
<tr>
<td>FSM201</td>
<td>12</td>
<td>8</td>
<td>4</td>
<td>76</td>
<td>4</td>
<td>67</td>
<td>4</td>
<td>73</td>
<td>4</td>
<td>91</td>
</tr>
<tr>
<td>FSM202</td>
<td>13</td>
<td>4</td>
<td>4</td>
<td>19</td>
<td>2</td>
<td>18</td>
<td>3</td>
<td>25</td>
<td>3</td>
<td>21</td>
</tr>
<tr>
<td>FSM203</td>
<td>7</td>
<td>8</td>
<td>12</td>
<td>24</td>
<td>3</td>
<td>22</td>
<td>2</td>
<td>32</td>
<td>3</td>
<td>28</td>
</tr>
<tr>
<td>FSM204</td>
<td>19</td>
<td>10</td>
<td>12</td>
<td>36</td>
<td>2</td>
<td>31</td>
<td>2</td>
<td>26</td>
<td>1</td>
<td>44</td>
</tr>
<tr>
<td>FSM205</td>
<td>12</td>
<td>8</td>
<td>4</td>
<td>57</td>
<td>4</td>
<td>61</td>
<td>4</td>
<td>73</td>
<td>4</td>
<td>64</td>
</tr>
<tr>
<td>FSM206</td>
<td>14</td>
<td>4</td>
<td>12</td>
<td>27</td>
<td>2</td>
<td>24</td>
<td>2</td>
<td>38</td>
<td>2</td>
<td>29</td>
</tr>
<tr>
<td>FSM207</td>
<td>36</td>
<td>4</td>
<td>10</td>
<td>109</td>
<td>3</td>
<td>101</td>
<td>3</td>
<td>92</td>
<td>3</td>
<td>134</td>
</tr>
<tr>
<td>FSM208</td>
<td>11</td>
<td>4</td>
<td>4</td>
<td>18</td>
<td>2</td>
<td>17</td>
<td>2</td>
<td>22</td>
<td>2</td>
<td>21</td>
</tr>
<tr>
<td>FSM209</td>
<td>29</td>
<td>2</td>
<td>2</td>
<td>23</td>
<td>3</td>
<td>20</td>
<td>2</td>
<td>37</td>
<td>2</td>
<td>25</td>
</tr>
<tr>
<td>FSM210</td>
<td>7</td>
<td>4</td>
<td>4</td>
<td>8</td>
<td>2</td>
<td>7</td>
<td>2</td>
<td>11</td>
<td>2</td>
<td>10</td>
</tr>
<tr>
<td>FSM211</td>
<td>7</td>
<td>4</td>
<td>4</td>
<td>17</td>
<td>2</td>
<td>15</td>
<td>2</td>
<td>22</td>
<td>2</td>
<td>16</td>
</tr>
<tr>
<td>FSM212</td>
<td>45</td>
<td>8</td>
<td>16</td>
<td>982</td>
<td>9</td>
<td>980</td>
<td>9</td>
<td>1567</td>
<td>12</td>
<td>1895</td>
</tr>
<tr>
<td>FSM213</td>
<td>9</td>
<td>8</td>
<td>4</td>
<td>70</td>
<td>6</td>
<td>59</td>
<td>6</td>
<td>92</td>
<td>5</td>
<td>92</td>
</tr>
<tr>
<td>FSM214</td>
<td>6</td>
<td>8</td>
<td>8</td>
<td>19</td>
<td>2</td>
<td>16</td>
<td>2</td>
<td>25</td>
<td>2</td>
<td>28</td>
</tr>
<tr>
<td>FSM215</td>
<td>13</td>
<td>4</td>
<td>4</td>
<td>45</td>
<td>3</td>
<td>39</td>
<td>3</td>
<td>59</td>
<td>3</td>
<td>52</td>
</tr>
<tr>
<td>FSM216</td>
<td>36</td>
<td>12</td>
<td>8</td>
<td>166</td>
<td>5</td>
<td>159</td>
<td>4</td>
<td>112</td>
<td>4</td>
<td>214</td>
</tr>
<tr>
<td>FSM217</td>
<td>10</td>
<td>4</td>
<td>4</td>
<td>15</td>
<td>2</td>
<td>14</td>
<td>2</td>
<td>19</td>
<td>2</td>
<td>20</td>
</tr>
<tr>
<td>FSM218</td>
<td>36</td>
<td>8</td>
<td>12</td>
<td>174</td>
<td>5</td>
<td>150</td>
<td>5</td>
<td>156</td>
<td>5</td>
<td>209</td>
</tr>
<tr>
<td>FSM219</td>
<td>36</td>
<td>4</td>
<td>10</td>
<td>78</td>
<td>4</td>
<td>64</td>
<td>3</td>
<td>72</td>
<td>2</td>
<td>77</td>
</tr>
<tr>
<td>FSM220</td>
<td>7</td>
<td>8</td>
<td>2</td>
<td>12</td>
<td>2</td>
<td>12</td>
<td>2</td>
<td>19</td>
<td>2</td>
<td>20</td>
</tr>
<tr>
<td>FSM221</td>
<td>7</td>
<td>8</td>
<td>4</td>
<td>12</td>
<td>2</td>
<td>10</td>
<td>2</td>
<td>12</td>
<td>2</td>
<td>13</td>
</tr>
<tr>
<td>FSM222</td>
<td>13</td>
<td>4</td>
<td>4</td>
<td>18</td>
<td>2</td>
<td>16</td>
<td>2</td>
<td>21</td>
<td>2</td>
<td>21</td>
</tr>
<tr>
<td>FSM223</td>
<td>9</td>
<td>4</td>
<td>12</td>
<td>26</td>
<td>2</td>
<td>23</td>
<td>2</td>
<td>30</td>
<td>2</td>
<td>25</td>
</tr>
<tr>
<td>FSM224</td>
<td>30</td>
<td>2</td>
<td>2</td>
<td>26</td>
<td>2</td>
<td>25</td>
<td>2</td>
<td>36</td>
<td>2</td>
<td>28</td>
</tr>
<tr>
<td>FSM225</td>
<td>31</td>
<td>2</td>
<td>2</td>
<td>27</td>
<td>2</td>
<td>24</td>
<td>2</td>
<td>39</td>
<td>2</td>
<td>25</td>
</tr>
<tr>
<td>FSM226</td>
<td>9</td>
<td>8</td>
<td>16</td>
<td>40</td>
<td>4</td>
<td>33</td>
<td>2</td>
<td>34</td>
<td>2</td>
<td>45</td>
</tr>
<tr>
<td>FSM227</td>
<td>20</td>
<td>4</td>
<td>4</td>
<td>75</td>
<td>3</td>
<td>69</td>
<td>3</td>
<td>99</td>
<td>4</td>
<td>95</td>
</tr>
<tr>
<td>FSM228</td>
<td>12</td>
<td>8</td>
<td>4</td>
<td>57</td>
<td>4</td>
<td>55</td>
<td>4</td>
<td>58</td>
<td>4</td>
<td>66</td>
</tr>
<tr>
<td>FSM229</td>
<td>11</td>
<td>4</td>
<td>12</td>
<td>24</td>
<td>2</td>
<td>23</td>
<td>2</td>
<td>24</td>
<td>2</td>
<td>25</td>
</tr>
<tr>
<td>FSM230</td>
<td>9</td>
<td>4</td>
<td>12</td>
<td>27</td>
<td>2</td>
<td>24</td>
<td>2</td>
<td>30</td>
<td>2</td>
<td>32</td>
</tr>
<tr>
<td>FSM231</td>
<td>13</td>
<td>4</td>
<td>12</td>
<td>26</td>
<td>2</td>
<td>24</td>
<td>2</td>
<td>33</td>
<td>2</td>
<td>28</td>
</tr>
<tr>
<td>FSM232</td>
<td>15</td>
<td>2</td>
<td>2</td>
<td>17</td>
<td>2</td>
<td>14</td>
<td>2</td>
<td>19</td>
<td>2</td>
<td>20</td>
</tr>
<tr>
<td>FSM233</td>
<td>14</td>
<td>2</td>
<td>2</td>
<td>16</td>
<td>2</td>
<td>16</td>
<td>2</td>
<td>23</td>
<td>2</td>
<td>22</td>
</tr>
<tr>
<td>FSM234</td>
<td>37</td>
<td>8</td>
<td>12</td>
<td>80</td>
<td>4</td>
<td>81</td>
<td>3</td>
<td>81</td>
<td>3</td>
<td>97</td>
</tr>
<tr>
<td>FSM235</td>
<td>13</td>
<td>4</td>
<td>4</td>
<td>43</td>
<td>3</td>
<td>40</td>
<td>3</td>
<td>62</td>
<td>3</td>
<td>50</td>
</tr>
<tr>
<td>FSM236</td>
<td>11</td>
<td>4</td>
<td>4</td>
<td>30</td>
<td>3</td>
<td>27</td>
<td>2</td>
<td>37</td>
<td>3</td>
<td>28</td>
</tr>
<tr>
<td>FSM237</td>
<td>24</td>
<td>4</td>
<td>4</td>
<td>105</td>
<td>5</td>
<td>93</td>
<td>5</td>
<td>124</td>
<td>5</td>
<td>99</td>
</tr>
<tr>
<td>FSM238</td>
<td>13</td>
<td>4</td>
<td>4</td>
<td>56</td>
<td>4</td>
<td>52</td>
<td>4</td>
<td>104</td>
<td>5</td>
<td>79</td>
</tr>
<tr>
<td>FSM239</td>
<td>8</td>
<td>4</td>
<td>4</td>
<td>12</td>
<td>2</td>
<td>11</td>
<td>2</td>
<td>19</td>
<td>2</td>
<td>17</td>
</tr>
<tr>
<td>FSM240</td>
<td>39</td>
<td>4</td>
<td>10</td>
<td>94</td>
<td>3</td>
<td>99</td>
<td>3</td>
<td>91</td>
<td>3</td>
<td>114</td>
</tr>
<tr>
<td>FSM241</td>
<td>10</td>
<td>8</td>
<td>4</td>
<td>32</td>
<td>3</td>
<td>26</td>
<td>2</td>
<td>31</td>
<td>2</td>
<td>45</td>
</tr>
<tr>
<td>FSM242</td>
<td>7</td>
<td>8</td>
<td>8</td>
<td>15</td>
<td>2</td>
<td>13</td>
<td>2</td>
<td>16</td>
<td>2</td>
<td>16</td>
</tr>
<tr>
<td>FSM243</td>
<td>41</td>
<td>12</td>
<td>8</td>
<td>387</td>
<td>8</td>
<td>337</td>
<td>8</td>
<td>330</td>
<td>9</td>
<td>622</td>
</tr>
<tr>
<td>FSM244</td>
<td>20</td>
<td>2</td>
<td>2</td>
<td>19</td>
<td>2</td>
<td>15</td>
<td>2</td>
<td>25</td>
<td>2</td>
<td>17</td>
</tr>
<tr>
<td>FSM245</td>
<td>36</td>
<td>12</td>
<td>12</td>
<td>85</td>
<td>3</td>
<td>79</td>
<td>3</td>
<td>80</td>
<td>3</td>
<td>85</td>
</tr>
<tr>
<td>FN</td>
<td>17</td>
<td>4</td>
<td>4</td>
<td>50</td>
<td>3</td>
<td>38</td>
<td>3</td>
<td>43</td>
<td>3</td>
<td>42</td>
</tr>
<tr>
<td>-----</td>
<td>----</td>
<td>----</td>
<td>----</td>
<td>----</td>
<td>---</td>
<td>----</td>
<td>---</td>
<td>----</td>
<td>---</td>
<td>----</td>
</tr>
<tr>
<td>FN1</td>
<td>37</td>
<td>12</td>
<td>12</td>
<td>47</td>
<td>9</td>
<td>46</td>
<td>3</td>
<td>46</td>
<td>3</td>
<td>42</td>
</tr>
<tr>
<td>FN2</td>
<td>19</td>
<td>3</td>
<td>2</td>
<td>29</td>
<td>2</td>
<td>26</td>
<td>2</td>
<td>36</td>
<td>2</td>
<td>32</td>
</tr>
<tr>
<td>FN3</td>
<td>12</td>
<td>8</td>
<td>4</td>
<td>36</td>
<td>3</td>
<td>34</td>
<td>3</td>
<td>42</td>
<td>3</td>
<td>60</td>
</tr>
<tr>
<td>FN4</td>
<td>12</td>
<td>8</td>
<td>4</td>
<td>48</td>
<td>3</td>
<td>46</td>
<td>4</td>
<td>56</td>
<td>3</td>
<td>74</td>
</tr>
<tr>
<td>FN5</td>
<td>21</td>
<td>3</td>
<td>2</td>
<td>36</td>
<td>3</td>
<td>32</td>
<td>3</td>
<td>37</td>
<td>3</td>
<td>33</td>
</tr>
<tr>
<td>FN6</td>
<td>7</td>
<td>4</td>
<td>4</td>
<td>7</td>
<td>1</td>
<td>7</td>
<td>1</td>
<td>10</td>
<td>1</td>
<td>9</td>
</tr>
<tr>
<td>FN7</td>
<td>13</td>
<td>4</td>
<td>4</td>
<td>16</td>
<td>2</td>
<td>16</td>
<td>2</td>
<td>21</td>
<td>2</td>
<td>20</td>
</tr>
<tr>
<td>FN8</td>
<td>18</td>
<td>4</td>
<td>4</td>
<td>70</td>
<td>4</td>
<td>71</td>
<td>3</td>
<td>104</td>
<td>4</td>
<td>110</td>
</tr>
<tr>
<td>FN9</td>
<td>13</td>
<td>8</td>
<td>4</td>
<td>49</td>
<td>4</td>
<td>36</td>
<td>3</td>
<td>46</td>
<td>3</td>
<td>64</td>
</tr>
<tr>
<td>FNa</td>
<td>15</td>
<td>8</td>
<td>4</td>
<td>59</td>
<td>3</td>
<td>51</td>
<td>3</td>
<td>47</td>
<td>3</td>
<td>61</td>
</tr>
<tr>
<td>FNb</td>
<td>10</td>
<td>12</td>
<td>16</td>
<td>29</td>
<td>2</td>
<td>26</td>
<td>2</td>
<td>23</td>
<td>2</td>
<td>30</td>
</tr>
<tr>
<td>FNC</td>
<td>19</td>
<td>2</td>
<td>2</td>
<td>17</td>
<td>2</td>
<td>18</td>
<td>2</td>
<td>24</td>
<td>2</td>
<td>18</td>
</tr>
<tr>
<td>FNd</td>
<td>60</td>
<td>4</td>
<td>4</td>
<td>246</td>
<td>6</td>
<td>236</td>
<td>6</td>
<td>274</td>
<td>8</td>
<td>231</td>
</tr>
<tr>
<td>FNf</td>
<td>20</td>
<td>2</td>
<td>8</td>
<td>24</td>
<td>2</td>
<td>22</td>
<td>2</td>
<td>37</td>
<td>2</td>
<td>23</td>
</tr>
<tr>
<td>FNg</td>
<td>11</td>
<td>4</td>
<td>4</td>
<td>34</td>
<td>3</td>
<td>31</td>
<td>2</td>
<td>42</td>
<td>2</td>
<td>41</td>
</tr>
<tr>
<td>FNh</td>
<td>14</td>
<td>4</td>
<td>4</td>
<td>56</td>
<td>3</td>
<td>50</td>
<td>3</td>
<td>76</td>
<td>3</td>
<td>56</td>
</tr>
<tr>
<td>FNI</td>
<td>7</td>
<td>8</td>
<td>12</td>
<td>24</td>
<td>3</td>
<td>24</td>
<td>3</td>
<td>32</td>
<td>3</td>
<td>31</td>
</tr>
<tr>
<td>FNj</td>
<td>14</td>
<td>8</td>
<td>4</td>
<td>78</td>
<td>4</td>
<td>71</td>
<td>4</td>
<td>79</td>
<td>4</td>
<td>97</td>
</tr>
<tr>
<td>FNk</td>
<td>33</td>
<td>2</td>
<td>2</td>
<td>43</td>
<td>3</td>
<td>30</td>
<td>3</td>
<td>44</td>
<td>2</td>
<td>40</td>
</tr>
<tr>
<td>FNl</td>
<td>12</td>
<td>8</td>
<td>4</td>
<td>162</td>
<td>7</td>
<td>155</td>
<td>7</td>
<td>207</td>
<td>8</td>
<td>183</td>
</tr>
<tr>
<td>FNm</td>
<td>25</td>
<td>4</td>
<td>4</td>
<td>85</td>
<td>4</td>
<td>70</td>
<td>4</td>
<td>100</td>
<td>5</td>
<td>94</td>
</tr>
<tr>
<td>FNn</td>
<td>15</td>
<td>4</td>
<td>4</td>
<td>44</td>
<td>3</td>
<td>46</td>
<td>3</td>
<td>66</td>
<td>3</td>
<td>52</td>
</tr>
<tr>
<td>FNo</td>
<td>8</td>
<td>8</td>
<td>8</td>
<td>59</td>
<td>3</td>
<td>50</td>
<td>4</td>
<td>64</td>
<td>3</td>
<td>58</td>
</tr>
<tr>
<td>FNP</td>
<td>7</td>
<td>4</td>
<td>4</td>
<td>21</td>
<td>2</td>
<td>17</td>
<td>2</td>
<td>34</td>
<td>2</td>
<td>20</td>
</tr>
<tr>
<td>FNQ</td>
<td>7</td>
<td>8</td>
<td>12</td>
<td>33</td>
<td>2</td>
<td>32</td>
<td>2</td>
<td>47</td>
<td>3</td>
<td>45</td>
</tr>
<tr>
<td>FNR</td>
<td>26</td>
<td>2</td>
<td>2</td>
<td>21</td>
<td>2</td>
<td>21</td>
<td>2</td>
<td>33</td>
<td>2</td>
<td>21</td>
</tr>
<tr>
<td>FNS</td>
<td>10</td>
<td>4</td>
<td>12</td>
<td>42</td>
<td>2</td>
<td>41</td>
<td>3</td>
<td>51</td>
<td>2</td>
<td>47</td>
</tr>
<tr>
<td>FNT</td>
<td>9</td>
<td>4</td>
<td>4</td>
<td>36</td>
<td>3</td>
<td>27</td>
<td>2</td>
<td>43</td>
<td>3</td>
<td>39</td>
</tr>
<tr>
<td>FNU</td>
<td>7</td>
<td>8</td>
<td>12</td>
<td>36</td>
<td>3</td>
<td>30</td>
<td>2</td>
<td>36</td>
<td>3</td>
<td>40</td>
</tr>
<tr>
<td>FNV</td>
<td>7</td>
<td>4</td>
<td>4</td>
<td>15</td>
<td>2</td>
<td>11</td>
<td>2</td>
<td>18</td>
<td>2</td>
<td>17</td>
</tr>
<tr>
<td>FNW</td>
<td>8</td>
<td>4</td>
<td>4</td>
<td>12</td>
<td>2</td>
<td>11</td>
<td>2</td>
<td>16</td>
<td>2</td>
<td>13</td>
</tr>
<tr>
<td>FNX</td>
<td>38</td>
<td>4</td>
<td>12</td>
<td>193</td>
<td>5</td>
<td>175</td>
<td>4</td>
<td>189</td>
<td>5</td>
<td>202</td>
</tr>
<tr>
<td>FNY</td>
<td>15</td>
<td>4</td>
<td>4</td>
<td>40</td>
<td>3</td>
<td>40</td>
<td>3</td>
<td>51</td>
<td>3</td>
<td>47</td>
</tr>
<tr>
<td>FNZ</td>
<td>28</td>
<td>3</td>
<td>3</td>
<td>42</td>
<td>3</td>
<td>34</td>
<td>3</td>
<td>43</td>
<td>3</td>
<td>40</td>
</tr>
<tr>
<td>FNAA</td>
<td>33</td>
<td>2</td>
<td>2</td>
<td>41</td>
<td>3</td>
<td>26</td>
<td>3</td>
<td>41</td>
<td>2</td>
<td>35</td>
</tr>
<tr>
<td>FNAB</td>
<td>14</td>
<td>4</td>
<td>4</td>
<td>17</td>
<td>2</td>
<td>17</td>
<td>2</td>
<td>24</td>
<td>2</td>
<td>21</td>
</tr>
<tr>
<td>FNAC</td>
<td>20</td>
<td>2</td>
<td>2</td>
<td>24</td>
<td>2</td>
<td>22</td>
<td>2</td>
<td>28</td>
<td>2</td>
<td>21</td>
</tr>
<tr>
<td>FNAD</td>
<td>18</td>
<td>4</td>
<td>12</td>
<td>60</td>
<td>3</td>
<td>56</td>
<td>3</td>
<td>64</td>
<td>3</td>
<td>67</td>
</tr>
<tr>
<td>FNAE</td>
<td>5</td>
<td>8</td>
<td>12</td>
<td>18</td>
<td>2</td>
<td>18</td>
<td>2</td>
<td>27</td>
<td>3</td>
<td>24</td>
</tr>
<tr>
<td>FNAF</td>
<td>6</td>
<td>4</td>
<td>12</td>
<td>13</td>
<td>2</td>
<td>13</td>
<td>2</td>
<td>16</td>
<td>2</td>
<td>17</td>
</tr>
<tr>
<td>FNAG</td>
<td>40</td>
<td>4</td>
<td>10</td>
<td>245</td>
<td>6</td>
<td>243</td>
<td>6</td>
<td>280</td>
<td>7</td>
<td>298</td>
</tr>
<tr>
<td>FNAH</td>
<td>37</td>
<td>8</td>
<td>12</td>
<td>421</td>
<td>8</td>
<td>403</td>
<td>7</td>
<td>401</td>
<td>10</td>
<td>632</td>
</tr>
<tr>
<td>FNAI</td>
<td>20</td>
<td>4</td>
<td>4</td>
<td>66</td>
<td>3</td>
<td>58</td>
<td>3</td>
<td>69</td>
<td>3</td>
<td>85</td>
</tr>
<tr>
<td>FNAJ</td>
<td>13</td>
<td>4</td>
<td>4</td>
<td>41</td>
<td>3</td>
<td>39</td>
<td>3</td>
<td>55</td>
<td>3</td>
<td>42</td>
</tr>
<tr>
<td>FNAK</td>
<td>17</td>
<td>4</td>
<td>4</td>
<td>34</td>
<td>3</td>
<td>25</td>
<td>2</td>
<td>38</td>
<td>2</td>
<td>26</td>
</tr>
<tr>
<td>fsm</td>
<td>9</td>
<td>4</td>
<td>4</td>
<td>38</td>
<td>3</td>
<td>36</td>
<td>3</td>
<td>60</td>
<td>4</td>
<td>47</td>
</tr>
<tr>
<td>-----</td>
<td>----------------</td>
<td>----------------</td>
<td>----------------</td>
<td>----------------</td>
<td>----------------</td>
<td>----------------</td>
<td>----------------</td>
<td>----------------</td>
<td>----------------</td>
<td>----------------</td>
</tr>
<tr>
<td>fsm292</td>
<td>13</td>
<td>8</td>
<td>12</td>
<td>43</td>
<td>5</td>
<td>38</td>
<td>3</td>
<td>47</td>
<td>2</td>
<td>53</td>
</tr>
<tr>
<td>fsm293</td>
<td>13</td>
<td>4</td>
<td>12</td>
<td>48</td>
<td>2</td>
<td>41</td>
<td>2</td>
<td>48</td>
<td>2</td>
<td>52</td>
</tr>
<tr>
<td>fsm294</td>
<td>9</td>
<td>8</td>
<td>8</td>
<td>30</td>
<td>3</td>
<td>30</td>
<td>3</td>
<td>35</td>
<td>3</td>
<td>37</td>
</tr>
<tr>
<td>fsm295</td>
<td>6</td>
<td>8</td>
<td>4</td>
<td>20</td>
<td>2</td>
<td>19</td>
<td>2</td>
<td>21</td>
<td>2</td>
<td>22</td>
</tr>
<tr>
<td>fsm296</td>
<td>35</td>
<td>10</td>
<td>8</td>
<td>86</td>
<td>4</td>
<td>74</td>
<td>3</td>
<td>67</td>
<td>3</td>
<td>114</td>
</tr>
<tr>
<td>fsm297</td>
<td>35</td>
<td>10</td>
<td>8</td>
<td>86</td>
<td>4</td>
<td>36</td>
<td>5</td>
<td>63</td>
<td>4</td>
<td>62</td>
</tr>
<tr>
<td>fsm298</td>
<td>39</td>
<td>4</td>
<td>12</td>
<td>187</td>
<td>4</td>
<td>192</td>
<td>4</td>
<td>202</td>
<td>4</td>
<td>225</td>
</tr>
<tr>
<td>fsm299</td>
<td>13</td>
<td>12</td>
<td>4</td>
<td>32</td>
<td>4</td>
<td>27</td>
<td>3</td>
<td>38</td>
<td>3</td>
<td>49</td>
</tr>
<tr>
<td>fsm300</td>
<td>20</td>
<td>2</td>
<td>8</td>
<td>34</td>
<td>2</td>
<td>33</td>
<td>2</td>
<td>43</td>
<td>2</td>
<td>33</td>
</tr>
<tr>
<td>fsm301</td>
<td>6</td>
<td>8</td>
<td>8</td>
<td>25</td>
<td>2</td>
<td>23</td>
<td>2</td>
<td>31</td>
<td>2</td>
<td>33</td>
</tr>
<tr>
<td>fsm302</td>
<td>6</td>
<td>8</td>
<td>8</td>
<td>42</td>
<td>4</td>
<td>32</td>
<td>3</td>
<td>46</td>
<td>4</td>
<td>52</td>
</tr>
<tr>
<td>fsm303</td>
<td>13</td>
<td>4</td>
<td>12</td>
<td>26</td>
<td>2</td>
<td>26</td>
<td>2</td>
<td>33</td>
<td>2</td>
<td>32</td>
</tr>
<tr>
<td>fsm304</td>
<td>14</td>
<td>4</td>
<td>4</td>
<td>40</td>
<td>3</td>
<td>40</td>
<td>3</td>
<td>60</td>
<td>3</td>
<td>51</td>
</tr>
<tr>
<td>fsm305</td>
<td>12</td>
<td>4</td>
<td>4</td>
<td>20</td>
<td>2</td>
<td>17</td>
<td>2</td>
<td>22</td>
<td>2</td>
<td>19</td>
</tr>
<tr>
<td>fsm306</td>
<td>36</td>
<td>8</td>
<td>12</td>
<td>250</td>
<td>6</td>
<td>245</td>
<td>7</td>
<td>233</td>
<td>6</td>
<td>383</td>
</tr>
<tr>
<td>fsm307</td>
<td>8</td>
<td>4</td>
<td>4</td>
<td>18</td>
<td>2</td>
<td>16</td>
<td>2</td>
<td>28</td>
<td>2</td>
<td>19</td>
</tr>
<tr>
<td>fsm308</td>
<td>12</td>
<td>8</td>
<td>12</td>
<td>48</td>
<td>4</td>
<td>45</td>
<td>4</td>
<td>55</td>
<td>5</td>
<td>52</td>
</tr>
<tr>
<td>fsm309</td>
<td>7</td>
<td>1</td>
<td>1</td>
<td>1</td>
<td>1</td>
<td>1</td>
<td>1</td>
<td>1</td>
<td>1</td>
<td>1</td>
</tr>
<tr>
<td>fsm310</td>
<td>6</td>
<td>8</td>
<td>4</td>
<td>18</td>
<td>2</td>
<td>17</td>
<td>2</td>
<td>25</td>
<td>3</td>
<td>22</td>
</tr>
<tr>
<td>fsm311</td>
<td>34</td>
<td>4</td>
<td>12</td>
<td>141</td>
<td>4</td>
<td>116</td>
<td>3</td>
<td>107</td>
<td>3</td>
<td>127</td>
</tr>
<tr>
<td>fsm312</td>
<td>13</td>
<td>4</td>
<td>4</td>
<td>61</td>
<td>4</td>
<td>58</td>
<td>4</td>
<td>101</td>
<td>4</td>
<td>80</td>
</tr>
<tr>
<td>fsm313</td>
<td>14</td>
<td>8</td>
<td>4</td>
<td>180</td>
<td>7</td>
<td>171</td>
<td>8</td>
<td>249</td>
<td>7</td>
<td>221</td>
</tr>
<tr>
<td>fsm314</td>
<td>8</td>
<td>4</td>
<td>12</td>
<td>17</td>
<td>2</td>
<td>16</td>
<td>2</td>
<td>20</td>
<td>2</td>
<td>17</td>
</tr>
<tr>
<td>fsm315</td>
<td>7</td>
<td>4</td>
<td>4</td>
<td>23</td>
<td>2</td>
<td>21</td>
<td>2</td>
<td>34</td>
<td>2</td>
<td>23</td>
</tr>
<tr>
<td>fsm316</td>
<td>13</td>
<td>4</td>
<td>4</td>
<td>30</td>
<td>3</td>
<td>29</td>
<td>3</td>
<td>46</td>
<td>3</td>
<td>31</td>
</tr>
<tr>
<td>fsm317</td>
<td>12</td>
<td>4</td>
<td>4</td>
<td>38</td>
<td>3</td>
<td>37</td>
<td>3</td>
<td>56</td>
<td>3</td>
<td>49</td>
</tr>
<tr>
<td>fsm318</td>
<td>16</td>
<td>8</td>
<td>2</td>
<td>70</td>
<td>5</td>
<td>60</td>
<td>4</td>
<td>67</td>
<td>4</td>
<td>77</td>
</tr>
<tr>
<td>fsm319</td>
<td>14</td>
<td>8</td>
<td>4</td>
<td>92</td>
<td>7</td>
<td>79</td>
<td>7</td>
<td>106</td>
<td>7</td>
<td>110</td>
</tr>
<tr>
<td>fsm320</td>
<td>8</td>
<td>4</td>
<td>12</td>
<td>17</td>
<td>2</td>
<td>16</td>
<td>2</td>
<td>25</td>
<td>2</td>
<td>17</td>
</tr>
<tr>
<td>fsm321</td>
<td>10</td>
<td>4</td>
<td>4</td>
<td>49</td>
<td>3</td>
<td>36</td>
<td>3</td>
<td>59</td>
<td>3</td>
<td>44</td>
</tr>
<tr>
<td>fsm322</td>
<td>13</td>
<td>4</td>
<td>12</td>
<td>40</td>
<td>2</td>
<td>41</td>
<td>2</td>
<td>49</td>
<td>2</td>
<td>44</td>
</tr>
<tr>
<td>fsm323</td>
<td>102</td>
<td>4</td>
<td>16</td>
<td>692</td>
<td>7</td>
<td>714</td>
<td>7</td>
<td>664</td>
<td>7</td>
<td>720</td>
</tr>
<tr>
<td>fsm324</td>
<td>5</td>
<td>4</td>
<td>12</td>
<td>17</td>
<td>2</td>
<td>13</td>
<td>1</td>
<td>19</td>
<td>2</td>
<td>16</td>
</tr>
<tr>
<td>fsm325</td>
<td>13</td>
<td>4</td>
<td>4</td>
<td>13</td>
<td>2</td>
<td>13</td>
<td>2</td>
<td>17</td>
<td>2</td>
<td>16</td>
</tr>
<tr>
<td>fsm326</td>
<td>12</td>
<td>4</td>
<td>12</td>
<td>30</td>
<td>2</td>
<td>29</td>
<td>2</td>
<td>38</td>
<td>2</td>
<td>35</td>
</tr>
<tr>
<td>fsm327</td>
<td>7</td>
<td>4</td>
<td>4</td>
<td>15</td>
<td>2</td>
<td>12</td>
<td>2</td>
<td>21</td>
<td>2</td>
<td>17</td>
</tr>
<tr>
<td>fsm328</td>
<td>7</td>
<td>8</td>
<td>12</td>
<td>29</td>
<td>3</td>
<td>25</td>
<td>2</td>
<td>29</td>
<td>3</td>
<td>30</td>
</tr>
<tr>
<td>fsm329</td>
<td>38</td>
<td>10</td>
<td>8</td>
<td>92</td>
<td>4</td>
<td>90</td>
<td>3</td>
<td>84</td>
<td>3</td>
<td>135</td>
</tr>
<tr>
<td>fsm330</td>
<td>8</td>
<td>4</td>
<td>4</td>
<td>12</td>
<td>2</td>
<td>11</td>
<td>2</td>
<td>18</td>
<td>2</td>
<td>12</td>
</tr>
<tr>
<td>fsm331</td>
<td>13</td>
<td>4</td>
<td>4</td>
<td>29</td>
<td>3</td>
<td>26</td>
<td>2</td>
<td>36</td>
<td>2</td>
<td>36</td>
</tr>
<tr>
<td>fsm332</td>
<td>38</td>
<td>8</td>
<td>10</td>
<td>111</td>
<td>4</td>
<td>109</td>
<td>3</td>
<td>91</td>
<td>3</td>
<td>124</td>
</tr>
<tr>
<td>fsm333</td>
<td>13</td>
<td>4</td>
<td>12</td>
<td>42</td>
<td>2</td>
<td>42</td>
<td>2</td>
<td>44</td>
<td>2</td>
<td>47</td>
</tr>
<tr>
<td>fsm334</td>
<td>17</td>
<td>2</td>
<td>2</td>
<td>15</td>
<td>2</td>
<td>13</td>
<td>2</td>
<td>20</td>
<td>2</td>
<td>13</td>
</tr>
<tr>
<td>fsm335</td>
<td>33</td>
<td>12</td>
<td>8</td>
<td>212</td>
<td>5</td>
<td>173</td>
<td>5</td>
<td>146</td>
<td>5</td>
<td>318</td>
</tr>
<tr>
<td>fsm336</td>
<td>9</td>
<td>8</td>
<td>4</td>
<td>152</td>
<td>6</td>
<td>130</td>
<td>8</td>
<td>158</td>
<td>5</td>
<td>155</td>
</tr>
<tr>
<td>fsn338</td>
<td>16  8  16</td>
<td>135  4</td>
<td>130  4</td>
<td>126  4</td>
<td>150  6</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>--------</td>
<td>----------</td>
<td>-------</td>
<td>-------</td>
<td>-------</td>
<td>-------</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>fsn339</td>
<td>17  4  4</td>
<td>82   4</td>
<td>77   4</td>
<td>106  4</td>
<td>96   5</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>fsn340</td>
<td>7   1   1</td>
<td>4    1</td>
<td>4    1</td>
<td>5    1</td>
<td>4    1</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>fsn341</td>
<td>9   4   12</td>
<td>35   2</td>
<td>32   2</td>
<td>36   2</td>
<td>36   2</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>fsn342</td>
<td>9   4   12</td>
<td>33   2</td>
<td>29   2</td>
<td>33   2</td>
<td>34   2</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>fsn343</td>
<td>7   4   12</td>
<td>23   2</td>
<td>21   2</td>
<td>31   2</td>
<td>29   2</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>fsn344</td>
<td>7   4   12</td>
<td>22   2</td>
<td>20   2</td>
<td>27   2</td>
<td>24   2</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>fsn345</td>
<td>13  4   12</td>
<td>42   2</td>
<td>37   2</td>
<td>47   2</td>
<td>41   2</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>fsn346</td>
<td>38  10  10</td>
<td>131   5</td>
<td>127   4</td>
<td>105   3</td>
<td>166   6</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>fsn347</td>
<td>25   3    3</td>
<td>49    3</td>
<td>44    3</td>
<td>49    3</td>
<td>47    3</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>fsn348</td>
<td>14   4    4</td>
<td>24    2</td>
<td>23    2</td>
<td>31    2</td>
<td>28    2</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>fsn349</td>
<td>14   4   12</td>
<td>40    2</td>
<td>35    2</td>
<td>41    2</td>
<td>39    2</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

| Σ      | 21682 | 1051  | 20167 | 1009  | 23714 | 1044  | 26336 | 1204  |
| Δ%     | -7.0  | -4.0  | 9.4   | -0.7  | 21.5  | 14.6  |
Biography

Aleksander Ślusarczyk was born in 1974 in Warsaw, Poland.

After receiving primary and secondary education in Warsaw, he enrolled in 1993 at the Faculty of Electronics and Information Techniques at the Warsaw University of Technology. During academic year 1996/7 he studied at the University of Wales Swansea in Great Britain. In 1998, he graduated with honours from Warsaw University of Technology in the specialty Computer Construction and Programming.

In the same year, he joined the faculty of Electronics of Technische Universiteit Eindhoven in Eindhoven, the Netherlands as a Ph.D. candidate. He performed there the research on Finite State Machine decomposition and encoding for FPGA implementation under the supervision of prof. Mario Stevens and dr. Lech Jozwiak. He hopes to be able to defend the thesis on this subject on 15 December 2004.

In June 2004 Aleksander Ślusarczyk joined Topic Automatisering, where he is currently active in the area of system design of medical equipment.