

### Compiling Esterel into Better Circuits and Faster Simulations

### Stephen A. Edwards

Department of Computer Science, Columbia University

www.cs.columbia.edu/~sedwards

sedwards@cs.columbia.edu

### **An Overview of Esterel**

Synchronous model of time: implicit global clock

Communication through wire-like signals

Two flavors of statement:

### **Combinational**

Execute in one cycle

emit present / if loop

### **Sequential**

Take multiple cycles

pause await sustain

### Sequencing and Decisions

```
emit A;
emit B;
pause;
loop
  present C then emit D end;
  present E then emit F end;
  pause;
end
         Ε
```

### Concurrency

```
    Parallel statements

                               start in same cycle
  await A; emit C
                               Block terminates once
  await B; emit D
                               all have terminated
];
emit E
             В
```

### The Abort Statement

**Normal Termination** Α abort pause; pause; В emit A when B; emit C B



### The Suspend Statement

```
suspend
  loop
    emit A; pause; pause
  end
when B
 A A
            В
                           В
                              B prevents A
                              from being emitted here;
                              resumed next cycle
                        B delays emission
                        of A by one cycle
```

# **Esterel for Hardware Synthesis**

### **Esterel for Hardware Specification**

### Why consider Esterel?

- Semantics more abstract than RTL
   More succinct descriptions faster and easier to write
- High-level semantics enable high-level optimizations
   State assignment a hierarchical problem
- High-level semantics enable more efficient simulation
   Semantics are more like an imperative program
- Esterel's semantics are deterministic
   Simulation-synthesis mismatches eliminated

### **Applications of Esterel**

Systems with complex (non-pipelined) control-behavior:

- DMA controllers
- Cache controllers
- Communication protocols

(Not processors)

### **Verilog More Verbose Than Esterel**

```
ase (cur_state)
                                         loop
                 // synopsys parallel_case
  if (pcsu_powerdown & !jmp_e & !valid_diag_window) begin
                                             await
  next_state = STANDBY_PWR_DN;
                                                 case [icu miss and
  else if (valid_diag_window | ibuf_full |
         jmp_e) begin
  next_state = cur_state;
                                                              not cacheable] do
  end
  else if(icu_miss&!cacheable) begin
  next_state = NC_REQ_STATE ;
                                                      await [normal ack or error ack]
  else if (icu_miss&cacheable) begin
  next_state = REQ_STATE;
                                                 end
  end
  else
        next_state = cur_state ;
                                                 case / [icu miss and
NC_REQ_STATE: begin
  if(normal_ack| error_ack) begin
next_state = IDLE;
                                                              cacheable] do
  end
        next state = cur state :
                                                      abort
REQ_STATE: begin
  if (normal ack) begin
                                                          await 4 normal ack;
  next_state = FILL_2ND_WD;
  else if (error_ack) begin
                                                      when error ack
  next_state = IDLE ;
  else next_state = cur_state ;
                                                 end
FILL_2ND_WD: begin
  if(normal_ack) begin
                                                 case [pcsu powerdown and
  next_state = REQ_STATE2;
  end
  else if (error ack) begin
                                                              not jmp e and
  next_state = IDLE ;
  end
  else next state = cur state ;
                                                              not valid diag window] do
end
REQ_STATE2: begin
   if(normal_ack) begin
                                                      await [pcsu powerdown and
  next_state = FILL_4TH_WD;
  else if (error_ack) begin
                                                                     not jmp e]
  next_state = IDLE ;
  else next_state = cur_state ;
                                                 end
FILL_4TH_WD: begin
                                             end;
  if(normal_ack| error_ack) begin
  next_state = IDLE;
  else next_state = cur_state ;
                                             pause
STANDBY_PWR_DN: begin
  if(!pcsu_powerdown | jmp_e ) begin
                                         end
  next_state = IDLE;
  else next_state = STANDBY PWR DN;
default: next_state = 7'bx;
```

endcase

### Why is Esterel More Succinct?

### Verilog:

```
Esterel:
```

```
REQ_STATE2: begin
  if(normal_ack) begin
  next_state = FILL_4TH_WD;
  end
  else if (error_ack) begin
  next_state = IDLE;
  end
  else next_state = cur_state;
end
```

abort
 await normal\_ack
when error\_ack

- Esterel provides cross-clock control-flow
- State machine logic represented implicitly
- Higher-level constructs like await

## **Generating Fast Circuits**

### **Basic Circuit Generation**



### **Basic Circuit Generation**

Berry's technique [1992] works, but is fairly inefficient:

Many combinational redundancies. E.g.,
 present A then emit B end;
 present C then emit D end
 produces two redundant OR gates



Many sequential redundancies
 One flop per pause can be very wasteful
 Touati, Toma, Sentovich, and Berry [1993–1997]
 proposed techniques to eliminate many, but requires reachable state space and only works on circuit.

### **Generating Fast Circuits**

Esterel's semantics match hardware. Translation is straightforward.

Nice feature: state space is well-defined and hierarchical (e.g., due to abort and concurrency).

Enables a hierarchical state assignment/synthesis procedure.

### A State Assignment Example

```
abort
    await A; await B
    await C
when D;
emit E;
pause;
  await F
  await G
```

### **Hierarchical States**

```
abort
              await B
    await A;
    await C
when D;
emit E;
pause;
  await F
  await G
```

Five Simple FSMs



Five Simple FSMs



Obvious questions:

- How should each state machine be encoded?
- Should state be shared between the AB/F and C/G machines?

### **General Problem Statement**

States in an Esterel program an arbitrary tree of sequential and parallel state machines. pause end end end end end

**Choosing an Encoding** 



- How should  $s_1, \ldots, s_4$  be encoded?
- Should  $s_2$  or  $s_3$  be shared with  $s_4$  or  $s_5$ ?

### **Choosing a Good Encoding**

Goal: The smallest circuit that meets a timing constraint

- 1. Start with largest, fastest circuit (one-hot, no sharing)
- 2. Estimate the slack at each state decision point by estimating how much the delay could be increased at that point while still meeting the timing requirement
- 3. Attempt to share states at the lowest decision point with the largest slack or reencode the widest-fanout decision point with sufficient slack.
- 4. Repeat steps 2–3 until no further gain possible



### **Automata Compilers**

Esterel is a finite-state language, so build an automata:

```
loop switch (s) { emit A; await C; case 0: A = 1; s = 1; break; emit B; pause case 1: if (C) { B = 1; s = 0; } break; end }
```

V1, V2, V3 (INRIA/CMA) [Berry, Gonthier 1992]

Fastest known code; great for programs with few states.

Does not scale; concurrency causes state explosion.

### **Netlist-based Compilers**



### **Discrete-Event Based Compilers**

SAXO-RT [Weil et al. 2000] Divides Esterel program into event functions dispatched by a fixed scheduler.

unsigned curr = 0x1;

if (curr & 0x2) f2();

curr |= next;

next = 0;

unsigned next = 0;

```
static void f1() {
    A = 1;
    curr &= ~0x1; next |= 0x2;

loop

emit A; await C;
emit B; pause
end

static void f2() {
    if (!C) return;
    B = 1;
    curr &= ~0x2; next |= 0x1;
}

void tick() {
    if (curr & 0x1) f1();
```

## **Generating Fast Simulations**

### My Previous Technique

```
if ((s0 \& 3) == 1) {
every R do
                                                      if (S) {
 loop
                                                         s3 = 1; s2 = 1; s1 = 1;
   await A;
   emit B;
                                                       } else
   present C then
                                                         if (s1 >> 1)
    emit D end:
   pause
                                                           s1 = 3;
 end
                                                        else {
 loop
                                                           if ((s3 \& 3) == 1) {
   present B then
    emit C end:
                                                              s3 = 2; t3 = L1;
   pause
                                                           } else {
 end
end
                                                              t3 = L2;
                                      s=2
   Esterel
                Concurrent Sequential
                                                           C code
   Source
                      CFG
                                       CFG
```

### My Previous Technique

- 1. Translate Esterel into a concurrent control-flow graph
- 2. Analyze static data dependencies
- 3. Schedule
- 4. Generate sequential control-flow graph by inserting context-switching code
- 5. Translate to C

Average Cycle Times on an UltraSparc-II



### **Comments on Previous Technique**

Much more efficient (can be  $100\times$ ) than netlist simulation.

Currently used within Synopsys' CoCentric System Studio for control-code generation.

Context-switching idea powerful, but does not have much insight into program behavior.

Adheres too closely to control dependencies; many more opportunities to reorder code and simplify scheduling.

### **New Technique**

- 1. Translate Esterel into a concurrent control-flow graph
- 2. Transform into Program Dependence Graph
- 3. Analyze static data dependencies
- 4. Insert control predicates to enable scheduling
- 5. Schedule
- 6. Generate sequential control-flow graph
- 7. Translate to C

### A Code-Generation Example

```
loop
   await A;
   emit B;
   present C then
      emit D end;
   present E then
      emit F end;
   pause
 end
 loop
   present B then
     emit C /end;
   pause
 end
```

### **Concurrent Control-Flow and the PDG**



### Splitting the PDG for Scheduling



### **Generating Code**



### Summary

Introduction to Esterel

Synchronous, Concurrent, Textual Language

Combinational and Sequential Statements

**Esterel for Hardware Synthesis** 

Higher-level, deterministic

More succinct than Verilog

Generating Fast Circuits

Existing translation produces many redundancies

Hierarchical state assignment problem

### Summary

### Simulating Esterel

Automata, Netlist, Discrete-Event, and Context-switching techniques

New Technique: Transform to PDG, insert predicates, schedule, generate code