______________________________________________________________________
______________________________________________________________________

1/28/16		CSEE E6861y: CAD for Digital Systems	Handout #9a

______________________________________________________________________
______________________________________________________________________

HW #1, PROBLEM #1: INTRODUCTION TO ESPRESSO and ESPRESSO-EXACT.
______________________________________________________________________
______________________________________________________________________

This file gives an introduction to use of the CAD tools, "espresso"
and "espresso-exact", in the Berkeley SIS package.  It also gives 
details on HW#1 problem #1.

This assignment is designed to get you familiar with using CAD tools
for 2-level logic minimization (both heuristic and exact).  You will 
also gain some intuition as to the complexity of the problems 
that can be solved, and the pluses/minuses of the different algorithms.
But the main point of this assignment is just to get you started
with the CAD package itself.

THE ASSIGNMENT SHOULD NOT TAKE MUCH WORK!  You are basically learning
how to run the package.  You will simply apply 4 logic minimization tools 
to 12 different large benchmark combinational examples and collect results.  
You will then analyze and interpret the results, and hand in a short summary. 

It should take you 4-6 hours maximum to learn the tools and run them
on these examples, after you have downloaded the CAD tool.  It will
then take several more hours to put your results in a table, perform
some initial analysis, and write it up.

______________________________________________________________________
______________________________________________________________________

		      THE ESPRESSO TOOL

You will be using the CAD tool, "espresso", in the Berkeley SIS package.  
Actually, the espresso program is a SET OF TOOLS; you can invoke different 
operation by giving the program different "switches" (described below).

You will be running the espresso program to invoke 4 separate 
two-level logic minimization algorithms.

	1.  espresso-exact ("mincov"):		exact solution

	2.  espresso-exact, single-output:  	exact solution 

	3.  espresso:				heuristic solution

	4.  espresso, fast mode			heuristic solution 

Algorithms #1, #3, and #4 all target 'multi-output minimization',
where products (AND gates) can be shared among multiple
output functions.  

Algorithm #2 targets 'single-output minimization', where each 
individual output function is separately optimized, and there 
is no sharing of  products (AND gates) between
individual output functions.

Each of the heuristic algorithms has a "primary cost," i.e. target, 
of minimizing the number of products (i.e. cubes, implicants).  As a 
"secondary cost," i.e. once the # of products is minimized, each 
heuristic algorithm also tries to minimize the number of gate 
inputs to the AND and OR gates (i.e. # of literals).

______________________________________________________________________

ACCESSING SIS:

See the SIS web page in the class directory.

You have three options:  (i) download SIS package for Linux; 
(ii) download SIS package for PC's; (iii) run SIS on Acis machines.

______________________________________________________________________

ALGORITHM #1:  ESPRESSO-EXACT (multi-output mode).  

This algorithm is a more modern and improved version of the Quine-McCluskey 
version presented in class.  It is also described in Rudell's PhD thesis, 
chapter 2. 

It is *NOT* 'espresso' (which is heuristic only)!  However, for convenience,
the name 'espresso-exact' is used in the tool.  It is more commonly known
in the research literature as "MINCOV" (minimum-covering).

You do not need to know all the details of this algorithm, just
to understand that it (i) generates all primes; (ii) produces a prime
implicant table; (iii) reduces the table, by identifying essentials
and performing row and column dominance; and (iv) uses a sophisticated
"branch-and-bound" method to solve any remaining "cyclic core".

To call espresso-exact (after setting up your paths, discussed below), 
type the command:

		espresso -Dexact -t

The optional -t switch means "trace mode"; it outputs many useful 
statistics on the run including runtime breakdown of steps, and
cost of the final "cover" (AND-OR implementation) produced.  

______________________________________________________________________

ALGORITHM #2:  ESPRESSO-EXACT (single-output mode). 

 The same as #1 above, but now the algorithm is run multiple times, once 
for each individual output function.  Effectively, if the function has 
m inputs and n outputs, the algorithm treats it as n *distinct functions*, 
each with one output, and runs the algorithm in turn for each of these 
m functions, which have m inputs and 1 output.  No multi-output primes 
are generated, and no sharing of products across functions is allowed.  

To call espresso-exact in single-output mode, type the command:
 
		espresso -Dso -S1 -t

______________________________________________________________________

ALGORITHM #3:  ESPRESSO.  

This is the heuristic minimization algorithm which is being presented 
in class.

The algorithm begins with any initial "seed" cover, i.e.
a legal (but potentially sub-optimal) sum-of-products implementation 
listed in PLA format (see below).  It then performs an
(i) EXPAND operation, followed by (ii) IRREDUNDANT operation. 
It then has an inner loop which repeats 3 operations until 
there is no change:  REDUCE; EXPAND; IRREDUNDANT.  
This inner loop is used to improve the quality of the solution.

To call espresso, type the command:

		espresso -t

Note that in trace mode, you also can see how many iterations of
the espresso loop were executed.

______________________________________________________________________

ALGORITHM #4: ESPRESSO, FAST MODE.  

This heuristic algorithm is a simple modification of Algorithm #3, to 
have faster runtime.  However, the quality of results may be worse.

ESPRESSO-FAST MODE, also begins with an initial (i) EXPAND operation, 
followed by (ii) IRREDUNDANT.  However, at this point, the algorithm 
*terminates* and returns the result.  THERE IS NO INNER LOOP!  To call 
espresso, fast mode, type the command:

		espresso -efast -t

______________________________________________________________________

ESPRESSO HELP OPTION:

If you are interested, you can see a list of other espresso options
by typing the help command:	

		espresso -h

In addition, on the "Getting Started with SIS 1.3" web page, in the 
class directory, there is a link to a detailed and very useful manual
page; click on:

		"Espresso Man Page"

______________________________________________________________________

OPTIMIZATION MODES:

	a) OUTPUT PHASE OPTIMIZATION:  

	   	See the "Espresso Man Page" (listed above).  
		This optimization uses the
		command switch -Dopo.  It attempts to *complement*
		different combinations of the outputs, and implement
		some outputs complemented and others non-complemented.
		The SOP implementations for the complemented outputs
		then get added inverters.  In some cases, it is
		much simpler to implement the complement f' of 
		a function f.  Hence, this optimization considers
		the inversion of different combinations of the outputs
		and picks the best pattern of complemented and 
		non-complemented outputs.

		For example, to run Algorithm #3, espresso, in 
		this optimization  mode, you would type:

			espresso -t -Dopo <filename>


	b) USE 2-to-4 DECODERS ON SOME INPUTS:  

	        See the "Espresso Man Page" (listed above).  This 
		optimization uses the command switch -Dpair.  

		NOTE:  unlike optimization a),
		this optimization should be applied *after*
		you have already minimized the function (with one
		of the above 4 algorithms); the optimization is 
		applied on the resulting minimized PLA file, and
		attempts to do further minimization.

		This optimization heuristically looks for pairs
		of inputs, for example A and C, feeds this pair
		through a 2-to-4 decoder, and then takes the 
		decoded *4-BIT OUTPUT* of the decoder, e.g. Z0, Z1,
		Z2, and Z3, and modifies the original function 
		to replace the pair of original inputs with the
		4-bit decoder output.

		Various other input pairs are also identified for
		decoding.

		The idea is that sometimes, decoders can be added
		to inputs, thus implementing a "predecoding" step".  
		The larger number of resulting decoder outputs is then
		used as input to a NEW FUNCTION (i.e. a function
		block which takes inputs *after* the initial decoding).
		In a number of cases, even after counting the cost
		of the extra input decoders, the resulting function
		block may have a lower cost implementation.

		For example, to run Algorithm #3, espresso, in 
		this optimization  mode, you would type:

			espresso -t -Dpair <filename>
		
______________________________________________________________________
______________________________________________________________________

			PLA FILE FORMAT

   (also known as "CUBICAL REPRESENTATION" or "CUBICAL COMPLEX")
			
Both the input file and output file of the espresso tools are a multi-output
function, in "PLA format" (also called "cubical representation"), as 
discussed in class.  See also the SIS home page, which includes links
on PLA format.

A PLA file has a header, body and footer (the footer is sometimes 
optional).  The body describes products or implicants.

--------------
PLA Input File
--------------

A PLA input file for a multi-output function typically describes 2 SETS:
the (i) ON-set ("F") and (ii) DC-set ("D").  

This "FD mode" is the DEFAULT MODE for espresso.  Any minterm which is 
not specified in the ON-set or DC-set is implicitly assumed to belong 
to the OFF-set ("R").

(Be careful: there are slightly different meanings to the 1/0/- of outputs
in other modes such as FR.  We're not dealing with those modes now.)

A "1" in an output field describes an ON-set cube,
and a "-" in an output field describes an DC-set cube.

IMPORTANT:  Remember that a "0" in an output field 
simply means that the cube "does not contribute" to 
that output; it DOES *NOT* INDICATE AN OFF-SET MINTERM!

NOTE: In old notation, the "-" is sometimes indicated by "2".

For example, consider a following truth table for a function
with 3 inputs (a,b,c) and 2 outputs (f1 f2):

		a  b  c  |  f1 f2
	      ----------------------
		0  0  0  |  1  0 
		0  0  1  |  1  1 
		0  1  0  |  0  0 
		0  1  1  |  0  0 
		1  0  0  |  0  - 
		1  0  1  |  1  1 
		1  1  0  |  1  1 
		1  1  1  |  1  0 

A PLA input file for this function is the following:

---------
test1.pla
---------

.i 3
.o 2
.ilb a b c
.ob f1 f2
.p 6
00- 10
-01 11
1-1 10
11- 10
110 11
100 0-
.e

Field ".i" indicates number of inputs, ".o" indicates number of 
outputs, and ".p" indicates number of cubes (products) listed.  
Note that some of the header fields are *optional*, including 
".ilb" (names of inputs) and ".ob" (names of outputs).  

---------------
PLA Output File
---------------

A PLA output file for a multi-output function is used to describe
a COVER of the function, i.e. the sum-of-products solution, or LIST
OF CUBES.  The format is similar to a PLA input file, but the meaning 
is different.  A cover simply consists of a list of prime implicants.  If 
an implicant contributes to an output, there is a "1" in the output field.  
If the implicant does not contribute to the output, the output field 
has a "0".  

Inherently, a cover -- which represents an implementation -- 
has *no DC's* in the output part.  Each cube listed either contributes
or does not contribute to particular outputs, and the PLA output file
simply enumerates which cubes are selected, and to which outputs
they contribute. 

To generate a PLA output file for this "test1.pla" example,
you can run espresso-exact (Algorithm #1) on the input file.
To do this, first copy test1.pla to your own directly, then type:

	espresso -Dexact test1.pla > test1.sol

For simplicity, the "-t" trace option is omitted.
You can run with the option as well:

	espresso -Dexact -t test1.pla > test1.sol

______________________________________________________________________
______________________________________________________________________

			HOMEWORK #1, PROBLEM #1  


SET-UP:

You will be running the tools on examples from the SIS benchmark
suite.  These examples are accessible by following directions
on the SIS page.  They are downloadable as a tar file, or as
individual pla files.

These benchmark examples are also discussed in detail in Rudell's thesis,
at the end of chapter 2, which you are being given as Handout #8.

You will be looking at benchmark examples in three subdirectories
of the "ex" directory:  "indust", "math" and "random".  

Rudell classifies the examples from "indust" and "math"
into 5 categories (Table 2.2, p. 35):

		trivial
		noncyclic
		cyclic-solved ("cyclic-s")
		cyclic-unsolved ("cyclic-us")
		too many primes ("primes")

You will use the following 12 examples from subdirectories 
"indust", "math" and "random", and from the class web page; 
their classifications are also listed:

	"indust":

		prom1 (non-cyclic)
		bcc (cyclic-s)

		lin.rom (cyclic-s)
		x6dn (cyclic-s)

		max1024 (cyclic-us)
		cps (cyclic-s)

		spla  (cyclic-s)
		x7dn (primes)

		jbp (primes)

	"math":  

		add6 (noncyclic)
		tial (cyclic-s)

	"random":  
		
		test1


Note that 3 problems (x7dn, jbp, max1024) are classified as "unsolvable" 
in Rudell's table, using exact minimization.   In addition, 1 more example
is not listed in Rudell's table (test1).   
In principle, you should not be able to synthesize an 'unsolvable'
examplle using espresso-exact (Algorithm #1).

HOWEVER:  you should still still try running these benchmarks
for Algorithm #1, WITH A TIMEOUT OF 5 MINUTES.  In some cases, you may
find that earlier metrics for 'unsolvable' (from 1980's) may no
longer apply for modern machines! 

You can also find all of them in Rudell's PhD Thesis, pp. 38-40
(except the random "test1" benchmark).    
This table lists relevant information on the examples, as well 
as the results of runs using espresso-exact:  

	# inputs/outputs, 
	# initial product terms (in the *initial* unoptimized cover,
		i.e. the given PLA file in the examples subdirectory), 
	# primes implicants, 
	# essential prime implicants, 
	# products (i.e. primes) in the exact minimum-cost solution.   

Also, take a look at the actual .pla files for these examples,
before running them.  You should also read the relevant discussion 
in Rudell, pp. 31-36.  

______________________________________________________________________

DO THE FOLLOWING:

COPY the above 12 example pla files into your own directory.  
Also, be sure you have access to the espresso program.  (See above.)

    ==================================================================  

    A.  RUNS.  For each of the 12 examples, run the 4 algorithms
	described above.  Redirect the output into a file for each run.  
	The result will be 48 PLA output files, each describing a cover 
	for one of the examples using one of the algorithms.

	NOTE:  for the 3 unsolved benchmarks, and the 1 benchmark
	with unknown results (test1), see above.  You may or may not be 
	able to complete a run under 'espresso-exact'.  You still should
	do these runs, in verbose mode, with a TIMEOUT OF 5 MINUTES
	(as indicated above), to see where the algorithm gets stuck,
	or whether you can solve them.

	EXAMPLE:  If the PLA file bcc is in your current 
	directory, and you want to run Algorithm #1, espresso-exact, 
	type:
		espresso -Dexact -t bcc > bcc.sol1

	where sol1 indicates solution using Algorithm #1.  

		==> DO **NOT** HAND IN THESE FILES.

    ==================================================================  

    B.  RESULTS TABLE.  Using the header information in each 
    	of the 48 solution (.sol) files, set up the following table
	(again, the 3 'unsolved' benchmarks, you won't be able
	to complete an 'espresso-exact' run).


	1. FOR EACH OF THE 12 EXAMPLES, LIST:

		(a)  # INPUTS/# OUTPUTS

		(b) TOTAL # OF PRIMES* 

		(c) TOTAL # OF ESSENTIAL PRIMES* 

	NOTE:  The (a), (b) and (c) values are also listed
	Rudell's table (pp. 38-40).  

	*NOTE:  for Algorithm #2, which is the only algorithm
	you run which uses single-output minimization, there
	will be separate primes and essential primes for each 
	individual output function, and so the count will grow 
	very large.  Therefore, for (b) and (c), only write
	the single result for multi-output minimization, as 
	listed in Rudell's table.

	Also, the random benchmark "test1" is not included in Rudell.
	You will have to obtain these #'s from the 'espresso-exact'
	output file.


		(d)  I+O = TOTAL I/O 
		      The 'total I/O' = # INPUTS + OUTPUTS:  add these 
		      two numbers, and enter the single resulting sum 
		      in a  separate column.  This number is one metric 
  		      of the 'size' of the function.

		(e)  E/P RATIO: 
		      The E/P ratio =  total # essentials/total # primes,
		      i.e. (c)/(b).  This ratio is one metric of how
		      much the minimization problem reduces to the
		      problem of finding the function's essential
		      primes.

		(f)  E/C RATIO:
		      The E/C ratio =  
		           total # essentials/# products in exact solution
		      Here, the # products in exact solution is the result
		      of running Algorithm #1 on each benchmark.
		      The total # of essentials is given by (c).
		      This ratio is another metric of how
		      much the minimization problem reduces to the
		      problem of finding the function's essential
		      primes.


	2. FOR *EACH* OF THE 4 RUNS ON EACH EXAMPLE, LIST:

		(a) TOTAL RUNTIME
		    See the output of the run, as follows:
			Alg. #1/#2:    "exact     Time was [XXX] sec..."
			Alg. #3/#4:    "ESPRESSO  Time was [XXX] sec..."

		(b) RUNTIME RATIO
		    For Algorithms #2/3/4, indicate the ratio of
		    its runtime to that of Algorithm #1.
		    For example, for each benchmark, for Algorithm #2
		    	the ratio would be: 
		    	runtime(algorithm #2)/runtime(algorithm #1).

		(c) COVER SIZE (= quality of solution)
		    The cover size is the # of products in the 
		    final solution.  See the same line of output as (a):
		      "... Time was XXX sec, cost is c=X(Y)"; 
		       Here, 'X' is the answer you want =
			  # of products (cubes); it is also
			  indicated by the ".p" field
			  at top of the resulting pla solution file.

		(d) COVER SIZE RATIO
		    For Algorithms #2/3/4, indicate the ratio of
		    its cover size to that of Algorithm #1.
		    
		    For example, for each benchmark, for Algorithm #2
		    	the ratio would be: 
		    	cover-size(algorithm #2)/cover-size(algorithm #1)

		==> HAND IN THIS TABLE.

    ==================================================================  

     C. ANALYSIS OF RESULTS

     	In this part, you are to evaluate several parameters of each
	benchmark, and propose any correlation (or lack thereof)
	between the runtime improvements and quality of results of 
	algorithms #2-4, and these parameters.

	WHAT TO DO:  

	FOR EACH QUESTION BELOW, clearly analyze the results, highlighting and 
	justifying any trends (or lack of trend), indicating any anomalies, 
	and providing any explanation of the observed patterns of behavior.  

	==> Be sure to discuss results for several individual benchmarks,
	    to justify your arguments.

	GRADING NOTE:  Your writeup will be graded on how thoroughly and
	insightfully you cover the different comparisons.
	(A quick cursory evaluation will have points deducted.)

	FACTORS TO CONSIDER:
		     - total # of primes  
		     - total # of essentials
		     - total I/O
		     - E/P ratio
		     - E/C ratio
		       ... anything else of interest

	(a) ALGORITHM #1 vs. ALGORITHM #2:   EXACT METHOD COMPARISON

	Compare RUNTIME and COVER SIZE results for the benchmarks,
	and explain the differences observed.
	(3-4 SENTENCES)

	(b) ALGORITHM #1 vs. ALGORITHM #3:  EXACT vs. HEURISTIC COMPARISON

	Consider all of the above "FACTORS".  Highlight any 
	interesting cases, where the two algorithms have similar
	results, where they diverge significantly.
	Conclude with any trends, or partial trends, or lack of
	trends, as well as hypotheses on observed behavior.
	Also, draw any overall conclusions about the tradeoffs of 
	espresso vs. espresso-exact.
	(6-8 SENTENCES)

	(c) ALGORITHM #3 vs. ALGORITHM #4:  HEURISTIC ALTERNATIVES COMPARISON

	Do similar analysis to (b), again, but now comparing the
	baseline heuristic algorithm, espresso (#3), with the fast version (#4).
	Draw any overall conclusions about the tradeoffs of espresso vs. 
	espresso fast mode.
	(6-8 SENTENCES)
	
	==> HAND IN THIS WRITTEN ANALYSIS. 


    ==================================================================  
	
     D.  OPTIMIZATION EXPERIMENTS

	Try out the two optimizations listed above, -Dopo and -Dpair, 
 	on 2 examples:  
	
		tial (math)
		add6 (math)

	For each of the two benchmark examples, do the following:  

		 (a) OUTPUT PHASE OPTIMIZATION:
		     Re-run espresso (Algorithm #3) on the original given
			PLA file with the '-Dopo' option
			(see "Optimization Modes" (a) above).

			NOTE:  the selected phase assignment for the outputs
			is given by the line: "#.phase...".

		 (b) USING 2-to-4 INPUT DECODERS: 
		    first, run espresso (Algorithm #3) on the original given
		     PLA file and save it: 
		     	 e.g. espresso -t tial > tial.sol.
		     Then run espresso in -Dpair mode *on this
		     already-minimized initial solution*:
		     	 e.g. espresso -t -Dpair tial.sol.
		     (see "Optimization Modes" (b) above;
		       the man page indicates that -Dpair should
		       be applied after initial minimization).
			 

	Report these two results, for these 2 examples.
	Briefly indicate any trends you see, as to quality of
	improvements.
	
______________________________________________________________________
______________________________________________________________________