Mertcan Temel

MULTIPLIER EXPERIMENT NOTES 2023

I collected my notes about gathering benchmarks and running tools for multiplier verification.

Gathering Benchmarks

Here is the list of generators we used when generating the designs given in the paper:

All combinations from multiplier generator from Homma/AOKI Labotory. The benchmarks used to be obtained here: https://www.ecsis.riec.tohoku.ac.jp/views/amg-e. However, it seems this website is taken down now. For 64x64, we gathered all the possible combinations when running our experiments, which amounts to 192 different design samples.
A small set of benchmarks are generated from https://github.com/amahzoon/genmul. This generator does not have Booth encoding, and it was very slow when generating larger multipliers.
Finally, a large set of multiplier are generated using https://github.com/temelmertcan/multgen. For isolated nxn-bit multipliers, we have all the possible combinations for all sizes up to 256x256-bit multipliers. This generator supports Booth radix-2/4/8/16. This generator also creates dot-product, multiply-add, and truncated and/or shifted multipliers.
I used a python script to also create over 10000 randomly parameterized benchmarks from this generator. These include multiply-add, dot-product, truncation, shifting, and random operand sizes. We used these to stress-test our tool. Other tools do not support these configurations.

As an example, below are the commands to generate an example testbench: git clone https://github.com/temelmertcan/multgen tem-multgen cd tem-multgen make ./multgen -type FourMult -tree WT -pp SB4 -adder KS -in1size 32 -in2size 32 This will generate a 32x32-bit multiplier with signed Booth radix-4 encoding with Wallace tree and Kogge-Stone adder. Name of the output file is Merged_WT_SB4_KS_32x32_noX_multgen.sv with top module name: Merged_WT_SB4_KS_32x32_noX

The noX suffix indicate that the design is free of Verilog don't-care (X) values. VeSCmul can handle these cases but it seems the yosys/abc route to generate the AIGs for other tools do not support Xes in the design, so this generator is instructed to output a design that specifically excludes X values. This is a subtle hardware verification problem that does not seem to be addressed frequently in publications.

Running VeSCmul

Thousands of experiments are run for VeSCmul, including randomly genererated designs. These are packaged into a zip file for others to run easily. A makefile and some python scripts create random designs, run the proofs automatically, and collect them into a single text file. Only thing that users need to do is build ACL2, which should be a more or less straightforward process.

Running RevSCA2

First step is to create AIGs from Verilog designs. There are various ways to do this. I used yosys and abc because they are easier to call from shell when running hundreds or thousands of experiments. Apparently, this is what the owners of RevSCA and Amulet also use.
Below are the commands we used to parse combinational Verilog designs and convert them to AIGs:
yosys Merged_WT_SB4_KS_32x32_noX_multgen.sv -p "hierarchy -top Merged_WT_SB4_KS_32x32_noX" -p proc -p techmap -p flatten -o tmp2.blif abc read "tmp2.blif" strash write "origdesign.aig"
These commands will create an AIG file named origdesign.aig.
I pulled revsca from this git repo: https://github.com/amahzoon/RevSCA-2.0. It appears that the source code is not available and only an executable (that runs on Linux) is given.
Then revsca2 can be run as follows:

./revsca2 origdesign.aig revsca.out -s

Some notes and observations:

I measured the time consumed only when revsca is running for proof-time averages and not Verilog processing times.
For some multipliers (e.g., Merged_WT_SB4_KS_32x32_noX), revsca2 claimed them to be buggy, and printed: "The multiplier is buggy! Remainder:[1;37m-92233...."
This seems to be a false-negative as Amulet and VeSCmul could prove this multiplier correct. For other similar configurations, revsca said some of them to be correct and some to be buggy, where all of them are actually proved to be correct.
To run the 1000+ experiments given in the paper, I created a python script to run all of them and collect the results in a systematic manner.

Running AMulet2

Followed the same steps when running Revsca to create AIGs from Verilog: yosys Merged_WT_SB4_KS_32x32_noX_multgen.sv -p "hierarchy -top Merged_WT_SB4_KS_32x32_noX" -p proc -p techmap -p flatten -o tmp2.blif abc read "tmp2.blif" strash write "origdesign.aig"
These commands will create an AIG file named origdesign.aig.
Pull AMulet2 from https://github.com/d-kfmnn/amulet2.
Amulet2 requires other tools to be installed for the complete verification flow. These are:
- libgmp (https://gmplib.org/) to build amulet2
- Cadical SAT solver to check equivalence after adder substitution. (we used version 1.2.1)
- DRAT-trim to check the proof done by Cadical
- Pacheck2 to check the correctness of proof run by amulet2 (https://github.com/d-kfmnn/pacheck2)
After building all the tools, below commands are used to verify multipliers and check the proof certificates: src/amulet2/amulet -substitute origdesign.aig out2.cnf out2.aig src/cadical/build/cadical out2.cnf out2.proof src/amulet2/amulet -certify out2.aig out2.polys out2.pac out2.spec -signed src/drat-trim/drat-trim out2.cnf out2.proof src/pacheck2/pacheck out2.polys out2.pac out2.spec

Some notes and observations:

I measured the total time from these events excluding Verilog processing times.
Amulet2 sometimes returned 1, which usually indicate an error in unix, but apparently this program does not actually mean an error. For example, in the substitute stage, it looks like returning 0 vs 1 is to indicate that the program made a change in the AIG.
Certificates seem to occupy a large space in disk. I did not save the exact numbers but as multipliers grow, I was running out of disk space (allocated 200GB).
Some problem in Amulet2 seems to be causing timeout in a large majority of the benchmarks I tried. It looks like amulet2 is working ok for Homma designs (#1 multiplier generator above) but not others (at least #3) . The proofs were running for a very long time even for simple 8x8 multipliers. This is observed in April 2023. I notified the owner of the amulet2 repo of the problem. Because of this, I ended up using Amulet1 when I collected proof-time results.

Running AMulet1

Very similar to the commands for amulet2. Details may be added later.