MULTIPLIER EXPERIMENT NOTES 2023
I collected my notes about gathering benchmarks and running tools for multiplier verification.Gathering Benchmarks
Here is the list of generators we used when generating the designs given in the paper:- All combinations from multiplier generator from Homma/AOKI Labotory. The benchmarks used to be obtained here: https://www.ecsis.riec.tohoku.ac.jp/views/amg-e. However, it seems this website is taken down now. For 64x64, we gathered all the possible combinations when running our experiments, which amounts to 192 different design samples.
- A small set of benchmarks are generated from https://github.com/amahzoon/genmul. This generator does not have Booth encoding, and it was very slow when generating larger multipliers.
- Finally, a large set of multiplier are generated using
https://github.com/temelmertcan/multgen.
For isolated nxn-bit multipliers, we have all the possible combinations for
all sizes up to 256x256-bit multipliers. This generator supports Booth
radix-2/4/8/16. This generator also creates dot-product, multiply-add, and
truncated and/or shifted multipliers.
I used a python script to also create over 10000 randomly parameterized benchmarks from this generator. These include multiply-add, dot-product, truncation, shifting, and random operand sizes. We used these to stress-test our tool. Other tools do not support these configurations.
As an example, below are the commands to generate an example testbench:
git clone https://github.com/temelmertcan/multgen tem-multgen
cd tem-multgen
make
./multgen -type FourMult -tree WT -pp SB4 -adder KS -in1size 32 -in2size 32
This will generate a 32x32-bit multiplier with signed Booth radix-4 encoding with Wallace tree and Kogge-Stone adder. Name of the output file isMerged_WT_SB4_KS_32x32_noX_multgen.sv
with top module name:Merged_WT_SB4_KS_32x32_noX
The noX suffix indicate that the design is free of Verilog don't-care (X) values. VeSCmul can handle these cases but it seems the yosys/abc route to generate the AIGs for other tools do not support Xes in the design, so this generator is instructed to output a design that specifically excludes X values. This is a subtle hardware verification problem that does not seem to be addressed frequently in publications.
Running VeSCmul
Thousands of experiments are run for VeSCmul, including randomly genererated designs. These are packaged into a zip file for others to run easily. A makefile and some python scripts create random designs, run the proofs automatically, and collect them into a single text file. Only thing that users need to do is build ACL2, which should be a more or less straightforward process.
Running RevSCA2
- First step is to create AIGs from Verilog designs. There are various
ways to do this. I used yosys and abc because they are easier to call from
shell when running hundreds or thousands of experiments. Apparently, this
is what the owners of RevSCA and Amulet also use.
Below are the commands we used to parse combinational Verilog designs and convert them to AIGs:
yosys Merged_WT_SB4_KS_32x32_noX_multgen.sv -p "hierarchy -top Merged_WT_SB4_KS_32x32_noX" -p proc -p techmap -p flatten -o tmp2.blif
abc
read "tmp2.blif" strash write "origdesign.aig" These commands will create an AIG file named origdesign.aig.
- I pulled revsca from this git repo: https://github.com/amahzoon/RevSCA-2.0. It appears that the source code is not available and only an executable (that runs on Linux) is given.
- Then revsca2 can be run as follows:
./revsca2 origdesign.aig revsca.out -s
Some notes and observations:
- I measured the time consumed only when revsca is running for proof-time averages and not Verilog processing times.
- For some multipliers (e.g., Merged_WT_SB4_KS_32x32_noX), revsca2 claimed them to
be buggy, and printed:
"The multiplier is buggy!
Remainder:[1;37m-92233...."This seems to be a false-negative as Amulet and VeSCmul could prove this multiplier correct. For other similar configurations, revsca said some of them to be correct and some to be buggy, where all of them are actually proved to be correct.
- To run the 1000+ experiments given in the paper, I created a python script to run all of them and collect the results in a systematic manner.
Running AMulet2
- Followed the same steps when running Revsca to create AIGs from Verilog:
yosys Merged_WT_SB4_KS_32x32_noX_multgen.sv -p "hierarchy -top Merged_WT_SB4_KS_32x32_noX" -p proc -p techmap -p flatten -o tmp2.blif
abc
read "tmp2.blif" strash write "origdesign.aig" These commands will create an AIG file named origdesign.aig.
- Pull AMulet2 from https://github.com/d-kfmnn/amulet2.
- Amulet2 requires other tools to be installed for the complete verification flow. These are:
- libgmp (https://gmplib.org/) to build amulet2
- Cadical SAT solver to check equivalence after adder substitution. (we used version 1.2.1)
- DRAT-trim to check the proof done by Cadical
- Pacheck2 to check the correctness of proof run by amulet2 (https://github.com/d-kfmnn/pacheck2)
- After building all the tools, below commands are used to verify multipliers and check the proof certificates:
src/amulet2/amulet -substitute origdesign.aig out2.cnf out2.aig
src/cadical/build/cadical out2.cnf out2.proof
src/amulet2/amulet -certify out2.aig out2.polys out2.pac out2.spec -signed
src/drat-trim/drat-trim out2.cnf out2.proof
src/pacheck2/pacheck out2.polys out2.pac out2.spec
Some notes and observations:
- I measured the total time from these events excluding Verilog processing times.
- Amulet2 sometimes returned 1, which usually indicate an error in unix, but apparently this program does not actually mean an error. For example, in the substitute stage, it looks like returning 0 vs 1 is to indicate that the program made a change in the AIG.
- Certificates seem to occupy a large space in disk. I did not save the exact numbers but as multipliers grow, I was running out of disk space (allocated 200GB).
- Some problem in Amulet2 seems to be causing timeout in a large majority of the benchmarks I tried. It looks like amulet2 is working ok for Homma designs (#1 multiplier generator above) but not others (at least #3) . The proofs were running for a very long time even for simple 8x8 multipliers. This is observed in April 2023. I notified the owner of the amulet2 repo of the problem. Because of this, I ended up using Amulet1 when I collected proof-time results.