Artifact for Automated and Scalable Verification of Integer Multipliers by M. Temel, A. Slobodova and W. Hunt
This page serves as a guidance to reproduce the experiment results and test our tool for new inputs as described in our paper "Automated and Scalable Verification of Integer Multipliers" by Mertcan Temel, Anna Slobodova, Warren A. Hunt, Jr. to be published in CAV 2020 [available here]. In this paper, we present a new method to verify various integer multipliers efficiently. We implemented and tested our method using the ACL2 theorem prover. We ran experiments on various multipliers including Wallace-tree like multipliers with Booth encoding with sizes up to 1024x1024. In the first section below, we list the steps to be taken in order to run the same experiments as given in the paper to reproduce the results. In the following section, we describe how users may verify some other integer multiplier circuits.
1. REPRODUCING THE RESULTS
1.1 Running the benchmarks
We provide two options for users to run our benchmarks. First option is to use a VM image that has all the necessary tools and multiplier designs. Users may reproduce our results without having to install anything else. However, the size of this VM image is very large (~8.7GB) and this may cause issues for some users. The second option has only smaller multiplier designs (64x64-bit and 128x128-bit) and utilizes user's own ACL2 installation. This option may be a better alternative for people with more limited disk space and bandwidth (download size is 68MB).
1.1.1 Option 1: From a VM image
We provide a zipped VM image in OVA format generated with VirtualBox. [Download ~8.7GB]. This is a minimal Linux Debian image without any GUI, and it has all the tools and benchmarks included. It is set up for anyone to rerun our experiments easily.
We have tested this image on a Late 2014 model iMac 4 GHz Quad-Core Intel Core i7 with a 32 GB memory on OS X Catalina with VirtualBox 6.1. We set the memory of this image to be 24GB. Our experiments for very large circuits require a lot of memory. If you want to run the experiments for 512x512 or 1024x1024-bit multipliers, you should not reduce this memory limit and your host machine should have at least 32 GB of memory. If you do not have access to a system with this much memory, then you can reduce the memory of the image with respect to the available sources in your host machine, and still run some of the smaller experiments. The CPU speed does not need to be as high for any of the experiments.
Below are the steps to reproduce our results:
Step 1. Load the provided VM image using Virtual Box, or a similar program of your preference. Upon booting the image, you should see a terminal prompting for login credentials. Username is "root" and the password is "mert" (without quotes). If you need to use multiple terminals, you can switch to a different one with alt+right or alt+left.
debian login: root
Step 2. The home directory is /root/. Benchmarks and a Makefile are located under /root/benchmarks/. Type:
Step 3. ACL2 has a mechanism to save certificates for finished proofs so that the user wouldn't have to run the same proofs again. We released this VM image with proofs already run and certificates saved. You may see the timing results from our run in "results.txt" (use "less results.txt" to view the contents of that file). See Section 1.2 to interpret the results. In order to clean those certificates and run the experiments again, type "make clean".
Step 4. In order to reduce the size of the VM image, we zipped some very large design files. Before running the proofs again, you need to unzip them. If you would like to unzip only the files for 64x64 and 128x128-bit multipliers, you can type "make unzip-small". If you would like to unzip everything, you can type "make unzip"
Step 5. You may run different make commands to run the experiments. If you want to run all the experiments you may run "make all". If you want to run a specific size of multipliers you may run "make 64" (for 64x64-bit multipliers), "make 128", "make 256", "make 512" or "make 1024". If you cannot run larger multipliers, start from the smallest and increase as memory allows (if memory runs out, the program may be terminated by the OS without any warning). All the experiments finished in around 90 minutes on our machine. Note that there is a constant overhead while starting ACL2, therefore the total run time will be greater than the total accumulated verification time.
make 64 && make 128 ...
Step 6. The timing results from these experiments are parsed using a python script that goes through the certificate files and puts all the timing results in "result.txt" on the same directory (/root/benchmarks/). See the Section 1.2 to interpret the results.
"result.txt" will only have the timing results for each benchmark. If you would like to see the actual theorems proved for these multipliers, see the discussion in Section 1.4. Additionally, "result.txt" will only have results from our tool. We did not set up the tools from related work in this image to save space. You can set up these tools on the same image if you'd like.
1.1.2 Option 2: Using a local ACL2 installation
As a more convenient alternative to the VM image to produce our experiment results, we provide only the smaller benchmarks (64x64 and 128x128-bit multipliers) that you can verify with a local ACL2 installation. These benchmarks are available at: [Download 68MB]
In order to verify these benchmarks, follow the steps below:
Step 1. Install the most recent ACL2 from GitHub. More detailed instructions can be found here We recommended using SBCL as the Common Lisp implementation (faster) but CCL has been shown to work as well.
Step 2. Add ACL2 executable and cert.pl (acl2-directory/books/build/cert.pl) to PATH such that the commands acl2 and cert.pl execute respective utilities.
Step 3. Unzip the downloaded benchmark file. A Makefile and some LISP files (ACL2 scripts) are prepared to verify various 64x64 and 128x128-bit multiplier designs from three different multiplier generators. If you execute "make", your ACL2 will verify these designs.
Step 4. The timing results from these experiments are parsed using a python script. This script goes through the certificate files and puts all the timing results in "result.txt" on the same directory. See the Section 1.2 to interpret the results.
You may use these scripts to verify other designs automatically from the same benchmarks. Section 1.3 details how the files are organized.
1.2 Interpreting "result.txt"
When experiments finished running, the timing results for each multiplier correctness proofs are collected and saved in file "result.txt" under the main directory of the benchmarks. You may see the content of that file by typing "less result.txt".
An example snippet from "result.txt":
If you ran all the experiments, you will see timing results for almost 100 different multiplier circuits. Note that there are more benchmarks here than we listed in our paper, so you may see some results that are not present in our paper.
In result.txt, a descriptor line for a benchmark is followed by its timing results. An example descriptor line: "64x64 - UNSIGNED - SCA BP-4:2-CLA". This format and abbreviations are the same as in our paper. First part of the descriptor is the multiplication size (can be 64x64 through 1024x1024), the second is signed or unsigned, the third is the generator name (HOM, TEM, or SCA) and the last one is the type of multiplier with the same acronyms as listed in the paper. The complete list of such acronyms is given as follows.
Partial Product Encoding:
- BP = Booth Encoding
- SP = Simple Partial Products
- WT = Wallace Tree
- DT = Dadda Tree
- OS = Overturned Stairs Tree
- BDT = Balanced Delay Tree
- 4:2 - 4:2 Compressor Tree
- CWT = Counter-based Wallace Tree
- AR = Array Multiplier (this is a regular shaped multiplier, not Wallace-tree like)
Final Stage Adder:
- BK = Brent-Kung
- HC = Han-Carlson
- KS = Kogge-Stone
- LF = Ladner-Fischer
- RBCLA = Ripple-block Carry-look-ahead
- CLA = Carry Look-ahead
- CSKF = Carry skip adder with fixed block size
- RC or RP = Ripple-carry
Note that the timing results we give in our paper are produced on a machine with 32 GB memory and Intel(R) Xeon(R) CPU E1270 @ 3.50GHz (a 2011 model). If you run these experiments with a more limited memory (say 24GB), the garbage collector will be deployed more frequently, and you may see slightly poorer results for some larger multipliers.
1.3 Organization of Benchmarks
In this section, we describe how benchmarks are organized and how our Makefile runs and collects the results.
There are three sub-directories for each multiplier benchmark generator. These are: tem, sca-genmul and homma. Each of these folders have three relevant sub-directories: verilog-files, parsed-modules, and proofs3.
All the benchmarks in their originally generated forms (Verilog) are stored under <generator-name>/verilog-files/.
We need to convert these Verilog files to SVL format (refer to the preliminaries in the paper) in order to reason about them in ACL2. This is done with *.lisp files under <generator-name>/parsed-modules/. Those *.lisp files are ACL2 "books" that can be included by other books or during an interactive ACL2 session, and they contain parsed and SVL versions of each benchmark. You may view one of those LISP files by going into those directories in order to see how we parsed those Verilog designs.
The ACL2 books for proofs are located under <generator-name>/proofs3/. Here you will see various sets of LISP files, each stating (provable) conjectures for correctness of the benchmarks. Our Makefile calls ACL2 to certify these books, and calls a python script, which parses all the *.cert.out files under these directories to collect the timing results for each multiplier proof in a single file "result.txt". You may view these LISP files to see what is proved, but the theorems are generated automatically using a macro. Translating these macros may be challenging for users who are not familiar with ACL2. These macros also save the theorems in a separate file in a more user-friendly fashion. See Section 1.4 for a more detailed discussion.
Also note that we use the utility "cert.pl" to have ACL2 save certificates for *.lisp. Our Makefile actually calls "cert.pl" to save the certificates for each *.lisp book. If you would like to certify a book, you can type "cert.pl <file-name>".
1.4 Viewing the Statement of Actual Theorems
In this section, we describe how users may view the statement of theorems proved for multiplier designs.
A static library we proved and use to verify multiplier designs are distributed with ACL2. Users may view the LISP files under /root/acl2/books/projects/rp-rewriter/lib/mult3. The events in these files follow the rewriting algorithm described in our paper. However, we implemented our rewriting method using some complex meta rules. Therefore, the events in these files may be very difficult to interpret for an inexperienced ACL2 user.
During our proofs, we do not introduce any axioms, and we prove everything with the default configuration of ACL2. The discussion of axioms that ACL2 is built upon are given in the source file /root/benchmarks/acl2/axioms.lisp. The only part we have not verified is translation of Verilog designs into SVL designs, which is out of scope of our multiplier verification work.
When we run a proof for a specific design in our benchmarking tests, a macro automatically generates the conjectures to prove. This macro also saves the statement of these conjectures in a separate file when all the proofs are finished. After you follow the instructions in Section 1.1 and run the make command, *.summary files will be generated and saved under /root/benchmark/<generator-name>/proofs3/ directory. For example, if you run "make" or "make 64", you will see "64x64-proof.summary" files for each generator. You may view these files to see the theorems proved for each multiplier design. A snippet from such a file is given below.
This snippet has some user-friendly comments and two events stating the conjectures to be proved specific for a multiplier module (in this case, a 64x64 Dadda Tree multiplier with simple partial products and Brent-Kung final stage adder). For our method, we reason about adder modules before the multiplier module, so we first prove that the final stage adder is equivalent to our adder specification. Then, we prove the multiplier design correct. The *.summary file also has the theorems for half/full-adder but we omit these here for brevity.
This snippet has more components than described in our paper (we had summarized it for simplicity). In this snippet, "defthmrp" calls our verified clause processor that is designed specifically for conjectures that can grow into very large terms as multiplier designs. The keyword next to "defthmrp" is a unique name, and will be the name of the theorem proved. "svl::svl-run-phase-wog" is a primitive function that "svl-run" (described in our paper) calls. Since multiplier designs are combinational we call this function instead of "svl-run" to have a slightly better performance. "4vec-adder" is our specification function for vector-adder functions, and it is later rewritten in terms of s and c functions. "svl::bits" selects a portion of the given vector. "loghead" is equivalent to "truncate". And, the other part this snippet are fixed components that we have for all the multiplier design proofs.
2. USING THE TOOL FOR OTHER MULTIPLIER DESIGNS (NEW INPUT)
You need to have the most recent ACL2 installed from GitHub development snapshot. Alternatively, you can use the ACL2 installation in the VM image. However, it is recommended that you install ACL2 in your own machine for better performance. A detailed documentation to use our multiplier verification tool is available online. If you are familiar with ACL2, this documentation should be enough to get you started using our tool on new designs.
Users who have not installed ACL2 before can do so by following the instructions here. For the Common Lisp implementation, we recommend SBCL (faster for our proofs) or CCL (manages memory better which is essential for large proofs).
After you build ACL2, it is recommended to set up shortcuts for the ACL2 executable and book-certifier. They are located under your ACL2 directory (saved_acl2) and (books/build/cert.pl).
ACL2 contains many books that are developed by its users over the years. If you certify all the books, it would take around 4 hours. To run our tool, you do not need to certify all the books but some very large libraries that parses Verilog designs. To make sure that all the necesary libraries are certified, navigate to books/projects/rp-rewriter/lib/mult3/demo/ and execute "cert.pl *.lisp -j 4" (without quotes). You can change the argument to 'j' depending on the CPU cores available in your system. This might take almost an hour, but it will certify all the necessary books to use our tool. If you see "Built" messages for the demo files in this directory, then the result will be a success. If you encounter any issues, you can send an email. It is also worth noting that many users choose Emacs when working with ACL2. There is also a more user-friendly ACL2 version (ACL2s) that some users might prefer.
You can look at the online documentation or read the lisp files under books/projects/rp-rewriter/lib/mult3/demo/. We hope that it is enough information for users to test our tool for new inputs. We would also appreciate any feedback to improve this documentation as well as the tool.
If you would like to use the provided VM image, you can start ACL2 by simply typing "acl2" (without quotes). If you plan to test a large multiplier (say 512x512), it is recommended to use "acl2-ccl" (and use set-max-mem as described in the demo web-site above), which is slower but manages memory better. We use this for 512x512 and 1024x1024-bit multipliers. The "acl2" command starts ACL2 compiled with SBCL, and "acl2-ccl" starts ACL2 compiled with CCL.
If you plan to test some other circuits using the same generators as we have, you may simply copy/modify the existing files and add your circuit. Please refer to Section 1.3 to understand the organization of our benchmarks.