Draft Plonk intermediate opcode representation

,

Hi all,

This is a very rough draft of a possible ‘intermediate representation’ to represent Plonk programs. The goal is not to write circuits in this language, but to present an abstraction layer that higher level languages can compile down to.

The IR form will need to then be transformed into a lower level polynomial language before finally being converted into a proving key/verification key and any associated proofs.

The goal of the IR is to be the lowest-level representation that preserves the ‘intent’ of a program. i.e. the IR describes ‘what’ to implement in a circuit, but not ‘how’.

Spec is at https://hackmd.io/@zacwilliamson/r1z1yZTs8

4 Likes

Looks really nice!

A couple of things I would like to point out.

Addition

Should we just use fan-in-3 and accept variadic arguments for the add imputs:
add -> o = a + b + c (optional).
Looks reasonable to me, but not sure if there’s anything that I’m missing.

PartialOrd ops

With the actual definition/implementation that we have for plonk, we should specify whether the operations done by each opcode will provide an output or simply write the corresponding constraints inside the program memory. We maybe want to force them to provide an output as a boolean for example identifying if the inequation holds or not and do something with the result.

The problem is that gates like the boolean_gate are good just for constraining, but not for evaluating and providing a “true” or “false” result.

I have some other examples like these. Do you have any thoughts on that?

uint8/16/32/64
int8/16/32/64

I don’t see any reason to limit these types to power-of-2 widths. Integers with arbitrary bit widths are incredibly useful in circuit programming. bool can then be a synonym for u1.

@CPerezz is right about the boolean ops. Relation constraints should be expressed directly rather than in terms of boolean outputs. More generally, I personally favour embracing constraint programming, rather than imposing an input/output structure.

Types are needed for immutable and mutable tables (i.e. maps from an integer type to another arbitrary type).

I’m not sure I see the difference between writing an output and constraining.
When writing a SNARK program, I think it’s natural that
a=b
(where b is some expression in previous vars)
means we define a new var a and it is constrained to be equal to b

Hi, I have a proposal for floating point that uses 6 width-4 witness rows. This is on the arithmetic circuit side, and the gates require additional table lookups and range proofs, which also cost exponentiations.

Subject to corrections.

Floating point value e.g. (x_1, x_2, x_3) representing sign, significand and exponent.
For floating point values, range checks:

  1. x_1, y_1, z_1 in \{p-1, 1\}
  2. x_2, y_2, z_2 in significand range (e.g. 2^{22} \leq x < 2^{23})
  3. x_3, y_3, z_3 in exponent range (e.g. -2^{8}+2 \leq x < 2^{8}-1, with negative values -n represented by p-n)

Addition gate: (6 rows)
Wire values: x_1, x_2, x_3, y_1, y_2, y_3, z_1, z_2, z_3, e, r, d, v, v', s_a, s_b, s_c, s_1, s_2, s_3, s_4, s_5, s_6, i. (Total: 23)
Total 6 rows of 4 witness wires each for 1 floating point gate. Max degree (including mul/add selector & 2 (dynamic) branch selectors): 4.

d = x_3-y_3, i (lookup d in table i)

0 < d < 23 (s_1) and
s_3(z_2 = x_2 + v' and z_3 = x_3)
+s_4(2\times z_2 = x_2 + v' and z_3 = x_3 + 1)
+s_5(2\times z_2 = x_2 + v' + 1 and z_3 = x_3 + 1)
+s_6(z_2 = 2\times(x_2 + v') and z_3 = x_3 - 1)

i=1
v' = (y_1 \times x_1) \times v
y_2 = v \times e + r
0\leq r < e
Powers of two table: (d, e=2^d)\in t^*
e=1, r=0 for d=0

Flip x, y for 23 < d < 0 (s_2, i=2).

23 \leq d < 255 (s_a): z_j = x_j, i=3
-255 < d \leq-23 (s_b): z_j = y_j, i=4

d = 0, s_c:
TODO

Require range proofs for d.

Multiplication gate: (3 rows)
Wire values: x_1, x_2, x_3, y_1, y_2, y_3, z_1, z_2, z_3, s_1, r. (Total: 11)

Sign check: x_1 \times y_1 = z_1
Note that 2^{44} \leq x_2 \times y_2 < 2^{46}.

If 2^{45} \leq x_2 \times y_2 < 2^{46} (s_1):
x_2 \times y_2 = z_2 \times 2^{23} + r
0 \leq r < 2^{23}
x_3 + y_3 +1 = z_3

Else 2^{44} \leq x_2 \times y_2 < 2^{45} (1-s_1):
x_2 \times y_2 = z_2 \times 2^{22} + r
0 \leq r < 2^{22}
x_3 + y_3 = z_3

Note that for the multiplication branch, only one of the constraints can be satisfied. Hence, only an additional selector s is needed, while no range checks are needed.

For simplicity, any overflows result in an unsatisfied constraint.

1 Like

Sorry @daira and @arielgabizon, maybe I didn’t express myself correctly, my bad.
My point was about going in the input/output direction.

I had a long discussion with Kev where I expressed the following:

Suppose you have something like:
a < b AND c = d OR e < b AND c < d.

That’s not possible to express with the regular PLONK unless you write specific gadgets for it. And the main problem, is that even one side of the OR is allowed to not hold (if the other does), the way on which PLONK gates are applied in the program memory, will cause the whole proof to fail if any of the 4 checks does not hold.

This is a problem. And I think an easy way to solve this kind of problems and abstract them to the user is allow him/her to get a boolean variable as a result of each <, >, = assigment/check.

On that way, is able to AND, OR and XOR the results and also, able to constraint them to be true or false (so hold or not).

This would be similar to writing if in your programs and doesn’t enforce a dynamic circuit. But allows you to evaluate multi-conditional statements.

Let \mathcal{R} be a relation on a set X, and let T, F \in X be some values such that \mathcal{R}(T) holds but \mathcal{R}(F) does not. If you have two circuits, one that asserts \mathcal{R}(x) and another that asserts ¬\mathcal{R}(x), then a circuit that computes b : \mathsf{bool} = \mathcal{R}(x) can be constructed like this:

  • assert b : \mathsf{bool}
  • assert \phantom{¬}\mathcal{R}(b\ ?\ x : T)
  • assert ¬\mathcal{R}(b\ ?\ F : x)

Notice that this generic construction is concretely quite inefficient; it costs 3 + \mathsf{cost}(\mathcal{R}) + \mathsf{cost}(¬\mathcal{R}) constraints in an R1CS-like circuit model. It is fairly easy to see that, if the circuits to assert \mathcal{R} and ¬\mathcal{R} are well-optimized, then a lower bound on the cost of a circuit to compute \mathcal{R} should be \max\big(\mathsf{cost}(\mathcal{R}), \mathsf{cost}(¬\mathcal{R})\big) constraints (otherwise we would be able to use it to further optimize the assertion circuits). But there may be a significant gap (up to about a factor of 2) between the above generic solution, and an optimal one for a particular \mathcal{R}. Indeed we can improve on the above solution for range constraints, but I know of no way to do so generically for any relation.

I conclude that, if we care about circuit cost up to a factor of 2, then a general purpose circuit language and/or gadget library must distinguish between the "assert \mathcal{R}(x)", "assert ¬\mathcal{R}(x)", and "compute b = \mathcal{R}(x)" cases, allow them to be separately optimized, and ideally provide built-in/preoptimized implementations of all three for relations that are most commonly used as building blocks.

2 Likes