R1CS-like representations of PLONK programs

Here is an attempt to define a relatively low level language/intermediary representation (IR)
for turboPLONK programs that is similar to r1cs, in the sense of being defined by low-degree constraints between the program variables.
It is related but more high level to the definition of turboPLONK programs here

A program is defined by:
Selector Variables: q_1,...,q_l
Program degree d (usually d\in \{3,4\})
Program width w (usually w\in \{3,4\})
Gates: G_1,..,G_s
Each gate is a polynomial G(q_1,\ldots,q_\ell,X_1,\ldots,X_v) of degree at most d in the selector variables q_1,\ldots,q_l
and at most 2w additional variables.

Similar to r1cs, a program will have public variables, x_1,\ldots,x_\ell and private variables x_{\ell+1},\ldots,x_n.

Now the turboPLONK program will consist of a set of constraints of the following form:
Each constraint consists of a choice of gate G among G_1,\ldots,G_s, a choice of values
a_1,\ldots, a_\ell for the selector variables, a choice of program variables x_{i_1},\ldots,x_{i_v} to constrain.
The constraint itself enforces:
G(a_1,\ldots,a_\ell, x_{i_1},\ldots,x_{i_v}) =0

Given such a set of constraints and values c_1,\ldots,c_\ell for the public variables, TurboPLONK can be used to generate a zero-knowledge proof of knowledge for values c_{\ell+1},\ldots,c_n such that
c_1,\ldots,c_n satisfy all constraints.

Example:
This is what we call “Standard PLONK”.
We can use d=3,w=3 and only one gate, the arithmetic gate.
G(q_L,q_R,q_O,q_M,q_C,X_1,X_2,X_3) = q_L X_1 + q_R X_2 + q_O X_3 + q_M X_1 X_2 + q_C

This one gate, allows us in particular to do addition and multiplication.
E.g., if for program variables x,y,z we want to check
x+y = z.
We set q_L=1,q_R=1, q_O=-1, q_M=0,q_C=0 (and obviously the choice of program variables for the constraint to x,y,z).
To check x\cdot y =z we set
q_L=0,q_R=0, q_O=-1, q_M=1,q_C=0.

Maybe it’s more accurate to say a “constraint” is a subset of the gates G_1, \dots, G_s rather than a single gate? That way the cost is proportional to the number of constraints rather than proportional to the number of sets of constraints partitioned by the variables that they use?

Or maybe even more general than a subset, a map [s] \to \mathbb{F} but I guess this is not really useful since you can scale the coefficients in the gate if you want to.

So gates on same program variables can be combined without extra cost if they use disjoint selector variables; the compiler should perhaps check for such optimizations, but it’s not the case in general, so don’t know if it should be defined with subsets.
(I haven’t figured out a formula yet for the exact prover complexity as function of the gates.)

I think also for the programmer inserting the constraints, it’s perhaps easier not to think of the whole subset at once.

Here is a little thing we made for trying out writing programs with the standard plonk gate