Tags: tutorial
Tutorials 01–07 used the transpiler as a black box: write a @qkernel, call
transpiler.transpile(...), get an executable. This chapter opens the box.
It is aimed at contributors — readers who want to:
Debug a kernel that fails somewhere between tracing and emission
Write a custom compiler pass
Add a new backend (e.g., a different quantum SDK)
Simply understand what
transpile()actually does
We will walk a small @qkernel through the pipeline stage by stage using the
step-by-step public API on Transpiler, inspect the intermediate
representation at each step, and compare how two backends (Qiskit and
QURI Parts) turn the same plan into different circuits.
# Install the latest Qamomile through pip!
# !pip install qamomile1. The Pipeline at a Glance¶
Transpiler.transpile() is documented in
qamomile/circuit/transpiler/transpiler.py as a composition of ten passes.
The stages fall into four bands — frontend → inlining → analysis →
emission — separated by BlockKind transitions:
QKernel
│ to_block (tracing: Python AST → IR)
▼
Block [HIERARCHICAL]
│ substitute (optional rule-based replacement)
│ resolve_parameter_shapes (concretise Vector shape dims)
│ inline (remove CallBlockOperations)
▼
Block [AFFINE]
│ unroll_recursion (iterated inline ↔ partial_eval)
│ affine_validate (safety net for affine types)
│ partial_eval (constant fold + compile-time ifs)
│ analyze (dependency graph + I/O validation)
▼
Block [ANALYZED]
│ validate_symbolic_shapes (reject unresolved Vector dims)
│ plan (segment into C→Q→C)
▼
ProgramPlan
│ emit (backend-specific code generation)
▼
ExecutableProgram[T]Every pass is idempotent and exposed as a public method on Transpiler, so
you can run them one at a time and print the Block in between. That is the
single most useful debugging technique in Qamomile.
2. IR Vocabulary¶
Before we run any pass, let’s name the things we will be printing.
Block (qamomile.circuit.ir.block) is the container that flows through
the pipeline. It holds:
operations: ordered list ofOperationinstancesinput_values/output_values: SSAValues for the kernel’s signatureparameters: dict of unbound parameter names to theirValueskind: aBlockKindtag (TRACED,HIERARCHICAL,AFFINE, orANALYZED) indicating which invariants currently hold
BlockKind is the pipeline’s state machine. Each pass has a precondition
on kind and advances it on success. The progression is monotone:
TRACED → HIERARCHICAL → AFFINE → ANALYZEDAside: why is it called AFFINE?¶
In programming-language type theory, types are classified by how many times a value may be used:
| Flavour | Uses allowed | Example |
|---|---|---|
| Unrestricted (ordinary types) | 0 or more times | Python int, a classical bit — values you can copy freely. |
| Affine | at most once (0 or 1 times) | A qubit. |
| Linear | exactly once (discarding is forbidden) | Values you must consume. |
Qubits are affine because of the no-cloning theorem: the same quantum
state cannot be duplicated. Once q has been consumed by qmc.h(q), the
old q value is gone — only the new version q' is usable. That is
exactly what “at most once” means.
Discarding a quantum value without using it is still allowed (the final measurement is what typically consumes it, but forgetting a qubit is not a type error). This is why Qamomile picks affine rather than linear — linear types would require every qubit to be explicitly consumed.
BlockKind.AFFINE means the block has reached a shape where this affine
invariant (“each quantum value is used at most once”) can be checked.
AffineValidationPass does the actual check and raises AffineTypeError
on a violation.
Value (qamomile.circuit.ir.value) is an SSA-style typed value. Not
only Qubits but every IR value — Float, UInt, Bit, and so on — is
represented as a Value. Whenever the value is updated (by a gate, a
classical operation, or assignment), Value.next_version() produces a
fresh copy with a new version and uuid; the logical_id, type, and
metadata are preserved.
logical_id is a stable identifier that says “this is still the same
logical variable across SSA versions” — e.g. q = qmc.h(q) creates a new
Value whose logical_id matches the old one. It is not a mapping to
a physical qubit; backend qubit allocation happens later in emit via the
ResourceAllocator. The same mechanism is reused for classical values
such as Float parameters and Bit measurement results.
Metadata can tag a value as a parameter (with_parameter("theta")) or a
constant (with_const(2.0)).
Operation is the base of the operation hierarchy. Subclasses include:
| Subclass | Purpose | File |
|---|---|---|
GateOperation | H, RX, CX, … | ir/operation/gate.py |
MeasureOperation | Measurement | ir/operation/measurement.py |
ForOperation, IfOperation, WhileOperation | Control flow | ir/operation/control_flow.py |
CallBlockOperation | Call to another Block (removed by inline) | ir/operation/call_block_ops.py |
All control-flow ops implement the HasNestedOps protocol
(nested_op_lists() / rebuild_nested()) so passes can walk into loop and
branch bodies uniformly, without special-casing each operation type.
Every Operation also reports an operation_kind (QUANTUM, CLASSICAL,
HYBRID, CONTROL) — this is what the plan stage uses to segment the
block into classical / quantum / expval steps.
import qamomile.circuit as qmc
from qamomile.circuit.ir import pretty_print_block
from qamomile.circuit.ir.operation.call_block_ops import CallBlockOperation
from qamomile.circuit.ir.operation.control_flow import ForOperation
from qamomile.qiskit import QiskitTranspiler
transpiler = QiskitTranspiler()3. The Running Example¶
We need a kernel small enough to print but rich enough to exercise multiple stages:
A helper
@qkernel(to exercise inline)A
UIntparameter that drivesqmc.range(n)(to exercise partial_eval)A
Floatparameter we will keep unbound (to exercise emit’s parameter handling)
@qmc.qkernel
def entangle_pair(q0: qmc.Qubit, q1: qmc.Qubit) -> tuple[qmc.Qubit, qmc.Qubit]:
"""Helper subroutine. Inlined into its caller."""
q0 = qmc.h(q0)
q0, q1 = qmc.cx(q0, q1)
return q0, q1
@qmc.qkernel
def demo_kernel(n: qmc.UInt, theta: qmc.Float) -> qmc.Vector[qmc.Bit]:
q = qmc.qubit_array(n, name="q")
q[0] = qmc.h(q[0])
for i in qmc.range(n - 1):
q[i], q[i + 1] = entangle_pair(q[i], q[i + 1])
q[i + 1] = qmc.rz(q[i + 1], theta)
return qmc.measure(q)We will transpile with n=3 bound at compile time and theta kept as a
backend parameter.
def summarise(block):
"""Compact summary of a Block — we will call this after every pass."""
by_kind = {}
for op in block.operations:
by_kind[type(op).__name__] = by_kind.get(type(op).__name__, 0) + 1
return (
f"kind={block.kind.name:13s} "
f"ops={len(block.operations):>2d} "
f"breakdown={by_kind}"
)When the one-line summary is not enough,
qamomile.circuit.ir.pretty_print_block returns an MLIR-style textual
dump of the block — the fastest way to see what changed between two
passes. The depth argument controls how many layers of
CallBlockOperation to expand inline, so e.g. depth=1 previews what
inline will produce without running it.
4. Stage-by-Stage Walkthrough¶
We will now run each pass by hand. The summarise helper prints one line
per stage so the BlockKind and operation mix are easy to compare; drop
in pretty_print_block wherever you want the full picture.
4.1 to_block — tracing the Python function¶
to_block executes the decorated function under a tracer context. Every
qmc.h(...), qmc.range(...), and entangle_pair(...) call records an
Operation into the Block. Calls to other @qkernels become
CallBlockOperations — the body is not inlined yet.
bindings = {"n": 3}
parameters = ["theta"]
block = transpiler.to_block(demo_kernel, bindings=bindings, parameters=parameters)
print("after to_block: ", summarise(block))
print("parameters: ", list(block.parameters))
print(
"CallBlockOps: ",
sum(1 for op in block.operations if isinstance(op, CallBlockOperation)),
)
# Note: `CallBlockOperation`s may live inside a `ForOperation` body too —
# they are not necessarily in the top-level list.after to_block: kind=HIERARCHICAL ops= 5 breakdown={'QInitOperation': 1, 'GateOperation': 1, 'BinOp': 1, 'ForOperation': 1, 'MeasureVectorOperation': 1}
parameters: ['theta']
CallBlockOps: 0
Let’s look at the block itself. pretty_print_block renders it as
MLIR-style text; you can see that the for body still contains a live
call entangle_pair(...).
print(pretty_print_block(block))block demo_kernel [HIERARCHICAL] (n: UIntType, theta: FloatType) -> BitType {
parameters: [theta]
%q@v0 = QInitOperation()
%q[const(0)]@v1 = h(%q[const(0)]@v0)
%_@v0 = const(3) - const(1)
for %i in range(const(0), %_@v0, const(1)) {
%_@v0 = %i@v0 + const(1)
call entangle_pair(%q[%i@v0]@v0, %q[%_@v0]@v0) -> (%q[%i@v0]@v1, %q[%_@v0]@v1)
%_@v0 = %i@v0 + const(1)
%_@v0 = %i@v0 + const(1)
%q[%_@v0]@v1 = rz(%q[%_@v0]@v0, θ=param(theta))
%_@v0 = %i@v0 + const(1)
}
%q_measured@v0 = measure_vector(%q@v0)
}
With depth=1, the CallBlockOperation is expanded inside its call line
— the same shape inline will produce in the next stage, so you can
preview it without actually running the pass.
print(pretty_print_block(block, depth=1))block demo_kernel [HIERARCHICAL] (n: UIntType, theta: FloatType) -> BitType {
parameters: [theta]
%q@v0 = QInitOperation()
%q[const(0)]@v1 = h(%q[const(0)]@v0)
%_@v0 = const(3) - const(1)
for %i in range(const(0), %_@v0, const(1)) {
%_@v0 = %i@v0 + const(1)
call entangle_pair(%q[%i@v0]@v0, %q[%_@v0]@v0) -> (%q[%i@v0]@v1, %q[%_@v0]@v1) {
%q0@v1 = h(%q0@v0)
%q0@v2, %q1@v1 = cx(%q0@v1, %q1@v0)
return %q0@v2, %q1@v1
}
%_@v0 = %i@v0 + const(1)
%_@v0 = %i@v0 + const(1)
%q[%_@v0]@v1 = rz(%q[%_@v0]@v0, θ=param(theta))
%_@v0 = %i@v0 + const(1)
}
%q_measured@v0 = measure_vector(%q@v0)
}
The block is HIERARCHICAL: it may still contain calls to other blocks and
composite gates. block.parameters mirrors the parameters=["theta"]
argument we passed in. Any input not in parameters must either be bound
in bindings (like n) or consumed by trace-time Python code.
4.2 inline — flattening nested block calls¶
inline replaces every CallBlockOperation with the operations of the
target block, substituting SSA values so the result stays well-formed. Once
no CallBlockOperation remains, the block transitions to AFFINE.
def count_calls(ops):
total = 0
for op in ops:
if isinstance(op, CallBlockOperation):
total += 1
# Walk nested control-flow bodies so we count calls inside loops.
for child in getattr(op, "nested_op_lists", lambda: [])():
total += count_calls(child)
return total
block = transpiler.inline(block)
print("after inline: ", summarise(block))
print("CallBlockOps (deep):", count_calls(block.operations))
print("is_affine: ", block.is_affine())after inline: kind=AFFINE ops= 5 breakdown={'QInitOperation': 1, 'GateOperation': 1, 'BinOp': 1, 'ForOperation': 1, 'MeasureVectorOperation': 1}
CallBlockOps (deep): 0
is_affine: True
Pretty-printing again confirms that call entangle_pair(...) has vanished
and its body (h / cx) sits directly inside the for. The block’s
kind has advanced to AFFINE.
print(pretty_print_block(block))block demo_kernel [AFFINE] (n: UIntType, theta: FloatType) -> BitType {
parameters: [theta]
%q@v0 = QInitOperation()
%q[const(0)]@v1 = h(%q[const(0)]@v0)
%_@v0 = const(3) - const(1)
for %i in range(const(0), %_@v0, const(1)) {
%_@v0 = %i@v0 + const(1)
%q0@v1 = h(%q[%i@v0]@v0)
%q0@v2, %q1@v1 = cx(%q0@v1, %q[%_@v0]@v0)
%_@v0 = %i@v0 + const(1)
%_@v0 = %i@v0 + const(1)
%q[%_@v0]@v1 = rz(%q[%_@v0]@v0, θ=param(theta))
%_@v0 = %i@v0 + const(1)
}
%q_measured@v0 = measure_vector(%q@v0)
}
Notice what inline does not do: the ForOperation body still has
GateOperations, one per iteration of the original for inside
entangle_pair’s body. Inlining preserves control flow; unrolling happens
later in emit.
4.3 partial_eval — constant folding and compile-time if removal¶
partial_eval is composed of two sub-passes:
ConstantFoldingPass— foldsBinOp/CompOpnodes whose operands are all constants (or bound parameters) into literal values. Because we boundn=3, then - 1insideqmc.range(n - 1)collapses to2, making theForOperationbounds concrete.CompileTimeIfLoweringPass— when anIfOperation’s condition resolves at compile time, it is replaced by the selected branch’s operations. Measurement-backedIfOperations are left alone.
Note that ForOperation itself is not unrolled here. Loop unrolling, if
needed, is decided later by LoopAnalyzer during emit (see section 5),
so the ForOperations count does not drop in this stage.
block = transpiler.partial_eval(block, bindings=bindings)
print("after partial_eval:", summarise(block))
print(
"ForOperations: ",
sum(1 for op in block.operations if isinstance(op, ForOperation)),
)after partial_eval: kind=AFFINE ops= 4 breakdown={'QInitOperation': 1, 'GateOperation': 1, 'ForOperation': 1, 'MeasureVectorOperation': 1}
ForOperations: 1
If you left a UInt unbound and tried to use it as a loop bound, the
downstream validate_symbolic_shapes pass would raise
QamomileCompileError with the name of the offending value. That is the
pass whose job it is to convert “this kernel isn’t actually compile-time
structured” into a readable error rather than a confusing crash later.
4.4 analyze — dependency graph and I/O validation¶
analyze builds a dependency graph over values and checks two invariants:
The block’s inputs and outputs are classical (quantum I/O is only allowed for subroutine blocks, not entrypoints).
No
OperationKind.QUANTUMoperation receives a classical-typed operand whose value was computed from a measurement — concretely, the rotation angle inrx(q, theta)cannot be a classical value derived from an earlier measurement, because the backend would have to JIT a classical computation between measurement and gate.
This rule does not forbid dynamic quantum circuits: IfOperation and
WhileOperation are OperationKind.CONTROL, not QUANTUM, so control flow
conditioned on a measurement Bit (if bit: ..., while bit: ...)
passes the check. Quantum-typed values that survive a phi merge are also
explicitly exempt. Section 5 walks through which dynamic patterns are
allowed and which are rejected, with code examples.
On success, the block transitions to ANALYZED.
block = transpiler.analyze(block)
print("after analyze: ", summarise(block))after analyze: kind=ANALYZED ops= 4 breakdown={'QInitOperation': 1, 'GateOperation': 1, 'ForOperation': 1, 'MeasureVectorOperation': 1}
4.5 plan — segmenting into a ProgramPlan¶
plan walks the analyzed block, groups operations by OperationKind, and
assembles a ProgramPlan of ClassicalStep / QuantumStep / ExpvalStep
entries. The NisqSegmentationStrategy used by the default transpilers
enforces at most one QuantumStep — the canonical C→Q→C pattern.
plan = transpiler.plan(block)
for i, step in enumerate(plan.steps):
seg = step.segment
print(
f" step {i}: {type(step).__name__} ({type(seg).__name__}, {len(seg.operations)} ops)"
)
print("total unbound parameters:", list(plan.parameters)) step 0: QuantumStep (QuantumSegment, 4 ops)
total unbound parameters: ['theta']
The quantum segment also carries qubit_values and num_qubits so emit
knows how many qubit lines the backend circuit needs before it starts
placing gates.
4.6 emit — backend-specific code generation¶
emit hands the plan to an EmitPass for the target backend. The emit pass
allocates concrete qubit indices, then walks the quantum segment and calls
the backend’s GateEmitter protocol methods (emit_h, emit_rx, …) to
construct a native circuit.
executable = transpiler.emit(plan, bindings=bindings, parameters=parameters)
print("parameter_names: ", executable.parameter_names)
print()
print(executable.quantum_circuit)parameter_names: ['theta']
┌───┐┌───┐ ┌─┐
q_0: ┤ H ├┤ H ├──■───────────────┤M├─────────────────────────────
└───┘└───┘┌─┴─┐┌───────────┐└╥┘┌───┐ ┌─┐
q_1: ──────────┤ X ├┤ Rz(theta) ├─╫─┤ H ├──■───────────────┤M├───
└───┘└───────────┘ ║ └───┘┌─┴─┐┌───────────┐└╥┘┌─┐
q_2: ─────────────────────────────╫──────┤ X ├┤ Rz(theta) ├─╫─┤M├
║ └───┘└───────────┘ ║ └╥┘
q_3: ─────────────────────────────╫─────────────────────────╫──╫─
║ ║ ║
c: 3/═════════════════════════════╩═════════════════════════╩══╩═
0 1 2
The surviving parameter is exactly the one we kept (theta). All structural
decisions — qubit count, loop unrolling, which CX sits where — were resolved
at compile time.
4.7 Passes we skipped¶
Five passes are part of transpile() but we did not call them explicitly:
substitute— applies user-configuredSubstitutionRules to replace block targets or override composite-gate strategies. No-op when theTranspilerConfighas no rules.resolve_parameter_shapes— fills in{name}_dim{i}shape dims whenbindingsprovides a concreteVectororMatrixvalue, so thatarr.shape[0]resolves to a concreteUIntdownstream.unroll_recursion— fixed-point loop ofinline ↔ partial_evalfor self-recursive@qkernels (e.g. Suzuki–Trotter — see Tutorial 07). Terminates when the recursion bottoms out or raises if the bindings do not make the base case reachable.affine_validate— safety net that catches affine-type violations that slipped past frontend checks.validate_symbolic_shapes— rejects unresolvedVectorshape dims reaching aForOperationbound, with an actionable error message.
They are idempotent and cheap, so transpile() always runs them. As a pass
author you mostly care about the order: substitute and
resolve_parameter_shapes run before inline; affine_validate runs
after it; validate_symbolic_shapes runs after analyze so the
dependency graph is available.
5. Control Flow (if / for / while) Through the Pipeline¶
How the pipeline handles control flow spans several layers — what the frontend accepts, how each pass transforms it, and whether the backend supports runtime branching. This section ties those layers together. See tutorial 05 for the user-facing patterns; here we focus on the compiler’s view.
5.1 What the Frontend Accepts¶
@qkernel rewrites the function AST before tracing (see
ControlFlowTransformer in qamomile/circuit/frontend/ast_transform.py):
Python
if→emit_if(cond, true_branch, false_branch, ...)Python
for→for_loop(start, stop, step)orfor_items(dict)context managers
Because of this, using native Python if / for on runtime values behaves
intuitively: both branches / every iteration are traced. Supported loop
sources:
qmc.range(n)— symbolic bounds;ncan stay unbound and survive as aForOperationin the IR.qmc.items(d)— for dicts / sparse data. Always unrolled at compile time (ForItemsOperation).A bare
for i in <runtime_value>:— rejected. Always go throughqmc.range(...)orqmc.items(...).
while loops use the while_loop context manager. The condition must be
a measurement-backed Bit — classical variables or constants are
rejected downstream by ValidateWhileContractPass.
5.2 IR Representation¶
From qamomile/circuit/ir/operation/control_flow.py:
| Operation | Nested lists | Condition / bounds | Notes |
|---|---|---|---|
ForOperation | operations (body) | operands = [start, stop, step] (all UInt) | carries loop_var name |
ForItemsOperation | operations (body) | operands[0] is a DictValue | always unrolled at transpile time |
IfOperation | true_operations, false_operations | operands[0] is a Bit | phi_ops merge values post-branch |
WhileOperation | operations (body) | operands[0] (initial), operands[1] (loop-carried) | measurement-backed Bit required; optional max_iterations hint |
All four implement HasNestedOps, so passes walk into bodies uniformly via
nested_op_lists() / rebuild_nested() — no isinstance chains.
IfOperation carries Phi nodes (PhiOp) to merge values. When both
branches update the same logical value, readers past the if refer through
the phi to know which branch’s version they get.
5.3 Per-Pass Behavior¶
| Pass | IfOperation | ForOperation | WhileOperation |
|---|---|---|---|
inline | recurses into both branch bodies | recurses into body | recurses into body |
partial_eval | constant condition → replaced by selected branch (CompileTimeIfLoweringPass); measurement condition preserved | bound BinOps folded. No unrolling | untouched here |
analyze | phi edges enter the dependency graph | loop_var enters body deps | measurement-condition treated like a quantum operand |
validate_symbolic_shapes | — | unresolved Vector shape dim as a bound → rejected | — |
plan | OperationKind.CONTROL — creates a segment boundary | same | same |
emit | emitted as runtime if if the backend supports it | LoopAnalyzer.should_unroll() decides unroll vs native-loop | emitted as runtime while |
LoopAnalyzer.should_unroll()
(transpiler/passes/emit_support/loop_analyzer.py) unrolls when:
Loop bounds depend on an outer loop variable (dynamic nesting)
The body indexes an array with
loop_var(e.g.q[i])loop_varappears in aBinOp(e.g.i + 1,2 * i)
Our demo_kernel uses both q[i] and q[i + 1], so all three triggers
fire and the loop is unrolled at emit time — that’s why
executable.quantum_circuit is a flat sequence of CXs for two iterations.
A loop that hits none of these stays in the circuit as a native runtime
loop for backends that support it.
5.4 Quantum ↔ Classical Dependency Rule (analyze)¶
analyze enforces the invariant that quantum operations must not depend
on classical values derived from measurements:
# OK: a measurement Bit conditions a quantum gate
b = qmc.measure(q)
if b:
q = qmc.x(q)
# NG: a classical value derived from a measurement feeds a quantum gate
b = qmc.measure(q)
x = some_classical(b)
q = qmc.rx(q, x) # rejected by analyzeIn the first case the Bit is only used as an IfOperation condition —
no quantum operand type is rewritten. The second case requires JIT
compilation, which Qamomile does not support today. The plan stage
enforcing a single quantum segment is the other side of this guarantee.
5.5 Backend Runtime-Branching Support¶
Whether runtime if / while survives into the emitted circuit depends
on the backend’s MeasurementMode
(qamomile/circuit/transpiler/gate_emitter.py):
| Mode | Runtime if/while | Example |
|---|---|---|
NATIVE | Supported — conditional gates are emitted explicitly | Qiskit (e.g. QuantumCircuit.if_test) |
STATIC | Not supported — returns the pre-measurement state / operator | QURI Parts |
RUNNABLE | Fully supported, including runtime loops / branches | CUDA-Q (cudaq.run() path) |
Compiling a kernel that contains an IfOperation / WhileOperation on a
non-supporting backend will raise at emit time. It is up to the
contributor to know which mode applies when writing runtime branching.
5.6 Common Errors¶
ValidationError(analyze) — a classical value derived from a measurement was used as a quantum operand. Rewrite the pattern, or redesign to keep state on the quantum side.ValidateWhileContractPasserror — thewhilecondition is not a measurement-backedBit. Classical variables or constants are not supported aswhileconditions.QamomileCompileError(validate_symbolic_shapes) — an unresolvedVectorshape dim reached aForOperationbound. Concretise theVectorviabindings, or switch toqmc.items.Emit-time error — a runtime
ifreached aMeasurementMode.STATICbackend. Switch backend, or express the kernel differently.
6. Case Study: How MeasureQFixed Gets Compiled¶
For cases like reading out a Quantum Phase Estimation result, Qamomile exposes a one-line API for measuring a quantum register as a fixed-point floating value:
qf = qmc.cast(q, qmc.QFixed, int_bits=0) # Vector[Qubit] -> QFixed
phase = qmc.measure(qf) # QFixed -> FloatLet’s follow what happens at the IR level. MeasureQFixedOperation
enters the pipeline as a single OperationKind.HYBRID node — a logical
combination of a quantum measurement and a classical decode — and is
later split into the two halves just before segmentation.
6.1 A minimal demo kernel¶
A 3-qubit Hadamard superposition measured as a QFixed is enough to
exercise every step.
@qmc.qkernel
def qfixed_demo(n: qmc.UInt) -> qmc.Float:
q = qmc.qubit_array(n, name="q")
for i in qmc.range(n):
q[i] = qmc.h(q[i])
qf = qmc.cast(q, qmc.QFixed, int_bits=0)
return qmc.measure(qf)Run the pipeline up through analyze and inspect the block.
qfixed_bindings = {"n": 3}
qfixed_block = transpiler.to_block(qfixed_demo, bindings=qfixed_bindings)
qfixed_block = transpiler.inline(qfixed_block)
qfixed_block = transpiler.partial_eval(qfixed_block, bindings=qfixed_bindings)
qfixed_block = transpiler.analyze(qfixed_block)
print(pretty_print_block(qfixed_block))block qfixed_demo [ANALYZED] (n: UIntType) -> FloatType {
%q@v0 = QInitOperation()
for %i in range(const(0), const(3), const(1)) {
%q[%i@v0]@v1 = h(%q[%i@v0]@v0)
}
%q_as_qfixed@v0 = cast %q@v0 to QFixed[0.3]
%qfixed_measured@v0 = measure_qfixed(%q_as_qfixed@v0)
}
The trailing operation is a single measure_qfixed, with the
cast %q to QFixed[0.3] immediately above it. MeasureQFixedOperation
carries operation_kind=HYBRID; real backends only have quantum-side
measurement instructions, so somewhere in the pipeline the
“measure N qubits” and “decode bits to float” halves must be pulled
apart.
6.2 Where the split happens — plan’s pre-segmentation lowering¶
The split is done by lower_operations() in
qamomile/circuit/transpiler/passes/separate.py, which the plan
pass runs before segmentation. Every MeasureQFixedOperation is
replaced by:
MeasureVectorOperation— measures all qubits of the QFixed register as aVector[Qubit]into aVector[Bit](OperationKind.HYBRID, grouped at the tail of the quantum segment).DecodeQFixedOperation— decodes the bit array into a Float using the fixed-point encoding (OperationKind.CLASSICAL).
After this rewrite, segmentation simply places the two ops in their natural homes: the measurement at the end of the quantum segment, the decode in a classical segment after it.
You can watch the rewrite yourself by calling lower_operations
directly on the analyzed block.
from qamomile.circuit.transpiler.passes.separate import lower_operations
lowered = lower_operations(qfixed_block)
print(pretty_print_block(lowered))block qfixed_demo [ANALYZED] (n: UIntType) -> FloatType {
%q@v0 = QInitOperation()
for %i in range(const(0), const(3), const(1)) {
%q[%i@v0]@v1 = h(%q[%i@v0]@v0)
}
%q_as_qfixed@v0 = cast %q@v0 to QFixed[0.3]
%qfixed_bits@v0 = measure_vector(%qfixed_qubits@v0)
%qfixed_measured@v0 = decode_qfixed(%qfixed_bits@v0)
}
measure_qfixed has disappeared and been replaced by
measure_vector(...) + decode_qfixed(...). The final Float keeps
the same logical_id because lower_measure_qfixed reuses the
original result values, so value-based analyses are unaffected by
the split.
6.3 What reaches emit and the runtime¶
After lowering, the ProgramPlan lays out (conceptually):
QuantumStep (quantum segment): the gate sequence plus the
MeasureVectorOperation. The backend’semit_measure_vectoriterates the qubits and issues a normal per-qubitemit_measure, writing results into classical registers.ClassicalStep (role=
post) (classical segment): just theDecodeQFixedOperation. At run time,qamomile/circuit/transpiler/classical_executor.pytakes the measured bit string and decodes it to a Float.
From the backend’s point of view there is no “QFixed measurement” instruction; the abstraction exists only in Qamomile’s IR. The quantum hardware only ever sees ordinary measurement instructions.
6.4 Which pass touches which IR¶
| Pass | MeasureQFixedOperation | MeasureVectorOperation | DecodeQFixedOperation |
|---|---|---|---|
to_block | created | — | — |
inline / partial_eval / analyze | passes through | — | — |
plan (pre-segmentation lowering) | split and removed | created here | created here |
emit | — | per-qubit emit_measure | not touched (lives in the classical step) |
| runtime | — | backend runner executes measurements | classical_executor decodes bits to Float |
Together with CastOperation (which re-labels a value’s type without
allocating fresh qubits), this is a clean illustration of Qamomile’s
“reinterpret the classical meaning of a quantum register without
disturbing the quantum resources” design pattern.
7. Backend Emission: Qiskit vs QURI Parts¶
Every backend plugs into the pipeline by implementing two protocols defined
in qamomile/circuit/transpiler/:
GateEmitter[T](gate_emitter.py): the “how do I draw a gate” API. Methods includecreate_circuit(num_qubits, num_clbits) -> T,create_parameter(name) -> Any, and ~40 per-gate entry points (emit_h,emit_rx,emit_cx, …). It also advertises ameasurement_mode: MeasurementMode:Mode Meaning Used by NATIVEBackend has an explicit measurement instruction the emit pass should call. Qiskit STATICBackend takes the unmeasured state vector/operator; the sampler handles measurement externally. QURI Parts RUNNABLEBackend supports mid-circuit measurement with runtime control flow. CUDA-Q ( cudaq.run()path)CompositeGateEmitter[C](passes/emit.py): optional. Lets a backend short-circuit composite gates (QFT, QPE, …) with a native implementation. Thecan_emit(gate_type) -> bool/emit(...) -> boolcontract returnsFalseto opt out, in which case the emit pass falls back to the library-level decomposition.
A Transpiler subclass wires these together by overriding
_create_segmentation_pass and _create_emit_pass, plus executor() for
the runtime side. qamomile/qiskit/transpiler.py is the canonical ~50-line
reference implementation.
Let’s transpile the same kernel through QURI Parts and compare.
QURI Parts is an optional dependency — install with
pip install 'qamomile[quri_parts]' to reproduce this section locally.
try:
from qamomile.quri_parts import QuriPartsTranspiler
quri_transpiler = QuriPartsTranspiler()
quri_exe = quri_transpiler.transpile(
demo_kernel, bindings=bindings, parameters=parameters
)
print("backend circuit type: ", type(quri_exe.quantum_circuit).__name__)
print("parameter_names: ", quri_exe.parameter_names)
print()
for gate in quri_exe.quantum_circuit.gates:
print(" ", gate)
except ModuleNotFoundError:
# ``qamomile[quri_parts]`` is an optional dependency group — skip the
# side-by-side when it's not installed so this notebook still runs.
print("QURI Parts is not installed; skipping the side-by-side output.")backend circuit type: LinearMappedParametricQuantumCircuit
parameter_names: ['theta']
QuantumGate(name='H', target_indices=(0,), control_indices=(), classical_indices=(), params=(), pauli_ids=(), unitary_matrix=())
QuantumGate(name='H', target_indices=(0,), control_indices=(), classical_indices=(), params=(), pauli_ids=(), unitary_matrix=())
QuantumGate(name='CNOT', target_indices=(1,), control_indices=(0,), classical_indices=(), params=(), pauli_ids=(), unitary_matrix=())
ParametricQuantumGate(name='ParametricRZ', target_indices=(1,), control_indices=(), pauli_ids=())
QuantumGate(name='H', target_indices=(1,), control_indices=(), classical_indices=(), params=(), pauli_ids=(), unitary_matrix=())
QuantumGate(name='CNOT', target_indices=(2,), control_indices=(1,), classical_indices=(), params=(), pauli_ids=(), unitary_matrix=())
ParametricQuantumGate(name='ParametricRZ', target_indices=(2,), control_indices=(), pauli_ids=())
Three differences worth calling out:
Circuit type. Qiskit emits a
QuantumCircuitwith embeddedParameterobjects; QURI Parts emits aLinearMappedParametricQuantumCircuitwhose parameters are QURI PartsParameterinstances. Both round-trip through Qamomile’sparameter_namesthe same way.Measurement. Qiskit’s circuit ends in
measureinstructions (measurement_mode=NATIVE). QURI Parts’ circuit has no measurement gates — its executor handles sampling at run time (measurement_mode=STATIC).Composite gates. If the kernel used
qmc.qft(...), Qiskit’sQiskitQFTEmitterwould drop in aQFTGatebox, whereas the QURI Parts backend decomposes via the library pass — same IR, different realised circuit. You can override this per kernel viaTranspilerConfig.with_strategies({"qft": "approximate"}).
8. Pointers for Contributors¶
Writing a custom pass. Put it in qamomile/circuit/transpiler/passes/,
take a Block in and return a Block out, and assert your input kind
precondition up front. When you walk operations, use HasNestedOps —
never isinstance(op, ForOperation) chains — so future control-flow ops
are handled automatically:
def rewrite(ops):
new_ops = []
for op in ops:
if hasattr(op, "nested_op_lists"):
op = op.rebuild_nested([rewrite(child) for child in op.nested_op_lists()])
new_ops.append(transform(op))
return new_opsAdding a new backend. Minimum checklist:
Implement
GateEmitter[T]for your target SDK (Tis the SDK’s circuit type). Start fromqamomile/qiskit/emitter.py.Subclass
Transpiler[T]and implement_create_segmentation_pass(useNisqSegmentationStrategyunless you need something else) and_create_emit_passreturningStandardEmitPass(your_emitter).Implement a
QuantumExecutor[T]subclass so users can callexecutor().Optional: add
CompositeGateEmitters for QFT/QPE/etc. to preserve high-level structure in the emitted circuit.
Debugging a transpile error. Run the passes one at a time with
summarise(block) between them to track counts, and reach for
pretty_print_block(block) whenever you need to see the actual IR. The
stage where BlockKind fails to advance, the operation count explodes,
or an exception is raised is the stage you should look at first. Varying
pretty_print_block(block, depth=N) before and after inline makes it
much easier to spot where a value got disconnected or a phi got dropped.
9. Summary¶
The pipeline is an SSA-style IR moving through four kinds:
HIERARCHICAL— the raw trace, with block calls still unexpandedAFFINE— flat operations + control flow, no block callsANALYZED— validated, dependency-graphed, ready to segmentProgramPlan→ExecutableProgram[T]— segmented and emitted
Each pass has a narrow job and a precondition on BlockKind. The step-by-step
API on Transpiler exposes every pass publicly — treat it as your primary
debugging tool when a kernel misbehaves, and as the extension surface when
adding a pass or a backend.
Control-flow highlights:
if/forare rewritten at trace time intoIfOperation/ForOperation/ForItemsOperation/WhileOperationIR nodespartial_evalremoves compile-timeifs;for-loop unrolling is decided later byLoopAnalyzerduringemitanalyzeguarantees that quantum operations do not depend on classical values derived from measurementsWhether runtime branching survives into the circuit depends on the backend’s
MeasurementMode(NATIVEorRUNNABLErequired)