Line | Branch | Decision | Exec | Source |
---|---|---|---|---|
1 | // Copyright 2019-2022 Cambridge Quantum Computing | |||
2 | // | |||
3 | // Licensed under the Apache License, Version 2.0 (the "License"); | |||
4 | // you may not use this file except in compliance with the License. | |||
5 | // You may obtain a copy of the License at | |||
6 | // | |||
7 | // http://www.apache.org/licenses/LICENSE-2.0 | |||
8 | // | |||
9 | // Unless required by applicable law or agreed to in writing, software | |||
10 | // distributed under the License is distributed on an "AS IS" BASIS, | |||
11 | // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | |||
12 | // See the License for the specific language governing permissions and | |||
13 | // limitations under the License. | |||
14 | ||||
15 | #include "GateNode.hpp" | |||
16 | ||||
17 | #include <tkassert/Assert.hpp> | |||
18 | ||||
19 | #include "BitOperations.hpp" | |||
20 | ||||
21 | /* | |||
22 | The following is intended to be helpful, but there are no guarantees | |||
23 | (that it IS helpful)... | |||
24 | ||||
25 | Suppose that a circuit has n qubits, and a gate acts on k qubits [q0, q1, ...], | |||
26 | where n >= k. Let M be the (2^n)*(2^n) unitary matrix representing the gate. | |||
27 | ||||
28 | Let U be the (2^k)*(2^k) unitary matrix which would represent the same gate | |||
29 | acting on qubits [0,1,2,...,k-1] of a k-qubit circuit. | |||
30 | ||||
31 | How do we obtain M from U, q0, q1, ..., k, n? Use ILO-BE convention. | |||
32 | ||||
33 | For a binary string 001010010... of length n, the vector | |||
34 | |001010010...> corresponds to an element of C^(2^n), | |||
35 | a 2^n-dimensional vector space, as follows: | |||
36 | ||||
37 | Replace each 0 by e(0) = (1 0)^T, and each 1 by e(1) = (0 1)^T, in C^2. | |||
38 | ||||
39 | Between these 2D vectors, insert tensor product symbols | |||
40 | (still in the same order, from left to right). | |||
41 | ||||
42 | The resulting vector in C^(2^n) has exactly one row, say p, containing 1, | |||
43 | and 0 everywhere else; and therefore equals e(p) in C^(2^n), | |||
44 | for some 0 <= p < 2^n. | |||
45 | ||||
46 | p is just the number of zeros above the entry 1. But, how to find p? | |||
47 | ||||
48 | p is exactly the usual integer value of the original binary string 001010010... | |||
49 | used to create the vector with tensor products. | |||
50 | (This is easy to prove, by induction on n). | |||
51 | ||||
52 | We need to find M|abcdefgh...z> for each length-n binary string abc...z. | |||
53 | ||||
54 | Now, "a" corresponds to qubit 0, "b" to qubit 1, ... Mark the bits | |||
55 | in positions [q0, q1,...] to get abc@e@@h...z, using "@" to denote a marked bit. | |||
56 | ||||
57 | M|abc@e@@h...z> comes from calculating U|###> on the bits @@@ alone, | |||
58 | where ### are the bits @@@, but permuted if necessary (because the gate acts on | |||
59 | qubits [q0, q1, q2, ...] which is not necessarily [0,1,2,...]). | |||
60 | ||||
61 | We "tear up" the values of U|###> and copy/move them around | |||
62 | to fill up the (2^n)*(2^n) matrix. | |||
63 | ||||
64 | In the final matrix M, there are 2^(n-k) copies of each nonzero entry u_{i,j} of | |||
65 | U, placed in various positions. The positions depend only on q0, q1, ... and | |||
66 | i,j,k,n, not on the value u_{i,j}. There are no collisions. Thus, M is given by | |||
67 | a linear function of the entries u_{i,j}. | |||
68 | ||||
69 | For each i,j with 0 <= i,j < k, define the (2^k)*(2^k) matrix D = D(i,j) by | |||
70 | ||||
71 | D_{r,s} = 1 if (r,s)=(i,j), and 0 otherwise. | |||
72 | ||||
73 | Thus, U = sum_{i,j} u_{i,j} D(i,j). | |||
74 | ||||
75 | So now, if we fix (i,j), replace U by D(i,j), and allow all bits to vary, | |||
76 | it defines a linear transformation represented by a matrix M(i,j), | |||
77 | and then the final matrix M we desire is just M = sum_{i,j} u_{i,j} M(i,j). | |||
78 | ||||
79 | [Note that the D matrices are not unitary any more, but that's OK; | |||
80 | the formula still makes sense, it is all just linear equations]. | |||
81 | ||||
82 | So, what happens with D? An alternative characterisation of | |||
83 | D = D(i,j) is: D(e(j)) = e(i), and D(e(t)) = 0 for all other t. | |||
84 | ||||
85 | So now, fix (i,j). The only way for M acting on |***@**@*@@***> | |||
86 | to be nonzero is for the bits @@@ to be (a suitable permutation of) | |||
87 | the bits ### which represent j in binary (because, then |###> = e(j)). | |||
88 | ||||
89 | Now D|###> = e(i). Hence, when M acts on the vector |***@**@*@@***>, | |||
90 | it changes the bits @@@ to (a suitable permutation of) the bits representing i. | |||
91 | ||||
92 | This, then, is the algorithm: | |||
93 | ||||
94 | - For any binary string T_k = b0.b1.b2.... of length k, | |||
95 | and any length n binary string S_n, | |||
96 | let f(T_k, S_n) be the length n binary string | |||
97 | obtained by overwriting the bit in position q(r) with b(r). | |||
98 | (Counting positions left to right). | |||
99 | ||||
100 | - Let g be the reverse operation: given any length n binary string S_n, | |||
101 | let g(S_n) = T_k be the unique length k binary string such that | |||
102 | f(T_k, S_n) = S_n. | |||
103 | (Of course, there are 2^{n-k} different S_n giving the same T_k). | |||
104 | ||||
105 | - Pick any nonzero element u_{i,j} of U. Thus 0 <= i,j < 2^k. | |||
106 | Consider any fixed x = S_n for which g(S_n) = T_k | |||
107 | represents the integer j. | |||
108 | Let y = f(T', x) where T' is the binary string | |||
109 | representing the integer i. | |||
110 | The linear transformation x -> y which maps all other length n strings | |||
111 | to zero (regarding them as e(p) in C^{2^n}, where p is exactly the integer | |||
112 | value of the binary string) is a (2^n * 2^n) matrix W with 1 in exactly one | |||
113 | position, in fact (y,x) exactly (converting the binary strings x,y to | |||
114 | integers), because W(e(x)) = e(y). | |||
115 | ||||
116 | - To build up M, let the value u_{i,j} be copied to position (y,x). | |||
117 | ||||
118 | - However, we can vary those bits of S_n which are NOT in the positions | |||
119 | [q0, q1, ...] to get different (y,x), but still with the same i,j | |||
120 | (because the i,j depend only on the bits at [q0, q1, ...]). | |||
121 | The x values are all distinct from each other (and so, too, are the y), | |||
122 | so we get 2^{n-k} distinct positions (y,x) where we place the value u_{i,j}. | |||
123 | ||||
124 | SPARSITY: U is (2^k)*(2^k). Let 0 < s <= 1 be the sparsity of U, | |||
125 | i.e. U has s.4^k nonzero entries. The final matrix M then has | |||
126 | 2^{n-k}.s.4^k = s.2^{n+k} nonzero entries, so dividing by 4^n, | |||
127 | the sparsity of M is s.2^{k-n}. Thus, even if U is dense, in most applications | |||
128 | M becomes very sparse as n increases since k <= 2 for almost all gates. | |||
129 | (In the few gates where k > 2, usually U is sparse, so that s << 1. | |||
130 | e.g., CCX has an 8x8 matrix U, with 8 nonzero entries, so s=1/8). | |||
131 | ||||
132 | This sparsity is essential. On an ordinary 2020 laptop, | |||
133 | for n=8, with 560 gates, the full unitary M is found in <1 second. | |||
134 | A single statevector for n=16, with 1120 gates, is found in ~3 seconds. | |||
135 | This is without any really fancy, clever optimisation. | |||
136 | ||||
137 | */ | |||
138 | ||||
139 | namespace tket { | |||
140 | namespace tket_sim { | |||
141 | namespace internal { | |||
142 | ||||
143 | namespace { | |||
144 | /** There are k distinct qubits [q0, q1, q2, ...] chosen from the set | |||
145 | * {0,1,2,...,n-1}. We want each binary string x of length k to have its bits | |||
146 | * extracted and moved to the positions [q0, q1, q2, ...] within a larger binary | |||
147 | * string of length n. | |||
148 | * | |||
149 | * Thus 0 <= x < 2^k if we interpret x as an unsigned integer in the usual way. | |||
150 | * | |||
151 | * Element[x] of the returned vector gives exactly the bits corresponding to x. | |||
152 | */ | |||
153 | struct LiftedBitsResult { | |||
154 | // Element[x] tells us exactly the result of translating the bits of x. | |||
155 | std::vector<SimUInt> translated_bits; | |||
156 | ||||
157 | // Simply the OR of all translated bits; has a "1" everywhere there is a slot | |||
158 | // for a translated bit to appear. | |||
159 | SimUInt translated_bits_mask; | |||
160 | ||||
161 | /** Set the data members, given the list of relevant qubits, and the full | |||
162 | * number of qubits from which they are chosen. | |||
163 | * @param qubits The distinct qubit indices chosen from {0,1,2,...,n-1}. | |||
164 | * @param full_number_of_qubits The value n, the number of qubits in the full | |||
165 | * set, from which "qubits" is extracted. | |||
166 | */ | |||
167 | void set(const std::vector<unsigned>& qubits, unsigned full_number_of_qubits); | |||
168 | }; | |||
169 | ||||
170 | 191037 | void LiftedBitsResult::set( | ||
171 | const std::vector<unsigned>& qubits, unsigned full_number_of_qubits) { | |||
172 | TKET_ASSERT(full_number_of_qubits >= qubits.size()); | |||
173 | TKET_ASSERT(full_number_of_qubits < 32); | |||
174 | ||||
175 |
2/4✓ Branch 2 taken 191037 times.
✗ Branch 3 not taken.
✓ Branch 5 taken 191037 times.
✗ Branch 6 not taken.
|
191037 | translated_bits.assign(get_matrix_size(qubits.size()), 0); | |
176 | ||||
177 | // Build up the elements by ORing with shifted bits. | |||
178 | 191037 | translated_bits_mask = 0; | ||
179 | 191037 | SimUInt k_string_bit = 1; | ||
180 | ||||
181 |
2/2✓ Branch 1 taken 274683 times.
✓ Branch 2 taken 191037 times.
|
2/2✓ Decision 'true' taken 274683 times.
✓ Decision 'false' taken 191037 times.
|
465720 | for (unsigned count = 0; count < qubits.size(); ++count) { |
182 | TKET_ASSERT(full_number_of_qubits >= qubits[count] + 1); | |||
183 | ||||
184 | // This will be a bit within the length n string. | |||
185 | 274683 | SimUInt long_string_bit = 1; | ||
186 | 274683 | long_string_bit <<= | ||
187 | 274683 | (full_number_of_qubits - (qubits[qubits.size() - 1 - count] + 1)); | ||
188 | ||||
189 | 274683 | translated_bits_mask |= long_string_bit; | ||
190 | ||||
191 | // Go through the entries and add the new long string bit | |||
192 | // when appropriate | |||
193 | // (i.e., if "k_string_bit" occurs in the binary expansion of i). | |||
194 |
2/2✓ Branch 1 taken 892206 times.
✓ Branch 2 taken 274683 times.
|
2/2✓ Decision 'true' taken 892206 times.
✓ Decision 'false' taken 274683 times.
|
1166889 | for (SimUInt ii = 0; ii < translated_bits.size(); ++ii) { |
195 | // If "ii" has a 1 in the appropriate position, | |||
196 | // insert a "1" in the length n string for this ii. | |||
197 | ||||
198 | // Trick: convert 0 to 0, but convert any nonzero number to 1. | |||
199 | // This hopefully is faster by avoiding branching. | |||
200 | 892206 | const SimUInt has_bit = std::min(ii & k_string_bit, SimUInt(1)); | ||
201 | 892206 | translated_bits[ii] |= has_bit * long_string_bit; | ||
202 | } | |||
203 | 274683 | k_string_bit <<= 1; | ||
204 | } | |||
205 | 191037 | } | ||
206 | } // namespace | |||
207 | ||||
208 | 191037 | static void set_lifted_triplets( | ||
209 | const std::vector<TripletCd>& triplets, LiftedBitsResult& lifted_bits, | |||
210 | std::vector<TripletCd>& lifted_triplets, ExpansionData& expansion_data, | |||
211 | const std::vector<unsigned>& qubits, unsigned full_number_of_qubits) { | |||
212 | // translated_bits[j] gives the bits within the length n binary string, | |||
213 | // which correspond to the length k binary representation of j, | |||
214 | // but permuted and moved around so as to fit in the "slots" for qubits | |||
215 | // [q0, q1, q2,...]. | |||
216 | // Since U is unitary of size 2^k, no column or row can be all zeros, | |||
217 | // so every index j will be needed at least once. | |||
218 | // Hence, it is not a waste of time to compute all, | |||
219 | // even if U is very sparse. | |||
220 | 191037 | lifted_bits.set(qubits, full_number_of_qubits); | ||
221 | 191037 | lifted_triplets.clear(); | ||
222 | ||||
223 | 382074 | expansion_data = get_expansion_data( | ||
224 | 191037 | lifted_bits.translated_bits_mask, full_number_of_qubits); | ||
225 | ||||
226 | const SimUInt free_bits_limit = | |||
227 | 191037 | get_matrix_size(full_number_of_qubits - qubits.size()); | ||
228 | TKET_ASSERT(free_bits_limit != 0 || !"Too many bits"); | |||
229 | ||||
230 |
2/2✓ Branch 0 taken 8255694 times.
✓ Branch 1 taken 191037 times.
|
2/2✓ Decision 'true' taken 8255694 times.
✓ Decision 'false' taken 191037 times.
|
8446731 | for (SimUInt free_bits = 0; free_bits < free_bits_limit; ++free_bits) { |
231 | const SimUInt expanded_free_bits = | |||
232 | 8255694 | get_expanded_bits(expansion_data, free_bits); | ||
233 | ||||
234 | // Now go down the U matrix entries and copy them to the appropriate | |||
235 | // positions. | |||
236 |
2/2✓ Branch 4 taken 26635291 times.
✓ Branch 5 taken 8255694 times.
|
2/2✓ Decision 'true' taken 26635291 times.
✓ Decision 'false' taken 8255694 times.
|
34890985 | for (const auto& triplet : triplets) { |
237 | 26635291 | lifted_triplets.emplace_back( | ||
238 |
2/4✓ Branch 1 taken 26635291 times.
✗ Branch 2 not taken.
✓ Branch 4 taken 26635291 times.
✗ Branch 5 not taken.
|
26635291 | lifted_bits.translated_bits.at(triplet.row()) | expanded_free_bits, | |
239 |
1/2✓ Branch 2 taken 26635291 times.
✗ Branch 3 not taken.
|
53270582 | lifted_bits.translated_bits.at(triplet.col()) | expanded_free_bits, | |
240 | triplet.value()); | |||
241 | } | |||
242 | } | |||
243 | 191037 | } | ||
244 | ||||
245 | namespace { | |||
246 | // Contains data potentially of size roughly 2^n or larger, | |||
247 | // to avoid expensive memeory reallocation. | |||
248 | struct LargeWorkData { | |||
249 | LiftedBitsResult lifted_bits; | |||
250 | std::vector<TripletCd> lifted_triplets; | |||
251 | SparseMatrixXcd sparse_matrix; | |||
252 | ExpansionData expansion_data; | |||
253 | ||||
254 | static LargeWorkData& get_work_data(); | |||
255 | }; | |||
256 | ||||
257 | 191037 | LargeWorkData& LargeWorkData::get_work_data() { | ||
258 |
4/8✓ Branch 0 taken 1 times.
✓ Branch 1 taken 191036 times.
✓ Branch 3 taken 1 times.
✗ Branch 4 not taken.
✓ Branch 6 taken 1 times.
✗ Branch 7 not taken.
✗ Branch 10 not taken.
✗ Branch 11 not taken.
|
191037 | static LargeWorkData data; | |
259 | 191037 | return data; | ||
260 | } | |||
261 | } // namespace | |||
262 | ||||
263 | 191037 | void GateNode::apply_full_unitary( | ||
264 | Eigen::MatrixXcd& matr, unsigned full_number_of_qubits) const { | |||
265 | 191037 | auto& work_data = LargeWorkData::get_work_data(); | ||
266 | ||||
267 | 191037 | set_lifted_triplets( | ||
268 | 191037 | triplets, work_data.lifted_bits, work_data.lifted_triplets, | ||
269 | 191037 | work_data.expansion_data, qubit_indices, full_number_of_qubits); | ||
270 | ||||
271 | work_data.sparse_matrix = | |||
272 |
1/2✓ Branch 3 taken 191037 times.
✗ Branch 4 not taken.
|
191037 | get_sparse_square_matrix(work_data.lifted_triplets, matr.rows()); | |
273 | ||||
274 |
1/2✓ Branch 2 taken 191037 times.
✗ Branch 3 not taken.
|
191037 | matr = work_data.sparse_matrix * matr; | |
275 | 191037 | } | ||
276 | ||||
277 | } // namespace internal | |||
278 | } // namespace tket_sim | |||
279 | } // namespace tket | |||
280 |