Crypto++ 8.7
Free C++ class library of cryptographic schemes
integer.cpp
1// integer.cpp - originally written and placed in the public domain by Wei Dai
2// contains public domain code contributed by Alister Lee and Leonard Janke
3
4// Notes by JW: The Integer class needs to do two things. First, it needs
5// to set function pointers on some platforms, like X86 and X64. The
6// function pointers select a fast multiply and addition based on the cpu.
7// Second, it wants to create Integer::Zero(), Integer::One() and
8// Integer::Two().
9// The function pointers are initialized in the InitializeInteger class by
10// calling SetFunctionPointers(). The call to SetFunctionPointers() is
11// guarded to run once using a double-checked pattern. We don't use C++
12// std::call_once due to bad interactions between libstdc++, glibc and
13// pthreads. The bad interactions were causing crashes for us on platforms
14// like Sparc and PowerPC. Since we are only setting function pointers we
15// don't have to worry about leaking memory. The worst case seems to be the
16// pointers gets written twice until the init flag is set and visible to
17// all threads.
18// For Integer::Zero(), Integer::One() and Integer::Two(), we use one of three
19// strategies. First, if initialization priorities are available then we use
20// them. Initialization priorities are init_priority() on Linux and init_seg()
21// on Windows. OS X and several other platforms lack them. Initialization
22// priorities are platform specific but they are also the most trouble free
23// with deterministic destruction.
24// Second, if C++11 dynamic initialization is available, then we use it. After
25// the std::call_once fiasco we moved to dynamic initialization to avoid
26// unknown troubles platforms that are tested less frequently. In addition
27// Microsoft platforms mostly do not provide dynamic initialization.
28// The MSDN docs claim they do but they don't in practice because we need
29// Visual Studio 2017 and Windows 10 or above.
30// Third, we fall back to Wei's original code of a Singleton. Wei's original
31// code was much simpler. It simply used the Singleton pattern, but it always
32// produced memory findings on some platforms. The Singleton generates memory
33// findings because it uses a Create On First Use pattern (a dumb Nifty
34// Counter) and the compiler had to be smart enough to fold them to return
35// the same object. Unix and Linux compilers do a good job of folding objects,
36// but Microsoft compilers do a rather poor job for some versions of the
37// compiler.
38// Another problem with the Singleton is resource destruction requires running
39// resource acquisition in reverse. For resources provided through the
40// Singletons, there is no way to express the dependency order to safely
41// destroy resources. (That's one of the problems C++11 dynamic
42// initialization with concurrent execution is supposed to solve).
43// The final problem with Singletons is resource/memory exhaustion in languages
44// like Java and .Net. Java and .Net load and unload a native DLL hundreds or
45// thousands of times during the life of a program. Each load produces a
46// memory leak and they can add up quickly. If they library is being used in
47// Java or .Net then Singleton must be avoided at all costs.
48//
49// The code below has a path cut-in for BMI2 using mulx and adcx instructions.
50// There was a modest speedup of approximately 0.03 ms in public key Integer
51// operations. We had to disable BMI2 for the moment because some OS X machines
52// were advertising BMI/BMI2 support but caused SIGILL's at runtime. Also see
53// https://github.com/weidai11/cryptopp/issues/850.
54
55#include "pch.h"
56#include "config.h"
57
58#ifndef CRYPTOPP_IMPORTS
59
60#include "integer.h"
61#include "secblock.h"
62#include "modarith.h"
63#include "nbtheory.h"
64#include "smartptr.h"
65#include "algparam.h"
66#include "filters.h"
67#include "stdcpp.h"
68#include "asn.h"
69#include "oids.h"
70#include "words.h"
71#include "pubkey.h" // for P1363_KDF2
72#include "sha.h"
73#include "cpu.h"
74#include "misc.h"
75
76#include <iostream>
77
78#if (_MSC_VER >= 1400) && !defined(_M_ARM)
79 #include <intrin.h>
80#endif
81
82#ifdef __DECCXX
83 #include <c_asm.h>
84#endif
85
86// "Error: The operand ___LKDB cannot be assigned to",
87// http://github.com/weidai11/cryptopp/issues/188
88#if (__SUNPRO_CC >= 0x5130)
89# define MAYBE_CONST
90# define MAYBE_UNCONST_CAST(x) const_cast<word*>(x)
91#else
92# define MAYBE_CONST const
93# define MAYBE_UNCONST_CAST(x) x
94#endif
95
96// "Inline assembly operands don't work with .intel_syntax",
97// http://llvm.org/bugs/show_bug.cgi?id=24232
98#if CRYPTOPP_BOOL_X32 || defined(CRYPTOPP_DISABLE_MIXED_ASM)
99# undef CRYPTOPP_X86_ASM_AVAILABLE
100# undef CRYPTOPP_X32_ASM_AVAILABLE
101# undef CRYPTOPP_X64_ASM_AVAILABLE
102# undef CRYPTOPP_SSE2_ASM_AVAILABLE
103# undef CRYPTOPP_SSSE3_ASM_AVAILABLE
104#else
105# define CRYPTOPP_INTEGER_SSE2 (CRYPTOPP_SSE2_ASM_AVAILABLE && (CRYPTOPP_BOOL_X86))
106#endif
107
108// ***************** C++ Static Initialization ********************
109
110NAMESPACE_BEGIN(CryptoPP)
111
112// Function body near the middle of the file
113static void SetFunctionPointers();
114
115// Use a double-checked pattern. We are not leaking anything so it
116// does not matter if a pointer is written twice during a race.
117// Avoid std::call_once due to too many problems on platforms like
118// Solaris and Sparc. Also see
119// http://gcc.gnu.org/bugzilla/show_bug.cgi?id=66146 and
120// http://github.com/weidai11/cryptopp/issues/707.
121InitializeInteger::InitializeInteger()
122{
123 static bool s_flag;
125 if (s_flag == false)
126 {
127 SetFunctionPointers();
128 s_flag = true;
130 }
131}
132
133template <long i>
134struct NewInteger
135{
136 Integer * operator()() const
137 {
138 return new Integer(i);
139 }
140};
141
142// ***************** Library code ********************
143
144inline static int Compare(const word *A, const word *B, size_t N)
145{
146 while (N--)
147 if (A[N] > B[N])
148 return 1;
149 else if (A[N] < B[N])
150 return -1;
151
152 return 0;
153}
154
155inline static int Increment(word *A, size_t N, word B=1)
156{
158 word t = A[0];
159 A[0] = t+B;
160 if (A[0] >= t)
161 return 0;
162 for (unsigned i=1; i<N; i++)
163 if (++A[i])
164 return 0;
165 return 1;
166}
167
168inline static int Decrement(word *A, size_t N, word B=1)
169{
171 word t = A[0];
172 A[0] = t-B;
173 if (A[0] <= t)
174 return 0;
175 for (unsigned i=1; i<N; i++)
176 if (A[i]--)
177 return 0;
178 return 1;
179}
180
181static void TwosComplement(word *A, size_t N)
182{
183 Decrement(A, N);
184 for (unsigned i=0; i<N; i++)
185 A[i] = ~A[i];
186}
187
188static word AtomicInverseModPower2(word A)
189{
190 CRYPTOPP_ASSERT(A%2==1);
191
192 word R=A%8;
193
194 for (unsigned i=3; i<WORD_BITS; i*=2)
195 R = R*(2-R*A);
196
197 CRYPTOPP_ASSERT(R*A==1);
198 return R;
199}
200
201// ********************************************************
202
203#if !defined(CRYPTOPP_NATIVE_DWORD_AVAILABLE) || ((defined(__aarch64__) || defined(__x86_64__)) && defined(CRYPTOPP_WORD128_AVAILABLE))
204 #define TWO_64_BIT_WORDS 1
205 #define Declare2Words(x) word x##0, x##1;
206 #define AssignWord(a, b) a##0 = b; a##1 = 0;
207 #define Add2WordsBy1(a, b, c) a##0 = b##0 + c; a##1 = b##1 + (a##0 < c);
208 #define LowWord(a) a##0
209 #define HighWord(a) a##1
210 #ifdef _MSC_VER
211 #define MultiplyWordsLoHi(p0, p1, a, b) p0 = _umul128(a, b, &p1);
212 #ifndef __INTEL_COMPILER
213 #define Double3Words(c, d) d##1 = __shiftleft128(d##0, d##1, 1); d##0 = __shiftleft128(c, d##0, 1); c *= 2;
214 #endif
215 #elif defined(__aarch32__) || defined(__aarch64__)
216 #define MultiplyWordsLoHi(p0, p1, a, b) p0 = a*b; asm ("umulh %0,%1,%2" : "=r"(p1) : "r"(a), "r"(b));
217 #elif defined(__DECCXX)
218 #define MultiplyWordsLoHi(p0, p1, a, b) p0 = a*b; p1 = asm("umulh %a0, %a1, %v0", a, b);
219 #elif defined(__x86_64__)
220 #if defined(__SUNPRO_CC) && __SUNPRO_CC < 0x5100
221 // Sun Studio's gcc-style inline assembly is heavily bugged as of version 5.9 Patch 124864-09 2008/12/16, but this one works
222 #define MultiplyWordsLoHi(p0, p1, a, b) asm ("mulq %3" : "=a"(p0), "=d"(p1) : "a"(a), "r"(b) : "cc");
223 #elif defined(__BMI2__) && 0
224 #define MultiplyWordsLoHi(p0, p1, a, b) asm ("mulxq %3, %0, %1" : "=r"(p0), "=r"(p1) : "d"(a), "r"(b));
225 #define MulAcc(c, d, a, b) asm ("mulxq %6, %3, %4; addq %3, %0; adcxq %4, %1; adcxq %7, %2;" : "+&r"(c), "+&r"(d##0), "+&r"(d##1), "=&r"(p0), "=&r"(p1) : "d"(a), "r"(b), "r"(W64LIT(0)) : "cc");
226 #define Double3Words(c, d) asm ("addq %0, %0; adcxq %1, %1; adcxq %2, %2;" : "+r"(c), "+r"(d##0), "+r"(d##1) : : "cc");
227 #define Acc2WordsBy1(a, b) asm ("addq %2, %0; adcxq %3, %1;" : "+&r"(a##0), "+r"(a##1) : "r"(b), "r"(W64LIT(0)) : "cc");
228 #define Acc2WordsBy2(a, b) asm ("addq %2, %0; adcxq %3, %1;" : "+r"(a##0), "+r"(a##1) : "r"(b##0), "r"(b##1) : "cc");
229 #define Acc3WordsBy2(c, d, e) asm ("addq %5, %0; adcxq %6, %1; adcxq %7, %2;" : "+r"(c), "=&r"(e##0), "=&r"(e##1) : "1"(d##0), "2"(d##1), "r"(e##0), "r"(e##1), "r"(W64LIT(0)) : "cc");
230 #else
231 #define MultiplyWordsLoHi(p0, p1, a, b) asm ("mulq %3" : "=a"(p0), "=d"(p1) : "a"(a), "g"(b) : "cc");
232 #define MulAcc(c, d, a, b) asm ("mulq %6; addq %3, %0; adcq %4, %1; adcq $0, %2;" : "+r"(c), "+r"(d##0), "+r"(d##1), "=a"(p0), "=d"(p1) : "a"(a), "g"(b) : "cc");
233 #define Double3Words(c, d) asm ("addq %0, %0; adcq %1, %1; adcq %2, %2;" : "+r"(c), "+r"(d##0), "+r"(d##1) : : "cc");
234 #define Acc2WordsBy1(a, b) asm ("addq %2, %0; adcq $0, %1;" : "+r"(a##0), "+r"(a##1) : "r"(b) : "cc");
235 #define Acc2WordsBy2(a, b) asm ("addq %2, %0; adcq %3, %1;" : "+r"(a##0), "+r"(a##1) : "r"(b##0), "r"(b##1) : "cc");
236 #define Acc3WordsBy2(c, d, e) asm ("addq %5, %0; adcq %6, %1; adcq $0, %2;" : "+r"(c), "=r"(e##0), "=r"(e##1) : "1"(d##0), "2"(d##1), "r"(e##0), "r"(e##1) : "cc");
237 #endif
238 #endif
239 #define MultiplyWords(p, a, b) MultiplyWordsLoHi(p##0, p##1, a, b)
240 #ifndef Double3Words
241 #define Double3Words(c, d) d##1 = 2*d##1 + (d##0>>(WORD_BITS-1)); d##0 = 2*d##0 + (c>>(WORD_BITS-1)); c *= 2;
242 #endif
243 #ifndef Acc2WordsBy2
244 #define Acc2WordsBy2(a, b) a##0 += b##0; a##1 += a##0 < b##0; a##1 += b##1;
245 #endif
246 #define AddWithCarry(u, a, b) {word t = a+b; u##0 = t + u##1; u##1 = (t<a) + (u##0<t);}
247 #define SubtractWithBorrow(u, a, b) {word t = a-b; u##0 = t - u##1; u##1 = (t>a) + (u##0>t);}
248 #define GetCarry(u) u##1
249 #define GetBorrow(u) u##1
250#else
251 #define Declare2Words(x) dword x;
252 #if _MSC_VER >= 1400 && !defined(__INTEL_COMPILER) && (defined(_M_IX86) || defined(_M_X64) || defined(_M_IA64))
253 #define MultiplyWords(p, a, b) p = __emulu(a, b);
254 #else
255 #define MultiplyWords(p, a, b) p = (dword)a*b;
256 #endif
257 #define AssignWord(a, b) a = b;
258 #define Add2WordsBy1(a, b, c) a = b + c;
259 #define Acc2WordsBy2(a, b) a += b;
260 #define LowWord(a) word(a)
261 #define HighWord(a) word(a>>WORD_BITS)
262 #define Double3Words(c, d) d = 2*d + (c>>(WORD_BITS-1)); c *= 2;
263 #define AddWithCarry(u, a, b) u = dword(a) + b + GetCarry(u);
264 #define SubtractWithBorrow(u, a, b) u = dword(a) - b - GetBorrow(u);
265 #define GetCarry(u) HighWord(u)
266 #define GetBorrow(u) word(u>>(WORD_BITS*2-1))
267#endif
268#ifndef MulAcc
269 #define MulAcc(c, d, a, b) MultiplyWords(p, a, b); Acc2WordsBy1(p, c); c = LowWord(p); Acc2WordsBy1(d, HighWord(p));
270#endif
271#ifndef Acc2WordsBy1
272 #define Acc2WordsBy1(a, b) Add2WordsBy1(a, a, b)
273#endif
274#ifndef Acc3WordsBy2
275 #define Acc3WordsBy2(c, d, e) Acc2WordsBy1(e, c); c = LowWord(e); Add2WordsBy1(e, d, HighWord(e));
276#endif
277
278class DWord
279{
280public:
281#if defined(CRYPTOPP_NATIVE_DWORD_AVAILABLE)
282 DWord() {std::memset(&m_whole, 0x00, sizeof(m_whole));}
283#else
284 DWord() {std::memset(&m_halfs, 0x00, sizeof(m_halfs));}
285#endif
286
287#ifdef CRYPTOPP_NATIVE_DWORD_AVAILABLE
288 explicit DWord(word low) : m_whole(low) { }
289#else
290 explicit DWord(word low)
291 {
292 m_halfs.high = 0;
293 m_halfs.low = low;
294 }
295#endif
296
297#if defined(CRYPTOPP_NATIVE_DWORD_AVAILABLE)
298 DWord(word low, word high) : m_whole()
299#else
300 DWord(word low, word high) : m_halfs()
301#endif
302 {
303#if defined(CRYPTOPP_NATIVE_DWORD_AVAILABLE)
304# if (CRYPTOPP_LITTLE_ENDIAN)
305 const word t[2] = {low,high};
306 memcpy(&m_whole, t, sizeof(m_whole));
307# else
308 const word t[2] = {high,low};
309 memcpy(&m_whole, t, sizeof(m_whole));
310# endif
311#else
312 m_halfs.low = low;
313 m_halfs.high = high;
314#endif
315 }
316
317 static DWord Multiply(word a, word b)
318 {
319 DWord r;
320 #ifdef CRYPTOPP_NATIVE_DWORD_AVAILABLE
321 r.m_whole = (dword)a * b;
322 #elif defined(MultiplyWordsLoHi)
323 MultiplyWordsLoHi(r.m_halfs.low, r.m_halfs.high, a, b);
324 #else
326 #endif
327 return r;
328 }
329
330 static DWord MultiplyAndAdd(word a, word b, word c)
331 {
332 DWord r = Multiply(a, b);
333 return r += c;
334 }
335
336 DWord & operator+=(word a)
337 {
338 #ifdef CRYPTOPP_NATIVE_DWORD_AVAILABLE
339 m_whole = m_whole + a;
340 #else
341 m_halfs.low += a;
342 m_halfs.high += (m_halfs.low < a);
343 #endif
344 return *this;
345 }
346
347 DWord operator+(word a)
348 {
349 DWord r;
350 #ifdef CRYPTOPP_NATIVE_DWORD_AVAILABLE
351 r.m_whole = m_whole + a;
352 #else
353 r.m_halfs.low = m_halfs.low + a;
354 r.m_halfs.high = m_halfs.high + (r.m_halfs.low < a);
355 #endif
356 return r;
357 }
358
359 DWord operator-(DWord a)
360 {
361 DWord r;
362 #ifdef CRYPTOPP_NATIVE_DWORD_AVAILABLE
363 r.m_whole = m_whole - a.m_whole;
364 #else
365 r.m_halfs.low = m_halfs.low - a.m_halfs.low;
366 r.m_halfs.high = m_halfs.high - a.m_halfs.high - (r.m_halfs.low > m_halfs.low);
367 #endif
368 return r;
369 }
370
371 DWord operator-(word a)
372 {
373 DWord r;
374 #ifdef CRYPTOPP_NATIVE_DWORD_AVAILABLE
375 r.m_whole = m_whole - a;
376 #else
377 r.m_halfs.low = m_halfs.low - a;
378 r.m_halfs.high = m_halfs.high - (r.m_halfs.low > m_halfs.low);
379 #endif
380 return r;
381 }
382
383 // returns quotient, which must fit in a word
384 word operator/(word divisor);
385
386 word operator%(word a);
387
388 bool operator!() const
389 {
390 #ifdef CRYPTOPP_NATIVE_DWORD_AVAILABLE
391 return !m_whole;
392 #else
393 return !m_halfs.high && !m_halfs.low;
394 #endif
395 }
396
397 // TODO: When NATIVE_DWORD is in effect, we access high and low, which are inactive
398 // union members, and that's UB. Also see http://stackoverflow.com/q/11373203.
399 word GetLowHalf() const {return m_halfs.low;}
400 word GetHighHalf() const {return m_halfs.high;}
401 word GetHighHalfAsBorrow() const {return 0-m_halfs.high;}
402
403private:
404 // Issue 274, "Types cannot be declared in anonymous union",
405 // http://github.com/weidai11/cryptopp/issues/274
406 // Thanks to Martin Bonner at http://stackoverflow.com/a/39507183
407 struct half_words
408 {
409 #if (CRYPTOPP_LITTLE_ENDIAN)
410 word low;
411 word high;
412 #else
413 word high;
414 word low;
415 #endif
416 };
417 union
418 {
419 #ifdef CRYPTOPP_NATIVE_DWORD_AVAILABLE
420 dword m_whole;
421 #endif
422 half_words m_halfs;
423 };
424};
425
426class Word
427{
428public:
429 Word() : m_whole(0) {}
430 Word(word value) : m_whole(value) {}
431 Word(hword low, hword high) : m_whole(low | (word(high) << (WORD_BITS/2))) {}
432
433 static Word Multiply(hword a, hword b)
434 {
435 Word r;
436 r.m_whole = (word)a * b;
437 return r;
438 }
439
440 Word operator-(Word a)
441 {
442 Word r;
443 r.m_whole = m_whole - a.m_whole;
444 return r;
445 }
446
447 Word operator-(hword a)
448 {
449 Word r;
450 r.m_whole = m_whole - a;
451 return r;
452 }
453
454 // returns quotient, which must fit in a word
455 hword operator/(hword divisor)
456 {
457 return hword(m_whole / divisor);
458 }
459
460 bool operator!() const
461 {
462 return !m_whole;
463 }
464
465 word GetWhole() const {return m_whole;}
466 hword GetLowHalf() const {return hword(m_whole);}
467 hword GetHighHalf() const {return hword(m_whole>>(WORD_BITS/2));}
468 hword GetHighHalfAsBorrow() const {return 0-hword(m_whole>>(WORD_BITS/2));}
469
470private:
471 word m_whole;
472};
473
474// do a 3 word by 2 word divide, returns quotient and leaves remainder in A
475template <class S, class D>
476S DivideThreeWordsByTwo(S *A, S B0, S B1, D *dummy=NULLPTR)
477{
478 CRYPTOPP_UNUSED(dummy);
479
480 // Assert {A[2],A[1]} < {B1,B0}, so quotient can fit in a S
481 CRYPTOPP_ASSERT(A[2] < B1 || (A[2]==B1 && A[1] < B0));
482
483 // Profiling guided the flow below.
484
485 // estimate the quotient: do a 2 S by 1 S divide.
486 S Q; bool pre = (S(B1+1) == 0);
487 if (B1 > 0 && !pre)
488 Q = D(A[1], A[2]) / S(B1+1);
489 else if (pre)
490 Q = A[2];
491 else
492 Q = D(A[0], A[1]) / B0;
493
494 // now subtract Q*B from A
495 D p = D::Multiply(B0, Q);
496 D u = (D) A[0] - p.GetLowHalf();
497 A[0] = u.GetLowHalf();
498 u = (D) A[1] - p.GetHighHalf() - u.GetHighHalfAsBorrow() - D::Multiply(B1, Q);
499 A[1] = u.GetLowHalf();
500 A[2] += u.GetHighHalf();
501
502 // Q <= actual quotient, so fix it
503 while (A[2] || A[1] > B1 || (A[1]==B1 && A[0]>=B0))
504 {
505 u = (D) A[0] - B0;
506 A[0] = u.GetLowHalf();
507 u = (D) A[1] - B1 - u.GetHighHalfAsBorrow();
508 A[1] = u.GetLowHalf();
509 A[2] += u.GetHighHalf();
510 Q++;
511 CRYPTOPP_ASSERT(Q); // shouldn't overflow
512 }
513
514 return Q;
515}
516
517// do a 4 word by 2 word divide, returns 2 word quotient in Q0 and Q1
518template <class S, class D>
519inline D DivideFourWordsByTwo(S *T, const D &Al, const D &Ah, const D &B)
520{
521 // Profiling guided the flow below.
522
523 if (!!B)
524 {
525 S Q[2];
526 T[0] = Al.GetLowHalf();
527 T[1] = Al.GetHighHalf();
528 T[2] = Ah.GetLowHalf();
529 T[3] = Ah.GetHighHalf();
530 Q[1] = DivideThreeWordsByTwo<S, D>(T+1, B.GetLowHalf(), B.GetHighHalf());
531 Q[0] = DivideThreeWordsByTwo<S, D>(T, B.GetLowHalf(), B.GetHighHalf());
532 return D(Q[0], Q[1]);
533 }
534 else // if divisor is 0, we assume divisor==2**(2*WORD_BITS)
535 {
536 return D(Ah.GetLowHalf(), Ah.GetHighHalf());
537 }
538}
539
540// returns quotient, which must fit in a word
541inline word DWord::operator/(word a)
542{
543 #ifdef CRYPTOPP_NATIVE_DWORD_AVAILABLE
544 return word(m_whole / a);
545 #else
546 hword r[4];
547 return DivideFourWordsByTwo<hword, Word>(r, m_halfs.low, m_halfs.high, a).GetWhole();
548 #endif
549}
550
551inline word DWord::operator%(word a)
552{
553 #ifdef CRYPTOPP_NATIVE_DWORD_AVAILABLE
554 return word(m_whole % a);
555 #else
556 if (a < (word(1) << (WORD_BITS/2)))
557 {
558 hword h = hword(a);
559 word r = m_halfs.high % h;
560 r = ((m_halfs.low >> (WORD_BITS/2)) + (r << (WORD_BITS/2))) % h;
561 return hword((hword(m_halfs.low) + (r << (WORD_BITS/2))) % h);
562 }
563 else
564 {
565 hword r[4];
566 DivideFourWordsByTwo<hword, Word>(r, m_halfs.low, m_halfs.high, a);
567 return Word(r[0], r[1]).GetWhole();
568 }
569 #endif
570}
571
572// ********************************************************
573
574// Use some tricks to share assembly code between MSVC, GCC, Clang and Sun CC.
575#if defined(__GNUC__)
576 #define AddPrologue \
577 int result; \
578 __asm__ __volatile__ \
579 ( \
580 INTEL_NOPREFIX
581 #define AddEpilogue \
582 ATT_PREFIX \
583 : "=a" (result)\
584 : "d" (C), "a" (A), "D" (B), "c" (N) \
585 : "%esi", "memory", "cc" \
586 );\
587 return result;
588 #define MulPrologue \
589 __asm__ __volatile__ \
590 ( \
591 INTEL_NOPREFIX \
592 AS1( push ebx) \
593 AS2( mov ebx, edx)
594 #define MulEpilogue \
595 AS1( pop ebx) \
596 ATT_PREFIX \
597 : \
598 : "d" (s_maskLow16), "c" (C), "a" (A), "D" (B) \
599 : "%esi", "memory", "cc" \
600 );
601 #define SquPrologue MulPrologue
602 #define SquEpilogue \
603 AS1( pop ebx) \
604 ATT_PREFIX \
605 : \
606 : "d" (s_maskLow16), "c" (C), "a" (A) \
607 : "%esi", "%edi", "memory", "cc" \
608 );
609 #define TopPrologue MulPrologue
610 #define TopEpilogue \
611 AS1( pop ebx) \
612 ATT_PREFIX \
613 : \
614 : "d" (s_maskLow16), "c" (C), "a" (A), "D" (B), "S" (L) \
615 : "memory", "cc" \
616 );
617#else
618 #define AddPrologue \
619 __asm push edi \
620 __asm push esi \
621 __asm mov eax, [esp+12] \
622 __asm mov edi, [esp+16]
623 #define AddEpilogue \
624 __asm pop esi \
625 __asm pop edi \
626 __asm ret 8
627 #define SaveEBX
628 #define RestoreEBX
629 #define SquPrologue \
630 AS2( mov eax, A) \
631 AS2( mov ecx, C) \
632 SaveEBX \
633 AS2( lea ebx, s_maskLow16)
634 #define MulPrologue \
635 AS2( mov eax, A) \
636 AS2( mov edi, B) \
637 AS2( mov ecx, C) \
638 SaveEBX \
639 AS2( lea ebx, s_maskLow16)
640 #define TopPrologue \
641 AS2( mov eax, A) \
642 AS2( mov edi, B) \
643 AS2( mov ecx, C) \
644 AS2( mov esi, L) \
645 SaveEBX \
646 AS2( lea ebx, s_maskLow16)
647 #define SquEpilogue RestoreEBX
648 #define MulEpilogue RestoreEBX
649 #define TopEpilogue RestoreEBX
650#endif
651
652#ifdef CRYPTOPP_X64_MASM_AVAILABLE
653extern "C" {
654int Baseline_Add(size_t N, word *C, const word *A, const word *B);
655int Baseline_Sub(size_t N, word *C, const word *A, const word *B);
656}
657#elif defined(CRYPTOPP_X64_ASM_AVAILABLE) && defined(__GNUC__) && defined(CRYPTOPP_WORD128_AVAILABLE)
658int Baseline_Add(size_t N, word *C, const word *A, const word *B)
659{
660 word result;
661 __asm__ __volatile__
662 (
663 INTEL_NOPREFIX
664 AS1( neg %1)
665 ASJ( jz, 1, f)
666 AS2( mov %0,[%3+8*%1])
667 AS2( add %0,[%4+8*%1])
668 AS2( mov [%2+8*%1],%0)
669 ASL(0)
670 AS2( mov %0,[%3+8*%1+8])
671 AS2( adc %0,[%4+8*%1+8])
672 AS2( mov [%2+8*%1+8],%0)
673 AS2( lea %1,[%1+2])
674 ASJ( jrcxz, 1, f)
675 AS2( mov %0,[%3+8*%1])
676 AS2( adc %0,[%4+8*%1])
677 AS2( mov [%2+8*%1],%0)
678 ASJ( jmp, 0, b)
679 ASL(1)
680 AS2( mov %0, 0)
681 AS2( adc %0, %0)
682 ATT_NOPREFIX
683 : "=&r" (result), "+c" (N)
684 : "r" (C+N), "r" (A+N), "r" (B+N)
685 : "memory", "cc"
686 );
687 return (int)result;
688}
689
690int Baseline_Sub(size_t N, word *C, const word *A, const word *B)
691{
692 word result;
693 __asm__ __volatile__
694 (
695 INTEL_NOPREFIX
696 AS1( neg %1)
697 ASJ( jz, 1, f)
698 AS2( mov %0,[%3+8*%1])
699 AS2( sub %0,[%4+8*%1])
700 AS2( mov [%2+8*%1],%0)
701 ASL(0)
702 AS2( mov %0,[%3+8*%1+8])
703 AS2( sbb %0,[%4+8*%1+8])
704 AS2( mov [%2+8*%1+8],%0)
705 AS2( lea %1,[%1+2])
706 ASJ( jrcxz, 1, f)
707 AS2( mov %0,[%3+8*%1])
708 AS2( sbb %0,[%4+8*%1])
709 AS2( mov [%2+8*%1],%0)
710 ASJ( jmp, 0, b)
711 ASL(1)
712 AS2( mov %0, 0)
713 AS2( adc %0, %0)
714 ATT_NOPREFIX
715 : "=&r" (result), "+c" (N)
716 : "r" (C+N), "r" (A+N), "r" (B+N)
717 : "memory", "cc"
718 );
719 return (int)result;
720}
721#elif defined(CRYPTOPP_X86_ASM_AVAILABLE) && CRYPTOPP_BOOL_X86
722CRYPTOPP_NAKED int CRYPTOPP_FASTCALL Baseline_Add(size_t N, word *C, const word *A, const word *B)
723{
724 AddPrologue
725
726 // now: eax = A, edi = B, edx = C, ecx = N
727 AS2( lea eax, [eax+4*ecx])
728 AS2( lea edi, [edi+4*ecx])
729 AS2( lea edx, [edx+4*ecx])
730
731 AS1( neg ecx) // ecx is negative index
732 AS2( test ecx, 2) // this clears carry flag
733 ASJ( jz, 0, f)
734 AS2( sub ecx, 2)
735 ASJ( jmp, 1, f)
736
737 ASL(0)
738 ASJ( jecxz, 2, f) // loop until ecx overflows and becomes zero
739 AS2( mov esi,[eax+4*ecx])
740 AS2( adc esi,[edi+4*ecx])
741 AS2( mov [edx+4*ecx],esi)
742 AS2( mov esi,[eax+4*ecx+4])
743 AS2( adc esi,[edi+4*ecx+4])
744 AS2( mov [edx+4*ecx+4],esi)
745 ASL(1)
746 AS2( mov esi,[eax+4*ecx+8])
747 AS2( adc esi,[edi+4*ecx+8])
748 AS2( mov [edx+4*ecx+8],esi)
749 AS2( mov esi,[eax+4*ecx+12])
750 AS2( adc esi,[edi+4*ecx+12])
751 AS2( mov [edx+4*ecx+12],esi)
752
753 AS2( lea ecx,[ecx+4]) // advance index, avoid inc which causes slowdown on Intel Core 2
754 ASJ( jmp, 0, b)
755
756 ASL(2)
757 AS2( mov eax, 0)
758 AS1( setc al) // store carry into eax (return result register)
759
760 AddEpilogue
761
762 // http://github.com/weidai11/cryptopp/issues/340
763 CRYPTOPP_UNUSED(A); CRYPTOPP_UNUSED(B);
764 CRYPTOPP_UNUSED(C); CRYPTOPP_UNUSED(N);
765}
766
767CRYPTOPP_NAKED int CRYPTOPP_FASTCALL Baseline_Sub(size_t N, word *C, const word *A, const word *B)
768{
769 AddPrologue
770
771 // now: eax = A, edi = B, edx = C, ecx = N
772 AS2( lea eax, [eax+4*ecx])
773 AS2( lea edi, [edi+4*ecx])
774 AS2( lea edx, [edx+4*ecx])
775
776 AS1( neg ecx) // ecx is negative index
777 AS2( test ecx, 2) // this clears carry flag
778 ASJ( jz, 0, f)
779 AS2( sub ecx, 2)
780 ASJ( jmp, 1, f)
781
782 ASL(0)
783 ASJ( jecxz, 2, f) // loop until ecx overflows and becomes zero
784 AS2( mov esi,[eax+4*ecx])
785 AS2( sbb esi,[edi+4*ecx])
786 AS2( mov [edx+4*ecx],esi)
787 AS2( mov esi,[eax+4*ecx+4])
788 AS2( sbb esi,[edi+4*ecx+4])
789 AS2( mov [edx+4*ecx+4],esi)
790 ASL(1)
791 AS2( mov esi,[eax+4*ecx+8])
792 AS2( sbb esi,[edi+4*ecx+8])
793 AS2( mov [edx+4*ecx+8],esi)
794 AS2( mov esi,[eax+4*ecx+12])
795 AS2( sbb esi,[edi+4*ecx+12])
796 AS2( mov [edx+4*ecx+12],esi)
797
798 AS2( lea ecx,[ecx+4]) // advance index, avoid inc which causes slowdown on Intel Core 2
799 ASJ( jmp, 0, b)
800
801 ASL(2)
802 AS2( mov eax, 0)
803 AS1( setc al) // store carry into eax (return result register)
804
805 AddEpilogue
806
807 // http://github.com/weidai11/cryptopp/issues/340
808 CRYPTOPP_UNUSED(A); CRYPTOPP_UNUSED(B);
809 CRYPTOPP_UNUSED(C); CRYPTOPP_UNUSED(N);
810}
811
812#if CRYPTOPP_INTEGER_SSE2
813CRYPTOPP_NAKED int CRYPTOPP_FASTCALL SSE2_Add(size_t N, word *C, const word *A, const word *B)
814{
815 AddPrologue
816
817 // now: eax = A, edi = B, edx = C, ecx = N
818 AS2( lea eax, [eax+4*ecx])
819 AS2( lea edi, [edi+4*ecx])
820 AS2( lea edx, [edx+4*ecx])
821
822 AS1( neg ecx) // ecx is negative index
823 AS2( pxor mm2, mm2)
824 ASJ( jz, 2, f)
825 AS2( test ecx, 2) // this clears carry flag
826 ASJ( jz, 0, f)
827 AS2( sub ecx, 2)
828 ASJ( jmp, 1, f)
829
830 ASL(0)
831 AS2( movd mm0, DWORD PTR [eax+4*ecx])
832 AS2( movd mm1, DWORD PTR [edi+4*ecx])
833 AS2( paddq mm0, mm1)
834 AS2( paddq mm2, mm0)
835 AS2( movd DWORD PTR [edx+4*ecx], mm2)
836 AS2( psrlq mm2, 32)
837
838 AS2( movd mm0, DWORD PTR [eax+4*ecx+4])
839 AS2( movd mm1, DWORD PTR [edi+4*ecx+4])
840 AS2( paddq mm0, mm1)
841 AS2( paddq mm2, mm0)
842 AS2( movd DWORD PTR [edx+4*ecx+4], mm2)
843 AS2( psrlq mm2, 32)
844
845 ASL(1)
846 AS2( movd mm0, DWORD PTR [eax+4*ecx+8])
847 AS2( movd mm1, DWORD PTR [edi+4*ecx+8])
848 AS2( paddq mm0, mm1)
849 AS2( paddq mm2, mm0)
850 AS2( movd DWORD PTR [edx+4*ecx+8], mm2)
851 AS2( psrlq mm2, 32)
852
853 AS2( movd mm0, DWORD PTR [eax+4*ecx+12])
854 AS2( movd mm1, DWORD PTR [edi+4*ecx+12])
855 AS2( paddq mm0, mm1)
856 AS2( paddq mm2, mm0)
857 AS2( movd DWORD PTR [edx+4*ecx+12], mm2)
858 AS2( psrlq mm2, 32)
859
860 AS2( add ecx, 4)
861 ASJ( jnz, 0, b)
862
863 ASL(2)
864 AS2( movd eax, mm2)
865 AS1( emms)
866
867 AddEpilogue
868
869 // http://github.com/weidai11/cryptopp/issues/340
870 CRYPTOPP_UNUSED(A); CRYPTOPP_UNUSED(B);
871 CRYPTOPP_UNUSED(C); CRYPTOPP_UNUSED(N);
872}
873CRYPTOPP_NAKED int CRYPTOPP_FASTCALL SSE2_Sub(size_t N, word *C, const word *A, const word *B)
874{
875 AddPrologue
876
877 // now: eax = A, edi = B, edx = C, ecx = N
878 AS2( lea eax, [eax+4*ecx])
879 AS2( lea edi, [edi+4*ecx])
880 AS2( lea edx, [edx+4*ecx])
881
882 AS1( neg ecx) // ecx is negative index
883 AS2( pxor mm2, mm2)
884 ASJ( jz, 2, f)
885 AS2( test ecx, 2) // this clears carry flag
886 ASJ( jz, 0, f)
887 AS2( sub ecx, 2)
888 ASJ( jmp, 1, f)
889
890 ASL(0)
891 AS2( movd mm0, DWORD PTR [eax+4*ecx])
892 AS2( movd mm1, DWORD PTR [edi+4*ecx])
893 AS2( psubq mm0, mm1)
894 AS2( psubq mm0, mm2)
895 AS2( movd DWORD PTR [edx+4*ecx], mm0)
896 AS2( psrlq mm0, 63)
897
898 AS2( movd mm2, DWORD PTR [eax+4*ecx+4])
899 AS2( movd mm1, DWORD PTR [edi+4*ecx+4])
900 AS2( psubq mm2, mm1)
901 AS2( psubq mm2, mm0)
902 AS2( movd DWORD PTR [edx+4*ecx+4], mm2)
903 AS2( psrlq mm2, 63)
904
905 ASL(1)
906 AS2( movd mm0, DWORD PTR [eax+4*ecx+8])
907 AS2( movd mm1, DWORD PTR [edi+4*ecx+8])
908 AS2( psubq mm0, mm1)
909 AS2( psubq mm0, mm2)
910 AS2( movd DWORD PTR [edx+4*ecx+8], mm0)
911 AS2( psrlq mm0, 63)
912
913 AS2( movd mm2, DWORD PTR [eax+4*ecx+12])
914 AS2( movd mm1, DWORD PTR [edi+4*ecx+12])
915 AS2( psubq mm2, mm1)
916 AS2( psubq mm2, mm0)
917 AS2( movd DWORD PTR [edx+4*ecx+12], mm2)
918 AS2( psrlq mm2, 63)
919
920 AS2( add ecx, 4)
921 ASJ( jnz, 0, b)
922
923 ASL(2)
924 AS2( movd eax, mm2)
925 AS1( emms)
926
927 AddEpilogue
928
929 // http://github.com/weidai11/cryptopp/issues/340
930 CRYPTOPP_UNUSED(A); CRYPTOPP_UNUSED(B);
931 CRYPTOPP_UNUSED(C); CRYPTOPP_UNUSED(N);
932}
933#endif // CRYPTOPP_INTEGER_SSE2
934#else // CRYPTOPP_SSE2_ASM_AVAILABLE
935int CRYPTOPP_FASTCALL Baseline_Add(size_t N, word *C, const word *A, const word *B)
936{
937 CRYPTOPP_ASSERT (N%2 == 0);
938
939 Declare2Words(u);
940 AssignWord(u, 0);
941 for (size_t i=0; i<N; i+=2)
942 {
943 AddWithCarry(u, A[i], B[i]);
944 C[i] = LowWord(u);
945 AddWithCarry(u, A[i+1], B[i+1]);
946 C[i+1] = LowWord(u);
947 }
948 return int(GetCarry(u));
949}
950
951int CRYPTOPP_FASTCALL Baseline_Sub(size_t N, word *C, const word *A, const word *B)
952{
953 CRYPTOPP_ASSERT (N%2 == 0);
954
955 Declare2Words(u);
956 AssignWord(u, 0);
957 for (size_t i=0; i<N; i+=2)
958 {
959 SubtractWithBorrow(u, A[i], B[i]);
960 C[i] = LowWord(u);
961 SubtractWithBorrow(u, A[i+1], B[i+1]);
962 C[i+1] = LowWord(u);
963 }
964 return int(GetBorrow(u));
965}
966#endif
967
968static word LinearMultiply(word *C, const word *AA, word B, size_t N)
969{
970 // http://github.com/weidai11/cryptopp/issues/188
972
973 word carry=0;
974 for(unsigned i=0; i<N; i++)
975 {
976 Declare2Words(p);
977 MultiplyWords(p, A[i], B);
978 Acc2WordsBy1(p, carry);
979 C[i] = LowWord(p);
980 carry = HighWord(p);
981 }
982 return carry;
983}
984
985#ifndef CRYPTOPP_DOXYGEN_PROCESSING
986
987#define Mul_2 \
988 Mul_Begin(2) \
989 Mul_SaveAcc(0, 0, 1) Mul_Acc(1, 0) \
990 Mul_End(1, 1)
991
992#define Mul_4 \
993 Mul_Begin(4) \
994 Mul_SaveAcc(0, 0, 1) Mul_Acc(1, 0) \
995 Mul_SaveAcc(1, 0, 2) Mul_Acc(1, 1) Mul_Acc(2, 0) \
996 Mul_SaveAcc(2, 0, 3) Mul_Acc(1, 2) Mul_Acc(2, 1) Mul_Acc(3, 0) \
997 Mul_SaveAcc(3, 1, 3) Mul_Acc(2, 2) Mul_Acc(3, 1) \
998 Mul_SaveAcc(4, 2, 3) Mul_Acc(3, 2) \
999 Mul_End(5, 3)
1000
1001#define Mul_8 \
1002 Mul_Begin(8) \
1003 Mul_SaveAcc(0, 0, 1) Mul_Acc(1, 0) \
1004 Mul_SaveAcc(1, 0, 2) Mul_Acc(1, 1) Mul_Acc(2, 0) \
1005 Mul_SaveAcc(2, 0, 3) Mul_Acc(1, 2) Mul_Acc(2, 1) Mul_Acc(3, 0) \
1006 Mul_SaveAcc(3, 0, 4) Mul_Acc(1, 3) Mul_Acc(2, 2) Mul_Acc(3, 1) Mul_Acc(4, 0) \
1007 Mul_SaveAcc(4, 0, 5) Mul_Acc(1, 4) Mul_Acc(2, 3) Mul_Acc(3, 2) Mul_Acc(4, 1) Mul_Acc(5, 0) \
1008 Mul_SaveAcc(5, 0, 6) Mul_Acc(1, 5) Mul_Acc(2, 4) Mul_Acc(3, 3) Mul_Acc(4, 2) Mul_Acc(5, 1) Mul_Acc(6, 0) \
1009 Mul_SaveAcc(6, 0, 7) Mul_Acc(1, 6) Mul_Acc(2, 5) Mul_Acc(3, 4) Mul_Acc(4, 3) Mul_Acc(5, 2) Mul_Acc(6, 1) Mul_Acc(7, 0) \
1010 Mul_SaveAcc(7, 1, 7) Mul_Acc(2, 6) Mul_Acc(3, 5) Mul_Acc(4, 4) Mul_Acc(5, 3) Mul_Acc(6, 2) Mul_Acc(7, 1) \
1011 Mul_SaveAcc(8, 2, 7) Mul_Acc(3, 6) Mul_Acc(4, 5) Mul_Acc(5, 4) Mul_Acc(6, 3) Mul_Acc(7, 2) \
1012 Mul_SaveAcc(9, 3, 7) Mul_Acc(4, 6) Mul_Acc(5, 5) Mul_Acc(6, 4) Mul_Acc(7, 3) \
1013 Mul_SaveAcc(10, 4, 7) Mul_Acc(5, 6) Mul_Acc(6, 5) Mul_Acc(7, 4) \
1014 Mul_SaveAcc(11, 5, 7) Mul_Acc(6, 6) Mul_Acc(7, 5) \
1015 Mul_SaveAcc(12, 6, 7) Mul_Acc(7, 6) \
1016 Mul_End(13, 7)
1017
1018#define Mul_16 \
1019 Mul_Begin(16) \
1020 Mul_SaveAcc(0, 0, 1) Mul_Acc(1, 0) \
1021 Mul_SaveAcc(1, 0, 2) Mul_Acc(1, 1) Mul_Acc(2, 0) \
1022 Mul_SaveAcc(2, 0, 3) Mul_Acc(1, 2) Mul_Acc(2, 1) Mul_Acc(3, 0) \
1023 Mul_SaveAcc(3, 0, 4) Mul_Acc(1, 3) Mul_Acc(2, 2) Mul_Acc(3, 1) Mul_Acc(4, 0) \
1024 Mul_SaveAcc(4, 0, 5) Mul_Acc(1, 4) Mul_Acc(2, 3) Mul_Acc(3, 2) Mul_Acc(4, 1) Mul_Acc(5, 0) \
1025 Mul_SaveAcc(5, 0, 6) Mul_Acc(1, 5) Mul_Acc(2, 4) Mul_Acc(3, 3) Mul_Acc(4, 2) Mul_Acc(5, 1) Mul_Acc(6, 0) \
1026 Mul_SaveAcc(6, 0, 7) Mul_Acc(1, 6) Mul_Acc(2, 5) Mul_Acc(3, 4) Mul_Acc(4, 3) Mul_Acc(5, 2) Mul_Acc(6, 1) Mul_Acc(7, 0) \
1027 Mul_SaveAcc(7, 0, 8) Mul_Acc(1, 7) Mul_Acc(2, 6) Mul_Acc(3, 5) Mul_Acc(4, 4) Mul_Acc(5, 3) Mul_Acc(6, 2) Mul_Acc(7, 1) Mul_Acc(8, 0) \
1028 Mul_SaveAcc(8, 0, 9) Mul_Acc(1, 8) Mul_Acc(2, 7) Mul_Acc(3, 6) Mul_Acc(4, 5) Mul_Acc(5, 4) Mul_Acc(6, 3) Mul_Acc(7, 2) Mul_Acc(8, 1) Mul_Acc(9, 0) \
1029 Mul_SaveAcc(9, 0, 10) Mul_Acc(1, 9) Mul_Acc(2, 8) Mul_Acc(3, 7) Mul_Acc(4, 6) Mul_Acc(5, 5) Mul_Acc(6, 4) Mul_Acc(7, 3) Mul_Acc(8, 2) Mul_Acc(9, 1) Mul_Acc(10, 0) \
1030 Mul_SaveAcc(10, 0, 11) Mul_Acc(1, 10) Mul_Acc(2, 9) Mul_Acc(3, 8) Mul_Acc(4, 7) Mul_Acc(5, 6) Mul_Acc(6, 5) Mul_Acc(7, 4) Mul_Acc(8, 3) Mul_Acc(9, 2) Mul_Acc(10, 1) Mul_Acc(11, 0) \
1031 Mul_SaveAcc(11, 0, 12) Mul_Acc(1, 11) Mul_Acc(2, 10) Mul_Acc(3, 9) Mul_Acc(4, 8) Mul_Acc(5, 7) Mul_Acc(6, 6) Mul_Acc(7, 5) Mul_Acc(8, 4) Mul_Acc(9, 3) Mul_Acc(10, 2) Mul_Acc(11, 1) Mul_Acc(12, 0) \
1032 Mul_SaveAcc(12, 0, 13) Mul_Acc(1, 12) Mul_Acc(2, 11) Mul_Acc(3, 10) Mul_Acc(4, 9) Mul_Acc(5, 8) Mul_Acc(6, 7) Mul_Acc(7, 6) Mul_Acc(8, 5) Mul_Acc(9, 4) Mul_Acc(10, 3) Mul_Acc(11, 2) Mul_Acc(12, 1) Mul_Acc(13, 0) \
1033 Mul_SaveAcc(13, 0, 14) Mul_Acc(1, 13) Mul_Acc(2, 12) Mul_Acc(3, 11) Mul_Acc(4, 10) Mul_Acc(5, 9) Mul_Acc(6, 8) Mul_Acc(7, 7) Mul_Acc(8, 6) Mul_Acc(9, 5) Mul_Acc(10, 4) Mul_Acc(11, 3) Mul_Acc(12, 2) Mul_Acc(13, 1) Mul_Acc(14, 0) \
1034 Mul_SaveAcc(14, 0, 15) Mul_Acc(1, 14) Mul_Acc(2, 13) Mul_Acc(3, 12) Mul_Acc(4, 11) Mul_Acc(5, 10) Mul_Acc(6, 9) Mul_Acc(7, 8) Mul_Acc(8, 7) Mul_Acc(9, 6) Mul_Acc(10, 5) Mul_Acc(11, 4) Mul_Acc(12, 3) Mul_Acc(13, 2) Mul_Acc(14, 1) Mul_Acc(15, 0) \
1035 Mul_SaveAcc(15, 1, 15) Mul_Acc(2, 14) Mul_Acc(3, 13) Mul_Acc(4, 12) Mul_Acc(5, 11) Mul_Acc(6, 10) Mul_Acc(7, 9) Mul_Acc(8, 8) Mul_Acc(9, 7) Mul_Acc(10, 6) Mul_Acc(11, 5) Mul_Acc(12, 4) Mul_Acc(13, 3) Mul_Acc(14, 2) Mul_Acc(15, 1) \
1036 Mul_SaveAcc(16, 2, 15) Mul_Acc(3, 14) Mul_Acc(4, 13) Mul_Acc(5, 12) Mul_Acc(6, 11) Mul_Acc(7, 10) Mul_Acc(8, 9) Mul_Acc(9, 8) Mul_Acc(10, 7) Mul_Acc(11, 6) Mul_Acc(12, 5) Mul_Acc(13, 4) Mul_Acc(14, 3) Mul_Acc(15, 2) \
1037 Mul_SaveAcc(17, 3, 15) Mul_Acc(4, 14) Mul_Acc(5, 13) Mul_Acc(6, 12) Mul_Acc(7, 11) Mul_Acc(8, 10) Mul_Acc(9, 9) Mul_Acc(10, 8) Mul_Acc(11, 7) Mul_Acc(12, 6) Mul_Acc(13, 5) Mul_Acc(14, 4) Mul_Acc(15, 3) \
1038 Mul_SaveAcc(18, 4, 15) Mul_Acc(5, 14) Mul_Acc(6, 13) Mul_Acc(7, 12) Mul_Acc(8, 11) Mul_Acc(9, 10) Mul_Acc(10, 9) Mul_Acc(11, 8) Mul_Acc(12, 7) Mul_Acc(13, 6) Mul_Acc(14, 5) Mul_Acc(15, 4) \
1039 Mul_SaveAcc(19, 5, 15) Mul_Acc(6, 14) Mul_Acc(7, 13) Mul_Acc(8, 12) Mul_Acc(9, 11) Mul_Acc(10, 10) Mul_Acc(11, 9) Mul_Acc(12, 8) Mul_Acc(13, 7) Mul_Acc(14, 6) Mul_Acc(15, 5) \
1040 Mul_SaveAcc(20, 6, 15) Mul_Acc(7, 14) Mul_Acc(8, 13) Mul_Acc(9, 12) Mul_Acc(10, 11) Mul_Acc(11, 10) Mul_Acc(12, 9) Mul_Acc(13, 8) Mul_Acc(14, 7) Mul_Acc(15, 6) \
1041 Mul_SaveAcc(21, 7, 15) Mul_Acc(8, 14) Mul_Acc(9, 13) Mul_Acc(10, 12) Mul_Acc(11, 11) Mul_Acc(12, 10) Mul_Acc(13, 9) Mul_Acc(14, 8) Mul_Acc(15, 7) \
1042 Mul_SaveAcc(22, 8, 15) Mul_Acc(9, 14) Mul_Acc(10, 13) Mul_Acc(11, 12) Mul_Acc(12, 11) Mul_Acc(13, 10) Mul_Acc(14, 9) Mul_Acc(15, 8) \
1043 Mul_SaveAcc(23, 9, 15) Mul_Acc(10, 14) Mul_Acc(11, 13) Mul_Acc(12, 12) Mul_Acc(13, 11) Mul_Acc(14, 10) Mul_Acc(15, 9) \
1044 Mul_SaveAcc(24, 10, 15) Mul_Acc(11, 14) Mul_Acc(12, 13) Mul_Acc(13, 12) Mul_Acc(14, 11) Mul_Acc(15, 10) \
1045 Mul_SaveAcc(25, 11, 15) Mul_Acc(12, 14) Mul_Acc(13, 13) Mul_Acc(14, 12) Mul_Acc(15, 11) \
1046 Mul_SaveAcc(26, 12, 15) Mul_Acc(13, 14) Mul_Acc(14, 13) Mul_Acc(15, 12) \
1047 Mul_SaveAcc(27, 13, 15) Mul_Acc(14, 14) Mul_Acc(15, 13) \
1048 Mul_SaveAcc(28, 14, 15) Mul_Acc(15, 14) \
1049 Mul_End(29, 15)
1050
1051#define Squ_2 \
1052 Squ_Begin(2) \
1053 Squ_End(2)
1054
1055#define Squ_4 \
1056 Squ_Begin(4) \
1057 Squ_SaveAcc(1, 0, 2) Squ_Diag(1) \
1058 Squ_SaveAcc(2, 0, 3) Squ_Acc(1, 2) Squ_NonDiag \
1059 Squ_SaveAcc(3, 1, 3) Squ_Diag(2) \
1060 Squ_SaveAcc(4, 2, 3) Squ_NonDiag \
1061 Squ_End(4)
1062
1063#define Squ_8 \
1064 Squ_Begin(8) \
1065 Squ_SaveAcc(1, 0, 2) Squ_Diag(1) \
1066 Squ_SaveAcc(2, 0, 3) Squ_Acc(1, 2) Squ_NonDiag \
1067 Squ_SaveAcc(3, 0, 4) Squ_Acc(1, 3) Squ_Diag(2) \
1068 Squ_SaveAcc(4, 0, 5) Squ_Acc(1, 4) Squ_Acc(2, 3) Squ_NonDiag \
1069 Squ_SaveAcc(5, 0, 6) Squ_Acc(1, 5) Squ_Acc(2, 4) Squ_Diag(3) \
1070 Squ_SaveAcc(6, 0, 7) Squ_Acc(1, 6) Squ_Acc(2, 5) Squ_Acc(3, 4) Squ_NonDiag \
1071 Squ_SaveAcc(7, 1, 7) Squ_Acc(2, 6) Squ_Acc(3, 5) Squ_Diag(4) \
1072 Squ_SaveAcc(8, 2, 7) Squ_Acc(3, 6) Squ_Acc(4, 5) Squ_NonDiag \
1073 Squ_SaveAcc(9, 3, 7) Squ_Acc(4, 6) Squ_Diag(5) \
1074 Squ_SaveAcc(10, 4, 7) Squ_Acc(5, 6) Squ_NonDiag \
1075 Squ_SaveAcc(11, 5, 7) Squ_Diag(6) \
1076 Squ_SaveAcc(12, 6, 7) Squ_NonDiag \
1077 Squ_End(8)
1078
1079#define Squ_16 \
1080 Squ_Begin(16) \
1081 Squ_SaveAcc(1, 0, 2) Squ_Diag(1) \
1082 Squ_SaveAcc(2, 0, 3) Squ_Acc(1, 2) Squ_NonDiag \
1083 Squ_SaveAcc(3, 0, 4) Squ_Acc(1, 3) Squ_Diag(2) \
1084 Squ_SaveAcc(4, 0, 5) Squ_Acc(1, 4) Squ_Acc(2, 3) Squ_NonDiag \
1085 Squ_SaveAcc(5, 0, 6) Squ_Acc(1, 5) Squ_Acc(2, 4) Squ_Diag(3) \
1086 Squ_SaveAcc(6, 0, 7) Squ_Acc(1, 6) Squ_Acc(2, 5) Squ_Acc(3, 4) Squ_NonDiag \
1087 Squ_SaveAcc(7, 0, 8) Squ_Acc(1, 7) Squ_Acc(2, 6) Squ_Acc(3, 5) Squ_Diag(4) \
1088 Squ_SaveAcc(8, 0, 9) Squ_Acc(1, 8) Squ_Acc(2, 7) Squ_Acc(3, 6) Squ_Acc(4, 5) Squ_NonDiag \
1089 Squ_SaveAcc(9, 0, 10) Squ_Acc(1, 9) Squ_Acc(2, 8) Squ_Acc(3, 7) Squ_Acc(4, 6) Squ_Diag(5) \
1090 Squ_SaveAcc(10, 0, 11) Squ_Acc(1, 10) Squ_Acc(2, 9) Squ_Acc(3, 8) Squ_Acc(4, 7) Squ_Acc(5, 6) Squ_NonDiag \
1091 Squ_SaveAcc(11, 0, 12) Squ_Acc(1, 11) Squ_Acc(2, 10) Squ_Acc(3, 9) Squ_Acc(4, 8) Squ_Acc(5, 7) Squ_Diag(6) \
1092 Squ_SaveAcc(12, 0, 13) Squ_Acc(1, 12) Squ_Acc(2, 11) Squ_Acc(3, 10) Squ_Acc(4, 9) Squ_Acc(5, 8) Squ_Acc(6, 7) Squ_NonDiag \
1093 Squ_SaveAcc(13, 0, 14) Squ_Acc(1, 13) Squ_Acc(2, 12) Squ_Acc(3, 11) Squ_Acc(4, 10) Squ_Acc(5, 9) Squ_Acc(6, 8) Squ_Diag(7) \
1094 Squ_SaveAcc(14, 0, 15) Squ_Acc(1, 14) Squ_Acc(2, 13) Squ_Acc(3, 12) Squ_Acc(4, 11) Squ_Acc(5, 10) Squ_Acc(6, 9) Squ_Acc(7, 8) Squ_NonDiag \
1095 Squ_SaveAcc(15, 1, 15) Squ_Acc(2, 14) Squ_Acc(3, 13) Squ_Acc(4, 12) Squ_Acc(5, 11) Squ_Acc(6, 10) Squ_Acc(7, 9) Squ_Diag(8) \
1096 Squ_SaveAcc(16, 2, 15) Squ_Acc(3, 14) Squ_Acc(4, 13) Squ_Acc(5, 12) Squ_Acc(6, 11) Squ_Acc(7, 10) Squ_Acc(8, 9) Squ_NonDiag \
1097 Squ_SaveAcc(17, 3, 15) Squ_Acc(4, 14) Squ_Acc(5, 13) Squ_Acc(6, 12) Squ_Acc(7, 11) Squ_Acc(8, 10) Squ_Diag(9) \
1098 Squ_SaveAcc(18, 4, 15) Squ_Acc(5, 14) Squ_Acc(6, 13) Squ_Acc(7, 12) Squ_Acc(8, 11) Squ_Acc(9, 10) Squ_NonDiag \
1099 Squ_SaveAcc(19, 5, 15) Squ_Acc(6, 14) Squ_Acc(7, 13) Squ_Acc(8, 12) Squ_Acc(9, 11) Squ_Diag(10) \
1100 Squ_SaveAcc(20, 6, 15) Squ_Acc(7, 14) Squ_Acc(8, 13) Squ_Acc(9, 12) Squ_Acc(10, 11) Squ_NonDiag \
1101 Squ_SaveAcc(21, 7, 15) Squ_Acc(8, 14) Squ_Acc(9, 13) Squ_Acc(10, 12) Squ_Diag(11) \
1102 Squ_SaveAcc(22, 8, 15) Squ_Acc(9, 14) Squ_Acc(10, 13) Squ_Acc(11, 12) Squ_NonDiag \
1103 Squ_SaveAcc(23, 9, 15) Squ_Acc(10, 14) Squ_Acc(11, 13) Squ_Diag(12) \
1104 Squ_SaveAcc(24, 10, 15) Squ_Acc(11, 14) Squ_Acc(12, 13) Squ_NonDiag \
1105 Squ_SaveAcc(25, 11, 15) Squ_Acc(12, 14) Squ_Diag(13) \
1106 Squ_SaveAcc(26, 12, 15) Squ_Acc(13, 14) Squ_NonDiag \
1107 Squ_SaveAcc(27, 13, 15) Squ_Diag(14) \
1108 Squ_SaveAcc(28, 14, 15) Squ_NonDiag \
1109 Squ_End(16)
1110
1111#define Bot_2 \
1112 Mul_Begin(2) \
1113 Bot_SaveAcc(0, 0, 1) Bot_Acc(1, 0) \
1114 Bot_End(2)
1115
1116#define Bot_4 \
1117 Mul_Begin(4) \
1118 Mul_SaveAcc(0, 0, 1) Mul_Acc(1, 0) \
1119 Mul_SaveAcc(1, 2, 0) Mul_Acc(1, 1) Mul_Acc(0, 2) \
1120 Bot_SaveAcc(2, 0, 3) Bot_Acc(1, 2) Bot_Acc(2, 1) Bot_Acc(3, 0) \
1121 Bot_End(4)
1122
1123#define Bot_8 \
1124 Mul_Begin(8) \
1125 Mul_SaveAcc(0, 0, 1) Mul_Acc(1, 0) \
1126 Mul_SaveAcc(1, 0, 2) Mul_Acc(1, 1) Mul_Acc(2, 0) \
1127 Mul_SaveAcc(2, 0, 3) Mul_Acc(1, 2) Mul_Acc(2, 1) Mul_Acc(3, 0) \
1128 Mul_SaveAcc(3, 0, 4) Mul_Acc(1, 3) Mul_Acc(2, 2) Mul_Acc(3, 1) Mul_Acc(4, 0) \
1129 Mul_SaveAcc(4, 0, 5) Mul_Acc(1, 4) Mul_Acc(2, 3) Mul_Acc(3, 2) Mul_Acc(4, 1) Mul_Acc(5, 0) \
1130 Mul_SaveAcc(5, 0, 6) Mul_Acc(1, 5) Mul_Acc(2, 4) Mul_Acc(3, 3) Mul_Acc(4, 2) Mul_Acc(5, 1) Mul_Acc(6, 0) \
1131 Bot_SaveAcc(6, 0, 7) Bot_Acc(1, 6) Bot_Acc(2, 5) Bot_Acc(3, 4) Bot_Acc(4, 3) Bot_Acc(5, 2) Bot_Acc(6, 1) Bot_Acc(7, 0) \
1132 Bot_End(8)
1133
1134#define Bot_16 \
1135 Mul_Begin(16) \
1136 Mul_SaveAcc(0, 0, 1) Mul_Acc(1, 0) \
1137 Mul_SaveAcc(1, 0, 2) Mul_Acc(1, 1) Mul_Acc(2, 0) \
1138 Mul_SaveAcc(2, 0, 3) Mul_Acc(1, 2) Mul_Acc(2, 1) Mul_Acc(3, 0) \
1139 Mul_SaveAcc(3, 0, 4) Mul_Acc(1, 3) Mul_Acc(2, 2) Mul_Acc(3, 1) Mul_Acc(4, 0) \
1140 Mul_SaveAcc(4, 0, 5) Mul_Acc(1, 4) Mul_Acc(2, 3) Mul_Acc(3, 2) Mul_Acc(4, 1) Mul_Acc(5, 0) \
1141 Mul_SaveAcc(5, 0, 6) Mul_Acc(1, 5) Mul_Acc(2, 4) Mul_Acc(3, 3) Mul_Acc(4, 2) Mul_Acc(5, 1) Mul_Acc(6, 0) \
1142 Mul_SaveAcc(6, 0, 7) Mul_Acc(1, 6) Mul_Acc(2, 5) Mul_Acc(3, 4) Mul_Acc(4, 3) Mul_Acc(5, 2) Mul_Acc(6, 1) Mul_Acc(7, 0) \
1143 Mul_SaveAcc(7, 0, 8) Mul_Acc(1, 7) Mul_Acc(2, 6) Mul_Acc(3, 5) Mul_Acc(4, 4) Mul_Acc(5, 3) Mul_Acc(6, 2) Mul_Acc(7, 1) Mul_Acc(8, 0) \
1144 Mul_SaveAcc(8, 0, 9) Mul_Acc(1, 8) Mul_Acc(2, 7) Mul_Acc(3, 6) Mul_Acc(4, 5) Mul_Acc(5, 4) Mul_Acc(6, 3) Mul_Acc(7, 2) Mul_Acc(8, 1) Mul_Acc(9, 0) \
1145 Mul_SaveAcc(9, 0, 10) Mul_Acc(1, 9) Mul_Acc(2, 8) Mul_Acc(3, 7) Mul_Acc(4, 6) Mul_Acc(5, 5) Mul_Acc(6, 4) Mul_Acc(7, 3) Mul_Acc(8, 2) Mul_Acc(9, 1) Mul_Acc(10, 0) \
1146 Mul_SaveAcc(10, 0, 11) Mul_Acc(1, 10) Mul_Acc(2, 9) Mul_Acc(3, 8) Mul_Acc(4, 7) Mul_Acc(5, 6) Mul_Acc(6, 5) Mul_Acc(7, 4) Mul_Acc(8, 3) Mul_Acc(9, 2) Mul_Acc(10, 1) Mul_Acc(11, 0) \
1147 Mul_SaveAcc(11, 0, 12) Mul_Acc(1, 11) Mul_Acc(2, 10) Mul_Acc(3, 9) Mul_Acc(4, 8) Mul_Acc(5, 7) Mul_Acc(6, 6) Mul_Acc(7, 5) Mul_Acc(8, 4) Mul_Acc(9, 3) Mul_Acc(10, 2) Mul_Acc(11, 1) Mul_Acc(12, 0) \
1148 Mul_SaveAcc(12, 0, 13) Mul_Acc(1, 12) Mul_Acc(2, 11) Mul_Acc(3, 10) Mul_Acc(4, 9) Mul_Acc(5, 8) Mul_Acc(6, 7) Mul_Acc(7, 6) Mul_Acc(8, 5) Mul_Acc(9, 4) Mul_Acc(10, 3) Mul_Acc(11, 2) Mul_Acc(12, 1) Mul_Acc(13, 0) \
1149 Mul_SaveAcc(13, 0, 14) Mul_Acc(1, 13) Mul_Acc(2, 12) Mul_Acc(3, 11) Mul_Acc(4, 10) Mul_Acc(5, 9) Mul_Acc(6, 8) Mul_Acc(7, 7) Mul_Acc(8, 6) Mul_Acc(9, 5) Mul_Acc(10, 4) Mul_Acc(11, 3) Mul_Acc(12, 2) Mul_Acc(13, 1) Mul_Acc(14, 0) \
1150 Bot_SaveAcc(14, 0, 15) Bot_Acc(1, 14) Bot_Acc(2, 13) Bot_Acc(3, 12) Bot_Acc(4, 11) Bot_Acc(5, 10) Bot_Acc(6, 9) Bot_Acc(7, 8) Bot_Acc(8, 7) Bot_Acc(9, 6) Bot_Acc(10, 5) Bot_Acc(11, 4) Bot_Acc(12, 3) Bot_Acc(13, 2) Bot_Acc(14, 1) Bot_Acc(15, 0) \
1151 Bot_End(16)
1152
1153#endif
1154
1155#if 0
1156#define Mul_Begin(n) \
1157 Declare2Words(p) \
1158 Declare2Words(c) \
1159 Declare2Words(d) \
1160 MultiplyWords(p, A[0], B[0]) \
1161 AssignWord(c, LowWord(p)) \
1162 AssignWord(d, HighWord(p))
1163
1164#define Mul_Acc(i, j) \
1165 MultiplyWords(p, A[i], B[j]) \
1166 Acc2WordsBy1(c, LowWord(p)) \
1167 Acc2WordsBy1(d, HighWord(p))
1168
1169#define Mul_SaveAcc(k, i, j) \
1170 R[k] = LowWord(c); \
1171 Add2WordsBy1(c, d, HighWord(c)) \
1172 MultiplyWords(p, A[i], B[j]) \
1173 AssignWord(d, HighWord(p)) \
1174 Acc2WordsBy1(c, LowWord(p))
1175
1176#define Mul_End(n) \
1177 R[2*n-3] = LowWord(c); \
1178 Acc2WordsBy1(d, HighWord(c)) \
1179 MultiplyWords(p, A[n-1], B[n-1])\
1180 Acc2WordsBy2(d, p) \
1181 R[2*n-2] = LowWord(d); \
1182 R[2*n-1] = HighWord(d);
1183
1184#define Bot_SaveAcc(k, i, j) \
1185 R[k] = LowWord(c); \
1186 word e = LowWord(d) + HighWord(c); \
1187 e += A[i] * B[j];
1188
1189#define Bot_Acc(i, j) \
1190 e += A[i] * B[j];
1191
1192#define Bot_End(n) \
1193 R[n-1] = e;
1194#else
1195#define Mul_Begin(n) \
1196 Declare2Words(p) \
1197 word c; \
1198 Declare2Words(d) \
1199 MultiplyWords(p, A[0], B[0]) \
1200 c = LowWord(p); \
1201 AssignWord(d, HighWord(p))
1202
1203#define Mul_Acc(i, j) \
1204 MulAcc(c, d, A[i], B[j])
1205
1206#define Mul_SaveAcc(k, i, j) \
1207 R[k] = c; \
1208 c = LowWord(d); \
1209 AssignWord(d, HighWord(d)) \
1210 MulAcc(c, d, A[i], B[j])
1211
1212#define Mul_End(k, i) \
1213 R[k] = c; \
1214 MultiplyWords(p, A[i], B[i]) \
1215 Acc2WordsBy2(p, d) \
1216 R[k+1] = LowWord(p); \
1217 R[k+2] = HighWord(p);
1218
1219#define Bot_SaveAcc(k, i, j) \
1220 R[k] = c; \
1221 c = LowWord(d); \
1222 c += A[i] * B[j];
1223
1224#define Bot_Acc(i, j) \
1225 c += A[i] * B[j];
1226
1227#define Bot_End(n) \
1228 R[n-1] = c;
1229#endif
1230
1231#define Squ_Begin(n) \
1232 Declare2Words(p) \
1233 word c; \
1234 Declare2Words(d) \
1235 Declare2Words(e) \
1236 MultiplyWords(p, A[0], A[0]) \
1237 R[0] = LowWord(p); \
1238 AssignWord(e, HighWord(p)) \
1239 MultiplyWords(p, A[0], A[1]) \
1240 c = LowWord(p); \
1241 AssignWord(d, HighWord(p)) \
1242 Squ_NonDiag \
1243
1244#define Squ_NonDiag \
1245 Double3Words(c, d)
1246
1247#define Squ_SaveAcc(k, i, j) \
1248 Acc3WordsBy2(c, d, e) \
1249 R[k] = c; \
1250 MultiplyWords(p, A[i], A[j]) \
1251 c = LowWord(p); \
1252 AssignWord(d, HighWord(p)) \
1253
1254#define Squ_Acc(i, j) \
1255 MulAcc(c, d, A[i], A[j])
1256
1257#define Squ_Diag(i) \
1258 Squ_NonDiag \
1259 MulAcc(c, d, A[i], A[i])
1260
1261#define Squ_End(n) \
1262 Acc3WordsBy2(c, d, e) \
1263 R[2*n-3] = c; \
1264 MultiplyWords(p, A[n-1], A[n-1])\
1265 Acc2WordsBy2(p, e) \
1266 R[2*n-2] = LowWord(p); \
1267 R[2*n-1] = HighWord(p);
1268
1269
1270void Baseline_Multiply2(word *R, const word *AA, const word *BB)
1271{
1272 // http://github.com/weidai11/cryptopp/issues/188
1275
1276 Mul_2
1277}
1278
1279void Baseline_Multiply4(word *R, const word *AA, const word *BB)
1280{
1281 // http://github.com/weidai11/cryptopp/issues/188
1284
1285 Mul_4
1286}
1287
1288void Baseline_Multiply8(word *R, const word *AA, const word *BB)
1289{
1290 // http://github.com/weidai11/cryptopp/issues/188
1293
1294 Mul_8
1295}
1296
1297void Baseline_Square2(word *R, const word *AA)
1298{
1299 // http://github.com/weidai11/cryptopp/issues/188
1301
1302 Squ_2
1303}
1304
1305void Baseline_Square4(word *R, const word *AA)
1306{
1307 // http://github.com/weidai11/cryptopp/issues/188
1309
1310 Squ_4
1311}
1312
1313void Baseline_Square8(word *R, const word *AA)
1314{
1315 // http://github.com/weidai11/cryptopp/issues/188
1317
1318 Squ_8
1319}
1320
1321void Baseline_MultiplyBottom2(word *R, const word *AA, const word *BB)
1322{
1323 // http://github.com/weidai11/cryptopp/issues/188
1326
1327 Bot_2
1328
1329// http://github.com/weidai11/cryptopp/issues/340
1330#if defined(TWO_64_BIT_WORDS)
1331 CRYPTOPP_UNUSED(d0); CRYPTOPP_UNUSED(d1);
1332#endif
1333}
1334
1335void Baseline_MultiplyBottom4(word *R, const word *AA, const word *BB)
1336{
1337 // http://github.com/weidai11/cryptopp/issues/188
1340
1341 Bot_4
1342}
1343
1344void Baseline_MultiplyBottom8(word *R, const word *AA, const word *BB)
1345{
1346 // http://github.com/weidai11/cryptopp/issues/188
1349
1350 Bot_8
1351}
1352
1353#define Top_Begin(n) \
1354 Declare2Words(p) \
1355 word c; \
1356 Declare2Words(d) \
1357 MultiplyWords(p, A[0], B[n-2]);\
1358 AssignWord(d, HighWord(p));
1359
1360#define Top_Acc(i, j) \
1361 MultiplyWords(p, A[i], B[j]);\
1362 Acc2WordsBy1(d, HighWord(p));
1363
1364#define Top_SaveAcc0(i, j) \
1365 c = LowWord(d); \
1366 AssignWord(d, HighWord(d)) \
1367 MulAcc(c, d, A[i], B[j])
1368
1369#define Top_SaveAcc1(i, j) \
1370 c = L<c; \
1371 Acc2WordsBy1(d, c); \
1372 c = LowWord(d); \
1373 AssignWord(d, HighWord(d)) \
1374 MulAcc(c, d, A[i], B[j])
1375
1376void Baseline_MultiplyTop2(word *R, const word *A, const word *B, word L)
1377{
1378 CRYPTOPP_UNUSED(L);
1379 word T[4];
1380 Baseline_Multiply2(T, A, B);
1381 R[0] = T[2];
1382 R[1] = T[3];
1383}
1384
1385void Baseline_MultiplyTop4(word *R, const word *AA, const word *BB, word L)
1386{
1387 // http://github.com/weidai11/cryptopp/issues/188
1390
1391 Top_Begin(4)
1392 Top_Acc(1, 1) Top_Acc(2, 0) \
1393 Top_SaveAcc0(0, 3) Mul_Acc(1, 2) Mul_Acc(2, 1) Mul_Acc(3, 0) \
1394 Top_SaveAcc1(1, 3) Mul_Acc(2, 2) Mul_Acc(3, 1) \
1395 Mul_SaveAcc(0, 2, 3) Mul_Acc(3, 2) \
1396 Mul_End(1, 3)
1397}
1398
1399void Baseline_MultiplyTop8(word *R, const word *AA, const word *BB, word L)
1400{
1401 // http://github.com/weidai11/cryptopp/issues/188
1404
1405 Top_Begin(8)
1406 Top_Acc(1, 5) Top_Acc(2, 4) Top_Acc(3, 3) Top_Acc(4, 2) Top_Acc(5, 1) Top_Acc(6, 0) \
1407 Top_SaveAcc0(0, 7) Mul_Acc(1, 6) Mul_Acc(2, 5) Mul_Acc(3, 4) Mul_Acc(4, 3) Mul_Acc(5, 2) Mul_Acc(6, 1) Mul_Acc(7, 0) \
1408 Top_SaveAcc1(1, 7) Mul_Acc(2, 6) Mul_Acc(3, 5) Mul_Acc(4, 4) Mul_Acc(5, 3) Mul_Acc(6, 2) Mul_Acc(7, 1) \
1409 Mul_SaveAcc(0, 2, 7) Mul_Acc(3, 6) Mul_Acc(4, 5) Mul_Acc(5, 4) Mul_Acc(6, 3) Mul_Acc(7, 2) \
1410 Mul_SaveAcc(1, 3, 7) Mul_Acc(4, 6) Mul_Acc(5, 5) Mul_Acc(6, 4) Mul_Acc(7, 3) \
1411 Mul_SaveAcc(2, 4, 7) Mul_Acc(5, 6) Mul_Acc(6, 5) Mul_Acc(7, 4) \
1412 Mul_SaveAcc(3, 5, 7) Mul_Acc(6, 6) Mul_Acc(7, 5) \
1413 Mul_SaveAcc(4, 6, 7) Mul_Acc(7, 6) \
1414 Mul_End(5, 7)
1415}
1416
1417#if !CRYPTOPP_INTEGER_SSE2 // save memory by not compiling these functions when SSE2 is available
1418void Baseline_Multiply16(word *R, const word *AA, const word *BB)
1419{
1420 // http://github.com/weidai11/cryptopp/issues/188
1423
1424 Mul_16
1425}
1426
1427void Baseline_Square16(word *R, const word *AA)
1428{
1429 // http://github.com/weidai11/cryptopp/issues/188
1431
1432 Squ_16
1433}
1434
1435void Baseline_MultiplyBottom16(word *R, const word *AA, const word *BB)
1436{
1437 // http://github.com/weidai11/cryptopp/issues/188
1440
1441 Bot_16
1442}
1443
1444void Baseline_MultiplyTop16(word *R, const word *AA, const word *BB, word L)
1445{
1446 // http://github.com/weidai11/cryptopp/issues/188
1449
1450 Top_Begin(16)
1451 Top_Acc(1, 13) Top_Acc(2, 12) Top_Acc(3, 11) Top_Acc(4, 10) Top_Acc(5, 9) Top_Acc(6, 8) Top_Acc(7, 7) Top_Acc(8, 6) Top_Acc(9, 5) Top_Acc(10, 4) Top_Acc(11, 3) Top_Acc(12, 2) Top_Acc(13, 1) Top_Acc(14, 0) \
1452 Top_SaveAcc0(0, 15) Mul_Acc(1, 14) Mul_Acc(2, 13) Mul_Acc(3, 12) Mul_Acc(4, 11) Mul_Acc(5, 10) Mul_Acc(6, 9) Mul_Acc(7, 8) Mul_Acc(8, 7) Mul_Acc(9, 6) Mul_Acc(10, 5) Mul_Acc(11, 4) Mul_Acc(12, 3) Mul_Acc(13, 2) Mul_Acc(14, 1) Mul_Acc(15, 0) \
1453 Top_SaveAcc1(1, 15) Mul_Acc(2, 14) Mul_Acc(3, 13) Mul_Acc(4, 12) Mul_Acc(5, 11) Mul_Acc(6, 10) Mul_Acc(7, 9) Mul_Acc(8, 8) Mul_Acc(9, 7) Mul_Acc(10, 6) Mul_Acc(11, 5) Mul_Acc(12, 4) Mul_Acc(13, 3) Mul_Acc(14, 2) Mul_Acc(15, 1) \
1454 Mul_SaveAcc(0, 2, 15) Mul_Acc(3, 14) Mul_Acc(4, 13) Mul_Acc(5, 12) Mul_Acc(6, 11) Mul_Acc(7, 10) Mul_Acc(8, 9) Mul_Acc(9, 8) Mul_Acc(10, 7) Mul_Acc(11, 6) Mul_Acc(12, 5) Mul_Acc(13, 4) Mul_Acc(14, 3) Mul_Acc(15, 2) \
1455 Mul_SaveAcc(1, 3, 15) Mul_Acc(4, 14) Mul_Acc(5, 13) Mul_Acc(6, 12) Mul_Acc(7, 11) Mul_Acc(8, 10) Mul_Acc(9, 9) Mul_Acc(10, 8) Mul_Acc(11, 7) Mul_Acc(12, 6) Mul_Acc(13, 5) Mul_Acc(14, 4) Mul_Acc(15, 3) \
1456 Mul_SaveAcc(2, 4, 15) Mul_Acc(5, 14) Mul_Acc(6, 13) Mul_Acc(7, 12) Mul_Acc(8, 11) Mul_Acc(9, 10) Mul_Acc(10, 9) Mul_Acc(11, 8) Mul_Acc(12, 7) Mul_Acc(13, 6) Mul_Acc(14, 5) Mul_Acc(15, 4) \
1457 Mul_SaveAcc(3, 5, 15) Mul_Acc(6, 14) Mul_Acc(7, 13) Mul_Acc(8, 12) Mul_Acc(9, 11) Mul_Acc(10, 10) Mul_Acc(11, 9) Mul_Acc(12, 8) Mul_Acc(13, 7) Mul_Acc(14, 6) Mul_Acc(15, 5) \
1458 Mul_SaveAcc(4, 6, 15) Mul_Acc(7, 14) Mul_Acc(8, 13) Mul_Acc(9, 12) Mul_Acc(10, 11) Mul_Acc(11, 10) Mul_Acc(12, 9) Mul_Acc(13, 8) Mul_Acc(14, 7) Mul_Acc(15, 6) \
1459 Mul_SaveAcc(5, 7, 15) Mul_Acc(8, 14) Mul_Acc(9, 13) Mul_Acc(10, 12) Mul_Acc(11, 11) Mul_Acc(12, 10) Mul_Acc(13, 9) Mul_Acc(14, 8) Mul_Acc(15, 7) \
1460 Mul_SaveAcc(6, 8, 15) Mul_Acc(9, 14) Mul_Acc(10, 13) Mul_Acc(11, 12) Mul_Acc(12, 11) Mul_Acc(13, 10) Mul_Acc(14, 9) Mul_Acc(15, 8) \
1461 Mul_SaveAcc(7, 9, 15) Mul_Acc(10, 14) Mul_Acc(11, 13) Mul_Acc(12, 12) Mul_Acc(13, 11) Mul_Acc(14, 10) Mul_Acc(15, 9) \
1462 Mul_SaveAcc(8, 10, 15) Mul_Acc(11, 14) Mul_Acc(12, 13) Mul_Acc(13, 12) Mul_Acc(14, 11) Mul_Acc(15, 10) \
1463 Mul_SaveAcc(9, 11, 15) Mul_Acc(12, 14) Mul_Acc(13, 13) Mul_Acc(14, 12) Mul_Acc(15, 11) \
1464 Mul_SaveAcc(10, 12, 15) Mul_Acc(13, 14) Mul_Acc(14, 13) Mul_Acc(15, 12) \
1465 Mul_SaveAcc(11, 13, 15) Mul_Acc(14, 14) Mul_Acc(15, 13) \
1466 Mul_SaveAcc(12, 14, 15) Mul_Acc(15, 14) \
1467 Mul_End(13, 15)
1468}
1469#endif
1470
1471// ********************************************************
1472
1473#if CRYPTOPP_INTEGER_SSE2
1474
1475CRYPTOPP_ALIGN_DATA(16)
1477const word32 s_maskLow16[4] = {
1478 0xffff,0xffff,0xffff,0xffff
1479};
1480
1481#undef Mul_Begin
1482#undef Mul_Acc
1483#undef Top_Begin
1484#undef Top_Acc
1485#undef Squ_Acc
1486#undef Squ_NonDiag
1487#undef Squ_Diag
1488#undef Squ_SaveAcc
1489#undef Squ_Begin
1490#undef Mul_SaveAcc
1491#undef Bot_Acc
1492#undef Bot_SaveAcc
1493#undef Bot_End
1494#undef Squ_End
1495#undef Mul_End
1496
1497#define SSE2_FinalSave(k) \
1498 AS2( psllq xmm5, 16) \
1499 AS2( paddq xmm4, xmm5) \
1500 AS2( movq QWORD PTR [ecx+8*(k)], xmm4)
1501
1502#define SSE2_SaveShift(k) \
1503 AS2( movq xmm0, xmm6) \
1504 AS2( punpckhqdq xmm6, xmm0) \
1505 AS2( movq xmm1, xmm7) \
1506 AS2( punpckhqdq xmm7, xmm1) \
1507 AS2( paddd xmm6, xmm0) \
1508 AS2( pslldq xmm6, 4) \
1509 AS2( paddd xmm7, xmm1) \
1510 AS2( paddd xmm4, xmm6) \
1511 AS2( pslldq xmm7, 4) \
1512 AS2( movq xmm6, xmm4) \
1513 AS2( paddd xmm5, xmm7) \
1514 AS2( movq xmm7, xmm5) \
1515 AS2( movd DWORD PTR [ecx+8*(k)], xmm4) \
1516 AS2( psrlq xmm6, 16) \
1517 AS2( paddq xmm6, xmm7) \
1518 AS2( punpckhqdq xmm4, xmm0) \
1519 AS2( punpckhqdq xmm5, xmm0) \
1520 AS2( movq QWORD PTR [ecx+8*(k)+2], xmm6) \
1521 AS2( psrlq xmm6, 3*16) \
1522 AS2( paddd xmm4, xmm6) \
1523
1524#define Squ_SSE2_SaveShift(k) \
1525 AS2( movq xmm0, xmm6) \
1526 AS2( punpckhqdq xmm6, xmm0) \
1527 AS2( movq xmm1, xmm7) \
1528 AS2( punpckhqdq xmm7, xmm1) \
1529 AS2( paddd xmm6, xmm0) \
1530 AS2( pslldq xmm6, 4) \
1531 AS2( paddd xmm7, xmm1) \
1532 AS2( paddd xmm4, xmm6) \
1533 AS2( pslldq xmm7, 4) \
1534 AS2( movhlps xmm6, xmm4) \
1535 AS2( movd DWORD PTR [ecx+8*(k)], xmm4) \
1536 AS2( paddd xmm5, xmm7) \
1537 AS2( movhps QWORD PTR [esp+12], xmm5)\
1538 AS2( psrlq xmm4, 16) \
1539 AS2( paddq xmm4, xmm5) \
1540 AS2( movq QWORD PTR [ecx+8*(k)+2], xmm4) \
1541 AS2( psrlq xmm4, 3*16) \
1542 AS2( paddd xmm4, xmm6) \
1543 AS2( movq QWORD PTR [esp+4], xmm4)\
1544
1545#define SSE2_FirstMultiply(i) \
1546 AS2( movdqa xmm7, [esi+(i)*16])\
1547 AS2( movdqa xmm5, [edi-(i)*16])\
1548 AS2( pmuludq xmm5, xmm7) \
1549 AS2( movdqa xmm4, [ebx])\
1550 AS2( movdqa xmm6, xmm4) \
1551 AS2( pand xmm4, xmm5) \
1552 AS2( psrld xmm5, 16) \
1553 AS2( pmuludq xmm7, [edx-(i)*16])\
1554 AS2( pand xmm6, xmm7) \
1555 AS2( psrld xmm7, 16)
1556
1557#define Squ_Begin(n) \
1558 SquPrologue \
1559 AS2( mov esi, esp)\
1560 AS2( and esp, 0xfffffff0)\
1561 AS2( lea edi, [esp-32*n])\
1562 AS2( sub esp, 32*n+16)\
1563 AS1( push esi)\
1564 AS2( mov esi, edi) \
1565 AS2( xor edx, edx) \
1566 ASL(1) \
1567 ASS( pshufd xmm0, [eax+edx], 3,1,2,0) \
1568 ASS( pshufd xmm1, [eax+edx], 2,0,3,1) \
1569 AS2( movdqa [edi+2*edx], xmm0) \
1570 AS2( psrlq xmm0, 32) \
1571 AS2( movdqa [edi+2*edx+16], xmm0) \
1572 AS2( movdqa [edi+16*n+2*edx], xmm1) \
1573 AS2( psrlq xmm1, 32) \
1574 AS2( movdqa [edi+16*n+2*edx+16], xmm1) \
1575 AS2( add edx, 16) \
1576 AS2( cmp edx, 8*(n)) \
1577 ASJ( jne, 1, b) \
1578 AS2( lea edx, [edi+16*n])\
1579 SSE2_FirstMultiply(0) \
1580
1581#define Squ_Acc(i) \
1582 ASL(LSqu##i) \
1583 AS2( movdqa xmm1, [esi+(i)*16]) \
1584 AS2( movdqa xmm0, [edi-(i)*16]) \
1585 AS2( movdqa xmm2, [ebx]) \
1586 AS2( pmuludq xmm0, xmm1) \
1587 AS2( pmuludq xmm1, [edx-(i)*16]) \
1588 AS2( movdqa xmm3, xmm2) \
1589 AS2( pand xmm2, xmm0) \
1590 AS2( psrld xmm0, 16) \
1591 AS2( paddd xmm4, xmm2) \
1592 AS2( paddd xmm5, xmm0) \
1593 AS2( pand xmm3, xmm1) \
1594 AS2( psrld xmm1, 16) \
1595 AS2( paddd xmm6, xmm3) \
1596 AS2( paddd xmm7, xmm1) \
1597
1598#define Squ_Acc1(i)
1599#define Squ_Acc2(i) ASC(call, LSqu##i)
1600#define Squ_Acc3(i) Squ_Acc2(i)
1601#define Squ_Acc4(i) Squ_Acc2(i)
1602#define Squ_Acc5(i) Squ_Acc2(i)
1603#define Squ_Acc6(i) Squ_Acc2(i)
1604#define Squ_Acc7(i) Squ_Acc2(i)
1605#define Squ_Acc8(i) Squ_Acc2(i)
1606
1607#define SSE2_End(E, n) \
1608 SSE2_SaveShift(2*(n)-3) \
1609 AS2( movdqa xmm7, [esi+16]) \
1610 AS2( movdqa xmm0, [edi]) \
1611 AS2( pmuludq xmm0, xmm7) \
1612 AS2( movdqa xmm2, [ebx]) \
1613 AS2( pmuludq xmm7, [edx]) \
1614 AS2( movdqa xmm6, xmm2) \
1615 AS2( pand xmm2, xmm0) \
1616 AS2( psrld xmm0, 16) \
1617 AS2( paddd xmm4, xmm2) \
1618 AS2( paddd xmm5, xmm0) \
1619 AS2( pand xmm6, xmm7) \
1620 AS2( psrld xmm7, 16) \
1621 SSE2_SaveShift(2*(n)-2) \
1622 SSE2_FinalSave(2*(n)-1) \
1623 AS1( pop esp)\
1624 E
1625
1626#define Squ_End(n) SSE2_End(SquEpilogue, n)
1627#define Mul_End(n) SSE2_End(MulEpilogue, n)
1628#define Top_End(n) SSE2_End(TopEpilogue, n)
1629
1630#define Squ_Column1(k, i) \
1631 Squ_SSE2_SaveShift(k) \
1632 AS2( add esi, 16) \
1633 SSE2_FirstMultiply(1)\
1634 Squ_Acc##i(i) \
1635 AS2( paddd xmm4, xmm4) \
1636 AS2( paddd xmm5, xmm5) \
1637 AS2( movdqa xmm3, [esi]) \
1638 AS2( movq xmm1, QWORD PTR [esi+8]) \
1639 AS2( pmuludq xmm1, xmm3) \
1640 AS2( pmuludq xmm3, xmm3) \
1641 AS2( movdqa xmm0, [ebx])\
1642 AS2( movdqa xmm2, xmm0) \
1643 AS2( pand xmm0, xmm1) \
1644 AS2( psrld xmm1, 16) \
1645 AS2( paddd xmm6, xmm0) \
1646 AS2( paddd xmm7, xmm1) \
1647 AS2( pand xmm2, xmm3) \
1648 AS2( psrld xmm3, 16) \
1649 AS2( paddd xmm6, xmm6) \
1650 AS2( paddd xmm7, xmm7) \
1651 AS2( paddd xmm4, xmm2) \
1652 AS2( paddd xmm5, xmm3) \
1653 AS2( movq xmm0, QWORD PTR [esp+4])\
1654 AS2( movq xmm1, QWORD PTR [esp+12])\
1655 AS2( paddd xmm4, xmm0)\
1656 AS2( paddd xmm5, xmm1)\
1657
1658#define Squ_Column0(k, i) \
1659 Squ_SSE2_SaveShift(k) \
1660 AS2( add edi, 16) \
1661 AS2( add edx, 16) \
1662 SSE2_FirstMultiply(1)\
1663 Squ_Acc##i(i) \
1664 AS2( paddd xmm6, xmm6) \
1665 AS2( paddd xmm7, xmm7) \
1666 AS2( paddd xmm4, xmm4) \
1667 AS2( paddd xmm5, xmm5) \
1668 AS2( movq xmm0, QWORD PTR [esp+4])\
1669 AS2( movq xmm1, QWORD PTR [esp+12])\
1670 AS2( paddd xmm4, xmm0)\
1671 AS2( paddd xmm5, xmm1)\
1672
1673#define SSE2_MulAdd45 \
1674 AS2( movdqa xmm7, [esi]) \
1675 AS2( movdqa xmm0, [edi]) \
1676 AS2( pmuludq xmm0, xmm7) \
1677 AS2( movdqa xmm2, [ebx]) \
1678 AS2( pmuludq xmm7, [edx]) \
1679 AS2( movdqa xmm6, xmm2) \
1680 AS2( pand xmm2, xmm0) \
1681 AS2( psrld xmm0, 16) \
1682 AS2( paddd xmm4, xmm2) \
1683 AS2( paddd xmm5, xmm0) \
1684 AS2( pand xmm6, xmm7) \
1685 AS2( psrld xmm7, 16)
1686
1687#define Mul_Begin(n) \
1688 MulPrologue \
1689 AS2( mov esi, esp)\
1690 AS2( and esp, 0xfffffff0)\
1691 AS2( sub esp, 48*n+16)\
1692 AS1( push esi)\
1693 AS2( xor edx, edx) \
1694 ASL(1) \
1695 ASS( pshufd xmm0, [eax+edx], 3,1,2,0) \
1696 ASS( pshufd xmm1, [eax+edx], 2,0,3,1) \
1697 ASS( pshufd xmm2, [edi+edx], 3,1,2,0) \
1698 AS2( movdqa [esp+20+2*edx], xmm0) \
1699 AS2( psrlq xmm0, 32) \
1700 AS2( movdqa [esp+20+2*edx+16], xmm0) \
1701 AS2( movdqa [esp+20+16*n+2*edx], xmm1) \
1702 AS2( psrlq xmm1, 32) \
1703 AS2( movdqa [esp+20+16*n+2*edx+16], xmm1) \
1704 AS2( movdqa [esp+20+32*n+2*edx], xmm2) \
1705 AS2( psrlq xmm2, 32) \
1706 AS2( movdqa [esp+20+32*n+2*edx+16], xmm2) \
1707 AS2( add edx, 16) \
1708 AS2( cmp edx, 8*(n)) \
1709 ASJ( jne, 1, b) \
1710 AS2( lea edi, [esp+20])\
1711 AS2( lea edx, [esp+20+16*n])\
1712 AS2( lea esi, [esp+20+32*n])\
1713 SSE2_FirstMultiply(0) \
1714
1715#define Mul_Acc(i) \
1716 ASL(LMul##i) \
1717 AS2( movdqa xmm1, [esi+i/2*(1-(i-2*(i/2))*2)*16]) \
1718 AS2( movdqa xmm0, [edi-i/2*(1-(i-2*(i/2))*2)*16]) \
1719 AS2( movdqa xmm2, [ebx]) \
1720 AS2( pmuludq xmm0, xmm1) \
1721 AS2( pmuludq xmm1, [edx-i/2*(1-(i-2*(i/2))*2)*16]) \
1722 AS2( movdqa xmm3, xmm2) \
1723 AS2( pand xmm2, xmm0) \
1724 AS2( psrld xmm0, 16) \
1725 AS2( paddd xmm4, xmm2) \
1726 AS2( paddd xmm5, xmm0) \
1727 AS2( pand xmm3, xmm1) \
1728 AS2( psrld xmm1, 16) \
1729 AS2( paddd xmm6, xmm3) \
1730 AS2( paddd xmm7, xmm1) \
1731
1732#define Mul_Acc1(i)
1733#define Mul_Acc2(i) ASC(call, LMul##i)
1734#define Mul_Acc3(i) Mul_Acc2(i)
1735#define Mul_Acc4(i) Mul_Acc2(i)
1736#define Mul_Acc5(i) Mul_Acc2(i)
1737#define Mul_Acc6(i) Mul_Acc2(i)
1738#define Mul_Acc7(i) Mul_Acc2(i)
1739#define Mul_Acc8(i) Mul_Acc2(i)
1740#define Mul_Acc9(i) Mul_Acc2(i)
1741#define Mul_Acc10(i) Mul_Acc2(i)
1742#define Mul_Acc11(i) Mul_Acc2(i)
1743#define Mul_Acc12(i) Mul_Acc2(i)
1744#define Mul_Acc13(i) Mul_Acc2(i)
1745#define Mul_Acc14(i) Mul_Acc2(i)
1746#define Mul_Acc15(i) Mul_Acc2(i)
1747#define Mul_Acc16(i) Mul_Acc2(i)
1748
1749#define Mul_Column1(k, i) \
1750 SSE2_SaveShift(k) \
1751 AS2( add esi, 16) \
1752 SSE2_MulAdd45\
1753 Mul_Acc##i(i) \
1754
1755#define Mul_Column0(k, i) \
1756 SSE2_SaveShift(k) \
1757 AS2( add edi, 16) \
1758 AS2( add edx, 16) \
1759 SSE2_MulAdd45\
1760 Mul_Acc##i(i) \
1761
1762#define Bot_Acc(i) \
1763 AS2( movdqa xmm1, [esi+i/2*(1-(i-2*(i/2))*2)*16]) \
1764 AS2( movdqa xmm0, [edi-i/2*(1-(i-2*(i/2))*2)*16]) \
1765 AS2( pmuludq xmm0, xmm1) \
1766 AS2( pmuludq xmm1, [edx-i/2*(1-(i-2*(i/2))*2)*16]) \
1767 AS2( paddq xmm4, xmm0) \
1768 AS2( paddd xmm6, xmm1)
1769
1770#define Bot_SaveAcc(k) \
1771 SSE2_SaveShift(k) \
1772 AS2( add edi, 16) \
1773 AS2( add edx, 16) \
1774 AS2( movdqa xmm6, [esi]) \
1775 AS2( movdqa xmm0, [edi]) \
1776 AS2( pmuludq xmm0, xmm6) \
1777 AS2( paddq xmm4, xmm0) \
1778 AS2( psllq xmm5, 16) \
1779 AS2( paddq xmm4, xmm5) \
1780 AS2( pmuludq xmm6, [edx])
1781
1782#define Bot_End(n) \
1783 AS2( movhlps xmm7, xmm6) \
1784 AS2( paddd xmm6, xmm7) \
1785 AS2( psllq xmm6, 32) \
1786 AS2( paddd xmm4, xmm6) \
1787 AS2( movq QWORD PTR [ecx+8*((n)-1)], xmm4) \
1788 AS1( pop esp)\
1789 MulEpilogue
1790
1791#define Top_Begin(n) \
1792 TopPrologue \
1793 AS2( mov edx, esp)\
1794 AS2( and esp, 0xfffffff0)\
1795 AS2( sub esp, 48*n+16)\
1796 AS1( push edx)\
1797 AS2( xor edx, edx) \
1798 ASL(1) \
1799 ASS( pshufd xmm0, [eax+edx], 3,1,2,0) \
1800 ASS( pshufd xmm1, [eax+edx], 2,0,3,1) \
1801 ASS( pshufd xmm2, [edi+edx], 3,1,2,0) \
1802 AS2( movdqa [esp+20+2*edx], xmm0) \
1803 AS2( psrlq xmm0, 32) \
1804 AS2( movdqa [esp+20+2*edx+16], xmm0) \
1805 AS2( movdqa [esp+20+16*n+2*edx], xmm1) \
1806 AS2( psrlq xmm1, 32) \
1807 AS2( movdqa [esp+20+16*n+2*edx+16], xmm1) \
1808 AS2( movdqa [esp+20+32*n+2*edx], xmm2) \
1809 AS2( psrlq xmm2, 32) \
1810 AS2( movdqa [esp+20+32*n+2*edx+16], xmm2) \
1811 AS2( add edx, 16) \
1812 AS2( cmp edx, 8*(n)) \
1813 ASJ( jne, 1, b) \
1814 AS2( mov eax, esi) \
1815 AS2( lea edi, [esp+20+00*n+16*(n/2-1)])\
1816 AS2( lea edx, [esp+20+16*n+16*(n/2-1)])\
1817 AS2( lea esi, [esp+20+32*n+16*(n/2-1)])\
1818 AS2( pxor xmm4, xmm4)\
1819 AS2( pxor xmm5, xmm5)
1820
1821#define Top_Acc(i) \
1822 AS2( movq xmm0, QWORD PTR [esi+i/2*(1-(i-2*(i/2))*2)*16+8]) \
1823 AS2( pmuludq xmm0, [edx-i/2*(1-(i-2*(i/2))*2)*16]) \
1824 AS2( psrlq xmm0, 48) \
1825 AS2( paddd xmm5, xmm0)\
1826
1827#define Top_Column0(i) \
1828 AS2( psllq xmm5, 32) \
1829 AS2( add edi, 16) \
1830 AS2( add edx, 16) \
1831 SSE2_MulAdd45\
1832 Mul_Acc##i(i) \
1833
1834#define Top_Column1(i) \
1835 SSE2_SaveShift(0) \
1836 AS2( add esi, 16) \
1837 SSE2_MulAdd45\
1838 Mul_Acc##i(i) \
1839 AS2( shr eax, 16) \
1840 AS2( movd xmm0, eax)\
1841 AS2( movd xmm1, [ecx+4])\
1842 AS2( psrld xmm1, 16)\
1843 AS2( pcmpgtd xmm1, xmm0)\
1844 AS2( psrld xmm1, 31)\
1845 AS2( paddd xmm4, xmm1)\
1846
1847void SSE2_Square4(word *C, const word *A)
1848{
1849 Squ_Begin(2)
1850 Squ_Column0(0, 1)
1851 Squ_End(2)
1852}
1853
1854void SSE2_Square8(word *C, const word *A)
1855{
1856 Squ_Begin(4)
1857#ifndef __GNUC__
1858 ASJ( jmp, 0, f)
1859 Squ_Acc(2)
1860 AS1( ret) ASL(0)
1861#endif
1862 Squ_Column0(0, 1)
1863 Squ_Column1(1, 1)
1864 Squ_Column0(2, 2)
1865 Squ_Column1(3, 1)
1866 Squ_Column0(4, 1)
1867 Squ_End(4)
1868}
1869
1870void SSE2_Square16(word *C, const word *A)
1871{
1872 Squ_Begin(8)
1873#ifndef __GNUC__
1874 ASJ( jmp, 0, f)
1875 Squ_Acc(4) Squ_Acc(3) Squ_Acc(2)
1876 AS1( ret) ASL(0)
1877#endif
1878 Squ_Column0(0, 1)
1879 Squ_Column1(1, 1)
1880 Squ_Column0(2, 2)
1881 Squ_Column1(3, 2)
1882 Squ_Column0(4, 3)
1883 Squ_Column1(5, 3)
1884 Squ_Column0(6, 4)
1885 Squ_Column1(7, 3)
1886 Squ_Column0(8, 3)
1887 Squ_Column1(9, 2)
1888 Squ_Column0(10, 2)
1889 Squ_Column1(11, 1)
1890 Squ_Column0(12, 1)
1891 Squ_End(8)
1892}
1893
1894void SSE2_Square32(word *C, const word *A)
1895{
1896 Squ_Begin(16)
1897 ASJ( jmp, 0, f)
1898 Squ_Acc(8) Squ_Acc(7) Squ_Acc(6) Squ_Acc(5) Squ_Acc(4) Squ_Acc(3) Squ_Acc(2)
1899 AS1( ret) ASL(0)
1900 Squ_Column0(0, 1)
1901 Squ_Column1(1, 1)
1902 Squ_Column0(2, 2)
1903 Squ_Column1(3, 2)
1904 Squ_Column0(4, 3)
1905 Squ_Column1(5, 3)
1906 Squ_Column0(6, 4)
1907 Squ_Column1(7, 4)
1908 Squ_Column0(8, 5)
1909 Squ_Column1(9, 5)
1910 Squ_Column0(10, 6)
1911 Squ_Column1(11, 6)
1912 Squ_Column0(12, 7)
1913 Squ_Column1(13, 7)
1914 Squ_Column0(14, 8)
1915 Squ_Column1(15, 7)
1916 Squ_Column0(16, 7)
1917 Squ_Column1(17, 6)
1918 Squ_Column0(18, 6)
1919 Squ_Column1(19, 5)
1920 Squ_Column0(20, 5)
1921 Squ_Column1(21, 4)
1922 Squ_Column0(22, 4)
1923 Squ_Column1(23, 3)
1924 Squ_Column0(24, 3)
1925 Squ_Column1(25, 2)
1926 Squ_Column0(26, 2)
1927 Squ_Column1(27, 1)
1928 Squ_Column0(28, 1)
1929 Squ_End(16)
1930}
1931
1932void SSE2_Multiply4(word *C, const word *A, const word *B)
1933{
1934 Mul_Begin(2)
1935#ifndef __GNUC__
1936 ASJ( jmp, 0, f)
1937 Mul_Acc(2)
1938 AS1( ret) ASL(0)
1939#endif
1940 Mul_Column0(0, 2)
1941 Mul_End(2)
1942}
1943
1944void SSE2_Multiply8(word *C, const word *A, const word *B)
1945{
1946 Mul_Begin(4)
1947#ifndef __GNUC__
1948 ASJ( jmp, 0, f)
1949 Mul_Acc(4) Mul_Acc(3) Mul_Acc(2)
1950 AS1( ret) ASL(0)
1951#endif
1952 Mul_Column0(0, 2)
1953 Mul_Column1(1, 3)
1954 Mul_Column0(2, 4)
1955 Mul_Column1(3, 3)
1956 Mul_Column0(4, 2)
1957 Mul_End(4)
1958}
1959
1960void SSE2_Multiply16(word *C, const word *A, const word *B)
1961{
1962 Mul_Begin(8)
1963#ifndef __GNUC__
1964 ASJ( jmp, 0, f)
1965 Mul_Acc(8) Mul_Acc(7) Mul_Acc(6) Mul_Acc(5) Mul_Acc(4) Mul_Acc(3) Mul_Acc(2)
1966 AS1( ret) ASL(0)
1967#endif
1968 Mul_Column0(0, 2)
1969 Mul_Column1(1, 3)
1970 Mul_Column0(2, 4)
1971 Mul_Column1(3, 5)
1972 Mul_Column0(4, 6)
1973 Mul_Column1(5, 7)
1974 Mul_Column0(6, 8)
1975 Mul_Column1(7, 7)
1976 Mul_Column0(8, 6)
1977 Mul_Column1(9, 5)
1978 Mul_Column0(10, 4)
1979 Mul_Column1(11, 3)
1980 Mul_Column0(12, 2)
1981 Mul_End(8)
1982}
1983
1984void SSE2_Multiply32(word *C, const word *A, const word *B)
1985{
1986 Mul_Begin(16)
1987 ASJ( jmp, 0, f)
1988 Mul_Acc(16) Mul_Acc(15) Mul_Acc(14) Mul_Acc(13) Mul_Acc(12) Mul_Acc(11) Mul_Acc(10) Mul_Acc(9) Mul_Acc(8) Mul_Acc(7) Mul_Acc(6) Mul_Acc(5) Mul_Acc(4) Mul_Acc(3) Mul_Acc(2)
1989 AS1( ret) ASL(0)
1990 Mul_Column0(0, 2)
1991 Mul_Column1(1, 3)
1992 Mul_Column0(2, 4)
1993 Mul_Column1(3, 5)
1994 Mul_Column0(4, 6)
1995 Mul_Column1(5, 7)
1996 Mul_Column0(6, 8)
1997 Mul_Column1(7, 9)
1998 Mul_Column0(8, 10)
1999 Mul_Column1(9, 11)
2000 Mul_Column0(10, 12)
2001 Mul_Column1(11, 13)
2002 Mul_Column0(12, 14)
2003 Mul_Column1(13, 15)
2004 Mul_Column0(14, 16)
2005 Mul_Column1(15, 15)
2006 Mul_Column0(16, 14)
2007 Mul_Column1(17, 13)
2008 Mul_Column0(18, 12)
2009 Mul_Column1(19, 11)
2010 Mul_Column0(20, 10)
2011 Mul_Column1(21, 9)
2012 Mul_Column0(22, 8)
2013 Mul_Column1(23, 7)
2014 Mul_Column0(24, 6)
2015 Mul_Column1(25, 5)
2016 Mul_Column0(26, 4)
2017 Mul_Column1(27, 3)
2018 Mul_Column0(28, 2)
2019 Mul_End(16)
2020}
2021
2022void SSE2_MultiplyBottom4(word *C, const word *A, const word *B)
2023{
2024 Mul_Begin(2)
2025 Bot_SaveAcc(0) Bot_Acc(2)
2026 Bot_End(2)
2027}
2028
2029void SSE2_MultiplyBottom8(word *C, const word *A, const word *B)
2030{
2031 Mul_Begin(4)
2032#ifndef __GNUC__
2033 ASJ( jmp, 0, f)
2034 Mul_Acc(3) Mul_Acc(2)
2035 AS1( ret) ASL(0)
2036#endif
2037 Mul_Column0(0, 2)
2038 Mul_Column1(1, 3)
2039 Bot_SaveAcc(2) Bot_Acc(4) Bot_Acc(3) Bot_Acc(2)
2040 Bot_End(4)
2041}
2042
2043void SSE2_MultiplyBottom16(word *C, const word *A, const word *B)
2044{
2045 Mul_Begin(8)
2046#ifndef __GNUC__
2047 ASJ( jmp, 0, f)
2048 Mul_Acc(7) Mul_Acc(6) Mul_Acc(5) Mul_Acc(4) Mul_Acc(3) Mul_Acc(2)
2049 AS1( ret) ASL(0)
2050#endif
2051 Mul_Column0(0, 2)
2052 Mul_Column1(1, 3)
2053 Mul_Column0(2, 4)
2054 Mul_Column1(3, 5)
2055 Mul_Column0(4, 6)
2056 Mul_Column1(5, 7)
2057 Bot_SaveAcc(6) Bot_Acc(8) Bot_Acc(7) Bot_Acc(6) Bot_Acc(5) Bot_Acc(4) Bot_Acc(3) Bot_Acc(2)
2058 Bot_End(8)
2059}
2060
2061void SSE2_MultiplyBottom32(word *C, const word *A, const word *B)
2062{
2063 Mul_Begin(16)
2064#ifndef __GNUC__
2065 ASJ( jmp, 0, f)
2066 Mul_Acc(15) Mul_Acc(14) Mul_Acc(13) Mul_Acc(12) Mul_Acc(11) Mul_Acc(10) Mul_Acc(9) Mul_Acc(8) Mul_Acc(7) Mul_Acc(6) Mul_Acc(5) Mul_Acc(4) Mul_Acc(3) Mul_Acc(2)
2067 AS1( ret) ASL(0)
2068#endif
2069 Mul_Column0(0, 2)
2070 Mul_Column1(1, 3)
2071 Mul_Column0(2, 4)
2072 Mul_Column1(3, 5)
2073 Mul_Column0(4, 6)
2074 Mul_Column1(5, 7)
2075 Mul_Column0(6, 8)
2076 Mul_Column1(7, 9)
2077 Mul_Column0(8, 10)
2078 Mul_Column1(9, 11)
2079 Mul_Column0(10, 12)
2080 Mul_Column1(11, 13)
2081 Mul_Column0(12, 14)
2082 Mul_Column1(13, 15)
2083 Bot_SaveAcc(14) Bot_Acc(16) Bot_Acc(15) Bot_Acc(14) Bot_Acc(13) Bot_Acc(12) Bot_Acc(11) Bot_Acc(10) Bot_Acc(9) Bot_Acc(8) Bot_Acc(7) Bot_Acc(6) Bot_Acc(5) Bot_Acc(4) Bot_Acc(3) Bot_Acc(2)
2084 Bot_End(16)
2085}
2086
2087void SSE2_MultiplyTop8(word *C, const word *A, const word *B, word L)
2088{
2089 Top_Begin(4)
2090 Top_Acc(3) Top_Acc(2) Top_Acc(1)
2091#ifndef __GNUC__
2092 ASJ( jmp, 0, f)
2093 Mul_Acc(4) Mul_Acc(3) Mul_Acc(2)
2094 AS1( ret) ASL(0)
2095#endif
2096 Top_Column0(4)
2097 Top_Column1(3)
2098 Mul_Column0(0, 2)
2099 Top_End(2)
2100}
2101
2102void SSE2_MultiplyTop16(word *C, const word *A, const word *B, word L)
2103{
2104 Top_Begin(8)
2105 Top_Acc(7) Top_Acc(6) Top_Acc(5) Top_Acc(4) Top_Acc(3) Top_Acc(2) Top_Acc(1)
2106#ifndef __GNUC__
2107 ASJ( jmp, 0, f)
2108 Mul_Acc(8) Mul_Acc(7) Mul_Acc(6) Mul_Acc(5) Mul_Acc(4) Mul_Acc(3) Mul_Acc(2)
2109 AS1( ret) ASL(0)
2110#endif
2111 Top_Column0(8)
2112 Top_Column1(7)
2113 Mul_Column0(0, 6)
2114 Mul_Column1(1, 5)
2115 Mul_Column0(2, 4)
2116 Mul_Column1(3, 3)
2117 Mul_Column0(4, 2)
2118 Top_End(4)
2119}
2120
2121void SSE2_MultiplyTop32(word *C, const word *A, const word *B, word L)
2122{
2123 Top_Begin(16)
2124 Top_Acc(15) Top_Acc(14) Top_Acc(13) Top_Acc(12) Top_Acc(11) Top_Acc(10) Top_Acc(9) Top_Acc(8) Top_Acc(7) Top_Acc(6) Top_Acc(5) Top_Acc(4) Top_Acc(3) Top_Acc(2) Top_Acc(1)
2125#ifndef __GNUC__
2126 ASJ( jmp, 0, f)
2127 Mul_Acc(16) Mul_Acc(15) Mul_Acc(14) Mul_Acc(13) Mul_Acc(12) Mul_Acc(11) Mul_Acc(10) Mul_Acc(9) Mul_Acc(8) Mul_Acc(7) Mul_Acc(6) Mul_Acc(5) Mul_Acc(4) Mul_Acc(3) Mul_Acc(2)
2128 AS1( ret) ASL(0)
2129#endif
2130 Top_Column0(16)
2131 Top_Column1(15)
2132 Mul_Column0(0, 14)
2133 Mul_Column1(1, 13)
2134 Mul_Column0(2, 12)
2135 Mul_Column1(3, 11)
2136 Mul_Column0(4, 10)
2137 Mul_Column1(5, 9)
2138 Mul_Column0(6, 8)
2139 Mul_Column1(7, 7)
2140 Mul_Column0(8, 6)
2141 Mul_Column1(9, 5)
2142 Mul_Column0(10, 4)
2143 Mul_Column1(11, 3)
2144 Mul_Column0(12, 2)
2145 Top_End(8)
2146}
2147
2148#endif // #if CRYPTOPP_INTEGER_SSE2
2149
2150// ********************************************************
2151
2152typedef int (CRYPTOPP_FASTCALL * PAdd)(size_t N, word *C, const word *A, const word *B);
2153typedef void (* PMul)(word *C, const word *A, const word *B);
2154typedef void (* PSqu)(word *C, const word *A);
2155typedef void (* PMulTop)(word *C, const word *A, const word *B, word L);
2156
2157#if CRYPTOPP_INTEGER_SSE2
2158static PAdd s_pAdd = &Baseline_Add, s_pSub = &Baseline_Sub;
2159static size_t s_recursionLimit = 8;
2160#else
2161static const size_t s_recursionLimit = 16;
2162#endif // CRYPTOPP_INTEGER_SSE2
2163
2164static PMul s_pMul[9], s_pBot[9];
2165static PSqu s_pSqu[9];
2166static PMulTop s_pTop[9];
2167
2168void SetFunctionPointers()
2169{
2170 s_pMul[0] = &Baseline_Multiply2;
2171 s_pBot[0] = &Baseline_MultiplyBottom2;
2172 s_pSqu[0] = &Baseline_Square2;
2173 s_pTop[0] = &Baseline_MultiplyTop2;
2174 s_pTop[1] = &Baseline_MultiplyTop4;
2175
2176#if CRYPTOPP_INTEGER_SSE2
2177 if (HasSSE2())
2178 {
2179 if (IsP4())
2180 {
2181 s_pAdd = &SSE2_Add;
2182 s_pSub = &SSE2_Sub;
2183 }
2184
2185 s_recursionLimit = 32;
2186
2187 s_pMul[1] = &SSE2_Multiply4;
2188 s_pMul[2] = &SSE2_Multiply8;
2189 s_pMul[4] = &SSE2_Multiply16;
2190 s_pMul[8] = &SSE2_Multiply32;
2191
2192 s_pBot[1] = &SSE2_MultiplyBottom4;
2193 s_pBot[2] = &SSE2_MultiplyBottom8;
2194 s_pBot[4] = &SSE2_MultiplyBottom16;
2195 s_pBot[8] = &SSE2_MultiplyBottom32;
2196
2197 s_pSqu[1] = &SSE2_Square4;
2198 s_pSqu[2] = &SSE2_Square8;
2199 s_pSqu[4] = &SSE2_Square16;
2200 s_pSqu[8] = &SSE2_Square32;
2201
2202 s_pTop[2] = &SSE2_MultiplyTop8;
2203 s_pTop[4] = &SSE2_MultiplyTop16;
2204 s_pTop[8] = &SSE2_MultiplyTop32;
2205 }
2206 else
2207#endif // CRYPTOPP_INTEGER_SSE2
2208 {
2209 s_pMul[1] = &Baseline_Multiply4;
2210 s_pMul[2] = &Baseline_Multiply8;
2211
2212 s_pBot[1] = &Baseline_MultiplyBottom4;
2213 s_pBot[2] = &Baseline_MultiplyBottom8;
2214
2215 s_pSqu[1] = &Baseline_Square4;
2216 s_pSqu[2] = &Baseline_Square8;
2217
2218 s_pTop[2] = &Baseline_MultiplyTop8;
2219
2220#if !CRYPTOPP_INTEGER_SSE2
2221 s_pMul[4] = &Baseline_Multiply16;
2222 s_pBot[4] = &Baseline_MultiplyBottom16;
2223 s_pSqu[4] = &Baseline_Square16;
2224 s_pTop[4] = &Baseline_MultiplyTop16;
2225#endif // !CRYPTOPP_INTEGER_SSE2
2226 }
2227}
2228
2229inline int Add(word *C, const word *A, const word *B, size_t N)
2230{
2231#if CRYPTOPP_INTEGER_SSE2
2232 return s_pAdd(N, C, A, B);
2233#else
2234 return Baseline_Add(N, C, A, B);
2235#endif // CRYPTOPP_INTEGER_SSE2
2236}
2237
2238inline int Subtract(word *C, const word *A, const word *B, size_t N)
2239{
2240#if CRYPTOPP_INTEGER_SSE2
2241 return s_pSub(N, C, A, B);
2242#else
2243 return Baseline_Sub(N, C, A, B);
2244#endif // CRYPTOPP_INTEGER_SSE2
2245}
2246
2247// ********************************************************
2248
2249
2250#define A0 A
2251#define A1 (A+N2)
2252#define B0 B
2253#define B1 (B+N2)
2254
2255#define T0 T
2256#define T1 (T+N2)
2257#define T2 (T+N)
2258#define T3 (T+N+N2)
2259
2260#define R0 R
2261#define R1 (R+N2)
2262#define R2 (R+N)
2263#define R3 (R+N+N2)
2264
2265// R[2*N] - result = A*B
2266// T[2*N] - temporary work space
2267// A[N] --- multiplier
2268// B[N] --- multiplicant
2269
2270void RecursiveMultiply(word *R, word *T, const word *A, const word *B, size_t N)
2271{
2272 CRYPTOPP_ASSERT(N>=2 && N%2==0);
2273
2274 if (N <= s_recursionLimit)
2275 s_pMul[N/4](R, A, B);
2276 else
2277 {
2278 const size_t N2 = N/2;
2279
2280 size_t AN2 = Compare(A0, A1, N2) > 0 ? 0 : N2;
2281 Subtract(R0, A + AN2, A + (N2 ^ AN2), N2);
2282
2283 size_t BN2 = Compare(B0, B1, N2) > 0 ? 0 : N2;
2284 Subtract(R1, B + BN2, B + (N2 ^ BN2), N2);
2285
2286 RecursiveMultiply(R2, T2, A1, B1, N2);
2287 RecursiveMultiply(T0, T2, R0, R1, N2);
2288 RecursiveMultiply(R0, T2, A0, B0, N2);
2289
2290 // now T[01] holds (A1-A0)*(B0-B1), R[01] holds A0*B0, R[23] holds A1*B1
2291
2292 int c2 = Add(R2, R2, R1, N2);
2293 int c3 = c2;
2294 c2 += Add(R1, R2, R0, N2);
2295 c3 += Add(R2, R2, R3, N2);
2296
2297 if (AN2 == BN2)
2298 c3 -= Subtract(R1, R1, T0, N);
2299 else
2300 c3 += Add(R1, R1, T0, N);
2301
2302 c3 += Increment(R2, N2, c2);
2303 CRYPTOPP_ASSERT (c3 >= 0 && c3 <= 2);
2304 Increment(R3, N2, c3);
2305 }
2306}
2307
2308// R[2*N] - result = A*A
2309// T[2*N] - temporary work space
2310// A[N] --- number to be squared
2311
2312void RecursiveSquare(word *R, word *T, const word *A, size_t N)
2313{
2314 CRYPTOPP_ASSERT(N && N%2==0);
2315
2316 if (N <= s_recursionLimit)
2317 s_pSqu[N/4](R, A);
2318 else
2319 {
2320 const size_t N2 = N/2;
2321
2322 RecursiveSquare(R0, T2, A0, N2);
2323 RecursiveSquare(R2, T2, A1, N2);
2324 RecursiveMultiply(T0, T2, A0, A1, N2);
2325
2326 int carry = Add(R1, R1, T0, N);
2327 carry += Add(R1, R1, T0, N);
2328 Increment(R3, N2, carry);
2329 }
2330}
2331
2332// R[N] - bottom half of A*B
2333// T[3*N/2] - temporary work space
2334// A[N] - multiplier
2335// B[N] - multiplicant
2336
2337void RecursiveMultiplyBottom(word *R, word *T, const word *A, const word *B, size_t N)
2338{
2339 CRYPTOPP_ASSERT(N>=2 && N%2==0);
2340
2341 if (N <= s_recursionLimit)
2342 s_pBot[N/4](R, A, B);
2343 else
2344 {
2345 const size_t N2 = N/2;
2346
2347 RecursiveMultiply(R, T, A0, B0, N2);
2348 RecursiveMultiplyBottom(T0, T1, A1, B0, N2);
2349 Add(R1, R1, T0, N2);
2350 RecursiveMultiplyBottom(T0, T1, A0, B1, N2);
2351 Add(R1, R1, T0, N2);
2352 }
2353}
2354
2355// R[N] --- upper half of A*B
2356// T[2*N] - temporary work space
2357// L[N] --- lower half of A*B
2358// A[N] --- multiplier
2359// B[N] --- multiplicant
2360
2361void MultiplyTop(word *R, word *T, const word *L, const word *A, const word *B, size_t N)
2362{
2363 CRYPTOPP_ASSERT(N>=2 && N%2==0);
2364
2365 if (N <= s_recursionLimit)
2366 s_pTop[N/4](R, A, B, L[N-1]);
2367 else
2368 {
2369 const size_t N2 = N/2;
2370
2371 size_t AN2 = Compare(A0, A1, N2) > 0 ? 0 : N2;
2372 Subtract(R0, A + AN2, A + (N2 ^ AN2), N2);
2373
2374 size_t BN2 = Compare(B0, B1, N2) > 0 ? 0 : N2;
2375 Subtract(R1, B + BN2, B + (N2 ^ BN2), N2);
2376
2377 RecursiveMultiply(T0, T2, R0, R1, N2);
2378 RecursiveMultiply(R0, T2, A1, B1, N2);
2379
2380 // now T[01] holds (A1-A0)*(B0-B1) = A1*B0+A0*B1-A1*B1-A0*B0, R[01] holds A1*B1
2381
2382 int t, c3;
2383 int c2 = Subtract(T2, L+N2, L, N2);
2384
2385 if (AN2 == BN2)
2386 {
2387 c2 -= Add(T2, T2, T0, N2);
2388 t = (Compare(T2, R0, N2) == -1);
2389 c3 = t - Subtract(T2, T2, T1, N2);
2390 }
2391 else
2392 {
2393 c2 += Subtract(T2, T2, T0, N2);
2394 t = (Compare(T2, R0, N2) == -1);
2395 c3 = t + Add(T2, T2, T1, N2);
2396 }
2397
2398 c2 += t;
2399 if (c2 >= 0)
2400 c3 += Increment(T2, N2, c2);
2401 else
2402 c3 -= Decrement(T2, N2, -c2);
2403 c3 += Add(R0, T2, R1, N2);
2404
2405 CRYPTOPP_ASSERT (c3 >= 0 && c3 <= 2);
2406 Increment(R1, N2, c3);
2407 }
2408}
2409
2410inline void Multiply(word *R, word *T, const word *A, const word *B, size_t N)
2411{
2412 RecursiveMultiply(R, T, A, B, N);
2413}
2414
2415inline void Square(word *R, word *T, const word *A, size_t N)
2416{
2417 RecursiveSquare(R, T, A, N);
2418}
2419
2420inline void MultiplyBottom(word *R, word *T, const word *A, const word *B, size_t N)
2421{
2422 RecursiveMultiplyBottom(R, T, A, B, N);
2423}
2424
2425// R[NA+NB] - result = A*B
2426// T[NA+NB] - temporary work space
2427// A[NA] ---- multiplier
2428// B[NB] ---- multiplicant
2429
2430void AsymmetricMultiply(word *R, word *T, const word *A, size_t NA, const word *B, size_t NB)
2431{
2432 if (NA == NB)
2433 {
2434 // Profiling guided the flow below.
2435 if (A != B)
2436 Multiply(R, T, A, B, NA);
2437 else
2438 Square(R, T, A, NA);
2439
2440 return;
2441 }
2442
2443 if (NA > NB)
2444 {
2445 std::swap(A, B);
2446 std::swap(NA, NB);
2447 }
2448
2449 CRYPTOPP_ASSERT(NB % NA == 0);
2450
2451 if (NA==2 && !A[1])
2452 {
2453 // Profiling guided the flow below.
2454 switch (A[0])
2455 {
2456 default:
2457 R[NB] = LinearMultiply(R, B, A[0], NB);
2458 R[NB+1] = 0;
2459 return;
2460 case 0:
2461 SetWords(R, 0, NB+2);
2462 return;
2463 case 1:
2464 CopyWords(R, B, NB);
2465 R[NB] = R[NB+1] = 0;
2466 return;
2467 }
2468 }
2469
2470 size_t i;
2471 if ((NB/NA)%2 == 0)
2472 {
2473 Multiply(R, T, A, B, NA);
2474 CopyWords(T+2*NA, R+NA, NA);
2475
2476 for (i=2*NA; i<NB; i+=2*NA)
2477 Multiply(T+NA+i, T, A, B+i, NA);
2478 for (i=NA; i<NB; i+=2*NA)
2479 Multiply(R+i, T, A, B+i, NA);
2480 }
2481 else
2482 {
2483 for (i=0; i<NB; i+=2*NA)
2484 Multiply(R+i, T, A, B+i, NA);
2485 for (i=NA; i<NB; i+=2*NA)
2486 Multiply(T+NA+i, T, A, B+i, NA);
2487 }
2488
2489 if (Add(R+NA, R+NA, T+2*NA, NB-NA))
2490 Increment(R+NB, NA);
2491}
2492
2493// R[N] ----- result = A inverse mod 2**(WORD_BITS*N)
2494// T[3*N/2] - temporary work space
2495// A[N] ----- an odd number as input
2496
2497void RecursiveInverseModPower2(word *R, word *T, const word *A, size_t N)
2498{
2499 // Profiling guided the flow below.
2500 if (N!=2)
2501 {
2502 const size_t N2 = N/2;
2503 RecursiveInverseModPower2(R0, T0, A0, N2);
2504 T0[0] = 1;
2505 SetWords(T0+1, 0, N2-1);
2506 MultiplyTop(R1, T1, T0, R0, A0, N2);
2507 MultiplyBottom(T0, T1, R0, A1, N2);
2508 Add(T0, R1, T0, N2);
2509 TwosComplement(T0, N2);
2510 MultiplyBottom(R1, T1, R0, T0, N2);
2511 }
2512 else
2513 {
2514 T[0] = AtomicInverseModPower2(A[0]);
2515 T[1] = 0;
2516 s_pBot[0](T+2, T, A);
2517 TwosComplement(T+2, 2);
2518 Increment(T+2, 2, 2);
2519 s_pBot[0](R, T, T+2);
2520 }
2521}
2522
2523// R[N] --- result = X/(2**(WORD_BITS*N)) mod M
2524// T[3*N] - temporary work space
2525// X[2*N] - number to be reduced
2526// M[N] --- modulus
2527// U[N] --- multiplicative inverse of M mod 2**(WORD_BITS*N)
2528
2529void MontgomeryReduce(word *R, word *T, word *X, const word *M, const word *U, size_t N)
2530{
2531#if 1
2532 MultiplyBottom(R, T, X, U, N);
2533 MultiplyTop(T, T+N, X, R, M, N);
2534 word borrow = Subtract(T, X+N, T, N);
2535 // defend against timing attack by doing this Add even when not needed
2536 word carry = Add(T+N, T, M, N);
2537 CRYPTOPP_ASSERT(carry | !borrow);
2538 CRYPTOPP_UNUSED(carry), CRYPTOPP_UNUSED(borrow);
2539 CopyWords(R, T + ((0-borrow) & N), N);
2540#elif 0
2541 const word u = 0-U[0];
2542 Declare2Words(p)
2543 for (size_t i=0; i<N; i++)
2544 {
2545 const word t = u * X[i];
2546 word c = 0;
2547 for (size_t j=0; j<N; j+=2)
2548 {
2549 MultiplyWords(p, t, M[j]);
2550 Acc2WordsBy1(p, X[i+j]);
2551 Acc2WordsBy1(p, c);
2552 X[i+j] = LowWord(p);
2553 c = HighWord(p);
2554 MultiplyWords(p, t, M[j+1]);
2555 Acc2WordsBy1(p, X[i+j+1]);
2556 Acc2WordsBy1(p, c);
2557 X[i+j+1] = LowWord(p);
2558 c = HighWord(p);
2559 }
2560
2561 if (Increment(X+N+i, N-i, c))
2562 while (!Subtract(X+N, X+N, M, N)) {}
2563 }
2564
2565 memcpy(R, X+N, N*WORD_SIZE);
2566#else
2567 __m64 u = _mm_cvtsi32_si64(0-U[0]), p;
2568 for (size_t i=0; i<N; i++)
2569 {
2570 __m64 t = _mm_cvtsi32_si64(X[i]);
2571 t = _mm_mul_su32(t, u);
2572 __m64 c = _mm_setzero_si64();
2573 for (size_t j=0; j<N; j+=2)
2574 {
2575 p = _mm_mul_su32(t, _mm_cvtsi32_si64(M[j]));
2576 p = _mm_add_si64(p, _mm_cvtsi32_si64(X[i+j]));
2577 c = _mm_add_si64(c, p);
2578 X[i+j] = _mm_cvtsi64_si32(c);
2579 c = _mm_srli_si64(c, 32);
2580 p = _mm_mul_su32(t, _mm_cvtsi32_si64(M[j+1]));
2581 p = _mm_add_si64(p, _mm_cvtsi32_si64(X[i+j+1]));
2582 c = _mm_add_si64(c, p);
2583 X[i+j+1] = _mm_cvtsi64_si32(c);
2584 c = _mm_srli_si64(c, 32);
2585 }
2586
2587 if (Increment(X+N+i, N-i, _mm_cvtsi64_si32(c)))
2588 while (!Subtract(X+N, X+N, M, N)) {}
2589 }
2590
2591 memcpy(R, X+N, N*WORD_SIZE);
2592 _mm_empty();
2593#endif
2594}
2595
2596// R[N] --- result = X/(2**(WORD_BITS*N/2)) mod M
2597// T[2*N] - temporary work space
2598// X[2*N] - number to be reduced
2599// M[N] --- modulus
2600// U[N/2] - multiplicative inverse of M mod 2**(WORD_BITS*N/2)
2601// V[N] --- 2**(WORD_BITS*3*N/2) mod M
2602
2603void HalfMontgomeryReduce(word *R, word *T, const word *X, const word *M, const word *U, const word *V, size_t N)
2604{
2605 CRYPTOPP_ASSERT(N%2==0 && N>=4);
2606
2607#define M0 M
2608#define M1 (M+N2)
2609#define V0 V
2610#define V1 (V+N2)
2611
2612#define X0 X
2613#define X1 (X+N2)
2614#define X2 (X+N)
2615#define X3 (X+N+N2)
2616
2617 const size_t N2 = N/2;
2618 Multiply(T0, T2, V0, X3, N2);
2619 int c2 = Add(T0, T0, X0, N);
2620 MultiplyBottom(T3, T2, T0, U, N2);
2621 MultiplyTop(T2, R, T0, T3, M0, N2);
2622 c2 -= Subtract(T2, T1, T2, N2);
2623 Multiply(T0, R, T3, M1, N2);
2624 c2 -= Subtract(T0, T2, T0, N2);
2625 int c3 = -(int)Subtract(T1, X2, T1, N2);
2626 Multiply(R0, T2, V1, X3, N2);
2627 c3 += Add(R, R, T, N);
2628
2629 if (c2>0)
2630 c3 += Increment(R1, N2);
2631 else if (c2<0)
2632 c3 -= Decrement(R1, N2, -c2);
2633
2634 CRYPTOPP_ASSERT(c3>=-1 && c3<=1);
2635 if (c3>0)
2636 Subtract(R, R, M, N);
2637 else if (c3<0)
2638 Add(R, R, M, N);
2639
2640#undef M0
2641#undef M1
2642#undef V0
2643#undef V1
2644
2645#undef X0
2646#undef X1
2647#undef X2
2648#undef X3
2649}
2650
2651#undef A0
2652#undef A1
2653#undef B0
2654#undef B1
2655
2656#undef T0
2657#undef T1
2658#undef T2
2659#undef T3
2660
2661#undef R0
2662#undef R1
2663#undef R2
2664#undef R3
2665
2666/*
2667// do a 3 word by 2 word divide, returns quotient and leaves remainder in A
2668static word SubatomicDivide(word *A, word B0, word B1)
2669{
2670 // Assert {A[2],A[1]} < {B1,B0}, so quotient can fit in a word
2671 CRYPTOPP_ASSERT(A[2] < B1 || (A[2]==B1 && A[1] < B0));
2672
2673 // estimate the quotient: do a 2 word by 1 word divide
2674 word Q;
2675 if (B1+1 == 0)
2676 Q = A[2];
2677 else
2678 Q = DWord(A[1], A[2]).DividedBy(B1+1);
2679
2680 // now subtract Q*B from A
2681 DWord p = DWord::Multiply(B0, Q);
2682 DWord u = (DWord) A[0] - p.GetLowHalf();
2683 A[0] = u.GetLowHalf();
2684 u = (DWord) A[1] - p.GetHighHalf() - u.GetHighHalfAsBorrow() - DWord::Multiply(B1, Q);
2685 A[1] = u.GetLowHalf();
2686 A[2] += u.GetHighHalf();
2687
2688 // Q <= actual quotient, so fix it
2689 while (A[2] || A[1] > B1 || (A[1]==B1 && A[0]>=B0))
2690 {
2691 u = (DWord) A[0] - B0;
2692 A[0] = u.GetLowHalf();
2693 u = (DWord) A[1] - B1 - u.GetHighHalfAsBorrow();
2694 A[1] = u.GetLowHalf();
2695 A[2] += u.GetHighHalf();
2696 Q++;
2697 CRYPTOPP_ASSERT(Q); // shouldn't overflow
2698 }
2699
2700 return Q;
2701}
2702
2703// do a 4 word by 2 word divide, returns 2 word quotient in Q0 and Q1
2704static inline void AtomicDivide(word *Q, const word *A, const word *B)
2705{
2706 if (!B[0] && !B[1]) // if divisor is 0, we assume divisor==2**(2*WORD_BITS)
2707 {
2708 Q[0] = A[2];
2709 Q[1] = A[3];
2710 }
2711 else
2712 {
2713 word T[4];
2714 T[0] = A[0]; T[1] = A[1]; T[2] = A[2]; T[3] = A[3];
2715 Q[1] = SubatomicDivide(T+1, B[0], B[1]);
2716 Q[0] = SubatomicDivide(T, B[0], B[1]);
2717
2718#if defined(CRYPTOPP_DEBUG)
2719 // multiply quotient and divisor and add remainder, make sure it equals dividend
2720 CRYPTOPP_ASSERT(!T[2] && !T[3] && (T[1] < B[1] || (T[1]==B[1] && T[0]<B[0])));
2721 word P[4];
2722 LowLevel::Multiply2(P, Q, B);
2723 Add(P, P, T, 4);
2724 CRYPTOPP_ASSERT(memcmp(P, A, 4*WORD_SIZE)==0);
2725#endif
2726 }
2727}
2728*/
2729
2730static inline void AtomicDivide(word *Q, const word *A, const word *B)
2731{
2732 word T[4];
2733 DWord q = DivideFourWordsByTwo<word, DWord>(T, DWord(A[0], A[1]), DWord(A[2], A[3]), DWord(B[0], B[1]));
2734 Q[0] = q.GetLowHalf();
2735 Q[1] = q.GetHighHalf();
2736
2737#if defined(CRYPTOPP_DEBUG)
2738 if (B[0] || B[1])
2739 {
2740 // multiply quotient and divisor and add remainder, make sure it equals dividend
2741 CRYPTOPP_ASSERT(!T[2] && !T[3] && (T[1] < B[1] || (T[1]==B[1] && T[0]<B[0])));
2742 word P[4];
2743 s_pMul[0](P, Q, B);
2744 Add(P, P, T, 4);
2745 CRYPTOPP_ASSERT(memcmp(P, A, 4*WORD_SIZE)==0);
2746 }
2747#endif
2748}
2749
2750// for use by Divide(), corrects the underestimated quotient {Q1,Q0}
2751static void CorrectQuotientEstimate(word *R, word *T, word *Q, const word *B, size_t N)
2752{
2753 CRYPTOPP_ASSERT(N && N%2==0);
2754
2755 AsymmetricMultiply(T, T+N+2, Q, 2, B, N);
2756
2757 word borrow = Subtract(R, R, T, N+2);
2758 CRYPTOPP_ASSERT(!borrow && !R[N+1]);
2759 CRYPTOPP_UNUSED(borrow);
2760
2761 while (R[N] || Compare(R, B, N) >= 0)
2762 {
2763 R[N] -= Subtract(R, R, B, N);
2764 Q[1] += (++Q[0]==0);
2765 CRYPTOPP_ASSERT(Q[0] || Q[1]); // no overflow
2766 }
2767}
2768
2769// R[NB] -------- remainder = A%B
2770// Q[NA-NB+2] --- quotient = A/B
2771// T[NA+3*(NB+2)] - temp work space
2772// A[NA] -------- dividend
2773// B[NB] -------- divisor
2774
2775void Divide(word *R, word *Q, word *T, const word *A, size_t NA, const word *B, size_t NB)
2776{
2777 CRYPTOPP_ASSERT(NA && NB && NA%2==0 && NB%2==0);
2778 CRYPTOPP_ASSERT(B[NB-1] || B[NB-2]);
2779 CRYPTOPP_ASSERT(NB <= NA);
2780
2781 // set up temporary work space
2782 word *const TA=T;
2783 word *const TB=T+NA+2;
2784 word *const TP=T+NA+2+NB;
2785
2786 // copy B into TB and normalize it so that TB has highest bit set to 1
2787 unsigned shiftWords = (B[NB-1]==0);
2788 TB[0] = TB[NB-1] = 0;
2789 CopyWords(TB+shiftWords, B, NB-shiftWords);
2790 unsigned shiftBits = WORD_BITS - BitPrecision(TB[NB-1]);
2791 CRYPTOPP_ASSERT(shiftBits < WORD_BITS);
2792 ShiftWordsLeftByBits(TB, NB, shiftBits);
2793
2794 // copy A into TA and normalize it
2795 TA[0] = TA[NA] = TA[NA+1] = 0;
2796 CopyWords(TA+shiftWords, A, NA);
2797 ShiftWordsLeftByBits(TA, NA+2, shiftBits);
2798
2799 if (TA[NA+1]==0 && TA[NA] <= 1)
2800 {
2801 Q[NA-NB+1] = Q[NA-NB] = 0;
2802 while (TA[NA] || Compare(TA+NA-NB, TB, NB) >= 0)
2803 {
2804 TA[NA] -= Subtract(TA+NA-NB, TA+NA-NB, TB, NB);
2805 ++Q[NA-NB];
2806 }
2807 }
2808 else
2809 {
2810 NA+=2;
2811 CRYPTOPP_ASSERT(Compare(TA+NA-NB, TB, NB) < 0);
2812 }
2813
2814 word BT[2];
2815 BT[0] = TB[NB-2] + 1;
2816 BT[1] = TB[NB-1] + (BT[0]==0);
2817
2818 // start reducing TA mod TB, 2 words at a time
2819 for (size_t i=NA-2; i>=NB; i-=2)
2820 {
2821 AtomicDivide(Q+i-NB, TA+i-2, BT);
2822 CorrectQuotientEstimate(TA+i-NB, TP, Q+i-NB, TB, NB);
2823 }
2824
2825 // copy TA into R, and denormalize it
2826 CopyWords(R, TA+shiftWords, NB);
2827 ShiftWordsRightByBits(R, NB, shiftBits);
2828}
2829
2830static inline size_t EvenWordCount(const word *X, size_t N)
2831{
2832 while (N && X[N-2]==0 && X[N-1]==0)
2833 N-=2;
2834 return N;
2835}
2836
2837// return k
2838// R[N] --- result = A^(-1) * 2^k mod M
2839// T[4*N] - temporary work space
2840// A[NA] -- number to take inverse of
2841// M[N] --- modulus
2842
2843unsigned int AlmostInverse(word *R, word *T, const word *A, size_t NA, const word *M, size_t N)
2844{
2845 CRYPTOPP_ASSERT(NA<=N && N && N%2==0);
2846
2847 word *b = T;
2848 word *c = T+N;
2849 word *f = T+2*N;
2850 word *g = T+3*N;
2851 size_t bcLen=2, fgLen=EvenWordCount(M, N);
2852 unsigned int k=0;
2853 bool s=false;
2854
2855 SetWords(T, 0, 3*N);
2856 b[0]=1;
2857 CopyWords(f, A, NA);
2858 CopyWords(g, M, N);
2859
2860 while (1)
2861 {
2862 word t=f[0];
2863 while (!t)
2864 {
2865 if (EvenWordCount(f, fgLen)==0)
2866 {
2867 SetWords(R, 0, N);
2868 return 0;
2869 }
2870
2871 ShiftWordsRightByWords(f, fgLen, 1);
2872 bcLen += 2 * (c[bcLen-1] != 0);
2873 CRYPTOPP_ASSERT(bcLen <= N);
2874 ShiftWordsLeftByWords(c, bcLen, 1);
2875 k+=WORD_BITS;
2876 t=f[0];
2877 }
2878
2879 // t must be non-0; otherwise undefined.
2880 unsigned int i = TrailingZeros(t);
2881 t >>= i;
2882 k += i;
2883
2884 if (t==1 && f[1]==0 && EvenWordCount(f+2, fgLen-2)==0)
2885 {
2886 if (s)
2887 Subtract(R, M, b, N);
2888 else
2889 CopyWords(R, b, N);
2890 return k;
2891 }
2892
2893 ShiftWordsRightByBits(f, fgLen, i);
2894 t = ShiftWordsLeftByBits(c, bcLen, i);
2895 c[bcLen] += t;
2896 bcLen += 2 * (t!=0);
2897 CRYPTOPP_ASSERT(bcLen <= N);
2898
2899 bool swap = Compare(f, g, fgLen)==-1;
2902 s ^= swap;
2903
2904 fgLen -= 2 * !(f[fgLen-2] | f[fgLen-1]);
2905
2906 Subtract(f, f, g, fgLen);
2907 t = Add(b, b, c, bcLen);
2908 b[bcLen] += t;
2909 bcLen += 2*t;
2910 CRYPTOPP_ASSERT(bcLen <= N);
2911 }
2912}
2913
2914// R[N] - result = A/(2^k) mod M
2915// A[N] - input
2916// M[N] - modulus
2917
2918void DivideByPower2Mod(word *R, const word *A, size_t k, const word *M, size_t N)
2919{
2920 CopyWords(R, A, N);
2921
2922 while (k--)
2923 {
2924 if (R[0]%2==0)
2925 ShiftWordsRightByBits(R, N, 1);
2926 else
2927 {
2928 word carry = Add(R, R, M, N);
2929 ShiftWordsRightByBits(R, N, 1);
2930 R[N-1] += carry<<(WORD_BITS-1);
2931 }
2932 }
2933}
2934
2935// R[N] - result = A*(2^k) mod M
2936// A[N] - input
2937// M[N] - modulus
2938
2939void MultiplyByPower2Mod(word *R, const word *A, size_t k, const word *M, size_t N)
2940{
2941 CopyWords(R, A, N);
2942
2943 while (k--)
2944 if (ShiftWordsLeftByBits(R, N, 1) || Compare(R, M, N)>=0)
2945 Subtract(R, R, M, N);
2946}
2947
2948// ******************************************************************
2949
2950static const unsigned int RoundupSizeTable[] = {2, 2, 2, 4, 4, 8, 8, 8, 8};
2951
2952static inline size_t RoundupSize(size_t n)
2953{
2954 if (n<=8)
2955 return RoundupSizeTable[n];
2956 else if (n<=16)
2957 return 16;
2958 else if (n<=32)
2959 return 32;
2960 else if (n<=64)
2961 return 64;
2962 else
2963 return size_t(1) << BitPrecision(n-1);
2964}
2965
2967 : reg(2), sign(POSITIVE)
2968{
2969 reg[0] = reg[1] = 0;
2970}
2971
2972Integer::Integer(const Integer& t)
2973 : reg(RoundupSize(t.WordCount())), sign(t.sign)
2974{
2975 CopyWords(reg, t.reg, reg.size());
2976}
2977
2978Integer::Integer(Sign s, lword value)
2979 : reg(2), sign(s)
2980{
2981 reg[0] = word(value);
2982 reg[1] = word(SafeRightShift<WORD_BITS>(value));
2983}
2984
2985Integer::Integer(signed long value)
2986 : reg(2)
2987{
2988 if (value >= 0)
2989 sign = POSITIVE;
2990 else
2991 {
2992 sign = NEGATIVE;
2993 value = -value;
2994 }
2995 reg[0] = word(value);
2996 reg[1] = word(SafeRightShift<WORD_BITS>((unsigned long)value));
2997}
2998
2999Integer::Integer(Sign s, word high, word low)
3000 : reg(2), sign(s)
3001{
3002 reg[0] = low;
3003 reg[1] = high;
3004}
3005
3007{
3008 if (ByteCount() > sizeof(long))
3009 return false;
3010
3011 unsigned long value = (unsigned long)reg[0];
3012 value += SafeLeftShift<WORD_BITS, unsigned long>((unsigned long)reg[1]);
3013
3014 if (sign==POSITIVE)
3015 return (signed long)value >= 0;
3016 else
3017 return -(signed long)value < 0;
3018}
3019
3020signed long Integer::ConvertToLong() const
3021{
3023
3024 unsigned long value = (unsigned long)reg[0];
3025 value += SafeLeftShift<WORD_BITS, unsigned long>((unsigned long)reg[1]);
3026 return sign==POSITIVE ? value : -(signed long)value;
3027}
3028
3029Integer::Integer(BufferedTransformation &encodedInteger, size_t byteCount, Signedness s, ByteOrder o)
3030{
3032
3033 if (o != LITTLE_ENDIAN_ORDER)
3034 {
3035 Decode(encodedInteger, byteCount, s);
3036 }
3037 else
3038 {
3039 SecByteBlock block(byteCount);
3040 encodedInteger.Get(block, block.size());
3041 std::reverse(block.begin(), block.begin()+block.size());
3042
3043 Decode(block.begin(), block.size(), s);
3044 }
3045}
3046
3047Integer::Integer(const byte *encodedInteger, size_t byteCount, Signedness s, ByteOrder o)
3048{
3049 CRYPTOPP_ASSERT(encodedInteger && byteCount); // NULL buffer
3051
3052 if (o != LITTLE_ENDIAN_ORDER)
3053 {
3054 Decode(encodedInteger, byteCount, s);
3055 }
3056 else
3057 {
3058 SecByteBlock block(byteCount);
3059#if (_MSC_VER >= 1500)
3060 std::reverse_copy(encodedInteger, encodedInteger+byteCount,
3061 stdext::make_checked_array_iterator(block.begin(), block.size()));
3062#else
3063 std::reverse_copy(encodedInteger, encodedInteger+byteCount, block.begin());
3064#endif
3065 Decode(block.begin(), block.size(), s);
3066 return;
3067 }
3068}
3069
3071{
3072 // Make explicit call to avoid virtual-dispatch findings in ctor
3074}
3075
3076Integer::Integer(RandomNumberGenerator &rng, size_t bitcount)
3077{
3078 Randomize(rng, bitcount);
3079}
3080
3081Integer::Integer(RandomNumberGenerator &rng, const Integer &min, const Integer &max, RandomNumberType rnType, const Integer &equiv, const Integer &mod)
3082{
3083 if (!Randomize(rng, min, max, rnType, equiv, mod))
3085}
3086
3087Integer Integer::Power2(size_t e)
3088{
3089 Integer r((word)0, BitsToWords(e+1));
3090 r.SetBit(e);
3091 return r;
3092}
3093
3094bool Integer::operator!() const
3095{
3096 return IsNegative() ? false : (reg[0]==0 && WordCount()==0);
3097}
3098
3100{
3101 if (this != &t)
3102 {
3103 if (reg.size() != t.reg.size() || t.reg[t.reg.size()/2] == 0)
3104 reg.New(RoundupSize(t.WordCount()));
3105 CopyWords(reg, t.reg, reg.size());
3106 sign = t.sign;
3107 }
3108 return *this;
3109}
3110
3111bool Integer::GetBit(size_t n) const
3112{
3113 // Profiling guided the flow below.
3114 if (n/WORD_BITS < reg.size())
3115 return bool((reg[n/WORD_BITS] >> (n % WORD_BITS)) & 1);
3116 else
3117 return 0;
3118}
3119
3120void Integer::SetBit(size_t n, bool value)
3121{
3122 if (value)
3123 {
3124 reg.CleanGrow(RoundupSize(BitsToWords(n+1)));
3125 reg[n/WORD_BITS] |= (word(1) << (n%WORD_BITS));
3126 }
3127 else
3128 {
3129 if (n/WORD_BITS < reg.size())
3130 reg[n/WORD_BITS] &= ~(word(1) << (n%WORD_BITS));
3131 }
3132}
3133
3134byte Integer::GetByte(size_t n) const
3135{
3136 // Profiling guided the flow below.
3137 if (n/WORD_SIZE < reg.size())
3138 return byte(reg[n/WORD_SIZE] >> ((n%WORD_SIZE)*8));
3139 else
3140 return 0;
3141}
3142
3143void Integer::SetByte(size_t n, byte value)
3144{
3145 reg.CleanGrow(RoundupSize(BytesToWords(n+1)));
3146 reg[n/WORD_SIZE] &= ~(word(0xff) << 8*(n%WORD_SIZE));
3147 reg[n/WORD_SIZE] |= (word(value) << 8*(n%WORD_SIZE));
3148}
3149
3150lword Integer::GetBits(size_t i, size_t n) const
3151{
3152 lword v = 0;
3153 CRYPTOPP_ASSERT(n <= sizeof(v)*8);
3154 for (unsigned int j=0; j<n; j++)
3155 v |= lword(GetBit(i+j)) << j;
3156 return v;
3157}
3158
3160{
3161 Integer result(*this);
3162 result.Negate();
3163 return result;
3164}
3165
3167{
3168 Integer result(*this);
3169 result.sign = POSITIVE;
3170 return result;
3171}
3172
3173void Integer::swap(Integer &a)
3174{
3175 reg.swap(a.reg);
3176 std::swap(sign, a.sign);
3177}
3178
3179Integer::Integer(word value, size_t length)
3180 : reg(RoundupSize(length)), sign(POSITIVE)
3181{
3182 reg[0] = value;
3183 SetWords(reg+1, 0, reg.size()-1);
3184}
3185
3186template <class T>
3187static Integer StringToInteger(const T *str, ByteOrder order)
3188{
3190
3191 int radix, sign = 1;
3192 // GCC workaround
3193 // std::char_traits<wchar_t>::length() not defined in GCC 3.2 and STLport 4.5.3
3194 unsigned int length;
3195 for (length = 0; str[length] != 0; length++) {}
3196
3197 Integer v;
3198
3199 if (length == 0)
3200 return Integer::Zero();
3201
3202 // 'str' is of length 1 or more
3203 switch (str[length-1])
3204 {
3205 case 'h':
3206 case 'H':
3207 radix=16;
3208 break;
3209 case 'o':
3210 case 'O':
3211 radix=8;
3212 break;
3213 case 'b':
3214 case 'B':
3215 radix=2;
3216 break;
3217 default:
3218 radix=10;
3219 }
3220
3221 // 'str' is of length 1 or more
3222 if (str[0] == '-')
3223 {
3224 sign = -1;
3225 str += 1, length -= 1;
3226 }
3227
3228 // Recognize common prefixes for hexadecimal, octal and decimal.
3229 // Microsoft's MASM also recognizes 0t for octal, but not here.
3230 if (length > 2 && str[0] == '0')
3231 {
3232 if (str[1] == 'x' || str[1] == 'X')
3233 {
3234 radix = 16;
3235 str += 2, length -= 2;
3236 }
3237 else if (str[1] == 'n' || str[1] == 'N')
3238 {
3239 radix = 10;
3240 str += 2, length -= 2;
3241 }
3242 else if (str[1] == 'o' || str[1] == 'O')
3243 {
3244 radix = 8;
3245 str += 2, length -= 2;
3246 }
3247 }
3248
3249 if (order == BIG_ENDIAN_ORDER)
3250 {
3251 for (unsigned int i=0; i<length; i++)
3252 {
3253 int digit, ch = static_cast<int>(str[i]);
3254
3255 // Profiling guided the flow below.
3256 if (ch >= '0' && ch <= '9')
3257 digit = ch - '0';
3258 else if (ch >= 'a' && ch <= 'f')
3259 digit = ch - 'a' + 10;
3260 else if (ch >= 'A' && ch <= 'F')
3261 digit = ch - 'A' + 10;
3262 else
3263 digit = radix;
3264
3265 if (digit < radix)
3266 {
3267 v *= radix;
3268 v += digit;
3269 }
3270 }
3271 }
3272 else if (radix == 16 && order == LITTLE_ENDIAN_ORDER)
3273 {
3274 // Nibble high, low and count
3275 unsigned int nh = 0, nl = 0, nc = 0;
3276 Integer position(Integer::One());
3277
3278 for (unsigned int i=0; i<length; i++)
3279 {
3280 int digit, ch = static_cast<int>(str[i]);
3281
3282 if (ch >= '0' && ch <= '9')
3283 digit = ch - '0';
3284 else if (ch >= 'a' && ch <= 'f')
3285 digit = ch - 'a' + 10;
3286 else if (ch >= 'A' && ch <= 'F')
3287 digit = ch - 'A' + 10;
3288 else
3289 digit = radix;
3290
3291 if (digit < radix)
3292 {
3293 if (nc++ == 0)
3294 nh = digit;
3295 else
3296 nl = digit;
3297
3298 if (nc == 2)
3299 {
3300 v += position * (nh << 4 | nl);
3301 nc = 0, position <<= 8;
3302 }
3303 }
3304 }
3305
3306 if (nc == 1)
3307 v += nh * position;
3308 }
3309 else // LITTLE_ENDIAN_ORDER && radix != 16
3310 {
3311 for (int i=static_cast<int>(length)-1; i>=0; i--)
3312 {
3313 int digit, ch = static_cast<int>(str[i]);
3314
3315 if (ch >= '0' && ch <= '9')
3316 digit = ch - '0';
3317 else if (ch >= 'a' && ch <= 'f')
3318 digit = ch - 'a' + 10;
3319 else if (ch >= 'A' && ch <= 'F')
3320 digit = ch - 'A' + 10;
3321 else
3322 digit = radix;
3323
3324 if (digit < radix)
3325 {
3326 v *= radix;
3327 v += digit;
3328 }
3329 }
3330 }
3331
3332 if (sign == -1)
3333 v.Negate();
3334
3335 return v;
3336}
3337
3338Integer::Integer(const char *str, ByteOrder order)
3339 : reg(2), sign(POSITIVE)
3340{
3341 *this = StringToInteger(str,order);
3342}
3343
3344Integer::Integer(const wchar_t *str, ByteOrder order)
3345 : reg(2), sign(POSITIVE)
3346{
3347 *this = StringToInteger(str,order);
3348}
3349
3350unsigned int Integer::WordCount() const
3351{
3352 return (unsigned int)CountWords(reg, reg.size());
3353}
3354
3355unsigned int Integer::ByteCount() const
3356{
3357 unsigned wordCount = WordCount();
3358 if (wordCount)
3359 return (wordCount-1)*WORD_SIZE + BytePrecision(reg[wordCount-1]);
3360 else
3361 return 0;
3362}
3363
3364unsigned int Integer::BitCount() const
3365{
3366 unsigned wordCount = WordCount();
3367 if (wordCount)
3368 return (wordCount-1)*WORD_BITS + BitPrecision(reg[wordCount-1]);
3369 else
3370 return 0;
3371}
3372
3373void Integer::Decode(const byte *input, size_t inputLen, Signedness s)
3374{
3375 CRYPTOPP_ASSERT(input && inputLen); // NULL buffer
3376 StringStore store(input, inputLen);
3377 Decode(store, inputLen, s);
3378}
3379
3380void Integer::Decode(BufferedTransformation &bt, size_t inputLen, Signedness s)
3381{
3382 CRYPTOPP_ASSERT(bt.MaxRetrievable() >= inputLen);
3383 if (bt.MaxRetrievable() < inputLen)
3384 throw InvalidArgument("Integer: input length is too small");
3385
3386 byte b;
3387 bt.Peek(b);
3388 sign = ((s==SIGNED) && (b & 0x80)) ? NEGATIVE : POSITIVE;
3389
3390 while (inputLen>0 && (sign==POSITIVE ? b==0 : b==0xff))
3391 {
3392 bt.Skip(1);
3393 inputLen--;
3394 bt.Peek(b);
3395 }
3396
3397 reg.CleanNew(RoundupSize(BytesToWords(inputLen)));
3398 for (size_t i=inputLen; i > 0; i--)
3399 {
3400 (void)bt.Get(b);
3401 reg[(i-1)/WORD_SIZE] |= word(b) << ((i-1)%WORD_SIZE)*8;
3402 }
3403
3404 if (sign == NEGATIVE)
3405 {
3406 for (size_t i=inputLen; i<reg.size()*WORD_SIZE; i++)
3407 reg[i/WORD_SIZE] |= word(0xff) << (i%WORD_SIZE)*8;
3408 TwosComplement(reg, reg.size());
3409 }
3410}
3411
3412size_t Integer::MinEncodedSize(Signedness signedness) const
3413{
3414 // Profiling guided the flow below.
3415 unsigned int outputLen = STDMAX(1U, ByteCount());
3416 const bool pre = (signedness == UNSIGNED);
3417 if (!pre && NotNegative() && (GetByte(outputLen-1) & 0x80))
3418 outputLen++;
3419 if (pre)
3420 return outputLen;
3421 if (IsNegative() && *this < -Power2(outputLen*8-1))
3422 outputLen++;
3423 return outputLen;
3424}
3425
3426// PKCS12_PBKDF and other classes use undersized buffers
3427void Integer::Encode(byte *output, size_t outputLen, Signedness signedness) const
3428{
3429 CRYPTOPP_ASSERT(output && outputLen); // NULL buffer
3430 ArraySink sink(output, outputLen);
3431 Encode(sink, outputLen, signedness);
3432}
3433
3434void Integer::Encode(BufferedTransformation &bt, size_t outputLen, Signedness signedness) const
3435{
3436 if (signedness == UNSIGNED || NotNegative())
3437 {
3438 for (size_t i=outputLen; i > 0; i--)
3439 bt.Put(GetByte(i-1));
3440 }
3441 else
3442 {
3443 // take two's complement of *this
3444 Integer temp = Integer::Power2(8*STDMAX((size_t)ByteCount(), outputLen)) + *this;
3445 temp.Encode(bt, outputLen, UNSIGNED);
3446 }
3447}
3448
3450{
3451 DERGeneralEncoder enc(bt, INTEGER);
3453 enc.MessageEnd();
3454}
3455
3456void Integer::BERDecode(const byte *input, size_t len)
3457{
3458 CRYPTOPP_ASSERT(input && len); // NULL buffer
3459 StringStore store(input, len);
3460 BERDecode(store);
3461}
3462
3464{
3465 BERGeneralDecoder dec(bt, INTEGER);
3466 if (!dec.IsDefiniteLength() || dec.MaxRetrievable() < dec.RemainingLength())
3468 Decode(dec, (size_t)dec.RemainingLength(), SIGNED);
3469 dec.MessageEnd();
3470}
3471
3472void Integer::DEREncodeAsOctetString(BufferedTransformation &bt, size_t length) const
3473{
3475 Encode(enc, length);
3476 enc.MessageEnd();
3477}
3478
3480{
3482 if (!dec.IsDefiniteLength() || dec.RemainingLength() != length)
3484 Decode(dec, length);
3485 dec.MessageEnd();
3486}
3487
3488size_t Integer::OpenPGPEncode(byte *output, size_t bufferSize) const
3489{
3490 CRYPTOPP_ASSERT(output && bufferSize); // NULL buffer
3491 CRYPTOPP_ASSERT(bufferSize >= 2+ByteCount()); // Undersized buffer
3492 ArraySink sink(output, bufferSize);
3493 return OpenPGPEncode(sink);
3494}
3495
3497{
3498 word16 bitCount = word16(BitCount());
3499 bt.PutWord16(bitCount);
3500 size_t byteCount = BitsToBytes(bitCount);
3501 Encode(bt, byteCount);
3502 return 2 + byteCount;
3503}
3504
3505void Integer::OpenPGPDecode(const byte *input, size_t len)
3506{
3507 CRYPTOPP_ASSERT(input && len); // NULL buffer
3508 StringStore store(input, len);
3509 OpenPGPDecode(store);
3510}
3511
3513{
3514 word16 bitCount;
3515 if (bt.GetWord16(bitCount) != 2 || bt.MaxRetrievable() < BitsToBytes(bitCount))
3516 throw OpenPGPDecodeErr();
3517 Decode(bt, BitsToBytes(bitCount));
3518}
3519
3520void Integer::Randomize(RandomNumberGenerator &rng, size_t nbits)
3521{
3522 const size_t nbytes = nbits/8 + 1;
3523 SecByteBlock buf(nbytes);
3524 rng.GenerateBlock(buf, nbytes);
3525 if (nbytes)
3526 buf[0] = (byte)Crop(buf[0], nbits % 8);
3527 Decode(buf, nbytes, UNSIGNED);
3528}
3529
3530void Integer::Randomize(RandomNumberGenerator &rng, const Integer &min, const Integer &max)
3531{
3532 if (min > max)
3533 throw InvalidArgument("Integer: Min must be no greater than Max");
3534
3535 Integer range = max - min;
3536 const unsigned int nbits = range.BitCount();
3537
3538 do
3539 {
3540 Randomize(rng, nbits);
3541 }
3542 while (*this > range);
3543
3544 *this += min;
3545}
3546
3547bool Integer::Randomize(RandomNumberGenerator &rng, const Integer &min, const Integer &max, RandomNumberType rnType, const Integer &equiv, const Integer &mod)
3548{
3549 return GenerateRandomNoThrow(rng, MakeParameters("Min", min)("Max", max)
3550 ("RandomNumberType", rnType)("EquivalentTo", equiv)("Mod", mod));
3551}
3552
3553class KDF2_RNG : public RandomNumberGenerator
3554{
3555public:
3556 KDF2_RNG(const byte *seed, size_t seedSize)
3557 : m_counter(0), m_counterAndSeed(ClampSize(seedSize) + 4)
3558 {
3559 memcpy(m_counterAndSeed + 4, seed, ClampSize(seedSize));
3560 }
3561
3562 void GenerateBlock(byte *output, size_t size)
3563 {
3564 CRYPTOPP_ASSERT(output && size); // NULL buffer
3565 PutWord(false, BIG_ENDIAN_ORDER, m_counterAndSeed, m_counter);
3566 ++m_counter;
3567 P1363_KDF2<SHA1>::DeriveKey(output, size, m_counterAndSeed, m_counterAndSeed.size(), NULLPTR, 0);
3568 }
3569
3570 // UBsan finding, -Wstringop-overflow
3571 inline size_t ClampSize(size_t req) const
3572 {
3573 // Clamp at 16 MB
3574 if (req > 16U*1024*1024)
3575 return 16U*1024*1024;
3576 return req;
3577 }
3578
3579private:
3580 word32 m_counter;
3581 SecByteBlock m_counterAndSeed;
3582};
3583
3585{
3586 Integer min = params.GetValueWithDefault("Min", Integer::Zero());
3587 Integer max;
3588 if (!params.GetValue("Max", max))
3589 {
3590 int bitLength;
3591 if (params.GetIntValue("BitLength", bitLength))
3592 max = Integer::Power2(bitLength);
3593 else
3594 throw InvalidArgument("Integer: missing Max argument");
3595 }
3596 if (min > max)
3597 throw InvalidArgument("Integer: Min must be no greater than Max");
3598
3599 Integer equiv = params.GetValueWithDefault("EquivalentTo", Integer::Zero());
3600 Integer mod = params.GetValueWithDefault("Mod", Integer::One());
3601
3602 if (equiv.IsNegative() || equiv >= mod)
3603 throw InvalidArgument("Integer: invalid EquivalentTo and/or Mod argument");
3604
3605 Integer::RandomNumberType rnType = params.GetValueWithDefault("RandomNumberType", Integer::ANY);
3606
3607 member_ptr<KDF2_RNG> kdf2Rng;
3609 if (params.GetValue(Name::Seed(), seed))
3610 {
3611 ByteQueue bq;
3612 DERSequenceEncoder seq(bq);
3613 min.DEREncode(seq);
3614 max.DEREncode(seq);
3615 equiv.DEREncode(seq);
3616 mod.DEREncode(seq);
3617 DEREncodeUnsigned(seq, rnType);
3618 DEREncodeOctetString(seq, seed.begin(), seed.size());
3619 seq.MessageEnd();
3620
3621 SecByteBlock finalSeed((size_t)bq.MaxRetrievable());
3622 bq.Get(finalSeed, finalSeed.size());
3623 kdf2Rng.reset(new KDF2_RNG(finalSeed.begin(), finalSeed.size()));
3624 }
3625 RandomNumberGenerator &rng = kdf2Rng.get() ? (RandomNumberGenerator &)*kdf2Rng : i_rng;
3626
3627 switch (rnType)
3628 {
3629 case ANY:
3630 if (mod == One())
3631 Randomize(rng, min, max);
3632 else
3633 {
3634 Integer min1 = min + (equiv-min)%mod;
3635 if (max < min1)
3636 return false;
3637 Randomize(rng, Zero(), (max - min1) / mod);
3638 *this *= mod;
3639 *this += min1;
3640 }
3641 return true;
3642
3643 case PRIME:
3644 {
3645 const PrimeSelector *pSelector = params.GetValueWithDefault(Name::PointerToPrimeSelector(), (const PrimeSelector *)NULLPTR);
3646
3647 int i;
3648 i = 0;
3649 while (1)
3650 {
3651 if (++i==16)
3652 {
3653 // check if there are any suitable primes in [min, max]
3654 Integer first = min;
3655 if (FirstPrime(first, max, equiv, mod, pSelector))
3656 {
3657 // if there is only one suitable prime, we're done
3658 *this = first;
3659 if (!FirstPrime(first, max, equiv, mod, pSelector))
3660 return true;
3661 }
3662 else
3663 return false;
3664 }
3665
3666 Randomize(rng, min, max);
3667 if (FirstPrime(*this, STDMIN(*this+mod*PrimeSearchInterval(max), max), equiv, mod, pSelector))
3668 return true;
3669 }
3670 }
3671
3672 default:
3673 throw InvalidArgument("Integer: invalid RandomNumberType argument");
3674 }
3675}
3676
3677std::istream& operator>>(std::istream& in, Integer &a)
3678{
3679 char c;
3680 unsigned int length = 0;
3681 SecBlock<char> str(length + 16);
3682
3683 std::ws(in);
3684
3685 do
3686 {
3687 in.read(&c, 1);
3688 str[length++] = c;
3689 if (length >= str.size())
3690 str.Grow(length + 16);
3691 }
3692 while (in && (c=='-' || c=='x' || (c>='0' && c<='9') || (c>='a' && c<='f') || (c>='A' && c<='F') || c=='h' || c=='H' || c=='o' || c=='O' || c==',' || c=='.'));
3693
3694 if (in.gcount())
3695 in.putback(c);
3696 str[length-1] = '\0';
3697 a = Integer(str);
3698
3699 return in;
3700}
3701
3702// Ensure base 10 is default
3703inline int FlagToBase(long f) {
3704 return f == std::ios::hex ? 16 : (f == std::ios::oct ? 8 : 10);
3705}
3706
3707inline char FlagToSuffix(long f) {
3708 return f == std::ios::hex ? 'h' : (f == std::ios::oct ? 'o' : '.');
3709}
3710
3711// Ensure base 10 is default
3712std::ostream& operator<<(std::ostream& out, const Integer &a)
3713{
3714 // Get relevant conversion specifications from ostream.
3715 const long f = out.flags() & std::ios::basefield;
3716 const int base = FlagToBase(f);
3717 const char suffix = FlagToSuffix(f);
3718
3719 Integer temp1=a, temp2;
3720 if (a.IsNegative())
3721 {
3722 out << '-';
3723 temp1.Negate();
3724 }
3725
3726 if (!a)
3727 out << '0';
3728
3729 static const char upper[]="0123456789ABCDEF";
3730 static const char lower[]="0123456789abcdef";
3731
3732 const char* vec = (out.flags() & std::ios::uppercase) ? upper : lower;
3733 unsigned int i=0;
3734 SecBlock<char> s(a.BitCount() / (SaturatingSubtract1(BitPrecision(base),1U)) + 1);
3735
3736 while (!!temp1)
3737 {
3738 word digit;
3739 Integer::Divide(digit, temp2, temp1, base);
3740 s[i++]=vec[digit];
3741 temp1.swap(temp2);
3742 }
3743
3744 while (i--)
3745 {
3746 out << s[i];
3747 }
3748
3749#ifdef CRYPTOPP_USE_STD_SHOWBASE
3750 if (out.flags() & std::ios_base::showbase)
3751 out << suffix;
3752
3753 return out;
3754#else
3755 return out << suffix;
3756#endif
3757}
3758
3760{
3761 if (NotNegative())
3762 {
3763 if (Increment(reg, reg.size()))
3764 {
3765 reg.CleanGrow(2*reg.size());
3766 reg[reg.size()/2]=1;
3767 }
3768 }
3769 else
3770 {
3771 word borrow = Decrement(reg, reg.size());
3772 CRYPTOPP_ASSERT(!borrow); CRYPTOPP_UNUSED(borrow);
3773
3774 if (WordCount()==0)
3775 *this = Zero();
3776 }
3777 return *this;
3778}
3779
3781{
3782 if (IsNegative())
3783 {
3784 if (Increment(reg, reg.size()))
3785 {
3786 reg.CleanGrow(2*reg.size());
3787 reg[reg.size()/2]=1;
3788 }
3789 }
3790 else
3791 {
3792 if (Decrement(reg, reg.size()))
3793 *this = -One();
3794 }
3795 return *this;
3796}
3797
3798// This is a bit operation. We set sign to POSITIVE, so there's no need to
3799// worry about negative zero. Also see http://stackoverflow.com/q/11644362.
3800Integer Integer::And(const Integer& t) const
3801{
3802 // Grow due to https://github.com/weidai11/cryptopp/issues/1072
3803 // The temporary Integer 'result' may have fewer blocks than
3804 // 'this' or 't', if leading 0-blocks are trimmed in copy ctor.
3805
3806 if (this == &t)
3807 {
3808 return AbsoluteValue();
3809 }
3810 else if (reg.size() >= t.reg.size())
3811 {
3812 IntegerSecBlock temp(t.reg.size());
3813 // AndWords(temp, reg, t.reg, t.reg.size());
3814 for (size_t i=0; i<t.reg.size(); ++i)
3815 temp[i] = reg[i] & t.reg[i];
3816
3817 Integer result;
3818 std::swap(result.reg, temp);
3819
3820 return result;
3821 }
3822 else // reg.size() < t.reg.size()
3823 {
3824 IntegerSecBlock temp(reg.size());
3825 // AndWords(temp, reg, t.reg, reg.size());
3826 for (size_t i=0; i<reg.size(); ++i)
3827 temp[i] = reg[i] & t.reg[i];
3828
3829 Integer result;
3830 std::swap(result.reg, temp);
3831
3832 return result;
3833 }
3834}
3835
3836// This is a bit operation. We set sign to POSITIVE, so there's no need to
3837// worry about negative zero. Also see http://stackoverflow.com/q/11644362.
3838Integer Integer::Or(const Integer& t) const
3839{
3840 // Grow due to https://github.com/weidai11/cryptopp/issues/1072
3841 // The temporary Integer 'result' may have fewer blocks than
3842 // 'this' or 't', if leading 0-blocks are trimmed in copy ctor.
3843
3844 if (this == &t)
3845 {
3846 return AbsoluteValue();
3847 }
3848 else if (reg.size() >= t.reg.size())
3849 {
3850 IntegerSecBlock temp(reg, reg.size());
3851 // OrWords(temp, t.reg, t.reg.size());
3852 for (size_t i=0; i<t.reg.size(); ++i)
3853 temp[i] |= t.reg[i];
3854
3855 Integer result;
3856 std::swap(result.reg, temp);
3857
3858 return result;
3859 }
3860 else // reg.size() < t.reg.size()
3861 {
3862 IntegerSecBlock temp(t.reg, t.reg.size());
3863 // OrWords(temp, reg, reg.size());
3864 for (size_t i=0; i<reg.size(); ++i)
3865 temp[i] |= reg[i];
3866
3867 Integer result;
3868 std::swap(result.reg, temp);
3869
3870 return result;
3871 }
3872}
3873
3874// This is a bit operation. We set sign to POSITIVE, so there's no need to
3875// worry about negative zero. Also see http://stackoverflow.com/q/11644362.
3876Integer Integer::Xor(const Integer& t) const
3877{
3878 // Grow due to https://github.com/weidai11/cryptopp/issues/1072
3879 // The temporary Integer 'result' may have fewer blocks than
3880 // 'this' or 't', if leading 0-blocks are trimmed in copy ctor.
3881
3882 if (this == &t)
3883 {
3884 return Integer::Zero();
3885 }
3886 else if (reg.size() >= t.reg.size())
3887 {
3888 IntegerSecBlock temp(reg, reg.size());
3889 // OrWords(temp, t.reg, t.reg.size());
3890 for (size_t i=0; i<t.reg.size(); ++i)
3891 temp[i] ^= t.reg[i];
3892
3893 Integer result;
3894 std::swap(result.reg, temp);
3895
3896 return result;
3897 }
3898 else // reg.size() < t.reg.size()
3899 {
3900 IntegerSecBlock temp(t.reg, t.reg.size());
3901 // OrWords(temp, reg, reg.size());
3902 for (size_t i=0; i<reg.size(); ++i)
3903 temp[i] ^= reg[i];
3904
3905 Integer result;
3906 std::swap(result.reg, temp);
3907
3908 return result;
3909 }
3910}
3911
3912void PositiveAdd(Integer &sum, const Integer &a, const Integer& b)
3913{
3914 // Profiling guided the flow below.
3915 int carry; const bool pre = (a.reg.size() == b.reg.size());
3916 if (!pre && a.reg.size() > b.reg.size())
3917 {
3918 carry = Add(sum.reg, a.reg, b.reg, b.reg.size());
3919 CopyWords(sum.reg+b.reg.size(), a.reg+b.reg.size(), a.reg.size()-b.reg.size());
3920 carry = Increment(sum.reg+b.reg.size(), a.reg.size()-b.reg.size(), carry);
3921 }
3922 else if (pre)
3923 {
3924 carry = Add(sum.reg, a.reg, b.reg, a.reg.size());
3925 }
3926 else
3927 {
3928 carry = Add(sum.reg, a.reg, b.reg, a.reg.size());
3929 CopyWords(sum.reg+a.reg.size(), b.reg+a.reg.size(), b.reg.size()-a.reg.size());
3930 carry = Increment(sum.reg+a.reg.size(), b.reg.size()-a.reg.size(), carry);
3931 }
3932
3933 if (carry)
3934 {
3935 sum.reg.CleanGrow(2*sum.reg.size());
3936 sum.reg[sum.reg.size()/2] = 1;
3937 }
3938 sum.sign = Integer::POSITIVE;
3939}
3940
3941void PositiveSubtract(Integer &diff, const Integer &a, const Integer& b)
3942{
3943 unsigned aSize = a.WordCount();
3944 aSize += aSize%2;
3945 unsigned bSize = b.WordCount();
3946 bSize += bSize%2;
3947
3948 // Profiling guided the flow below.
3949 if (aSize > bSize)
3950 {
3951 word borrow = Subtract(diff.reg, a.reg, b.reg, bSize);
3952 CopyWords(diff.reg+bSize, a.reg+bSize, aSize-bSize);
3953 borrow = Decrement(diff.reg+bSize, aSize-bSize, borrow);
3954 CRYPTOPP_ASSERT(!borrow); CRYPTOPP_UNUSED(borrow);
3955 diff.sign = Integer::POSITIVE;
3956 }
3957 else if (aSize == bSize)
3958 {
3959 if (Compare(a.reg, b.reg, aSize) >= 0)
3960 {
3961 Subtract(diff.reg, a.reg, b.reg, aSize);
3962 diff.sign = Integer::POSITIVE;
3963 }
3964 else
3965 {
3966 Subtract(diff.reg, b.reg, a.reg, aSize);
3967 diff.sign = Integer::NEGATIVE;
3968 }
3969 }
3970 else
3971 {
3972 word borrow = Subtract(diff.reg, b.reg, a.reg, aSize);
3973 CopyWords(diff.reg+aSize, b.reg+aSize, bSize-aSize);
3974 borrow = Decrement(diff.reg+aSize, bSize-aSize, borrow);
3975 CRYPTOPP_ASSERT(!borrow); CRYPTOPP_UNUSED(borrow);
3976 diff.sign = Integer::NEGATIVE;
3977 }
3978}
3979
3980// MSVC .NET 2003 workaround
3981template <class T> inline const T& STDMAX2(const T& a, const T& b)
3982{
3983 return a < b ? b : a;
3984}
3985
3986Integer Integer::Plus(const Integer& b) const
3987{
3988 Integer sum((word)0, STDMAX2(reg.size(), b.reg.size()));
3989 if (NotNegative())
3990 {
3991 if (b.NotNegative())
3992 PositiveAdd(sum, *this, b);
3993 else
3994 PositiveSubtract(sum, *this, b);
3995 }
3996 else
3997 {
3998 if (b.NotNegative())
3999 PositiveSubtract(sum, b, *this);
4000 else
4001 {
4002 PositiveAdd(sum, *this, b);
4003 sum.sign = Integer::NEGATIVE;
4004 }
4005 }
4006 return sum;
4007}
4008
4010{
4011 reg.CleanGrow(t.reg.size());
4012 if (NotNegative())
4013 {
4014 if (t.NotNegative())
4015 PositiveAdd(*this, *this, t);
4016 else
4017 PositiveSubtract(*this, *this, t);
4018 }
4019 else
4020 {
4021 if (t.NotNegative())
4022 PositiveSubtract(*this, t, *this);
4023 else
4024 {
4025 PositiveAdd(*this, *this, t);
4026 sign = Integer::NEGATIVE;
4027 }
4028 }
4029 return *this;
4030}
4031
4032Integer Integer::Minus(const Integer& b) const
4033{
4034 Integer diff((word)0, STDMAX2(reg.size(), b.reg.size()));
4035 if (NotNegative())
4036 {
4037 if (b.NotNegative())
4038 PositiveSubtract(diff, *this, b);
4039 else
4040 PositiveAdd(diff, *this, b);
4041 }
4042 else
4043 {
4044 if (b.NotNegative())
4045 {
4046 PositiveAdd(diff, *this, b);
4047 diff.sign = Integer::NEGATIVE;
4048 }
4049 else
4050 PositiveSubtract(diff, b, *this);
4051 }
4052 return diff;
4053}
4054
4056{
4057 reg.CleanGrow(t.reg.size());
4058 if (NotNegative())
4059 {
4060 if (t.NotNegative())
4061 PositiveSubtract(*this, *this, t);
4062 else
4063 PositiveAdd(*this, *this, t);
4064 }
4065 else
4066 {
4067 if (t.NotNegative())
4068 {
4069 PositiveAdd(*this, *this, t);
4070 sign = Integer::NEGATIVE;
4071 }
4072 else
4073 PositiveSubtract(*this, t, *this);
4074 }
4075 return *this;
4076}
4077
4079{
4080 const size_t wordCount = WordCount();
4081 const size_t shiftWords = n / WORD_BITS;
4082 const unsigned int shiftBits = (unsigned int)(n % WORD_BITS);
4083
4084 reg.CleanGrow(RoundupSize(wordCount+BitsToWords(n)));
4085 ShiftWordsLeftByWords(reg, wordCount + shiftWords, shiftWords);
4086 ShiftWordsLeftByBits(reg+shiftWords, wordCount+BitsToWords(shiftBits), shiftBits);
4087 return *this;
4088}
4089
4091{
4092 const size_t wordCount = WordCount();
4093 const size_t shiftWords = n / WORD_BITS;
4094 const unsigned int shiftBits = (unsigned int)(n % WORD_BITS);
4095
4096 ShiftWordsRightByWords(reg, wordCount, shiftWords);
4097 if (wordCount > shiftWords)
4098 ShiftWordsRightByBits(reg, wordCount-shiftWords, shiftBits);
4099 if (IsNegative() && WordCount()==0) // avoid -0
4100 *this = Zero();
4101 return *this;
4102}
4103
4105{
4106 if (this != &t)
4107 {
4108 const size_t size = STDMIN(reg.size(), t.reg.size());
4109 reg.resize(size);
4110 AndWords(reg, t.reg, size);
4111 }
4112 sign = POSITIVE;
4113 return *this;
4114}
4115
4117{
4118 if (this != &t)
4119 {
4120 if (reg.size() >= t.reg.size())
4121 {
4122 OrWords(reg, t.reg, t.reg.size());
4123 }
4124 else // reg.size() < t.reg.size()
4125 {
4126 const size_t head = reg.size();
4127 const size_t tail = t.reg.size() - reg.size();
4128 reg.resize(head+tail);
4129 OrWords(reg, t.reg, head);
4130 CopyWords(reg+head,t.reg+head,tail);
4131 }
4132 }
4133 sign = POSITIVE;
4134 return *this;
4135}
4136
4138{
4139 if (this == &t)
4140 {
4141 *this = Zero();
4142 }
4143 else
4144 {
4145 if (reg.size() >= t.reg.size())
4146 {
4147 XorWords(reg, t.reg, t.reg.size());
4148 }
4149 else // reg.size() < t.reg.size()
4150 {
4151 const size_t head = reg.size();
4152 const size_t tail = t.reg.size() - reg.size();
4153 reg.resize(head+tail);
4154 XorWords(reg, t.reg, head);
4155 CopyWords(reg+head,t.reg+head,tail);
4156 }
4157 }
4158 sign = POSITIVE;
4159 return *this;
4160}
4161
4162void PositiveMultiply(Integer &product, const Integer &a, const Integer &b)
4163{
4164 size_t aSize = RoundupSize(a.WordCount());
4165 size_t bSize = RoundupSize(b.WordCount());
4166
4167 product.reg.CleanNew(RoundupSize(aSize+bSize));
4168 product.sign = Integer::POSITIVE;
4169
4170 IntegerSecBlock workspace(aSize + bSize);
4171 AsymmetricMultiply(product.reg, workspace, a.reg, aSize, b.reg, bSize);
4172}
4173
4174void Multiply(Integer &product, const Integer &a, const Integer &b)
4175{
4176 PositiveMultiply(product, a, b);
4177
4178 if (a.NotNegative() != b.NotNegative())
4179 product.Negate();
4180}
4181
4182Integer Integer::Times(const Integer &b) const
4183{
4184 Integer product;
4185 Multiply(product, *this, b);
4186 return product;
4187}
4188
4189/*
4190void PositiveDivide(Integer &remainder, Integer &quotient,
4191 const Integer &dividend, const Integer &divisor)
4192{
4193 remainder.reg.CleanNew(divisor.reg.size());
4194 remainder.sign = Integer::POSITIVE;
4195 quotient.reg.New(0);
4196 quotient.sign = Integer::POSITIVE;
4197 unsigned i=dividend.BitCount();
4198 while (i--)
4199 {
4200 word overflow = ShiftWordsLeftByBits(remainder.reg, remainder.reg.size(), 1);
4201 remainder.reg[0] |= dividend[i];
4202 if (overflow || remainder >= divisor)
4203 {
4204 Subtract(remainder.reg, remainder.reg, divisor.reg, remainder.reg.size());
4205 quotient.SetBit(i);
4206 }
4207 }
4208}
4209*/
4210
4211void PositiveDivide(Integer &remainder, Integer &quotient,
4212 const Integer &a, const Integer &b)
4213{
4214 unsigned aSize = a.WordCount();
4215 unsigned bSize = b.WordCount();
4216
4217 if (!bSize)
4218 throw Integer::DivideByZero();
4219
4220 if (aSize < bSize)
4221 {
4222 remainder = a;
4223 remainder.sign = Integer::POSITIVE;
4224 quotient = Integer::Zero();
4225 return;
4226 }
4227
4228 aSize += aSize%2; // round up to next even number
4229 bSize += bSize%2;
4230
4231 remainder.reg.CleanNew(RoundupSize(bSize));
4232 remainder.sign = Integer::POSITIVE;
4233 quotient.reg.CleanNew(RoundupSize(aSize-bSize+2));
4234 quotient.sign = Integer::POSITIVE;
4235
4236 IntegerSecBlock T(aSize+3*(bSize+2));
4237 Divide(remainder.reg, quotient.reg, T, a.reg, aSize, b.reg, bSize);
4238}
4239
4240void Integer::Divide(Integer &remainder, Integer &quotient, const Integer &dividend, const Integer &divisor)
4241{
4242 PositiveDivide(remainder, quotient, dividend, divisor);
4243
4244 if (dividend.IsNegative())
4245 {
4246 quotient.Negate();
4247 if (remainder.NotZero())
4248 {
4249 --quotient;
4250 remainder = divisor.AbsoluteValue() - remainder;
4251 }
4252 }
4253
4254 if (divisor.IsNegative())
4255 quotient.Negate();
4256}
4257
4258void Integer::DivideByPowerOf2(Integer &r, Integer &q, const Integer &a, unsigned int n)
4259{
4260 q = a;
4261 q >>= n;
4262
4263 const size_t wordCount = BitsToWords(n);
4264 if (wordCount <= a.WordCount())
4265 {
4266 r.reg.resize(RoundupSize(wordCount));
4267 CopyWords(r.reg, a.reg, wordCount);
4268 SetWords(r.reg+wordCount, 0, r.reg.size()-wordCount);
4269 if (n % WORD_BITS != 0)
4270 r.reg[wordCount-1] %= (word(1) << (n % WORD_BITS));
4271 }
4272 else
4273 {
4274 r.reg.resize(RoundupSize(a.WordCount()));
4275 CopyWords(r.reg, a.reg, r.reg.size());
4276 }
4277 r.sign = POSITIVE;
4278
4279 if (a.IsNegative() && r.NotZero())
4280 {
4281 --q;
4282 r = Power2(n) - r;
4283 }
4284}
4285
4286Integer Integer::DividedBy(const Integer &b) const
4287{
4288 Integer remainder, quotient;
4289 Integer::Divide(remainder, quotient, *this, b);
4290 return quotient;
4291}
4292
4293Integer Integer::Modulo(const Integer &b) const
4294{
4295 Integer remainder, quotient;
4296 Integer::Divide(remainder, quotient, *this, b);
4297 return remainder;
4298}
4299
4300void Integer::Divide(word &remainder, Integer &quotient, const Integer &dividend, word divisor)
4301{
4302 if (!divisor)
4303 throw Integer::DivideByZero();
4304
4305 // IsPowerOf2 uses BMI on x86 if available. There is a small
4306 // but measurable improvement during decryption and signing.
4307 if (IsPowerOf2(divisor))
4308 {
4309 quotient = dividend >> (BitPrecision(divisor)-1);
4310 remainder = dividend.reg[0] & (divisor-1);
4311 return;
4312 }
4313
4314 unsigned int i = dividend.WordCount();
4315 quotient.reg.CleanNew(RoundupSize(i));
4316 remainder = 0;
4317 while (i--)
4318 {
4319 quotient.reg[i] = DWord(dividend.reg[i], remainder) / divisor;
4320 remainder = DWord(dividend.reg[i], remainder) % divisor;
4321 }
4322
4323 if (dividend.NotNegative())
4324 quotient.sign = POSITIVE;
4325 else
4326 {
4327 quotient.sign = NEGATIVE;
4328 if (remainder)
4329 {
4330 --quotient;
4331 remainder = divisor - remainder;
4332 }
4333 }
4334}
4335
4337{
4338 word remainder;
4339 Integer quotient;
4340 Integer::Divide(remainder, quotient, *this, b);
4341 return quotient;
4342}
4343
4344word Integer::Modulo(word divisor) const
4345{
4346 if (!divisor)
4347 throw Integer::DivideByZero();
4348
4349 word remainder;
4350
4351 // Profiling guided the flow below.
4352 if ((divisor & (divisor-1)) != 0) // divisor is not a power of 2
4353 {
4354 // Profiling guided the flow below.
4355 unsigned int i = WordCount();
4356 if (divisor > 5)
4357 {
4358 remainder = 0;
4359 while (i--)
4360 remainder = DWord(reg[i], remainder) % divisor;
4361 }
4362 else
4363 {
4364 DWord sum(0, 0);
4365 while (i--)
4366 sum += reg[i];
4367 remainder = sum % divisor;
4368 }
4369 }
4370 else // divisor is a power of 2
4371 {
4372 remainder = reg[0] & (divisor-1);
4373 }
4374
4375 if (IsNegative() && remainder)
4376 remainder = divisor - remainder;
4377
4378 return remainder;
4379}
4380
4381void Integer::Negate()
4382{
4383 if (!!(*this)) // don't flip sign if *this==0
4384 sign = Sign(1-sign);
4385}
4386
4387int Integer::PositiveCompare(const Integer& t) const
4388{
4389 // Profiling guided the flow below.
4390 const unsigned size = WordCount(), tSize = t.WordCount();
4391 if (size != tSize)
4392 return size > tSize ? 1 : -1;
4393 else
4394 return CryptoPP::Compare(reg, t.reg, size);
4395}
4396
4397int Integer::Compare(const Integer& t) const
4398{
4399 if (NotNegative())
4400 {
4401 if (t.NotNegative())
4402 return PositiveCompare(t);
4403 else
4404 return 1;
4405 }
4406 else
4407 {
4408 if (t.NotNegative())
4409 return -1;
4410 else
4411 return -PositiveCompare(t);
4412 }
4413}
4414
4416{
4417 if (!IsPositive())
4418 return Zero();
4419
4420 // overestimate square root
4421 Integer x, y = Power2((BitCount()+1)/2);
4422 CRYPTOPP_ASSERT(y*y >= *this);
4423
4424 do
4425 {
4426 x = y;
4427 y = (x + *this/x) >> 1;
4428 } while (y<x);
4429
4430 return x;
4431}
4432
4433bool Integer::IsSquare() const
4434{
4435 Integer r = SquareRoot();
4436 return *this == r.Squared();
4437}
4438
4439bool Integer::IsUnit() const
4440{
4441 return (WordCount() == 1) && (reg[0] == 1);
4442}
4443
4445{
4446 return IsUnit() ? *this : Zero();
4447}
4448
4449Integer a_times_b_mod_c(const Integer &x, const Integer& y, const Integer& m)
4450{
4452 if (m.IsZero())
4453 throw Integer::DivideByZero();
4454
4455 return x*y%m;
4456}
4457
4458Integer a_exp_b_mod_c(const Integer &x, const Integer& e, const Integer& m)
4459{
4461 if (m.IsZero())
4462 throw Integer::DivideByZero();
4463
4464 ModularArithmetic mr(m);
4465 return mr.Exponentiate(x, e);
4466}
4467
4468Integer Integer::Gcd(const Integer &a, const Integer &b)
4469{
4470 return EuclideanDomainOf<Integer>().Gcd(a, b);
4471}
4472
4473Integer Integer::InverseMod(const Integer &m) const
4474{
4477
4478 if (IsNegative())
4479 return Modulo(m).InverseModNext(m);
4480
4481 // http://github.com/weidai11/cryptopp/issues/602
4482 if (*this >= m)
4483 return Modulo(m).InverseModNext(m);
4484
4485 return InverseModNext(m);
4486}
4487
4488Integer Integer::InverseModNext(const Integer &m) const
4489{
4492
4493 if (m.IsEven())
4494 {
4495 if (!m || IsEven())
4496 return Zero(); // no inverse
4497 if (*this == One())
4498 return One();
4499
4500 Integer u = m.Modulo(*this).InverseModNext(*this);
4501 return !u ? Zero() : (m*(*this-u)+1)/(*this);
4502 }
4503
4504 // AlmostInverse requires a 4x workspace
4505 IntegerSecBlock T(m.reg.size() * 4);
4506 Integer r((word)0, m.reg.size());
4507 unsigned k = AlmostInverse(r.reg, T, reg, reg.size(), m.reg, m.reg.size());
4508 DivideByPower2Mod(r.reg, r.reg, k, m.reg, m.reg.size());
4509 return r;
4510}
4511
4512word Integer::InverseMod(word mod) const
4513{
4514 CRYPTOPP_ASSERT(mod != 0);
4515
4516 word g0 = mod, g1 = *this % mod;
4517 word v0 = 0, v1 = 1;
4518 word y;
4519
4520 while (g1)
4521 {
4522 if (g1 == 1)
4523 return v1;
4524 y = g0 / g1;
4525 g0 = g0 % g1;
4526 v0 += y * v1;
4527
4528 if (!g0)
4529 break;
4530 if (g0 == 1)
4531 return mod-v0;
4532 y = g1 / g0;
4533 g1 = g1 % g0;
4534 v1 += y * v0;
4535 }
4536 return 0;
4537}
4538
4539// ********************************************************
4540
4542{
4543 BERSequenceDecoder seq(bt);
4544 OID oid(seq);
4545 if (oid != ASN1::prime_field())
4547 m_modulus.BERDecode(seq);
4548 seq.MessageEnd();
4549 m_result.reg.resize(m_modulus.reg.size());
4550}
4551
4553{
4554 DERSequenceEncoder seq(bt);
4555 ASN1::prime_field().DEREncode(seq);
4556 m_modulus.DEREncode(seq);
4557 seq.MessageEnd();
4558}
4559
4560void ModularArithmetic::DEREncodeElement(BufferedTransformation &out, const Element &a) const
4561{
4562 a.DEREncodeAsOctetString(out, MaxElementByteLength());
4563}
4564
4566{
4567 a.BERDecodeAsOctetString(in, MaxElementByteLength());
4568}
4569
4570const Integer& ModularArithmetic::Half(const Integer &a) const
4571{
4572 if (a.reg.size()==m_modulus.reg.size())
4573 {
4574 CryptoPP::DivideByPower2Mod(m_result.reg.begin(), a.reg, 1, m_modulus.reg, a.reg.size());
4575 return m_result;
4576 }
4577 else
4578 return m_result1 = (a.IsEven() ? (a >> 1) : ((a+m_modulus) >> 1));
4579}
4580
4581const Integer& ModularArithmetic::Add(const Integer &a, const Integer &b) const
4582{
4583 if (a.reg.size()==m_modulus.reg.size() && b.reg.size()==m_modulus.reg.size())
4584 {
4585 if (CryptoPP::Add(m_result.reg.begin(), a.reg, b.reg, a.reg.size())
4586 || Compare(m_result.reg, m_modulus.reg, a.reg.size()) >= 0)
4587 {
4588 CryptoPP::Subtract(m_result.reg.begin(), m_result.reg, m_modulus.reg, a.reg.size());
4589 }
4590 return m_result;
4591 }
4592 else
4593 {
4594 m_result1 = a+b;
4595 if (m_result1 >= m_modulus)
4596 m_result1 -= m_modulus;
4597 return m_result1;
4598 }
4599}
4600
4602{
4603 if (a.reg.size()==m_modulus.reg.size() && b.reg.size()==m_modulus.reg.size())
4604 {
4605 if (CryptoPP::Add(a.reg, a.reg, b.reg, a.reg.size())
4606 || Compare(a.reg, m_modulus.reg, a.reg.size()) >= 0)
4607 {
4608 CryptoPP::Subtract(a.reg, a.reg, m_modulus.reg, a.reg.size());
4609 }
4610 }
4611 else
4612 {
4613 a+=b;
4614 if (a>=m_modulus)
4615 a-=m_modulus;
4616 }
4617
4618 return a;
4619}
4620
4621const Integer& ModularArithmetic::Subtract(const Integer &a, const Integer &b) const
4622{
4623 if (a.reg.size()==m_modulus.reg.size() && b.reg.size()==m_modulus.reg.size())
4624 {
4625 if (CryptoPP::Subtract(m_result.reg.begin(), a.reg, b.reg, a.reg.size()))
4626 CryptoPP::Add(m_result.reg.begin(), m_result.reg, m_modulus.reg, a.reg.size());
4627 return m_result;
4628 }
4629 else
4630 {
4631 m_result1 = a-b;
4632 if (m_result1.IsNegative())
4633 m_result1 += m_modulus;
4634 return m_result1;
4635 }
4636}
4637
4639{
4640 if (a.reg.size()==m_modulus.reg.size() && b.reg.size()==m_modulus.reg.size())
4641 {
4642 if (CryptoPP::Subtract(a.reg, a.reg, b.reg, a.reg.size()))
4643 CryptoPP::Add(a.reg, a.reg, m_modulus.reg, a.reg.size());
4644 }
4645 else
4646 {
4647 a-=b;
4648 if (a.IsNegative())
4649 a+=m_modulus;
4650 }
4651
4652 return a;
4653}
4654
4655const Integer& ModularArithmetic::Inverse(const Integer &a) const
4656{
4657 if (!a)
4658 return a;
4659
4660 CopyWords(m_result.reg.begin(), m_modulus.reg, m_modulus.reg.size());
4661 if (CryptoPP::Subtract(m_result.reg.begin(), m_result.reg, a.reg, a.reg.size()))
4662 Decrement(m_result.reg.begin()+a.reg.size(), m_modulus.reg.size()-a.reg.size());
4663
4664 return m_result;
4665}
4666
4667Integer ModularArithmetic::CascadeExponentiate(const Integer &x, const Integer &e1, const Integer &y, const Integer &e2) const
4668{
4669 if (m_modulus.IsOdd())
4670 {
4671 MontgomeryRepresentation dr(m_modulus);
4672 return dr.ConvertOut(dr.CascadeExponentiate(dr.ConvertIn(x), e1, dr.ConvertIn(y), e2));
4673 }
4674 else
4675 return AbstractRing<Integer>::CascadeExponentiate(x, e1, y, e2);
4676}
4677
4678void ModularArithmetic::SimultaneousExponentiate(Integer *results, const Integer &base, const Integer *exponents, unsigned int exponentsCount) const
4679{
4680 if (m_modulus.IsOdd())
4681 {
4682 MontgomeryRepresentation dr(m_modulus);
4683 dr.SimultaneousExponentiate(results, dr.ConvertIn(base), exponents, exponentsCount);
4684 for (unsigned int i=0; i<exponentsCount; i++)
4685 results[i] = dr.ConvertOut(results[i]);
4686 }
4687 else
4688 AbstractRing<Integer>::SimultaneousExponentiate(results, base, exponents, exponentsCount);
4689}
4690
4691MontgomeryRepresentation::MontgomeryRepresentation(const Integer &m) // modulus must be odd
4692 : ModularArithmetic(m),
4693 m_u((word)0, m_modulus.reg.size()),
4694 m_workspace(5*m_modulus.reg.size())
4695{
4696 if (!m_modulus.IsOdd())
4697 throw InvalidArgument("MontgomeryRepresentation: Montgomery representation requires an odd modulus");
4698
4699 RecursiveInverseModPower2(m_u.reg, m_workspace, m_modulus.reg, m_modulus.reg.size());
4700}
4701
4702const Integer& MontgomeryRepresentation::Multiply(const Integer &a, const Integer &b) const
4703{
4704 word *const T = m_workspace.begin();
4705 word *const R = m_result.reg.begin();
4706 const size_t N = m_modulus.reg.size();
4707 CRYPTOPP_ASSERT(a.reg.size()<=N && b.reg.size()<=N);
4708
4709 AsymmetricMultiply(T, T+2*N, a.reg, a.reg.size(), b.reg, b.reg.size());
4710 SetWords(T+a.reg.size()+b.reg.size(), 0, 2*N-a.reg.size()-b.reg.size());
4711 MontgomeryReduce(R, T+2*N, T, m_modulus.reg, m_u.reg, N);
4712 return m_result;
4713}
4714
4716{
4717 word *const T = m_workspace.begin();
4718 word *const R = m_result.reg.begin();
4719 const size_t N = m_modulus.reg.size();
4720 CRYPTOPP_ASSERT(a.reg.size()<=N);
4721
4722 CryptoPP::Square(T, T+2*N, a.reg, a.reg.size());
4723 SetWords(T+2*a.reg.size(), 0, 2*N-2*a.reg.size());
4724 MontgomeryReduce(R, T+2*N, T, m_modulus.reg, m_u.reg, N);
4725 return m_result;
4726}
4727
4729{
4730 word *const T = m_workspace.begin();
4731 word *const R = m_result.reg.begin();
4732 const size_t N = m_modulus.reg.size();
4733 CRYPTOPP_ASSERT(a.reg.size()<=N);
4734
4735 CopyWords(T, a.reg, a.reg.size());
4736 SetWords(T+a.reg.size(), 0, 2*N-a.reg.size());
4737 MontgomeryReduce(R, T+2*N, T, m_modulus.reg, m_u.reg, N);
4738 return m_result;
4739}
4740
4742{
4743// return (EuclideanMultiplicativeInverse(a, modulus)<<(2*WORD_BITS*modulus.reg.size()))%modulus;
4744 word *const T = m_workspace.begin();
4745 word *const R = m_result.reg.begin();
4746 const size_t N = m_modulus.reg.size();
4747 CRYPTOPP_ASSERT(a.reg.size()<=N);
4748
4749 CopyWords(T, a.reg, a.reg.size());
4750 SetWords(T+a.reg.size(), 0, 2*N-a.reg.size());
4751 MontgomeryReduce(R, T+2*N, T, m_modulus.reg, m_u.reg, N);
4752 unsigned k = AlmostInverse(R, T, R, N, m_modulus.reg, N);
4753
4754// cout << "k=" << k << " N*32=" << 32*N << endl;
4755
4756 if (k>N*WORD_BITS)
4757 DivideByPower2Mod(R, R, k-N*WORD_BITS, m_modulus.reg, N);
4758 else
4759 MultiplyByPower2Mod(R, R, N*WORD_BITS-k, m_modulus.reg, N);
4760
4761 return m_result;
4762}
4763
4764// Specialization declared in misc.h to allow us to print integers
4765// with additional control options, like arbitrary bases and uppercase.
4766template <> CRYPTOPP_DLL
4767std::string IntToString<Integer>(Integer value, unsigned int base)
4768{
4769 // Hack... set the high bit for uppercase. Set the next bit fo a suffix.
4770 static const unsigned int BIT_32 = (1U << 31);
4771 const bool UPPER = !!(base & BIT_32);
4772 static const unsigned int BIT_31 = (1U << 30);
4773 const bool BASE = !!(base & BIT_31);
4774
4775 const char CH = UPPER ? 'A' : 'a';
4776 base &= ~(BIT_32|BIT_31);
4777 CRYPTOPP_ASSERT(base >= 2 && base <= 32);
4778
4779 if (value == 0)
4780 return "0";
4781
4782 bool negative = false, zero = false;
4783 if (value.IsNegative())
4784 {
4785 negative = true;
4786 value.Negate();
4787 }
4788
4789 if (!value)
4790 zero = true;
4791
4792 SecBlock<char> s(value.BitCount() / (SaturatingSubtract1(BitPrecision(base),1U)) + 1);
4793 Integer temp;
4794
4795 unsigned int i=0;
4796 while (!!value)
4797 {
4798 word digit;
4799 Integer::Divide(digit, temp, value, word(base));
4800 s[i++]=char((digit < 10 ? '0' : (CH - 10)) + digit);
4801 value.swap(temp);
4802 }
4803
4804 std::string result;
4805 result.reserve(i+2);
4806
4807 if (negative)
4808 result += '-';
4809
4810 if (zero)
4811 result += '0';
4812
4813 while (i--)
4814 result += s[i];
4815
4816 if (BASE)
4817 {
4818 if (base == 10)
4819 result += '.';
4820 else if (base == 16)
4821 result += 'h';
4822 else if (base == 8)
4823 result += 'o';
4824 else if (base == 2)
4825 result += 'b';
4826 }
4827
4828 return result;
4829}
4830
4831// Specialization declared in misc.h to avoid Coverity findings.
4832template <> CRYPTOPP_DLL
4833std::string IntToString<word64>(word64 value, unsigned int base)
4834{
4835 // Hack... set the high bit for uppercase.
4836 static const unsigned int HIGH_BIT = (1U << 31);
4837 const char CH = !!(base & HIGH_BIT) ? 'A' : 'a';
4838 base &= ~HIGH_BIT;
4839
4840 CRYPTOPP_ASSERT(base >= 2);
4841 if (value == 0)
4842 return "0";
4843
4844 std::string result;
4845 while (value > 0)
4846 {
4847 word64 digit = value % base;
4848 result = char((digit < 10 ? '0' : (CH - 10)) + digit) + result;
4849 value /= base;
4850 }
4851 return result;
4852}
4853
4854#ifndef CRYPTOPP_NO_ASSIGN_TO_INTEGER
4855// Allow the linker to discard Integer code if not needed.
4856// Also see http://github.com/weidai11/cryptopp/issues/389.
4857bool AssignIntToInteger(const std::type_info &valueType, void *pInteger, const void *pInt)
4858{
4859 if (valueType != typeid(Integer))
4860 return false;
4861 *reinterpret_cast<Integer *>(pInteger) = *reinterpret_cast<const int *>(pInt);
4862 return true;
4863}
4864#endif // CRYPTOPP_NO_ASSIGN_TO_INTEGER
4865
4866// *************************** C++ Static Initialization ***************************
4867
4868class InitInteger
4869{
4870public:
4871 InitInteger()
4872 {
4873 SetFunctionPointers();
4874 }
4875};
4876
4877// This is not really needed because each Integer can dynamically initialize
4878// itself, but we take a peephole optimization and initialize the class once
4879// if init priorities are available. Dynamic initialization will be used if
4880// init priorities are not available.
4881
4882#if defined(HAVE_GCC_INIT_PRIORITY)
4883 const InitInteger s_init __attribute__ ((init_priority (CRYPTOPP_INIT_PRIORITY + 10))) = InitInteger();
4884 const Integer g_zero __attribute__ ((init_priority (CRYPTOPP_INIT_PRIORITY + 11))) = Integer(0L);
4885 const Integer g_one __attribute__ ((init_priority (CRYPTOPP_INIT_PRIORITY + 12))) = Integer(1L);
4886 const Integer g_two __attribute__ ((init_priority (CRYPTOPP_INIT_PRIORITY + 13))) = Integer(2L);
4887#elif defined(HAVE_MSC_INIT_PRIORITY)
4888 #pragma warning(disable: 4075)
4889 #pragma init_seg(".CRT$XCU")
4890 const InitInteger s_init;
4891 const Integer g_zero(0L);
4892 const Integer g_one(1L);
4893 const Integer g_two(2L);
4894 #pragma warning(default: 4075)
4895#elif HAVE_XLC_INIT_PRIORITY
4896 // XLC needs constant, not a define
4897 #pragma priority(280)
4898 const InitInteger s_init;
4899 const Integer g_zero(0L);
4900 const Integer g_one(1L);
4901 const Integer g_two(2L);
4902#else
4903 const InitInteger s_init;
4904#endif
4905
4906// ***************** Library code ********************
4907
4908const Integer &Integer::Zero()
4909{
4910#if defined(HAVE_GCC_INIT_PRIORITY) || defined(HAVE_MSC_INIT_PRIORITY) || defined(HAVE_XLC_INIT_PRIORITY)
4911 return g_zero;
4912#elif defined(CRYPTOPP_CXX11_STATIC_INIT)
4913 static const Integer s_zero(0L);
4914 return s_zero;
4915#else // Potential memory leak. Avoid if possible.
4916 return Singleton<Integer, NewInteger<0L> >().Ref();
4917#endif
4918}
4919
4920const Integer &Integer::One()
4921{
4922#if defined(HAVE_GCC_INIT_PRIORITY) || defined(HAVE_MSC_INIT_PRIORITY) || defined(HAVE_XLC_INIT_PRIORITY)
4923 return g_one;
4924#elif defined(CRYPTOPP_CXX11_STATIC_INIT)
4925 static const Integer s_one(1L);
4926 return s_one;
4927#else // Potential memory leak. Avoid if possible.
4928 return Singleton<Integer, NewInteger<1L> >().Ref();
4929#endif
4930}
4931
4932const Integer &Integer::Two()
4933{
4934#if defined(HAVE_GCC_INIT_PRIORITY) || defined(HAVE_MSC_INIT_PRIORITY) || defined(HAVE_XLC_INIT_PRIORITY)
4935 return g_two;
4936#elif defined(CRYPTOPP_CXX11_STATIC_INIT)
4937 static const Integer s_two(2L);
4938 return s_two;
4939#else // Potential memory leak. Avoid if possible.
4940 return Singleton<Integer, NewInteger<2L> >().Ref();
4941#endif
4942}
4943
4944NAMESPACE_END
4945
4946#endif // CRYPTOPP_IMPORTS
#define MAYBE_UNCONST_CAST(T, x)
SunCC workaround.
Definition: adv_simd.h:595
#define MAYBE_CONST
SunCC workaround.
Definition: adv_simd.h:590
Classes for working with NameValuePairs.
AlgorithmParameters MakeParameters(const char *name, const T &value, bool throwIfNotUsed=true)
Create an object that implements NameValuePairs.
Definition: algparam.h:508
Classes and functions for working with ANS.1 objects.
std::ostream & operator<<(std::ostream &out, const OID &oid)
Print a OID value.
Definition: asn.h:939
OID operator+(const OID &lhs, unsigned long rhs)
Append a value to an OID.
CRYPTOPP_DLL size_t DEREncodeOctetString(BufferedTransformation &bt, const byte *str, size_t strLen)
DER encode octet string.
size_t DEREncodeUnsigned(BufferedTransformation &out, T w, byte asnTag=INTEGER)
DER Encode unsigned value.
Definition: asn.h:820
@ OCTET_STRING
ASN.1 Octet string.
Definition: asn.h:38
@ INTEGER
ASN.1 Integer.
Definition: asn.h:34
void BERDecodeError()
Raises a BERDecodeErr.
Definition: asn.h:104
virtual const Element & Gcd(const Element &a, const Element &b) const
Calculates the greatest common denominator in the ring.
Definition: algebra.cpp:56
virtual void SimultaneousExponentiate(Element *results, const Element &base, const Integer *exponents, unsigned int exponentsCount) const
Exponentiates a base to multiple exponents in the Ring.
Definition: algebra.cpp:334
virtual Element CascadeExponentiate(const Element &x, const Integer &e1, const Element &y, const Integer &e2) const
TODO.
Definition: algebra.cpp:323
Copy input to a memory buffer.
Definition: filters.h:1200
BER General Decoder.
Definition: asn.h:380
BER Sequence Decoder.
Definition: asn.h:525
Interface for buffered transformations.
Definition: cryptlib.h:1652
virtual size_t Get(byte &outByte)
Retrieve a 8-bit byte.
size_t GetWord16(word16 &value, ByteOrder order=BIG_ENDIAN_ORDER)
Retrieve a 16-bit word.
virtual size_t Peek(byte &outByte) const
Peek a 8-bit byte.
size_t PutWord16(word16 value, ByteOrder order=BIG_ENDIAN_ORDER, bool blocking=true)
Input a 16-bit word for processing.
size_t Put(byte inByte, bool blocking=true)
Input a byte for processing.
Definition: cryptlib.h:1673
virtual lword Skip(lword skipMax=LWORD_MAX)
Discard skipMax bytes from the output buffer.
virtual lword MaxRetrievable() const
Provides the number of bytes ready for retrieval.
Data structure used to store byte strings.
Definition: queue.h:23
size_t Get(byte &outByte)
Retrieve a 8-bit byte.
lword MaxRetrievable() const
Provides the number of bytes ready for retrieval.
Definition: queue.h:40
Used to pass byte array input as part of a NameValuePairs object.
Definition: algparam.h:25
const byte * begin() const
Pointer to the first byte in the memory block.
Definition: algparam.h:84
size_t size() const
Length of the memory block.
Definition: algparam.h:88
DER General Encoder.
Definition: asn.h:491
DER Sequence Encoder.
Definition: asn.h:557
Euclidean domain.
Definition: algebra.h:316
Exception thrown when division by 0 is encountered.
Definition: integer.h:56
Exception thrown when a random number cannot be found that satisfies the condition.
Definition: integer.h:64
Multiple precision integer with arithmetic operations.
Definition: integer.h:50
static void Divide(Integer &r, Integer &q, const Integer &a, const Integer &d)
Extended Division.
void DEREncode(BufferedTransformation &bt) const
Encode in DER format.
Integer & operator>>=(size_t n)
Right-shift Assignment.
Integer & operator&=(const Integer &t)
Bitwise AND Assignment.
bool GetBit(size_t i) const
Provides the i-th bit of the Integer.
void SetByte(size_t n, byte value)
Set the n-th byte to value.
static void DivideByPowerOf2(Integer &r, Integer &q, const Integer &a, unsigned int n)
Extended Division.
bool IsPositive() const
Determines if the Integer is positive.
Definition: integer.h:347
Integer Minus(const Integer &b) const
Subtraction.
signed long ConvertToLong() const
Convert the Integer to Long.
Integer operator-() const
Subtraction.
void SetBit(size_t n, bool value=1)
Set the n-th bit to value.
Integer & operator+=(const Integer &t)
Addition Assignment.
Integer And(const Integer &t) const
Bitwise AND.
bool IsSquare() const
Determine whether this integer is a perfect square.
Integer Plus(const Integer &b) const
Addition.
Integer DividedBy(const Integer &b) const
Division.
Integer & operator++()
Pre-increment.
void DEREncodeAsOctetString(BufferedTransformation &bt, size_t length) const
Encode absolute value as big-endian octet string.
void OpenPGPDecode(const byte *input, size_t inputLen)
Decode from OpenPGP format.
bool NotZero() const
Determines if the Integer is non-0.
Definition: integer.h:338
Integer Times(const Integer &b) const
Multiplication.
void BERDecodeAsOctetString(BufferedTransformation &bt, size_t length)
Decode nonnegative value from big-endian octet string.
Integer & operator--()
Pre-decrement.
byte GetByte(size_t i) const
Provides the i-th byte of the Integer.
static const Integer & Zero()
Integer representing 0.
void Randomize(RandomNumberGenerator &rng, size_t bitCount)
Set this Integer to random integer.
bool IsConvertableToLong() const
Determines if the Integer is convertable to Long.
Integer Or(const Integer &t) const
Bitwise OR.
lword GetBits(size_t i, size_t n) const
Provides the low order bits of the Integer.
static Integer Power2(size_t e)
Exponentiates to a power of 2.
Integer Squared() const
Multiply this integer by itself.
Definition: integer.h:633
Integer()
Creates the zero integer.
void BERDecode(const byte *input, size_t inputLen)
Decode from BER format.
size_t MinEncodedSize(Signedness sign=UNSIGNED) const
Minimum number of bytes to encode this integer.
unsigned int BitCount() const
Determines the number of bits required to represent the Integer.
Integer & operator|=(const Integer &t)
Bitwise OR Assignment.
void Negate()
Reverse the Sign of the Integer.
bool NotNegative() const
Determines if the Integer is non-negative.
Definition: integer.h:344
unsigned int WordCount() const
Determines the number of words required to represent the Integer.
Integer & operator^=(const Integer &t)
Bitwise XOR Assignment.
Integer & operator=(const Integer &t)
Assignment.
Integer & operator-=(const Integer &t)
Subtraction Assignment.
RandomNumberType
Properties of a random integer.
Definition: integer.h:91
@ ANY
a number with no special properties
Definition: integer.h:93
@ PRIME
a number which is probabilistically prime
Definition: integer.h:95
bool operator!() const
Negation.
Integer AbsoluteValue() const
Retrieve the absolute value of this integer.
@ SIGNED
a signed value
Definition: integer.h:87
@ UNSIGNED
an unsigned value
Definition: integer.h:85
Integer Xor(const Integer &t) const
Bitwise XOR.
int Compare(const Integer &a) const
Perform signed comparison.
Integer Modulo(const Integer &b) const
Remainder.
size_t OpenPGPEncode(byte *output, size_t bufferSize) const
Encode absolute value in OpenPGP format.
void swap(Integer &a)
Swaps this Integer with another Integer.
static const Integer & Two()
Integer representing 2.
bool IsZero() const
Determines if the Integer is 0.
Definition: integer.h:335
Integer MultiplicativeInverse() const
Calculate multiplicative inverse.
bool GenerateRandomNoThrow(RandomNumberGenerator &rng, const NameValuePairs &params=g_nullNameValuePairs)
Generate a random number.
bool IsNegative() const
Determines if the Integer is negative.
Definition: integer.h:341
void Decode(const byte *input, size_t inputLen, Signedness sign=UNSIGNED)
Decode from big-endian byte array.
Integer & operator<<=(size_t n)
Left-shift Assignment.
Sign
Used internally to represent the integer.
Definition: integer.h:73
@ NEGATIVE
the value is negative
Definition: integer.h:77
@ POSITIVE
the value is positive or 0
Definition: integer.h:75
unsigned int ByteCount() const
Determines the number of bytes required to represent the Integer.
bool IsUnit() const
Determine if 1 or -1.
bool IsOdd() const
Determines if the Integer is odd parity.
Definition: integer.h:356
static Integer Gcd(const Integer &a, const Integer &n)
Calculate greatest common divisor.
void Encode(byte *output, size_t outputLen, Signedness sign=UNSIGNED) const
Encode in big-endian format.
Integer InverseMod(const Integer &n) const
Calculate multiplicative inverse.
Integer SquareRoot() const
Extract square root.
static const Integer & One()
Integer representing 1.
bool IsEven() const
Determines if the Integer is even parity.
Definition: integer.h:353
An invalid argument was detected.
Definition: cryptlib.h:203
Ring of congruence classes modulo n.
Definition: modarith.h:44
Integer & Reduce(Integer &a, const Integer &b) const
TODO.
ModularArithmetic(const Integer &modulus=Integer::One())
Construct a ModularArithmetic.
Definition: modarith.h:54
const Integer & Half(const Integer &a) const
Divides an element by 2.
const Integer & Inverse(const Integer &a) const
Inverts the element in the ring.
void BERDecodeElement(BufferedTransformation &in, Element &a) const
Decodes element in DER format.
unsigned int MaxElementByteLength() const
Provides the maximum byte size of an element in the ring.
Definition: modarith.h:248
void DEREncodeElement(BufferedTransformation &out, const Element &a) const
Encodes element in DER format.
void SimultaneousExponentiate(Element *results, const Element &base, const Integer *exponents, unsigned int exponentsCount) const
Exponentiates a base to multiple exponents in the ring.
Integer CascadeExponentiate(const Integer &x, const Integer &e1, const Integer &y, const Integer &e2) const
TODO.
Integer & Accumulate(Integer &a, const Integer &b) const
TODO.
const Integer & Subtract(const Integer &a, const Integer &b) const
Subtracts elements in the ring.
const Integer & Add(const Integer &a, const Integer &b) const
Adds elements in the ring.
void DEREncode(BufferedTransformation &bt) const
Encodes in DER format.
Performs modular arithmetic in Montgomery representation for increased speed.
Definition: modarith.h:296
Integer ConvertOut(const Integer &a) const
Reduces an element in the congruence class.
const Integer & Square(const Integer &a) const
Square an element in the ring.
const Integer & Multiply(const Integer &a, const Integer &b) const
Multiplies elements in the ring.
MontgomeryRepresentation(const Integer &modulus)
Construct a MontgomeryRepresentation.
const Integer & MultiplicativeInverse(const Integer &a) const
Calculate the multiplicative inverse of an element in the ring.
Interface for retrieving values given their names.
Definition: cryptlib.h:322
T GetValueWithDefault(const char *name, T defaultValue) const
Get a named value.
Definition: cryptlib.h:392
bool GetValue(const char *name, T &value) const
Get a named value.
Definition: cryptlib.h:379
CRYPTOPP_DLL bool GetIntValue(const char *name, int &value) const
Get a named value with type int.
Definition: cryptlib.h:415
Object Identifier.
Definition: asn.h:265
static void DeriveKey(byte *output, size_t outputLength, const byte *input, size_t inputLength, const byte *derivationParams, size_t derivationParamsLength)
P1363 key derivation function.
Definition: pubkey.h:760
Application callback to signal suitability of a cabdidate prime.
Definition: nbtheory.h:114
Interface for random number generators.
Definition: cryptlib.h:1435
virtual void GenerateBlock(byte *output, size_t size)
Generate random array of bytes.
Secure memory block with allocator and cleanup.
Definition: secblock.h:731
iterator begin()
Provides an iterator pointing to the first element in the memory block.
Definition: secblock.h:836
void CleanNew(size_type newSize)
Change size without preserving contents.
Definition: secblock.h:1143
void swap(SecBlock< T, A > &b)
Swap contents with another SecBlock.
Definition: secblock.h:1209
void CleanGrow(size_type newSize)
Change size and preserve contents.
Definition: secblock.h:1179
void New(size_type newSize)
Change size without preserving contents.
Definition: secblock.h:1126
size_type size() const
Provides the count of elements in the SecBlock.
Definition: secblock.h:867
void resize(size_type newSize)
Change size and preserve contents.
Definition: secblock.h:1198
SecBlock<byte> typedef.
Definition: secblock.h:1226
Restricts the instantiation of a class to one static object without locks.
Definition: misc.h:307
Square block cipher.
Definition: square.h:25
String-based implementation of Store interface.
Definition: filters.h:1259
Pointer that overloads operator ->
Definition: smartptr.h:38
Library configuration file.
#define CRYPTOPP_TABLE
Override for internal linkage.
Definition: config_dll.h:111
unsigned char byte
8-bit unsigned datatype
Definition: config_int.h:56
word64 word
Full word used for multiprecision integer arithmetic.
Definition: config_int.h:182
const unsigned int WORD_BITS
Size of a platform word in bits.
Definition: config_int.h:249
unsigned int word32
32-bit unsigned datatype
Definition: config_int.h:62
unsigned short word16
16-bit unsigned datatype
Definition: config_int.h:59
word128 dword
Double word used for multiprecision integer arithmetic.
Definition: config_int.h:193
unsigned long long word64
64-bit unsigned datatype
Definition: config_int.h:91
word32 hword
Half word used for multiprecision integer arithmetic.
Definition: config_int.h:174
const unsigned int WORD_SIZE
Size of a platform word in bytes.
Definition: config_int.h:245
word64 lword
Large word type.
Definition: config_int.h:158
Functions for CPU features and intrinsics.
ByteOrder
Provides the byte ordering.
Definition: cryptlib.h:143
@ LITTLE_ENDIAN_ORDER
byte order is little-endian
Definition: cryptlib.h:145
@ BIG_ENDIAN_ORDER
byte order is big-endian
Definition: cryptlib.h:147
Implementation of BufferedTransformation's attachment interface.
Multiple precision integer with arithmetic operations.
inline ::Integer operator-(const ::Integer &a, const ::Integer &b)
Subtraction.
Definition: integer.h:771
Utility functions for the Crypto++ library.
CRYPTOPP_DLL std::string IntToString< Integer >(Integer value, unsigned int base)
Converts an Integer to a string.
const T & STDMAX(const T &a, const T &b)
Replacement function for std::max.
Definition: misc.h:666
unsigned int BitPrecision(const T &value)
Returns the number of bits required for a value.
Definition: misc.h:842
unsigned int BytePrecision(const T &value)
Returns the number of 8-bit bytes or octets required for a value.
Definition: misc.h:819
size_t BitsToWords(size_t bitCount)
Returns the number of words required for the specified number of bits.
Definition: misc.h:958
unsigned int TrailingZeros(word32 v)
Determines the number of trailing 0-bits in a value.
Definition: misc.h:867
void ConditionalSwapPointers(bool c, T &a, T &b)
Performs a branch-less swap of pointers a and b if condition c is true.
Definition: misc.h:1358
T Crop(T value, size_t bits)
Truncates the value to the specified number of bits.
Definition: misc.h:926
bool IsPowerOf2(const T &value)
Tests whether a value is a power of 2.
Definition: misc.h:1010
const T & STDMIN(const T &a, const T &b)
Replacement function for std::min.
Definition: misc.h:655
size_t BitsToBytes(size_t bitCount)
Returns the number of 8-bit bytes or octets required for the specified number of bits.
Definition: misc.h:938
#define MEMORY_BARRIER
A memory barrier.
Definition: misc.h:270
size_t BytesToWords(size_t byteCount)
Returns the number of words required for the specified number of bytes.
Definition: misc.h:948
CRYPTOPP_DLL std::string IntToString< word64 >(word64 value, unsigned int base)
Converts an unsigned value to a string.
void PutWord(bool assumeAligned, ByteOrder order, byte *block, T value, const byte *xorBlock=NULL)
Access a block of memory.
Definition: misc.h:2739
T1 SaturatingSubtract1(const T1 &a, const T2 &b)
Performs a saturating subtract clamped at 1.
Definition: misc.h:1110
Class file for performing modular arithmetic.
Crypto++ library namespace.
const char * PointerToPrimeSelector()
const PrimeSelector *
Definition: argnames.h:42
const char * Seed()
ConstByteArrayParameter.
Definition: argnames.h:19
Classes and functions for number theoretic operations.
CRYPTOPP_DLL bool FirstPrime(Integer &p, const Integer &max, const Integer &equiv, const Integer &mod, const PrimeSelector *pSelector)
Finds a random prime of special form.
ASN.1 object identifiers for algorithms and schemes.
Precompiled header file.
This file contains helper classes/functions for implementing public key algorithms.
Classes and functions for secure memory allocations.
void swap(::SecBlock< T, A > &a, ::SecBlock< T, A > &b)
Swap two SecBlocks.
Definition: secblock.h:1289
Classes for SHA-1 and SHA-2 family of message digests.
Classes for automatic resource management.
Common C++ header files.
#define CRYPTOPP_ASSERT(exp)
Debugging and diagnostic assertion.
Definition: trap.h:68
Support functions for word operations.
void ShiftWordsRightByWords(word *r, size_t n, size_t shiftWords)
Right shift word array.
Definition: words.h:212
void XorWords(word *r, const word *a, const word *b, size_t n)
XOR word arrays.
Definition: words.h:66
void SetWords(word *r, word a, size_t n)
Set the value of words.
Definition: words.h:35
void OrWords(word *r, const word *a, const word *b, size_t n)
OR word arrays.
Definition: words.h:120
word ShiftWordsLeftByBits(word *r, size_t n, unsigned int shiftBits)
Left shift word array.
Definition: words.h:149
void ShiftWordsLeftByWords(word *r, size_t n, size_t shiftWords)
Left shift word array.
Definition: words.h:194
size_t CountWords(const word *x, size_t n)
Count the number of words.
Definition: words.h:21
word ShiftWordsRightByBits(word *r, size_t n, unsigned int shiftBits)
Right shift word array.
Definition: words.h:172
void CopyWords(word *r, const word *a, size_t n)
Copy word array.
Definition: words.h:48
void AndWords(word *r, const word *a, const word *b, size_t n)
AND word arrays.
Definition: words.h:93