Papers updated in last 31 days (296 results)
Invertible Bloom Lookup Tables with Less Memory and Randomness
In this work we study Invertible Bloom Lookup Tables (IBLTs) with small failure probabilities.
IBLTs are highly versatile data structures that have found applications in set reconciliation protocols, error-correcting codes, and even the design of advanced cryptographic primitives.
For storing $n$ elements and ensuring correctness with probability at least $1 - \delta$, existing IBLT constructions require $\Omega(n(\frac{\log(1/\delta)}{\log(n)}+1))$ space and they crucially rely on fully random hash functions.
We present new constructions of IBLTs that are simultaneously more space efficient and require less randomness.
For storing $n$ elements with a failure probability of at most $\delta$, our data structure only requires $\mathcal{O}\left(n + \log(1/\delta)\log\log(1/\delta)\right)$ space and $\mathcal{O}\left(\log(\log(n)/\delta)\right)$-wise independent hash functions.
As a key technical ingredient we show that hashing $n$ keys with any $k$-wise independent hash function $h:U \to [Cn]$ for some sufficiently large constant $C$ guarantees with probability $1 - 2^{-\Omega(k)}$ that at least $n/2$ keys will have a unique hash value.
Proving this is non-trivial as $k$ approaches $n$.
We believe that the techniques used to prove this statement may be of independent interest.
We apply our new IBLTs to the encrypted compression problem, recently studied by Fleischhacker, Larsen, Simkin (Eurocrypt 2023).
We extend their approach to work for a more general class of encryption schemes and using our new IBLT we achieve an asymptotically better compression rate.
Adaptively Secure Attribute-Based Encryption from Witness Encryption
Attribute-based encryption (ABE) enables fine-grained control over which ciphertexts various users can decrypt. A master authority can create secret keys $sk_f$ with different functions (circuits) $f$ for different users. Anybody can encrypt a message under some attribute $x$ so that only recipients with a key $sk_f$ for a function such that $f(x)=1$ will be able to decrypt. There are a number of different approaches toward achieving selectively secure ABE, where the adversary has to decide on the challenge attribute $x$ ahead of time before seeing any keys, including constructions via bilinear maps (for NC1 circuits), learning with errors, or witness encryption. However, when it comes adaptively secure ABE, the problem seems to be much more challenging and we only know of two potential approaches: via the ``dual systems'' methodology from bilinear maps, or via indistinguishability obfuscation. In this work, we give a new approach that constructs adaptively secure ABE from witness encryption (along with statistically sound NIZKs and one-way functions). While witness encryption is a strong assumption, it appears to be fundamentally weaker than indistinguishability obfuscation. Moreover, we have candidate constructions of witness encryption from some assumptions (e.g., evasive LWE) from which we do not know how to construct indistinguishability obfuscation, giving us adaptive ABE from these assumptions as a corollary of our work.
Compute, but Verify: Efficient Multiparty Computation over Authenticated Inputs
Traditional notions of secure multiparty computation (MPC) allow mutually distrusting parties to jointly compute a function over their private inputs, but typically do not specify how these inputs are chosen. Motivated by real-world applications where corrupt inputs could adversely impact privacy and operational legitimacy, we consider a notion of authenticated MPC where the inputs are authenticated, e.g., signed using a digital signature by some certification authority. We propose a generic and efficient compiler that transforms any linear secret sharing based honest-majority MPC protocol into one with input authentication.
Our compiler incurs significantly lower computational costs and competitive communication overheads when compared to the best existing solutions, while entirely avoiding the (potentially expensive) protocol-specific techniques and pre-processing requirements that are inherent to these solutions. For $n$-party honest majority MPC protocols with abort security where each party has $\ell$ inputs, our compiler incurs $O(n\log \ell)$ communication overall and a computational overhead of $O(\ell)$ group exponentiations per party (the corresponding overheads for the most efficient existing solution are $O(n^2)$ and $O(\ell n)$). Finally, for a corruption threshold $t<n/3$, our compiler preserves the stronger identifiable abort security of the underlying MPC protocol. No existing solution for authenticated MPC achieves this regardless of the corruption threshold.
Along the way, we make several technical contributions that are of independent interest. This includes the notion of distributed proofs of knowledge and concrete realizations of the same for several relations of interest, such as proving knowledge of many popularly used digital signature schemes, and proving knowledge of opening of a Pedersen commitment.
Don’t Use It Twice! Solving Relaxed Linear Code Equivalence Problems
The Linear Code Equivalence (LCE) Problem has received increased attention in recent years due to its applicability in constructing efficient digital signatures. Notably, the LESS signature scheme based on LCE is under consideration for the NIST post-quantum standardization process, along with the MEDS signature scheme that relies on an extension of LCE to the rank metric, namely the Matrix Code Equivalence (MCE) Problem. Building upon these developments, a family of signatures with additional properties, including linkable ring, group, and threshold signatures, has been proposed. These novel constructions introduce relaxed versions of LCE (and MCE), wherein multiple samples share the same secret equivalence. Despite their significance, these variations have often lacked a thorough security analysis, being assumed to be as challenging as their original counterparts. Addressing this gap, our work delves into the sample complexity of LCE and MCE --- precisely, the sufficient number of samples required for efficient recovery of the shared secret equivalence. Our findings reveal, for instance, that one should not use the same secret twice in the LCE setting since this enables a polynomial time (and memory) algorithm to retrieve the secret. Consequently, our results unveil the insecurity of two advanced signatures based on variants of the LCE Problem.
On Active Attack Detection in Messaging with Immediate Decryption
The widely used Signal protocol provides protection against state exposure attacks through forward security (protecting past messages) and post-compromise security (for restoring security). It supports immediate decryption, allowing messages to be re-ordered or dropped at the protocol level without affecting correctness. In this work, we consider strong active attack detection for secure messaging with immediate decryption, where parties are able to immediately detect active attacks under certain conditions. We first consider in-band active attack detection, where participants who have been actively compromised but are still able to send a single message to their partner can detect the compromise. We propose two complementary notions to capture security, and present a compiler that provides security with respect to both notions. Our notions generalise existing work (RECOVER security) which only supported in-order messaging. We also study the related out-of-band attack detection problem by considering communication over out-of-band, authenticated channels and propose analogous security notions. We prove that one of our two notions in each setting imposes a linear communication overhead in the number of sent messages and security parameter using an information-theoretic argument. This implies that each message must information-theoretically contain all previous messages and that our construction, that essentially attaches the entire message history to every new message, is asymptotically optimal. We then explore ways to bypass this lower bound and highlight the feasibility of practical active attack detection compatible with immediate decryption.
Keeping Up with the KEMs: Stronger Security Notions for KEMs and automated analysis of KEM-based protocols
Key Encapsulation Mechanisms (KEMs) are a critical building block for hybrid encryption and modern security protocols, notably in the post-quantum setting. Given the asymmetric public key of a recipient, the primitive establishes a shared secret key between sender and recipient. In recent years, a large number of abstract designs and concrete implementations of KEMs have been proposed, e.g., in the context of the NIST process for post-quantum primitives.
In this work, we (i) establish stronger security notions for KEMs, and (ii) develop a symbolic analysis method to analyze security protocols that use KEMs. First, we generalize existing security notions for KEMs in the computational setting, introduce several stronger security notions, and prove their relations. Our new properties formalize in which sense outputs of the KEM uniquely determine, i.e., bind, other values. Our new binding properties can be used, e.g., to prove the absence of attacks that were not captured by prior security notions, such as re-encapsulation attacks.
Second, we develop a family of fine-grained symbolic models that correspond to our hierarchy of computational security notions, and are suitable for the automated analysis of KEM-based security protocols. We encode our models as a library in the framework of the Tamarin prover. Given a KEM-based protocol, our approach can automatically derive the minimal binding properties required from the KEM; or, if also given a concrete KEM, can analyze if the protocols meets its security goals. In case studies, Tamarin automatically discovers, e.g., that the key exchange protocol proposed in the original Kyber paper requires stronger properties from the KEM than were proven in the paper.
Lifting approach against the SNOVA scheme
In 2022, Wang et al. proposed the multivariate signature scheme SNOVA as a UOV variant over the non-commutative ring of $\ell \times \ell $ matrices over $\mathbb{F}_q$.
This scheme has small public key and signature size and is a first round candidate of NIST PQC additional digital signature project.
Recently, Ikematsu and Akiyama, and Li and Ding show that the core matrices of SNOVA with $v$ vinegar-variables and $o$ oil-variables are regarded as the representation matrices of UOV with $\ell v$ vinegar-variables and $\ell o$ oil-variables over $\mathbb{F}_q$, and thus we can apply existing key recovery attacks as a plain UOV.
In this paper, we propose a method that reduces SNOVA to smaller UOV with $v$ vinegar-variables and $o$ oil-variables over $\mathbb{F}_{q^\ell }$. As a result, we show that the previous first round parameter sets at $\ell = 2$ do not meet the NIST PQC security levels. We also confirm that the present parameter sets are secure from existing key recovery attacks with our approach.
Impossibility Results for Post-Compromise Security in Real-World Communication Systems
Modern secure communication systems, such as iMessage, WhatsApp, and Signal include intricate mechanisms that aim to achieve very strong security properties. These mechanisms typically involve continuously merging in new fresh secrets into the keying material, which is used to encrypt messages during communications. In the literature, these mechanisms have been proven to achieve forms of Post Compromise Security (PCS): the ability to provide communication security even if the full state of a party was compromised some time in the past. However, recent work has shown these proofs do not transfer to the end-user level, possibly because of usability concerns. This has raised the question of whether end-users can actually obtain PCS or not, and under which conditions.
Here we show and formally prove that communication systems that need to be resilient against certain types of state loss (which can occur in practice) fundamentally cannot achieve full PCS for end-users. Whereas previous work showed that the Signal messenger did not achieve this with its current session-management layer, we isolate the exact conditions that cause this failure, and why this cannot be simply solved in communication systems by implementing a different session-management layer or an entirely different protocol. Moreover, we clarify the trade-off of the maximum number of sessions between two users (40 in Signal) in terms of failure-resilience versus security.
Our results have direct consequences for the design of future secure communication systems, and could motivate either the simplification of redundant mechanisms, or the improvement of session-management designs to provide better security trade-offs with respect to state loss/failure tolerance.
Opening the Blackbox: Collision Attacks on Round-Reduced Tip5, Tip4, Tip4' and Monolith
A new design strategy for ZK-friendly hash functions has emerged since the proposal of $\mathsf{Reinforced Concrete}$ at CCS 2022, which is based on the hybrid use of two types of nonlinear transforms: the composition of some small-scale lookup tables (e.g., 7-bit or 8-bit permutations) and simple power maps over $\mathbb{F}_p$. Following such a design strategy, some new ZK-friendly hash functions have been recently proposed, e.g., $\mathsf{Tip5}$, $\mathsf{Tip4}$, $\mathsf{Tip4}'$ and the $\mathsf{Monolith}$ family. All these hash functions have a small number of rounds, i.e., $5$ rounds for $\mathsf{Tip5}$, $\mathsf{Tip4}$, and $\mathsf{Tip4}'$, and $6$ rounds for $\mathsf{Monolith}$ (recently published at ToSC 2024/3). Using the composition of some small-scale lookup tables to build a large-scale permutation over $\mathbb{F}_p$ - which we call S-box - is a main feature in such designs, which can somehow enhance the resistance against the Gröbner basis attack because this large-scale permutation will correspond to a complex and high-degree polynomial representation over $\mathbb{F}_p$.
As the first technical contribution, we propose a novel and efficient algorithm to study the differential property of this S-box and to find a conforming input pair for a randomly given input and output difference. For comparison, a trivial method based on the use of the differential distribution table (DDT) for solving this problem will require time complexity $\mathcal{O}(p^2)$.
For the second contribution, we also propose new frameworks to devise efficient collision attacks on such hash functions. Based on the differential properties of these S-boxes and the new attack frameworks, we propose the first collision attacks on $3$-round $\mathsf{Tip5}$, $\mathsf{Tip4}$, and $\mathsf{Tip4}'$, as well as $2$-round $\mathsf{Monolith}$-$31$ and $\mathsf{Monolith}$-$64$, where the $2$-round attacks on $\mathsf{Monolith}$ are practical. In the semi-free-start (SFS) collision attack setting, we achieve practical SFS collision attacks on $3$-round $\mathsf{Tip5}$, $\mathsf{Tip4}$, and $\mathsf{Tip4}'$. Moreover, the SFS collision attacks can reach up to $4$-round $\mathsf{Tip4}$ and $3$-round $\mathsf{Monolith}$-$64$. As far as we know, this is the first third-party cryptanalysis of these hash functions, which improves the initial analysis given by the designers.
Another Walk for Monchi
Monchi is a new protocol aimed at privacy-preserving biometric identification. It begins with scores computation in the encrypted domain thanks to homomorphic encryption and ends with comparisons of these scores to a given threshold with function secret sharing. We here study the integration in that context of scores computation techniques recently introduced by Bassit et al. that eliminate homomorphic multiplications by replacing them by lookup tables. First, we extend this lookup tables biometric recognition solution by adding the use of function secret sharing for the final comparison of scores. Then, we introduce a two-party computation of the scores with lookup tables which fits nicely together with the function secret sharing scores comparison. Our solutions accommodate well with the flight boarding use case introduced by Monchi.
Symmetric Twin Column Parity Mixers and their Applications
The circulant twin column parity mixer (TCPM) is a type of mixing layer for the round function of cryptographic permutations designed by Hirch et al. at CRYPTO 2023. It has a bitwise differential branch number of 12 and a bitwise linear branch number of 4, which makes it competitive in applications where differential security is required. Hirch et al. gave a concrete instantiation of a permutation using such a mixing layer, named Gaston, and showed the best 3-round differential and linear trails of Gaston have much higher weights than those of ASCON. In this paper, we first prove why the TCPM has linear branch number 4 and then show that Gaston's linear behavior is worse than ASCON for more than 3 rounds. Motivated by these facts, we aim to enhance the linear security of the TCPM. We show that adding a specific set of row cyclic shifts to the TCPM can make its differential and linear branch numbers both 12. Notably, by setting a special relationship between the row shift parameters of the modified TCPM, we obtain a special kind of mixlayer called the symmetric circulant twin column parity mixer. The symmetric TCPM has a unique design property that its differential and linear branch histograms are the same, which makes the parameter selection process and the security analysis convenient. Using the symmetric TCPM, we present two new 320-bit cryptographic permutations, namely (1) Gaston-S where we replace the mixing layer in Gaston with the symmetric TCPM and (2) SBD which uses a low-latency degree-4 S-box as the non-linear layer and the symmetric TCPM as the mixing layer. We evaluate the security of these permutations considering differential, linear and algebraic analysis, and then provide the performance comparison with Gaston in both hardware and software. Our results indicate that Gaston-S and SBD are competitive with Gaston in both security and performance.
Scutum: Temporal Verification for Cross-Rollup Bridges via Goal-Driven Reduction
Scalability remains a key challenge for blockchain adoption. Rollups—especially zero-knowledge (ZK) and optimistic rollups—address this by processing transactions off-chain while maintaining Ethereum’s security, thus reducing gas fees and improving speeds. Cross-rollup bridges like Orbiter Finance enable seamless asset transfers across various Layer 2 (L2) rollups and between L2 and Layer 1 (L1) chains. However, the increasing reliance on these bridges raises significant security concerns, as evidenced by major hacks like those of Poly Network and Nomad Bridge, resulting in losses of hundreds of millions of dollars. Traditional security analysis methods such as static analysis and fuzzing are inadequate for cross-rollup bridges due to their complex designs involving multiple entities, smart contracts, and zero-knowledge circuits. These systems require reasoning about temporal sequences of events across different entities, which exceeds the capabilities of conventional analyzers.
In this paper, we introduce a scalable verifier to systematically assess the security of cross-rollup bridges. Our approach features a comprehensive multi-model framework that captures both individual behaviors and complex interactions using temporal properties. To enhance scalability, we approximate temporal safety verification through reachability analysis of a graph representation of the contracts, leveraging advanced program analysis techniques. Additionally, we incorporate a conflict-driven refinement loop to eliminate false positives and improve precision. Our evaluation on mainstream cross-rollup bridges, including Orbiter Finance, uncovered multiple zero-day vulnerabilities, demonstrating the practical utility of our method. The tool also exhibited favorable runtime performance, enabling efficient analysis suitable for real-time or near-real-time applications.
IO-Optimized Design-Time Configurable Negacyclic Seven-Step NTT Architecture for FHE Applications
Uncategorized
Uncategorized
FHE enables computations on encrypted data, making it essential for privacy-preserving applications. However, it involves computationally demanding tasks, such as polynomial multiplication, while NTT is the state-of-the-art solution to perform this task. Most FHE schemes operate over the negacyclic ring of polynomials. We introduce a novel formulation of the hierarchical Four-Step NTT approach for the negacyclic ring, eliminating the need for pre- and post-processing steps found in the existing methods. To accelerate NTT operations, the FPGAs offer flexible and powerful computing platforms. We propose an FPGA-based, parametric and fully pipelined architecture that implements the improved Seven-Step NTT algorithm (which builds upon the four-step). Our design supports a wide range of parameters, including ring sizes up to $2^{16}$ and modulus sizes up to $64$-bit. We focus on achieving configurable throughput, as constrained by the bandwidth of HBM bandwidth, and aim to maximize throughput through an IO parametric design on the Alveo U280 FPGA. The implementation results demonstrate a reduction in the area-time-product by $2.08\times$ and a speed-up of $10.32\times$ for a ring size of $2^{16}$ and a 32-bit width compared to the current state-of-the-art designs.
Relaxed Functional Bootstrapping: A New Perspective on BGV and BFV Bootstrapping
BGV and BFV are among the most widely used fully homomorphic encryption (FHE) schemes, supporting evaluations over a finite field. To evaluate a circuit with arbitrary depth, bootstrapping is needed. However, despite the recent progress, bootstrapping of BGV/BFV still remains relatively impractical, compared to other FHE schemes.
In this work, we inspect the BGV/BFV bootstrapping procedure from a different angle. We provide a generalized bootstrapping definition that relaxes the correctness requirement of regular bootstrapping, allowing constructions that support only certain kinds of circuits with arbitrary depth. In addition, our definition captures a form of functional bootstrapping. In other words, the output encrypts a function evaluation of the input instead of the input itself.
Under this new definition, we provide a bootstrapping procedure supporting different types of functions. Our construction is 1-2 orders of magnitude faster than the state-of-the-art BGV/BFV bootstrapping algorithms, depending on the evaluated function.
Of independent interest, we show that our technique can be used to improve the batched FHEW/TFHE bootstrapping construction introduced by Liu and Wang (Asiacrypt 2023). Our optimization provides a speed-up of 6x in latency and 3x in throughput for batched binary gate bootstrapping and a plaintext-space-dependent speed-up for batched functional bootstrapping with plaintext space smaller than $\mathbb{Z}_{512}$.
Key Exchange in the Post-Snowden Era: Universally Composable Subversion-Resilient PAKE
Password-Authenticated Key Exchange (PAKE) allows two parties to establish a common high-entropy secret from a possibly low-entropy pre-shared secret such as a password. In this work, we provide the first PAKE protocol with subversion resilience in the framework of universal composability (UC), where the latter roughly means that UC security still holds even if one of the two parties is malicious and the honest party's code has been subverted (in an undetectable manner).
We achieve this result by sanitizing the PAKE protocol from oblivious transfer (OT) due to Canetti et al. (PKC'12) via cryptographic reverse firewalls in the UC framework (Chakraborty et al., EUROCRYPT'22). This requires new techniques, which help us uncover new cryptographic primitives with sanitation-friendly properties along the way (such as OT, dual-mode cryptosystems, and signature schemes).
As an additional contribution, we delve deeper in the backbone of communication required in the subversion-resilient UC framework, extending it to the unauthenticated setting, in line with the work of Barak et al. (CRYPTO'05).
Design and Implementation of a Fast, Platform-Adaptive, AIS-20/31 Compliant PLL-Based True Random Number Generator on a Zynq 7020 SoC FPGA
Phase-locked loops (PLLs) integrated within field-programmable gate arrays (FPGAs) or System-on-Chip FPGAs (SoCs) represent a promising approach for generating random numbers. Their widespread deployment, isolated functionality within these devices, and robust entropy, as demonstrated in prior studies, position PLL-based true random number generators (PLL-TRNGs) as highly viable solutions for this purpose. This study explicitly examines PLL-TRNG implementations using the ZC702 Rev1.1 Evaluation Board featuring the Zynq 7020 SoC from Xilinx, utilizing a configuration involving three such boards for experimental validation. Parameters governing the PLL-TRNG are optimized using a backtracking algorithm. Additionally, a novel methodology is proposed to enhance the rate of random data bit generation while preserving entropy characteristics. Performance metrics are rigorously evaluated against the criteria set by the German Federal Office for Information Security (BSI) AIS-20/31 Tests, accompanied by detailed descriptions of the implementation process. Furthermore, the suitability of our PLL-TRNG designs, attributed to their low resource utilization, is demonstrated.
Dense and smooth lattices in any genus
The Lattice Isomorphism Problem (LIP) was recently introduced as a new hardness assumption for post-quantum cryptography.
The strongest known efficiently computable invariant for LIP is the genus of a lattice.
To instantiate LIP-based schemes one often requires the existence of a lattice that (1) lies in some fixed genus, and (2) has some good geometric properties such as a high packing density or small smoothness parameter.
In this work we show that such lattices exist. In particular, building upon classical results by Siegel (1935), we show that essentially any genus contains a lattice with a close to optimal packing density, smoothing parameter and covering radius.
We present both how to efficiently compute concrete existence bounds for any genus, and asymptotically tight bounds under weak conditions on the genus.
Deletions and Dishonesty: Probabilistic Data Structures in Adversarial Settings
Probabilistic data structures (PDS) are compact representations of high-volume data that provide approximate answers to queries about the data. They are commonplace in today's computing systems, finding use in databases, networking and more. While PDS are designed to perform well under benign inputs, they are frequently used in applications where inputs may be adversarially chosen. This may lead to a violation of their expected behaviour, for example an increase in false positive rate.
In this work, we focus on PDS that handle approximate membership queries (AMQ). We consider adversarial users with the capability of making adaptive insertions, deletions and membership queries to AMQ-PDS, and analyse the performance of AMQ-PDS under such adversarial inputs.
We argue that deletions significantly empower adversaries, presenting a challenge to enforcing honest behaviour when compared to insertion-only AMQ-PDS. To address this, we introduce a new concept of an honest setting for AMQ-PDS with deletions. By leveraging simulation-based security definitions, we then quantify how much harm can be caused by adversarial users to the functionality of AMQ-PDS. Our resulting bounds only require calculating the maximal false positive probability and insertion failure probability achievable in our novel honest setting.
We apply our results to Cuckoo filters and Counting filters. We show how to protect these AMQ-PDS at low cost, by replacing or composing the hash functions with keyed pseudorandom functions in their construction. This strategy involves establishing practical bounds for the probabilities mentioned above. Using our new techniques, we demonstrate that achieving security against adversarial users making both insertions and deletions remains practical.
High Speed High Assurance implementations of Multivariate Quadratic based Signatures
In this poster, we present a Jasmin implementation of Mayo2, a multivariate quadratic(MQ) based signature scheme. Mayo overcomes the disadvantage of the Unbalanced oil and vinegar(UOV) scheme by whipping the UOV map to produce public keys of sizes comparable to ML-DSA. Our Jasmin implementation of Mayo2 takes 930 μs for keygen, 3206 μs for sign, 480 μs for verify based on the average of 1,00,000 runs of the implementation on a 2.25GHz x86 64 processor with 256 GB RAM. To this end, we have a multivariate quadratic based signature implementation that is amenable for verification of constant-time, correctness, proof of equivalence properties using Easycrypt. Subsequently, the results of this endeavor can be extended for other MQ based schemes including UOV.
Stealth Software Trojan: Amplifying Hidden RF Side-Channels with Ultra High SNR and Data-Rate
Interconnected devices enhance daily life but introduce security
vulnerabilities, new technologies enable malicious activities
such as information theft. This article combines radio frequency (RF) side-channel attacks with software Trojans to create a hard-to-detect, stealthy method for extracting kilobytes of secret information per millisecond over record distances with a single measurement in the RF spectrum. The technique exploits Trojan-induced electrical disturbances in RF components originating from peripherals, buses, memories and CPUs to achieve high SNR data leakage schemes. Experimental results show negligible acquisition time and stealth. The research introduces optimized modulation, demodulation schemes, and specialized synchronization symbols to minimize error rates and maximize data rates. It highlights the need for advanced detection and defense mechanisms to ensure the security and privacy of interconnected devices.
NewtonPIR: Communication Efficient Single-Server PIR
Private information retrieval (PIR) is a key component of many privacy-preserving systems. Although numerous PIR protocols have been proposed, designing a PIR scheme with communication overhead independent of the database size $N$ and computational cost practical for real-world applications remains a challenge. In this paper, we propose the NewtonPIR protocol, a communication efficient single-server PIR scheme. NewtonPIR can directly generate query values for the entire index without splitting the index and sending multiple query ciphertexts. Specifically, NewtonPIR achieves communication overhead that is 7.5$\times$ better than the state-of-the-art PIR protocol and 35.9$\sim$75$\times$ better than the other protocols. In experiments, when the database size and entry size increase, the communication overhead of NewtonPIR remains stable. By utilizing the single-ciphertext fully homomorphic encryption (FHE) scheme and the simple Newton interpolation polynomial, along with precomputing coefficients in the offline phase, we reduce the computational overhead of NewtonPIR from hours in previous schemes to seconds. To the best of our knowledge, NewtonPIR is the first protocol to achieve communication cost independent of $N$ along with computational overhead comparable to ring learning with errors (RLWE)-based PIR schemes. Additionally, we extend and introduce a private set intersection (PSI) protocol that balances computational and communication overhead more effectively.
Inject Less, Recover More: Unlocking the Potential of Document Recovery in Injection Attacks Against SSE
Searchable symmetric encryption has been vulnerable to inference attacks that rely on uniqueness in leakage patterns. However, many keywords in datasets lack distinctive leakage patterns, limiting the effectiveness of such attacks. The file injection attacks, initially proposed by Cash et al. (CCS 2015), have shown impressive performance with 100% accuracy and no prior knowledge requirement. Nevertheless, this attack fails to recover queries with underlying keywords not present in the injected files. To address these limitations, our research introduces a novel attack strategy called LEAP-Hierarchical Fusion Attack (LHFA) that combines the strengths of both file injection attacks and inference attacks. Before initiating keyword injection, we introduce a new approach for inert/active keyword selection. In the phase of selecting injected keywords, we focus on keywords without unique leakage patterns and recover them, leveraging their presence for document recovery. Our goal is to achieve an amplified effect in query recovery. We demonstrate a minimum query recovery rate of 1.3 queries per injected keyword with a 10% data leakage of a real-life dataset, and initiate further research to overcome challenges associated with non-distinctive keywords.
Similar Data is Powerful: Enhancing Inference Attacks on SSE with Volume Leakages
Searchable symmetric encryption (SSE) schemes provide users with the ability to perform keyword searches on encrypted databases without the need for decryption. While this functionality is advantageous, it introduces the potential for inadvertent information disclosure, thereby exposing SSE systems to various types of attacks. In this work, we introduce a new inference attack aimed at enhancing the query recovery accuracy of RefScore (presented at USENIX 2021). The proposed approach capitalizes on both similar data knowledge and an additional volume leakage as auxiliary information, facilitating the extraction of keyword matches from leaked data. Empirical evaluations conducted on multiple real-world datasets demonstrate a notable enhancement in query recovery accuracy, up to 19.5%. We also analyze the performance of the proposed attack in the presence of diverse countermeasures.
Generalized Impossible Differential Attacks on Block Ciphers: Application to SKINNY and ForkSKINNY
Impossible differential cryptanalysis is a crucial cryptanalytical method for symmetric ciphers. Given an impossible differential, the key recovery attack typically proceeds in two steps: generating pairs of data and then identifying wrong keys using the guess-and-filtering method. At CRYPTO 2023, Boura \etal first proposed a new key recovery technique - the differential meet-in-the-middle attack, which recovers the key in a meet-in-the-middle manner. Inspired by this technique, we incorporate the meet-in-the-middle technique into impossible cryptanalysis and propose a generic impossible differential meet-in-the-middle attack (\idma) framework. We apply \idma to block ciphers \skinny, \skinnye-v2, and \forkskinny and achieve remarkably efficient attacks. We improve the impossible differential attack on \skinny-$n$-$3n$ by 2 rounds in the single-tweakey setting and 1 round in the related-tweakey setting. For \skinnye-v2, the impossible differential attacks now can cover 2 more rounds in the related-tweakey setting and the first 23/24/25-round attacks in the single-tweakey model are given. For \forkskinny-$n$-$3n$, we improve the attacks by 2 rounds in the limited setting specified by the designers and 1 round in relaxed settings.
These results confirm that the meet-in-the-middle technique can result in more efficient key recovery, reaching beyond what traditional methods can achieve on certain ciphers.
Secure Encryption and Key Exchange using Arbiter PUF
Uncategorized
Uncategorized
This paper introduces a novel approach to enhancing cryp-
tographic security. It proposes the use of one-time message sharing com-
bined with Physically Unclonable Functions (PUF) to securely exchange
keys and generate an S-subbyte-box for encryption. This innovative tech-
nique aims to elevate the security standards of cryptographic applica-
tions.
Towards Optimal Garbled Circuits in the Standard Model
State-of-the-art garbling schemes for boolean circuits roughly consist of two families, i.e., ideal model garbling that combines linear operations and ideal blockciphers (aiming at maximizing performance), and PRF-based garbling that insists on using theoretically sound assumptions. In the linear garbling framework introduced by Zahur, Rosulek, and Evans (Eurocrypt 2015), it was established that garbling an AND gate requires at least $2(\kappa +1)$ bits of ciphertext, with $\kappa$ as the security parameter. Recent contributions from Lei Fan et al. and Chunghun Baek et al. have provided a detailed model showing that, under the free-XOR setting, which relies on a non-standard assumption, garbling an AND gate requires at least $1.5\kappa + O(1)$ bits. In contrast, regarding PRF-based garbling, the general model and efficiency bounds remain open questions.
In this paper, we present a comprehensive model for PRF-based garbled circuits and establish both the communication and computation lower bound. Specifically, we demonstrate that garbling an AND gate requires at least $2\kappa + 2$ bits communication and 6 PRF calls, while an XOR gate requires a minimum of $\kappa$ bits communication and 4 PRF calls. Notably, the state-of-the-art garbling scheme (GLNP scheme) under the PRF assumption, introduced by Shay, Yehuda, Ariel, and Benny (JOC 2018), uses $2\kappa + 4$ bits and 8 PRF calls for an AND gate, which exceeds our established lower bound. We finally introduce an optimal garbling scheme showing that our communication and computation lower bounds are tight.
On Efficient Computations of Koblitz Curves over Prime Fields
The family of Koblitz curves $E_b: y^2=x^3+b/\mathbb{F}_p$ over primes fields has close connections to the ring $\mathbb{Z}[\omega]$ of Eisenstein integers. Utilizing nice facts from the theory of cubic residues, this paper derives an efficient formula for a (complex) scalar multiplication by $\tau=1-\omega$. This enables us to develop a window $\tau$-NAF method for Koblitz curves over prime fields. This probably is the first window $\tau$-NAF method to be designed for curves over fields with large characteristic. Besides its theoretical interest, a higher performance is also achieved due to the facts that (1) the operation $\tau^2$ can be done more efficiently that makes the average cost of $\tau$ to be close to $2.5\mathbf{S}+3\mathbf{M}$ ( $\mathbf{S}$ and $\mathbf{M}$ denote the costs for field squaring and multiplication, respectively); (2) the pre-computation for the window $\tau$-NAF method is surprisingly simple in that only one-third of the coefficients need to be processed. The overall improvement over the best current method is more than $11\%$. The paper also suggests a simplified modular reduction for Eisenstein integers where the division operations are eliminated. The efficient formula of $\tau P$ can be further used to speed up the computation of $3P$, compared to $10\mathbf{S}+5\mathbf{M}$ , our new formula just costs $4\mathbf{S}+6\mathbf{M}$. As a main ingredient for double base chain method for scalar multiplication, the $3P$ formula will contribute to a greater efficiency.
Exploring the Six Worlds of Gröbner Basis Cryptanalysis: Application to Anemoi
Gröbner basis cryptanalysis of hash functions and ciphers, and their underlying permutations, has seen renewed interest recently. Anemoi (Crypto'23) is a permutation-based hash function that is efficient for a variety of arithmetizations used in zero-knowledge proofs. In this paper, exploring both theoretical bounds as well as experimental validation, we present new complexity estimates for Gröbner basis attacks on the Anemoi permutation over prime fields.
We cast our findings in what we call the six worlds of Gröbner basis cryptanalysis. As an example, keeping the same security arguments of the design, we conclude that at least 41 instead of 37 rounds would need to be used for 256-bit security, whereby our suggestion does not yet include a security margin.
Another Lattice Attack Against an RSA-like Cryptosystem
Let $N=pq$ be the product of two balanced prime numbers $p$ and $q$. In 2015, Roman'kov introduced an interesting RSA-like cryptosystem that, unlike the classical RSA key equation $ed - k (p-1)(q-1) = 1$, uses the key equation $ed - k r = 1$, where $r | p-1$ and is a large prime number. In this paper, we study if small private key attacks based on lattices can be applied to Roman'kov's cryptosystem. More precisely, we argue that such attacks do not appear to be applicable to this scheme without substantial adaptations.
Revisiting Leakage-Resilient MACs and Succinctly-Committing AEAD: More Applications of Pseudo-Random Injections
Pseudo-Random Injections (PRIs) have been used in several applications in symmetric-key cryptography, such as in the idealization of Authenticated Encryption with Associated Data (AEAD) schemes, building robust AEAD, and, recently, in converting a committing AEAD scheme into a succinctly committing AEAD scheme. In Crypto 2024, Bellare and Hoang showed that if an AEAD scheme is already committing, it can be transformed into a succinctly committing scheme by encrypting part of the plaintext using a PRI. In this paper, we revisit the applications of PRIs in building Message Authentication Codes (MACs) and AEAD schemes. First, we look at some of the properties and definitions of PRIs, such as collision resistance and unforgeability when used as a MAC with small plaintext space, under different leakage models. Next, we show how they can be combined with collision-resistant hash functions to build a MAC for long plaintexts, offering flexible security depending on how the PRI and equality check are implemented. If both the PRI and equality check are leak-free, the MAC provides almost optimal security, but the security only degrades a little if the equality check is only leakage-resilient (rather than leak-free). If the equality check has unbounded leakage, the security drops to a baseline security, rather than being completely insecure. Next, we show how to use PRIs to build a succinctly committing online AEAD scheme dubbed as scoAEfrom scratch that achieves succinct CMT-4 security, privacy, and Ciphertext Integrity with Misuse and Leakage (CIML) security. Last but not least, we show how to build a succinct nonce Misuse-Resistant (MRAE) AEAD scheme, dubbed as scMRAE. The construction combines the SIV paradigm with PRI-based encryption (e.g. the Encode-then-Encipher (EtE) framework).
OPL4GPT: An Application Space Exploration of Optimal Programming Language for Hardware Design by LLM
Despite the emergence of Large Language Models (LLMs) as potential tools for automating hardware design, the optimal programming language to describe hardware functions remains unknown. Prior works extensively explored optimizing Verilog-based HDL design, which often overlooked the potential capabilities of alternative programming languages for hardware designs. This paper investigates the efficacy of C++ and Verilog as input languages in extensive application space exploration, tasking an LLM to generate implementations for various System-on-chip functional blocks. We proposed an automated Optimal Programming Language (OPL) framework that leverages OpenAI's GPT-4o LLM to translate natural language specifications into hardware descriptions using both high-level and low-level programming paradigms. The OPL4GPT demonstration initially employs a novel prompt engineering approach that decomposes design specifications into manageable submodules, presented to the LLM to generate code in both C++ and Verilog. A closed-loop feedback mechanism automatically incorporates error logs from the LLM's outputs, encompassing both syntax and functionality. Finally, functionally correct outputs are synthesized using either RTL (Register-Transfer Level) for Verilog or High-Level Synthesis for C++ to assess area, power, and performance. Our findings illuminate the strengths and weaknesses of each language across various application domains, empowering hardware designers to select the most effective approach.
Light Clients for Lazy Blockchains
Lazy blockchains decouple consensus from transaction verification and execution to increase throughput. Although they can contain invalid transactions (e.g., double spends) as a result, these can easily be filtered out by full nodes that check if there have been previous conflicting transactions. However, creating light (SPV) clients that do not see the whole transaction history becomes a challenge: A record of a transaction on the chain does not necessarily entail transaction confirmation. In this paper, we devise a protocol that enables the creation of efficient light clients for lazy blockchains. The number of interaction rounds and the communication complexity of our protocol are logarithmic in the blockchain execution time. Our construction is based on a bisection game that traverses the Merkle tree containing the ledger of all - valid or invalid - transactions. We prove that our proof system is succinct, complete and sound, and empirically demonstrate the feasibility of our scheme.
An Open Source Ecosystem for Implementation Security Testing
Implementation-security vulnerabilities such as the
power-based side-channel leakage and fault-injection sensitivity
of a secure chip are hard to verify because of the sophistication
of the measurement setup, as well as the need to generalize the
adversary into a test procedure. While the literature has proposed
a wide range of vulnerability metrics to test the correctness of a
secure implementation, it is still up to the subject-matter expert to
map these concepts into a working and reliable test procedure.
Recently, we investigated the benefits of using an open-source
implementation security testing environment called Chipwhisperer.
The open-source and low-cost nature of the Chipwhisperer
hardware and software has resulted in the adoption of thousands
of testing kits throughout academia and industry, turning the
testkit into a baseline for implementation security testing. We
investigate the use cases for the Chipwhisperer hardware and
software, and we evaluate the feasibility of an open-source
ecosystem for implementation security testing. In addition to the
open-source hardware and firmware, an ecosystem also considers
broader community benefits such as re-usability, sustainability,
and governance.
Discrete Gaussians Modulo Sub-Lattices: New Leftover Hash Lemmas for Discrete Gaussians
The Leftover Hash Lemma (LHL) is a powerful tool for extracting randomness from an entropic distribution, with numerous applications in cryptography. LHLs for discrete Gaussians have been explored in both integer settings by Gentry et al. (GPV, STOC'08) and algebraic ring settings by Lyubashevsky et al. (LPR, Eurocrypt'13). However, the existing LHLs for discrete Gaussians have two main limitations: they require the Gaussian parameter to be larger than certain smoothing parameters, and they cannot handle cases where fixed and arbitrary information is leaked.
In this work, we present new LHLs for discrete Gaussians in both integer and ring settings. Our results show that the Gaussian parameter can be improved by a factor of $\omega(\sqrt{\log\lambda})$ and $O(\sqrt{N})$ compared to the regularity lemmas of GPV and LPR, respectively, under similar parameter choices such as the dimension and ring. Furthermore, our new LHLs can be applied to leaked discrete Gaussians, and the result can be used to establish asymptotic hardness of the extended MLWE assumptions, addressing an open question in recent works by Lyubashevsky et al. (LNP, Crypto'22). Our central techniques involve new fine-grained analyses of the min-entropy in discrete Gaussians modulo sublattices and should be of interest.
Trustworthy Approaches to RSA: Efficient Exploitation Strategies Based on Common Modulus
With the increasing integration of crowd computing, new vulnerabilities emerge in widely used cryptographic systems like the RSA cryptosystem, whose security is based on the factoring problem. It is strongly advised to avoid using the same modulus to produce two pairs of public-private keys, as the cryptosystem would be rendered vulnerable to common modulus attacks. Such attacks can take two forms: one that aims to factorize the common modulus based on one key pair and the other that aims to decrypt certain ciphertexts generated by two public keys if the keys are co-prime. This paper introduces a new type of common modulus attack on the RSA cryptosystem. In our proposed attack, given one public-private key pair, an attacker can obtain the private key corresponding to a given public key in RSA decryption. This allows the adversary to decrypt any ciphertext generated using this public key. It is worth noting that the proposed attack can be used in the CRT model of RSA. In addition, we propose a parallelizable factoring algorithm with an order equivalent to a cyclic attack in the worst-case scenario.
Finding Practical Parameters for Isogeny-based Cryptography
Isogeny-based schemes often come with special requirements on the field of definition of the involved elliptic curves. For instance, the efficiency of SQIsign, a promising candidate in the NIST signature standardisation process, requires a large power of two and a large smooth integer $T$ to divide $p^2-1$ for its prime parameter $p$.
We present two new methods that combine previous techniques for finding suitable primes: sieve-and-boost and XGCD-and-boost. We use these methods to find primes for the NIST submission of SQIsign. Furthermore, we show that our methods are flexible and can be adapted to find suitable parameters for other isogeny-based schemes such as AprèsSQI or POKE. For all three schemes, the parameters we present offer the best performance among all parameters proposed in the literature.
ZK-SNARKs for Ballot Validity: A Feasibility Study
Electronic voting (e-voting) systems have become more prevalent in recent years, but security concerns have also increased, especially regarding the privacy and verifiability of votes. As an essential ingredient for constructing secure e-voting systems, designers often employ zero-knowledge proofs (ZKPs), allowing voters to prove their votes are valid without revealing them. Invalid votes can then be discarded to protect verifiability without compromising the privacy of valid votes.
General purpose zero-knowledge proofs (GPZKPs) such as ZK-SNARKs can be used to prove arbitrary statements, including ballot validity. While a specialized ZKP that is constructed only for a specific election type/voting method, ballot format, and encryption/commitment scheme can be more efficient than a GPZKP, the flexibility offered by GPZKPs would allow for quickly constructing e-voting systems for new voting methods and new ballot formats. So far, however, the viability of GPZKPs for showing ballot validity for various ballot formats, in particular, whether and in how far they are practical for voters to compute, has only recently been investigated for ballots that are computed as Pedersen vector commitments in an ACM CCS 2022 paper by Huber et al.
Here, we continue this line of research by performing a feasibility study of GPZKPs for the more common case of ballots encrypted via Exponential ElGamal encryption. Specifically, building on the work by Huber et al., we describe how the Groth16 ZK-SNARK can be instantiated to show ballot validity for arbitrary election types and ballot formats encrypted via Exponential ElGamal. As our main contribution, we implement, benchmark, and compare several such instances for a wide range of voting methods and ballot formats. Our benchmarks not only establish a basis for protocol designers to make an educated choice for or against such a GPZKP, but also show that GPZKPs are actually viable for showing ballot validity in voting systems using Exponential ElGamal.
On the Insecurity of Bloom Filter-Based Private Set Intersections
Private set intersections are cryptographic protocols that compute the intersection of multiple parties' private sets without revealing elements that are not in the intersection. These protocols become less efficient when the number of parties grows, or the size of the sets increases. For this reason, many protocols are based on Bloom filters, which speed up the protocol by approximating the intersections, introducing false positives with a small but non-negligible probability. These false positives are caused by hash collisions in the hash functions that parties use to encode their sets as Bloom filters. In this work, we show that these false positives are more than an inaccuracy: an adversary in the augmented semi-honest model can use them to learn information about elements that are not in the intersection. First, we show that existing security proofs for Bloom filter-based private set intersections are flawed. Second, we show that even in the most optimistic setting, Bloom filter-based private set intersections cannot securely realize an approximate private set intersection unless the parameters are so large that false positives only occur with negligible probability. Third, we propose a practical attack that allows a party to learn if an element is contained in a victim's private set, showing that the problem with Bloom filters is not just theoretical. We conclude that the efficiency gain of using Bloom filters as an approximation in existing protocols vanishes when accounting for this security problem. We propose three mitigations besides choosing larger parameters: One can use oblivious pseudo-random functions instead of hash functions to reduce the success rate of our attack significantly, or replace them with password-based key derivation functions to significantly slow down attackers. A third option is to let a third party authorize the input sets before proceeding with the protocol.
Fast Multiplication and the PLWE-RLWE Equivalence for an Infinite Family of Cyclotomic Subextensions
We prove the equivalence between the Ring Learning With Errors (RLWE) and the Polynomial Learning With Errors (PLWE) problems for the maximal totally real subfield of the $2^r 3^s$-th cyclotomic field for $r \geq 3$ and $s \geq 1$. Moreover, we describe a fast algorithm for computing the product of two elements in the ring of integers of these subfields. This multiplication algorithm has quasilinear complexity in the dimension of the field, as it makes use of the fast Discrete Cosine Transform (DCT). Our approach assumes that the two input polynomials are given in a basis of Chebyshev-like polynomials, in contrast to the customary power basis. To validate this assumption, we prove that the change of basis from the power basis to the Chebyshev-like basis can be computed with $\mathcal{O}(n \log n)$ arithmetic operations, where $n$ is the problem dimension. Finally, we provide a heuristic and theoretical comparison of the vulnerability to some attacks for the $p$-th cyclotomic field versus the maximal totally real subextension of the $4p$-th cyclotomic field for a reasonable set of parameters of cryptographic size.
A Tool for Fast and Secure LWE Parameter Selection: the FHE case
The field of fully homomorphic encryption (FHE) has seen many theoretical and computational advances in recent years, bringing the technology closer to practicality than ever before. For this reason, practitioners in related fields, such as machine learning, are increasingly interested in using FHE to provide privacy to their applications.
Despite this progress, selecting secure and efficient parameters for FHE remains a complex and challenging task due to the intricate interdependencies between parameters. In this work, we address this issue by providing a rigorous theoretical foundation for parameter selection for any LWE-based schemes, with a specific focus on FHE. Our approach starts with an in-depth analysis of lattice attacks on the LWE problem, deriving precise expressions for the most effective ones. Building on this, we introduce closed-form formulas that establish the relationships among the LWE parameters.
In addition, we introduce a numerical method to enable the accurate selection of any configurable parameter to meet a desired security level.
Finally, we use our results to build a practical and efficient tool for researchers and practitioners deploying FHE in real-world applications, ensuring that our approach is both rigorous and accessible.
NTRU-based Bootstrapping for MK-FHEs without using Overstretched Parameters
Uncategorized
Uncategorized
Recent attacks on NTRU lattices given by Ducas and van Woerden (ASIACRYPT 2021) showed that for moduli $q$ larger than the so-called fatigue point $n^{2.484+o(1)}$, the security of NTRU is noticeably less than that of (ring)-LWE. Unlike
NTRU-based PKE with $q$ typically lying in the secure regime of NTRU lattices (i.e., $q<n^{2.484+o(1)}$), the security of existing NTRU-based multi-key FHEs (MK-FHEs) requiring $q=O(n^k)$ for $k$ keys could be significantly affected by those attacks.
In this paper, we first propose a (matrix) NTRU-based MK-FHE
for super-constant number $k$ of keys without using overstretched NTRU parameters. Our scheme is essentially a combination of two components following the two-layer framework of TFHE/FHEW:
- a simple first-layer matrix NTRU-based encryption that naturally supports multi-key NAND operations
with moduli $q=O(k\cdot n^{1.5})$ only linear in the number $k$ of keys;
-and a crucial second-layer NTRU-based encryption that supports an efficient hybrid product between a single-key ciphertext and a multi-key ciphertext for gate bootstrapping.
Then, by replacing the first-layer with a more efficient LWE-based multi-key encryption, we obtain an improved MK-FHE scheme with better performance. We also employ a light key-switching technique to reduce the key-switching key size from the previous $O(n^2)$ bits to $O(n)$ bits.
A proof-of-concept implementation shows that our two MK-FHE schemes outperform the state-of-the-art TFHE-like MK-FHE schemes in both computation efficiency and bootstrapping key size. Concretely, for $k=8$ at the same 100-bit security level, our improved MK-FHE scheme can bootstrap a ciphertext in {0.54s} on a laptop and only has a bootstrapping key of size {13.89}MB,which are respectively 2.2 times faster and 7.4 times smaller than the MK-FHE scheme (which relies on a second-layer encryption from the ring-LWE assumption) due to Chen, Chillotti and Song (ASIACRYPT 2019).
On Threshold Signatures from MPC-in-the-Head
We investigate the feasibility of constructing threshold signature schemes from the MPC-in-the-head paradigm. Our work addresses the significant challenge posed by recent impossibility results (Doerner et al., Crypto’24), which establish inherent barriers to efficient thresholdization of such schemes without compromising their security or significantly increasing the signature size.
- We introduce a general methodology to adapt any MPC-in-the-head signature into a threshold-friendly scheme, ensuring that the dependency on the number of users $n$ grows as $\lambda^2n + O(1)$. This represents a substantial improvement over the naive concatenation of independent signatures.
- We present a threshold signature scheme on top of the scheme of (Carozza, Couteau and Joux, EUROCRYPT’23). Our security analysis introduces the notion of Corruptible Existential Unforgeability under Chosen Message Attacks (CEUF-CMA), which formalizes resilience against adversarial control over parts of the randomness.
Our results provide a new perspective on the trade-offs between efficiency and security in threshold settings, opening pathways for future improvements in post-quantum threshold cryptography.
Shardora: Towards Scaling Blockchain Sharding via Unleashing Parallelism
Sharding emerges as a promising solution to enhance blockchain scalability. However, it faces two critical limitations during shard reconfiguration: (1) the TPS-Degradation issue, arising from ledger synchronization conflicts during transaction processing, and (2) the Zero-TPS issue, caused by disruptions in transaction processing due to key negotiation. To this end, we propose Shardora, a blockchain sharding system for scaling blockchain by unleashing parallelism. In Shardora, we implement two essential mechanisms: (1) A parallelized dual committee framework with a reputation mechanism to mitigate the TPS-Degradation issue while ensuring system security. (2) A parallelized key pre-negotiation mechanism with a secret-reuse strategy to avoid the Zero-TPS issue while maintaining a continuously high TPS. We prove that Shardora offers theory-guaranteed security. We implement a prototype of Shardora and deploy it on Alibaba Cloud. Experimental results demonstrate that Shardora addresses the limitations by significantly reducing the overhead of both ledger synchronization and key negotiation, which outperforms state-of-the-art sharding schemes by at least 90%. In addition, Shardora shows its superior performance in terms of throughput and latency, achieving a peak throughput of 8300 TPS on a single shard with 600 nodes under LAN conditions. The code of Shardora is publicly available on GitHub.
PerfOMR: Oblivious Message Retrieval with Reduced Communication and Computation
Anonymous message delivery, as in privacy-preserving blockchain and private messaging applications, needs to protect recipient metadata: eavesdroppers should not be able to link messages to their recipients. This raises the question: how can untrusted servers assist in delivering the pertinent messages to each recipient, without learning which messages are addressed to whom?
Recent work constructed Oblivious Message Retrieval (OMR) protocols that outsource the message detection and retrieval in a privacy-preserving way, using homomorphic encryption. Their construction exhibits significant costs in computation per message scanned (${\sim}0.1$ second), as well as in the size of the associated messages (${\sim}1$kB overhead) and public keys (${\sim}132$kB).
This work constructs more efficient OMR schemes, by replacing the LWE-based clue encryption of prior works with a Ring-LWE variant, and utilizing the resulting flexibility to improve several components of the scheme. We thus devise, analyze, and benchmark two protocols:
The first protocol focuses on improving the detector runtime, using a new retrieval circuit that can be homomorphically evaluated $13.8$x faster than the prior work.
The second protocol focuses on reducing the communication costs, by designing a different homomorphic decryption circuit that allows the parameter of the Ring-LWE encryption to be set such that the public key size is about $235$x smaller than the prior work, and the message size is roughly $1.6$x smaller. The runtime of this second construction is ${\sim}40.0$ms per message, still more than $2.5$x faster than prior works.
A non-comparison oblivious sort and its application to private k-NN
This paper introduces a novel adaptation of counting sort that enables sorting of encrypted data using Fully Homomorphic Encryption (FHE). Our approach represents the first known sorting algorithm for encrypted data that does not rely on comparisons. The implementation leverages some basic operations on TFHE's Look-Up-Tables (LUT). We have integrated these operations into RevoLUT, a comprehensive open-source library built upon tfhe-rs, which can be of independent interest for oblivious algorithms. We demonstrate the effectiveness of our Blind Counting Sort algorithm by developing a top-$k$ selection algorithm and applying it to privacy-preserving $k$-Nearest Neighbors classification. This proves to be approximately 5x faster than current state-of-the-art methods.
A notion on S-boxes for a partial resistance to some integral attacks
In two recent papers, we introduced and studied the notion of $k$th-order sum-freedom of a vectorial function $F:\mathbb F_2^n\to \mathbb F_2^m$. This notion generalizes that of almost perfect nonlinearity (which corresponds to $k=2$) and has some relation with the resistance to integral attacks of those block ciphers using $F$ as a substitution box (S-box), by preventing the propagation of the division property of $k$-dimensional affine spaces. In the present paper, we show that this notion, which is rarely satisfied by vectorial functions, can be weakened while retaining the property that the S-boxes do not propagate the division property of $k$-dimensional affine spaces. This leads us to the property that we name $k$th-order $t$-degree-sum-freedom, whose strength decreases when $t$ increases, and which coincides with $k$th-order sum-freedom when $t=1$. The condition for $k$th-order $t$-degree-sum-freedom is that, for every $k$-dimensional affine space $A$, there exists a non-negative integer $j$ of 2-weight at most $t$ such that $\sum_{x\in A}(F(x))^j\neq 0$. We show, for a general $k$th-order $t$-degree-sum-free function $F$, that $t$ can always be taken smaller than or equal to $\min(k,m)$ under some reasonable condition on $F$, and that it is larger than or equal to $\frac k{\deg(F)}$, where $\deg(F)$ is the algebraic degree of $F$. We study examples for $k=2$ (case in which $t=1$ corresponds to APNness) showing that finding $j$ of 2-weight 2 can be challenging, and we begin the study of power functions, and in particular, of the multiplicative inverse function (used as S-box in the AES), for which we extend to $k$th-order $t$-degree-sum-freedom the result that it is $k$th-order sum-free if and only if it is $(n-k)$th-order sum-free. We begin the study of the cases of $k\in \{2,3,n-3,n-2,n-1,n\}$.
Preservation of Speculative Constant-Time by Compilation
Compilers often weaken or even discard software-based countermeasures commonly used to protect programs against side-channel attacks; worse, they may also introduce vulnerabilities that attackers can exploit. The solution to this problem is to develop compilers that preserve such countermeasures. Prior work establishes that (a mildly modified version of) the CompCert and Jasmin formally verified compilers preserve constant-time, an information flow policy that ensures that programs are protected against timing side-channel attacks. However, nothing is known about preservation of speculative constant-time, a strengthening of the constant-time policy that ensures that programs are protected against Spectre-v1 attacks. We first show that preservation of speculative constant-time fails in practice by providing examples of secure programs whose compilation is not speculative constant-time using GCC (GCC -O0 and GCC -O1) and Jasmin. Then, we define a proof-of-concept compiler that distills some of the critical passes of the Jasmin compiler and use the Coq proof assistant to prove that it preserves speculative constant-time. Finally, we patch the Jasmin speculative constant-time type checker and demonstrate that all cryptographic implementations written in Jasmin can be fixed with minimal impact.
Extended Diffie-Hellman Encryption for Secure and Efficient Real-Time Beacon Notifications
Every computing paradigm involving communication requires new security protocols employing cryptography. For example, the Internet gave rise to TLS/SSL, and Mobile Computing gave rise to End to End Encryption protocols. In this paper, we address an emerging IoT paradigm involving beacons attached to things and security protocols associated with this new configuration.
Specifically, we address the ``beacon notification problem,'' a critical IoT paradigm aims at providing secure and efficient real-time notifications from beacons to their owners. Since the beacon notification problem has not yet been formally defined, we begin by inspecting natural requirements based on the operational setting and establishing correctness, security, and privacy definitions through the use of cryptographic games.
To resolve this problem, we propose a novel cryptographic tool we call XDHIES, which is a considerable extension of available Diffie-Hellman encryption schemes. We then show a new notification protocol built upon XDHIES and we prove that this cryptographic protocol is secure and private and successfully meets all the above problem's requirements.
WHIR: Reed–Solomon Proximity Testing with Super-Fast Verification
We introduce WHIR, a new IOP of proximity that offers small query complexity and exceptionally fast verification time. The WHIR verifier typically runs in a few hundred microseconds, whereas other verifiers in the literature require several milliseconds (if not much more). This significantly improves the state of the art in verifier time for hash-based SNARGs (and beyond).
Crucially, WHIR is an IOP of proximity for constrained Reed–Solomon codes, which can express a rich class of queries to multilinear polynomials and to univariate polynomials. In particular, WHIR serves as a direct replacement for protocols like FRI, STIR, BaseFold, and others. Leveraging the rich queries supported by WHIR and a new compiler for multilinear polynomial IOPs, we obtain a highly efficient SNARG for generalized R1CS.
As a comparison point, our techniques also yield state-of-the-art constructions of hash-based (non-interactive) polynomial commitment schemes for both univariate and multivariate polynomials (since sumcheck queries naturally express polynomial evaluations). For example, if we use WHIR to construct a polynomial commitment scheme for degree 222, with 100 bits of security, then the time to commit and open is 1.2 seconds, the sender communicates 63 KiB to the receiver, and the opening verification time is 360 microseconds.
A Comprehensive Survey on Hardware-Software co-Protection against Invasive, Non-Invasive and Interactive Security Threats
In the face of escalating security threats in modern
computing systems, there is an urgent need for comprehensive
defense mechanisms that can effectively mitigate invasive, noninvasive and interactive security vulnerabilities in hardware
and software domains. Individually, hardware and software
weaknesses and probable remedies have been practiced but
protecting a combined system has not yet been discussed in
detail. This survey paper provides a comprehensive overview of
the emerging field of Hardware-Software co-Protection against
Invasive and Non-Invasive Security Threats. We systematically
review state-of-the-art research and developments in hardware
and software security techniques, focusing on their integration to
create synergistic defense mechanisms. The survey covers a wide
range of security threats, including physical attacks, side-channel
attacks, and malware exploits, and explores the diverse strategies
employed to counter them. Our survey meticulously examines the
landscape of security vulnerabilities, encompassing both physical
and software-based attack vectors, and explores the intricate
interplay between hardware and software defenses in mitigating
these threats. Furthermore, we discuss the challenges and opportunities associated with Hardware-Software co-Protection and
identify future research directions to advance the field. Through
this survey, we aim to provide researchers, practitioners, and
policymakers with valuable insights into the latest advancements
and best practices for defending against complex security threats
in modern computing environments.
Sailfish: Towards Improving the Latency of DAG-based BFT
Directed Acyclic Graph (DAG) based BFT protocols balance consensus efforts across different parties and maintain high throughput even when some designated parties fail. However, existing DAG-based BFT protocols exhibit long latency to commit decisions, primarily because they have a \emph{leader} every 2 or more ``rounds''. Recent works, such as Shoal (FC'23) and Mysticeti, have deemed supporting a leader vertex in each round particularly difficult, if not impossible. Consequently, even under honest leaders, these protocols require high latency (or communication complexity) to commit the proposal submitted by the leader (leader vertex) and additional latency to commit other proposals (non-leader vertices).
In this work, we present Sailfish, the first DAG-based BFT that supports a leader vertex in each round. Under honest leaders, Sailfish maintains a commit latency of one reliable broadcast (RBC) round plus $1\delta$ to commit the leader vertex (where $\delta$ is the actual transmission latency of a message) and only an additional RBC round to commit non-leader vertices. We also extend Sailfish to Multi-leader Sailfish, which facilitates multiple leaders within a single round and commits all leader vertices in a round with a latency of one RBC round plus $1\delta$. Our experimental evaluation demonstrates that our protocols introduce significantly lower latency overhead compared to existing DAG-based protocols, with similar throughput.
Inflation-Tracking Proof-of-Work Crypto-Currencies
We show that Bitcoin and other existing egalitarian crypto-currencies are unstable as store-of-value as they fail to track inflation of local currencies closely, and the price dynamic is purely driven by speculation. In the case of Bitcoin, we show that instead of price being based on cost of mining Bitcoin, it is the cost of mining that rapidly converges to the current price of Bitcoin. Based on rational expectations equilibrium, we argue that if the coins awarded during mining are increased in proportion to increase in difficulty of the underlying cryptographic puzzle, then the price of the coin is likely to track inflation of local currencies closely over medium to long term. However, since Moore's law as well as targeted hardware design can lead to computational cost deflation, we suggest a hyper-geometric tapering, instead of a geometric tapering, of the mining award over time. This also handles bootstrapping interest in the crypto-currency.
Shifting our knowledge of MQ-Sign security
Unbalanced Oil and Vinegar (UOV) is one of the oldest, simplest, and most studied ad-hoc multivariate signature schemes. UOV signature schemes are attractive because they have very small signatures and fast verification. On the downside, they have large public and secret keys. As a result, variations of the traditional UOV scheme are usually developed with the goal to reduce the key sizes. Seven variants of UOV were submitted to the additional call for digital signatures by NIST, prior to which, a variant named MQ-Sign was submitted to the (South) Korean post-quantum cryptography competition (KpqC). MQ-Sign is currently competing in the second round of KpqC with two variants. One of the variants corresponds to the classic description of UOV with certain implementation and parameter choices. In the other variant, called MQ-Sign-LR, a part of the central map is constructed from row shifts of a single matrix. This design makes for smaller secret keys, and in the case where the equivalent keys optimization is used, it also leads to smaller public keys. However, we show in this work that the polynomial systems arising from an algebraic attack have a specific structure that can be exploited. Specifically, we are able to find preimages for $d$-periodic targets under the public map with a probability of $63\%$ for all security levels. The complexity of finding these preimages, as well as the fraction of $d$-periodic target increases with $d$ and hence provides a trade-off. We show that for all security levels one can choose $d=\frac{v}{2}$, for $v$ the number of vinegar variables, and reduce the security claim. Our experiments show practical running times for lower $d$ ranging from 0.06 seconds to 32 hours.
Succinctly Verifiable Computation over Additively-Homomorphically Encrypted Data with Applications to Privacy-Preserving Blueprints
With additively homomorphic encryption (AHE), one can compute, from input ciphertexts $\mathsf{Enc}(x_1),\ldots,\mathsf{Enc}(x_n)$, and additional inputs $y_1,\ldots,y_k$, a ciphertext $c_\textit{f}=\mathsf{Enc}(f(x_1,\ldots,x_n,y_1,\ldots, y_k))$ for any polynomial $f$ in which each monomial has total degree at most $1$ in the $x$-variables (but can be arbitrary in the $y$-variables). For AHE that satisfies a set of natural requirements, we give a non-interactive zero-knowledge proof system (in the random-oracle model) for showing that a ciphertext $c_\textit{f}$ is the result of homomorphically evaluating $f$ on ciphertexts $c_1,\ldots,c_n$ and private inputs $y_1,\ldots,y_k$ that correspond to commitments $C_1,\ldots,C_k$. Our proofs are $\textit{succinct}$, i.e., their size is independent of the number of ciphertexts $n$, and is instead $O(k\log d)$ where $k$ is the number of private inputs, and $d$ is the maximum degree of any variable in $f$.
We give two ways of instantiating this framework: with ElGamal-based encryption (under the DDH assumption) and with a variant of the Camenisch-Shoup cryptosystem (under the DCR assumption). Both yield proof systems where computing and verifying the proof takes a comparable amount of time to homomorphically evaluating $f$.
Next, we show that our framework yields a dramatically improved privacy-preserving blueprint (PPB) system. Introduced by Kohlweiss, Lysyanskaya, and Nguyen (Eurocrypt'23), an $f$-PPB system allows an auditor with secret input $x$ to create a public encoding $\sf pk$ of the function $f(x,\cdot)$ that reveals nothing about $x$.
Yet, it allows a user to compute an encoding, or escrow $Z$, of the value $f(x,y)$ on input the user's private data $y$ corresponding to a commitment $C_y$; $Z$ will verifiably correspond to the commitment $C_y$. The auditor will be able to recover $f(x,y)$ from $Z$, but will learn no other information about $y$. For example, if $f$ is the watchlist function where $f(x,y)$ outputs $y$ only in the event that $y$ is on the list $x$, then an $f$-PPB allows the auditor to trace watchlisted users in an otherwise anonymous system.
Using our succinct zero-knowledge proof system for additively homomorphic computation we achieve the following results: (1) We provide efficient schemes for a bigger class of functions $f$; for example, we show how to realize $f$ that would allow the auditor to trace e-cash transactions of a criminal suspect which was previously not efficient. (2) For the watchlist and related functions, we reduce the size of the escrow $Z$ from linear in the size of the auditor's input $x$, to logarithmic.
Additionally, we define and satisfy a stronger notion of security for $f$-PPBs, where a malicious auditor cannot frame a user in a transaction in which the user was not involved in.
Efficient Modular Multiplication Hardware for Number Theoretic Transform on FPGA
In this paper, we present a comprehensive analysis of various modular multiplication methods for Number Theoretic Transform (NTT) on FPGA. NTT is a critical and time-intensive component of Fully Homomorphic Encryption (FHE) applications while modular multiplication consumes a significant portion of the design resources in an NTT implementation. We study the existing modular reduction approaches from the literature, and implement particular methods on FPGA. Specifically Word-Level Montgomery (WLM)) for NTT friendly primes [1] and K2RED [2]. For improvements, we explore the trade-offs between the number of available primes in special forms and hardware cost of the reduction methods. We develop a DSP multiplication-optimized version of WLM, which we call WLM-Mixed. We also introduce a subclass of Proth primes, referred to as Proth-l primes, characterized by a low and fixed signed Hamming Weight. This special class of primes allows us to design multiplication-free shift-add versions of K2RED and naive Montgomery reduction [3], referred to as K2RED-Shift and Montgomery-Shift. We provide in-depth evaluations of these five reduction methods in an NTT architecture on FPGA. Our results indicate that WLM-Mixed is highly resource-efficient, utilizing only 3 DSP multiplications for 64-bit coefficient moduli. On the other hand, K2RED-Shift and Montgomery-Shift offer DSP-free alternatives, which can be beneficial in specific scenarios
Chosen-Prefix Collisions on AES-like Hashing
Chosen-prefix collision (CPC) attack was first presented by Stevens, Lenstra and de Weger on MD5 at Eurocrypt 2007. A CPC attack finds a collision for any two chosen prefixes, which is a stronger variant of collision attack. CPCs are naturally harder to construct but have larger practical impact than (identical-prefix) collisions, as seen from the series of previous works on MD5 by Stevens et al. and SHA-1 by Leurent and Peyrin. Despite its significance, the resistance of CPC attacks has not been studied on AES-like hashing.
In this work, we explore CPC attacks on AES-like hashing following the framework practiced on MD5 and SHA-1. Instead of the message modification technique developed for MD-SHA family, we opt for related-key rebound attack to construct collisions for AES-like hashing in view of its effectiveness. We also note that the CPC attack framework can be exploited to convert a specific class of one-block free-start collisions into two-block collisions, which sheds light on the importance of free-start collisions. As a result, we present the first CPC attacks on reduced Whirlpool, Saturnin-hash and AES-MMO/MP in classic and quantum settings, and extend the collision attack on Saturnin-hash from 5 to 6 rounds in the classic setting. As an independent contribution, we improve the memoryless algorithm of solving 3-round inbound phase by Hosoyamada and Sasaki at Eurocrpyt 2020, which leads to improved quantum attacks on Whirlpool. Notably, we find the first 6-round memoryless quantum collision attack on Whirlpool better than generic CNS collision finding algorithm when exponential-size qRAM is not available but exponential-size classic memory is available.
Authenticated private information retrieval
This paper introduces protocols for authenticated private information retrieval. These schemes enable a client to fetch a record from a remote database server such that (a) the server does not learn which record the client reads, and (b) the client either obtains the "authentic" record or detects server misbehavior and safely aborts. Both properties are crucial for many applications. Standard private-information-retrieval schemes either do not ensure this form of output authenticity, or they require multiple database replicas with an honest majority. In contrast, we offer multi-server schemes that protect security as long as at least one server is honest. Moreover, if the client can obtain a short digest of the database out of band, then our schemes require only a single server. Performing an authenticated private PGP-public-key lookup on an OpenPGP key server's database of 3.5 million keys (3 GiB), using two non-colluding servers, takes under 1.2 core-seconds of computation, essentially matching the time taken by unauthenticated private information retrieval. Our authenticated single-server schemes are 30-100$\times$ more costly than state-of-the-art unauthenticated single-server schemes, though they achieve incomparably stronger integrity properties.
Differential MITM attacks on SLIM and LBCIoT
SLIM and LBCIoT are lightweight block ciphers proposed for IoT applications. We present differential meet-in-the-middle attacks on these ciphers and discuss several implementation variants and possible improvements of these attacks. Experimental validation also shows some results that may be of independent interest in the cryptanalysis of other ciphers. Namely, the problems with low-probability differentials and the questionable accuracy of standard complexity estimates of using filters.
Partially Non-Interactive Two-Round Lattice-Based Threshold Signatures
This paper gives the first lattice-based two-round threshold signature based on lattice assumptions for which the first message is independent of the message being signed without relying on fully-homomorphic encryption, and our construction supports arbitrary thresholds.
Our construction provides a careful instantiation of a generic threshold signature construction by Tessaro and Zhu (EUROCRYPT ’23) based on specific linear hash functions, which in turns can be seen as a generalization of the FROST scheme by Komlo and Goldberg (SAC ’20). Our reduction techniques are new in the context of lattice-based cryptography. Also, our scheme does not use any heavy tools, such as NIZKs or homomorphic trapdoor commitments.
Verifying Jolt zkVM Lookup Semantics
Lookups are a popular way to express repeated constraints in state-of-the art SNARKs. This is especially the case for zero-knowledge virtual machines (zkVMs), which produce succinct proofs of correct execution for programs expressed as bytecode according to a specific instruction set architecture (ISA). The Jolt zkVM (Arun, Setty & Thaler, Eurocrypt 2024) for RISC-V ISA employs Lasso (Setty, Thaler & Wahby, Eurocrypt 2024), an efficient lookup argument for massive structured tables, to prove correct execution of instructions. Internally, Lasso performs multiple lookups into smaller subtables, then combines the results.
We present an approach to formally verify Lasso-style lookup arguments against the semantics of instruction set architectures. We demonstrate our approach by formalizing and verifying all Jolt 32-bit instructions corresponding to the RISC-V base instruction set (RV32I) using the ACL2 theorem proving system. Our formal ACL2 model has undergone extensive validation against the Rust implementation of Jolt. Due to ACL2's bit-blasting, rewriting, and developer-friendly features, our formalization is highly automated.
Through formalization, we also discovered optimizations to the Jolt codebase, leading to improved efficiency without impacting correctness or soundness. In particular, we removed one unnecessary lookup each for four instructions, and reduced the sizes of three subtables by 87.5\%.
Doubly Efficient Batched Private Information Retrieval
Private information retrieval (PIR) allows a client to read data from a server, without revealing which information they are interested in. A PIR is doubly efficient if the server runtime is, after a one-time pre-processing, sublinear in the database size. A recent breakthrough result from Lin, Mook, and Wichs [STOC’23] proposed the first-doubly efficient PIR with (online) server computation poly-logarithmic in the size of the database, assuming the hardness of the standard Ring-LWE problem.
In this work, we consider the problem of doubly efficient batched PIR (DEBPIR), where the client wishes to download multiple entries. This problem arises naturally in many practical applications of PIR, or when the database contains large entries. Our main result is a construction of DEBPIR where the amortized communication and server computation overhead is $\tilde{O}(1)$, from the Ring-LWE problem. This represents an exponential improvement compared with known constructions, and it is optimal up to poly-logarithmic factors in the security parameter. Interestingly, the server’s online operations are entirely combinatorial and all algebraic computations are done in the pre-processing or delegated to the client.
Do Not Disturb a Sleeping Falcon: Floating-Point Error Sensitivity of the Falcon Sampler and Its Consequences
Falcon is one of the three postquantum signature schemes already selected by NIST for standardization. It is the most compact among them, and offers excellent efficiency and security. However, it is based on a complex algorithm for lattice discrete Gaussian sampling which presents a number of implementation challenges. In particular, it relies on (possibly emulated) floating-point arithmetic, which is often regarded as a cause for concern, and has been leveraged in, e.g., side-channel analysis. The extent to which Falcon's use of floating point arithmetic can cause security issues has yet to be thoroughly explored in the literature.
In this paper, we contribute to filling this gap by identifying a way in which Falcon's lattice discrete Gaussian sampler, due to specific design choices, is singularly sensitive to floating-point errors. In the presence of small floating-point discrepancies (which can occur in various ways, including the use of the two almost but not quite equivalent signing procedures ``dynamic'' and ``tree'' exposed by the Falcon API), we find that, when called twice on the same input, the Falcon sampler has a small but significant chance (on the order of once in a few thousand calls) of outputting two different lattice points with a very structured difference, that immediately reveals the secret key. This is in contrast to other lattice Gaussian sampling algorithms like Peikert's sampler and Prest's hybrid sampler, that are stable with respect to small floating-point errors.
Correctly generated Falcon signatures include a salt that should in principle prevent the sampler to ever be called on the same input twice. In that sense, our observation has little impact on the security of Falcon signatures per se (beyond echoing warnings about the dangers of repeated randomness). On the other hand, it is critical for derandomized variants of Falcon, which have been proposed for use in numerous settings. One can mention in particular identity-based encryption, SNARK-friendly signatures, and sublinear signature aggregation. For all these settings, small floating point discrepancies have a chance of resulting in full private key exposure, even when using the slower, integer-based emulated floating-point arithmetic of Falcon's
reference implementation.
Improved PIR Schemes using Matching Vectors and Derivatives
In this paper, we construct new t-server Private Information Retrieval (PIR) schemes with communication complexity subpolynomial in the previously best known, for all but finitely many t. Our results are
based on combining derivatives (in the spirit of Woodruff-Yekhanin) with the Matching Vector based PIRs of Yekhanin and Efremenko. Previously such a combination was achieved in an ingenious way by Dvir and Gopi, using polynomials and derivatives over certain exotic rings, en route to their fundamental result giving the first 2-server PIR with subpolynomial communication.
Our improved PIRs are based on two ingredients:
• We develop a new and direct approach to combine derivatives with Matching Vector based PIRs.
This approach is much simpler than that of Dvir-Gopi: it works over the same field as the original PIRs, and only uses elementary properties of polynomials and derivatives.
• A key subproblem that arises in the above approach is a higher-order polynomial interpolation problem. We show how “sparse S-decoding polynomials”, a powerful tool from the original constructions of Matching Vector PIRs, can be used to solve this higher-order polynomial interpolation problem using surprisingly few higer-order evaluations.
Using the known sparse S-decoding polynomials in combination with our
ideas leads to our improved PIRs. Notably, we get a 3-server PIR scheme with communication $2^{O^\sim( (\log n)^{1/3}) }$, improving upon the previously best known communication of $2^{O^\sim( \sqrt{\log n})}$ due to Efremenko.
Ceno: Non-uniform, Segment and Parallel Zero-knowledge Virtual Machine
In this paper, we explore a novel Zero-knowledge Virtual Machine (zkVM) framework leveraging succinct, non-interactive zero-knowledge proofs for verifiable computation over any code. Our approach divides the proof of program execution into two stages. In the first stage, the process breaks down program execution into segments, identifying and grouping identical sections. These segments are then proved through data-parallel circuits that allow for varying amounts of duplication. In the subsequent stage, the verifier examines these segment proofs, reconstructing the program's control and data flow based on the segments' duplication number and the original program. The second stage can be further attested by a uniform recursive proof.
We propose two specific designs of this concept, where segmentation and parallelization occur at two levels: opcode and basic block. Both designs try to minimize the control flow that affects the circuit size and support dynamic copy numbers, ensuring that computational costs directly correlate with the actual code executed (i.e., you only pay as much as you use). In our second design, in particular, by proposing an innovative data-flow reconstruction technique in the second stage, we can drastically cut down on the stack operations even compared to the original program execution. Note that the two designs are complementary rather than mutually exclusive. Integrating both approaches in the same zkVM could unlock more significant potential to accommodate various program patterns.
We present an asymmetric GKR scheme to implement our designs, pairing a non-uniform prover and a uniform verifier to generate proofs for dynamic-length data-parallel circuits. The use of a GKR prover also significantly reduces the size of the commitment. GKR allows us to commit only the circuit's input and output, whereas in Plonkish-based solutions, the prover needs to commit to all the witnesses.
Age-aware Fairness in Blockchain Transaction Ordering for Reducing Tail Latency
In blockchain networks, transaction latency is crucial for determining the quality of service (QoS). The latency of a transaction is measured as the time between its issuance and its inclusion in a block in the chain. A block proposer often prioritizes transactions with higher fees or transactions from accounts it is associated with, to minimize their latencies. To maintain fairness among transactions, a block proposer is expected to select the included transactions randomly. The random selection might cause some transactions to experience high latency following the variance in the time a transaction waits until it is selected. We suggest an alternative, age-aware approach towards fairness so that transaction priority is increased upon observing a large waiting time. We explain that a challenge with this approach is that the age of a transaction is not absolute due to transaction propagation. Moreover, a node might present its transactions as older to obtain priority. We describe a new technique to enforce a fair block selection while prioritizing transactions that observed high latency. The technique is based on various declaration schemes in which a node declares its pending transactions, providing the ability to validate transaction age. By evaluating the solutions on Ethereum data and synthetic data of various scenarios, we demonstrate the advantages of the approach under realistic conditions and understand its potential impact to maintain fairness and reduce tail latency.
Push-Button Verification for BitVM Implementations
Bitcoin, while being the most prominent blockchain with the largest market capitalization, suffers from scalability and throughput limitations that impede the development of ecosystem projects like Bitcoin Decentralized Finance (BTCFi). Recent advancements in BitVM propose a promising Layer 2 (L2) solution to enhance Bitcoin's scalability by enabling complex computations off-chain with on-chain verification. However, Bitcoin's constrained programming environment—characterized by its non-Turing-complete Script language lacking loops and recursion, and strict block size limits—makes developing complex applications labor-intensive, error-prone, and necessitates manual partitioning of scripts. Under this complex programming model, subtle mistakes could lead to irreversible damage in a trustless environment like Bitcoin. Ensuring the correctness and security of such programs becomes paramount.
To address these challenges, we introduce the first formal verifier for BitVM implementations. Our approach involves designing a register-based, higher-level domain-specific language (DSL) that abstracts away complex stack operations, allowing developers to reason about program correctness more effectively while preserving the semantics of the low-level program. We present a formal computational model capturing the semantics of BitVM execution and Bitcoin script, providing a foundation for rigorous verification. To efficiently handle large programs and complex constraints arising from unrolled computations that simulate loops, we summarize repetitive "loop-style" computations using loop invariant predicates in our DSL. We leverage a counterexample-guided inductive synthesis (CEGIS) procedure to lift low-level Bitcoin script into our DSL, facilitating efficient verification without sacrificing accuracy. Evaluated on 78 benchmarks from BitVM implementations, our tool successfully verifies 83% of cases within 12.55 seconds on average and identified one previously unknown vulnerability, demonstrating its effectiveness in enhancing the security and reliability of BitVM.
Fine-Grained Non-Interactive Key-Exchange without Idealized Assumptions, and Lower Bounds
In this paper, we study multi-party non-interactive key exchange (NIKE) in the fine-grained setting. More precisely, we propose three multi-party NIKE schemes in three computation models, namely, the bounded parallel-time, bounded time, and bounded storage models. Their security is based on a very mild assumption (e.g., NC1 ⊊ ⊕L/poly) or even without any complexity assumption. This improves the recent work of Afshar, Couteau, Mahmoody, and Sadeghi (EUROCRYPT 2023) that requires idealized assumptions, such as random oracles or generic groups.
Additionally, we show that all our constructions satisfy a natural desirable property that we refer to as extendability, and we give generic transformations from extendable multi-party NIKE to multi-party identity-based NIKEs in the fine-grained settings.
Furthermore, we generalize the lower bound on users’ storage consumption in the bounded storage model by Dziembowski and Maurer (Eurocrypt 2004) to encompass any multi-party NIKE with extendability. This new lower bound suggests that the users’ storage consumption of our multi-party NIKE in the bounded storage model is optimal.
A Fault Analysis on SNOVA
SNOVA is a post-quantum cryptographic signature scheme known for its efficiency and compact key sizes, making it a second-round candidate in the NIST post-quantum cryptography standardization process. This paper presents a comprehensive fault analysis of SNOVA, focusing on both permanent and transient faults during signature generation. We introduce several fault injection strategies that exploit SNOVA's structure to recover partial or complete secret keys with limited faulty signatures. Our analysis reveals that as few as $22$ to $68$ faulty signatures, depending on the security level, can suffice for key recovery. We propose a novel fault-assisted reconciliation attack, demonstrating its effectiveness in extracting the secret key space via solving a quadratic polynomial system. Simulations show transient faults in key signature generation steps can significantly compromise SNOVA’s security. To address these vulnerabilities, we propose a lightweight countermeasure to reduce the success of fault attacks without adding significant overhead. Our results highlight the importance of fault-resistant mechanisms in post-quantum cryptographic schemes like SNOVA to ensure robustness.
Single Trace Side-Channel Attack on the MPC-in-the-Head Framework
In this paper, we present the first single trace side-channel attack that targets the MPC-in-the-Head (MPCitH) framework based on threshold secret sharing, also known as Threshold Computation in the Head (TCitH) in its original version. This MPCitH framework can be found in 5 of the 14 digital signatures schemes in the recent second round of the National Institute of Standards and Technology (NIST) call for digital signatures. In this work, we start by highlighting a side-channel vulnerability of the TCitH framework and show an exploitation of it on the SDitH algorithm, which is part of this NIST call. Specifically, we exploit the leakage of a multiplication function in the Galois field to make predictions about intermediate values, and we use the structure of the algorithm to combine information efficiently. This allows us to build an attack that is both the first Soft Analytical Side-Channel Attack (SASCA) targeting the MPCitH framework, as well as the first attack on SDitH. More specifically, we build a SASCA based on Belief Propagation (BP) on the evaluation of polynomials in the signature using the threshold variant structure to reconstruct the secret key. We perform simulated attacks under the Hamming Weight (HW) leakage model, enabling us to evaluate the resistance of the scheme against SASCA. We then perform our attacks in a real case scenario, more specifically on the STM32F407, and recover the secret key for all the security levels. We end this paper by discussing the various shuffling countermeasures we could use to mitigate our attacks.
Efficient Verifiable Differential Privacy with Input Authenticity in the Local and Shuffle Model
Local differential privacy (LDP) enables the efficient release of aggregate statistics without having to trust the central server (aggregator), as in the central model of differential privacy, and simultaneously protects a client's sensitive data. The shuffle model with LDP provides an additional layer of privacy, by disconnecting the link between clients and the aggregator. However, LDP has been shown to be vulnerable to malicious clients who can perform both input and output manipulation attacks, i.e., before and after applying the LDP mechanism, to skew the aggregator's results. In this work, we show how to prevent malicious clients from compromising LDP schemes. Our only realistic assumption is that the initial raw input is authenticated; the rest of the processing pipeline, e.g., formatting the input and applying the LDP mechanism, may be under adversarial control. We give several real-world examples where this assumption is justified. Our proposed schemes for verifiable LDP (VLDP), prevent both input and output manipulation attacks against generic LDP mechanisms, requiring only one-time interaction between client and server, unlike existing alternatives [37, 43]. Most importantly, we are the first to provide an efficient scheme for VLDP in the shuffle model. We describe, and prove security of, two schemes for VLDP in the local model, and one in the shuffle model. We show that all schemes are highly practical, with client run times of less than 2 seconds, and server run times of 5-7 milliseconds per client.
Compactly Committing Authenticated Encryption Using Encryptment and Tweakable Block Cipher
Facebook introduced message franking to enable users to report abusive content verifiably in end-to-end encrypted messaging. Grubbs et al. formalized the underlying primitive called compactly committing authenticated encryption with associated data (ccAEAD) and presented schemes with provable security. Dodis et al. proposed a core building block called encryptment and presented a generic construction of ccAEAD with encryptment and standard AEAD. This paper first proposes to use a tweakable block cipher instead of AEAD for the generic construction of Dodis et al. In the security analysis of the proposed construction, its ciphertext integrity is shown to require a new but feasible assumption on the ciphertext integrity of encryptment. Then, this paper formalizes remotely keyed ccAEAD (RK ccAEAD) and shows that the proposed construction works as RK ccAEAD. Finally, the confidentiality of the proposed construction as RK ccAEAD is shown to require a new variant of confidentiality for encryptment. The problem of remotely keyed encryption was posed by Blaze in 1996. It is now related to the problem of designing a cryptographic scheme using a trusted module and/or with leakage resiliency.
A Formal Treatment of Envelope Encryption
Envelope encryption is a method to encrypt data with two distinct keys in its basic form. Data is first encrypted with a data-encryption key, and then the data-encryption key is encrypted with a key-encryption key. Despite its deployment in major cloud services, as far as we know, envelope encryption has not received any formal treatment. To address this issue, we first formalize the syntax and security requirements of envelope encryption in the symmetric-key setting. Then, we show that it can be constructed by combining encryptment and authenticated encryption with associated data (AEAD). Encryptment is one-time AEAD satisfying that a small part of a ciphertext works as a commitment to the corresponding secret key, message, and associated data. Finally, we show that the security of the generic construction is reduced to the security of the underlying encryptment and AEAD.
THOR: Secure Transformer Inference with Homomorphic Encryption
As language models are increasingly deployed in cloud environments, privacy concerns have become a significant issue. To address this, we design THOR, a secure inference framework for transformer models on encrypted data. Specifically, we first propose new fast matrix multiplication algorithms based on diagonal-major order encoding and extend them to parallel matrix computation through the compact ciphertext packing technique. Second, we design efficient protocols for secure computations of four non-linear functions such as softmax, LayerNorm, GELU, and Tanh, by integrating advanced underlying approximation methods with tailored optimizations. Our matrix multiplication algorithms reduce the number of key-switching operations in the linear layers of the attention block in the BERT-base model by up to 14.5x, compared to the state-of-the-art HE-based secure inference protocol (Park et al., Preprint). Combined with cryptographic optimizations, our experimental results demonstrate that THOR provides secure inference for the BERT-base model with a latency of 10.43 minutes on a single GPU, while maintaining comparable inference accuracy on the MRPC dataset.
Cryptography Experiments In Lean 4: SHA-3 Implementation
In this paper we explain how we implemented the Secure Hash Algorithm-3 (SHA-3) family of functions in Lean 4, a functional programming language and theorem prover. We describe how we used several Lean facilities including type classes, dependent types, macros, and formal verification, and then refined the design to provide a simple one-shot and streaming API for hashing, and Extendable-output functions (XOFs), to reduce potential for misuse by users, and formally prove properties about the implementation.
A Closer Look at the Belief Propagation Algorithm in Side-Channel-Assisted Chosen-Ciphertext Attacks
The implementation security of post-quantum cryptography (PQC) algorithms has emerged as a critical concern with the PQC standardization process reaching its end. In a side-channel-assisted chosen-ciphertext attack, the attacker builds linear inequalities on secret key components and uses the belief propagation (BP) algorithm to solve. The number of inequalities leverages the query complexity of the attack, so the fewer the better. In this paper, we use the PQC standard algorithm CRYSTALS-Kyber as a study case to construct bilateral inequalities on key variables with substantially narrower intervals using a side-channel-assisted oracle. For Kyber512, Kyber768, and Kyber 1024, the average Shannon entropy carried by such inequality is improved from the previous 0.6094, 0.4734, and 0.8544 to 0.6418, 0.4777, and 1.2007. The number of such inequalities required to recover the key utilizing the BP algorithm for Kyber512 and Kyber1024 is reduced by 5.32% and 40.53% in theory and experimentally the reduction is even better. The query complexity is reduced by 43%, 37%, and 48% for Kyber512, 768, and 1024 assuming reasonably perfect reliability.
Furthermore, we introduce a strategy aimed at further refining the interval of inequalities. Diving into the BP algorithm, we discover a measure metric named JSD (Jensen-Shannon distance)-metric that can gauge the tightness of an inequality. We then develop a machine learning-based strategy to utilize the JSD-metrics to contract boundaries of inequalities even with fewer inequalities given, thus improving the entropy carried by the system of linear inequalities. This contraction strategy is at the algorithmic level and has the potential to be employed in all attacks endeavoring to establish a system of inequalities concerning key variables.
Free-XOR Gate Bootstrapping
The FHEW-like gate bootstrapping framework operates in a 2-bit plaintext space, where logic gates such as NAND, XOR, and AND are implemented by adding two ciphertexts and extracting the most significant bit. However, each gate operation requires bootstrapping with a primary cost of one blind rotation, which is expensive, when processing circuit operations for applications. We propose a novel Free-XOR gate bootstrapping framework based on a single-bit plaintext space, in which the XOR operation is realized by simply adding two ciphertexts, resulting in an almost free computational cost. To form a minimal complete set for logical operations, we design an algorithm for the AND gate within this framework. The AND gate cost of our Free-XOR gate bootstrapping involves two blind rotations. However, by utilizing a single-bit plaintext space to enhance noise tolerance and swapping some operations of the bootstrapping process, we can adopt a more compact parameter setting, which in turn accelerates the speed of blind rotation. We propose an instantiation of the NTRU-based AND gate operation, which requires two blind rotations. Despite the additional rotation, the overall computational cost is marginally lower than the state-of-the-art gate bootstrapping scheme LLW+ [TCHES24], which utilizes only a single blind rotation. In addition, our approach achieves a significant reduction in key size, reducing it to 3.3 times the size of LLW+ [TCHES24].
How to Redact the Bitcoin Backbone Protocol
We explain how to extend the Bitcoin backbone model of Garay et al. (Eurocrypt, 2015) to accommodate for redactable blockchains. Our extension captures fluid blockchain-based databases (with mutability requirements) and compliance with existing legislation, such as the GDPR right to be forgotten, or the need to erase offending data from nodes’ databases that would otherwise provoke legal shutdowns. Our redactable backbone protocol retains the essential properties of blockchains. Leveraging zero-knowledge proofs, old data can be erased without requiring trusted third parties or heuristics about past chain validation. Our solution can be implemented on Bitcoin immediately without hard-forks, and it is scalable. It allows the redaction of data from UTXOs or unconfirmed transactions that have not yet flooded the network, while guaranteeing invariance of the Bitcoin state. Thus, offending data does not need to persist in the system, not even temporarily.
A Practical Protocol for Quantum Oblivious Transfer from One-Way Functions
We present a new simulation-secure quantum oblivious transfer (QOT) protocol based on one-way functions in the plain model. With a focus on practical implementation, our protocol surpasses prior works in efficiency, promising feasible experimental realization. We address potential experimental errors and their correction, offering analytical expressions to facilitate the analysis of the required quantum resources. Technically, we achieve simulation security for QOT through an equivocal and relaxed-extractable quantum bit commitment.
Practical Zero-Knowledge PIOP for Public Key and Ciphertext Generation in (Multi-Group) Homomorphic Encryption
Homomorphic encryption (HE) is a foundational technology in privacy-enhancing cryptography, enabling non-interactive computation over encrypted data. Recently, generalized HE primitives designed for multi-party applications, such as multi-group HE (MGHE), have gained significant research interest.
While constructing secure multi-party protocols from (MG)HE in the semi-honest model is straightforward, zero-knowledge techniques are essential for ensuring security against malicious adversaries.
In this work, we design practical proof systems for MGHE to guarantee the well-formedness of public keys and ciphertexts. Specifically, we develop and optimize a polynomial interactive oracle proof (PIOP) for MGHE, which can be compiled into zk-SNARKs using a polynomial commitment scheme (PCS).
We compile our PIOP using a lattice-based PCS, and our implementation achieves a 5.5x reduction in proof size, a 70x speed-up in proof generation, and a 343x improvement in verification time compared to the previous state-of-the-art construction, PELTA (ACM CCS 2023). Additionally, our PIOPs are modular, enabling the use of alternative PCSs to optimize other aspects, such as further reducing proof sizes.
On the vector subspaces of $\mathbb{F}_{2^n}$ over which the multiplicative inverse function sums to zero
We study the behavior of the multiplicative inverse function (which plays an important role in cryptography and in the study of finite fields), with respect to a recently introduced generalization of almost perfect nonlinearity (APNness), called $k$th-order sum-freedom, that extends a classic characterization of APN functions, and has also some relationship with integral attacks. This generalization corresponds to the fact that a vectorial function $F:\mathbb F_2^n\mapsto \mathbb F_2^m$ sums to a nonzero value over every $k$-dimensional affine subspace of $\mathbb F_2^n$, for some $k\leq n$ (APNness corresponds to $k=2$). The sum of the values of the inverse function $x\in \mathbb F_{2^n}\mapsto x^{2^n-2}\in \mathbb F_{2^n}$ over any affine subspace $A$ of $\mathbb{F}_{2^n}$ not containing 0 ({\em i.e.} being not a vector space) has been addressed, thanks to a simple expression of such sum, which shows that it never vanishes. We study in the present paper the case of vector (i.e. linear) subspaces, which is much less simple to handle. The sum depends on a coefficient in subspace polynomials.
We study for which values of $k$ the multiplicative inverse function can sum to nonzero values over all $k$-dimensional vector subspaces. We show that, for every $k$ not co-prime with $n$, it sums to zero over at least one $k$-dimensional $\mathbb{F}_2$-subspace of $\mathbb{F}_{2^n}$. We study the behavior of the inverse function over direct sums of vector spaces and we deduce that the property of the inverse function to be $k$th-order sum-free happens for $k$ if and only if it happens for $n-k$. We derive several other results and we show that the set of values $k$ such that the inverse function is not $k$th-order sum-free is stable when adding two values of $k$ whose product is smaller than $n$ (and when subtracting two values under some conditions). We clarify the case of dimension at most 4 (equivalently, of co-dimension at most 4) and this allows to address, for every $n$, all small enough values of $k$ of the form $3a+4b$.
Tighter Security for Group Key Agreement in the Random Oracle Model
The Messaging Layer Security (MLS) protocol, recently standardized in RFC 9420, aims to provide efficient asynchronous group key establishment with strong security guarantees. The main component of MLS, which is the source of its important efficiency and security properties, is a protocol called TreeKEM. Given that a major vision for the MLS protocol is for it to become the new standard for messaging applications like WhatsApp, Facebook Messenger, Signal, etc., it has the potential to be used by a huge number of users. Thus, it is important to better understand the security of MLS and hence also of TreeKEM. In a previous work by Klein et. al, TreeKEM was proven adaptively secure in the Random Oracle Model (ROM) with a polynomial loss in security by proving a result about the security of an arbitrary IND-CPA secure public-key encryption scheme in a public-key version of the Generalized Selective Decryption (GSD) security game.
In this work, we prove a tighter bound for the security of TreeKEM. We follow the approach in the aforementioned work and first introduce a modified version of the public-key GSD game better suited for analyzing TreeKEM. We then provide a simple and detailed proof of security for a specific encryption scheme, the DHIES scheme (currently the only standardized scheme in MLS), in this game in the ROM and achieve a tighter bound compared to the result from Klein et. al. We also define and describe the syntax and security of TreeKEM-like schemes and state a result linking the security of TreeKEM with security in our GSD game in the ROM.
On the Black-Box Complexity of Private-Key Inner-Product Functional Encryption
We initiate the study of the black-box complexity of private-key functional encryption (FE). Of central importance in the private-key setting is the inner-product functionality, which is currently only known from assumptions that imply public-key encryption, such as Decisional Diffie-Hellman or Learning-with-Errors. As our main result, we rule out black-box constructions of private-key inner-product FE from random oracles. This implies a black-box separation between private-key inner-product FE from all symmetric-key primitives implied by random oracles (e.g., symmetric-key encryption and collision-resistant hash functions).
Proving lower bounds for private-key functional encryption schemes introduces challenges that were absent in prior works. In particular, the combinatorial techniques developed by prior works for proving black-box lower bounds are only useful in the public-key setting and predicate encryption settings, which all fail for the private-key FE case. Our work develops novel combinatorial techniques based on Fourier analysis to overcome these barriers. We expect these techniques to be widely useful in future research in this area.
On the Relationship between Public Key Primitives via Indifferentiability
Recently, Masny and Rindal [MR19] formalized a notion called Endemic Oblivious Transfer (EOT), and they proposed a generic transformation from Non-Interactive Key Exchange (NIKE) to EOT with standalone security in the random oracle (RO) model. However, from the model level, the relationship between idealized NIKE and idealized EOT and the relationship between idealized elementary public key primitives have been rarely researched.
In this work, we investigate the relationship between ideal NIKE and ideal one-round EOT, as well as the relationship between ideal public key encryption (PKE) and ideal two-round Oblivious Transfer (OT), in the indifferentiability framework proposed by Maurer et al.(MRH04). Our results are threefold: Firstly, we model ideal PKE without public key validity test, ideal one-round EOT and ideal two-round OT in the indifferentiability framework. Secondly, we show that ideal NIKE and ideal one-round EOT are equivalent, and ideal PKE without public key validity test are equivalent to ideal two-round OT. Thirdly, we show a separation between ideal two-round OT and ideal one-round EOT, which implies a separation between ideal PKE and ideal NIKE.
Randomness Bounds for Private Simultaneous Messages and Conditional Disclosure of Secrets
In cryptography, the private simultaneous messages (PSM) and conditional disclosure of secrets (CDS) are closely related fundamental primitives. We consider $k$-party PSM and CDS protocols for a function $f$ with a common random string, where each party $P_i$ generates a message and sends it to a referee $P_0$. We consider bounds for the optimal length $\rho$ of the common random string among $k$ parties (or, {\it randomness complexity}) in PSM and CDS protocols with perfect and statistical privacy through combinatorial and entropic arguments. ($i$) We provide general connections from the optimal total length $\lambda$ of the messages (or, {\it communication complexity}) to the randomness complexity $\rho$. ($ii$) We also prove randomness lower bounds in PSM and CDS protocols for general functions. ($iii$) We further prove randomness lower bounds for several important explicit functions. They contain the following results: For PSM protocols with perfect privacy, we prove $\lambda-1 \le \rho$ as the general connection. From the general lower bound, we prove $\rho\ge 2^{(k-1)n}-1$ for a general function $f:(\{0,1\}^n)^k\rightarrow\{0,1\}$ under universal reconstruction, in which $P_0$ is independent of $f$. This implies that the Feige-Killian-Naor PSM protocol for a general function [Proc.~STOC '94, pp.554--563] is optimal with respect to randomness complexity. We also provide a randomness lower bound $\rho> kn-2$ for a generalized inner product function. This implies the optimality of the $2$-party PSM protocol for the inner-product function of Liu, Vaikuntanathan, and Wee [Proc.~CRYPTO 2017, pp.758--790]. For CDS protocols with perfect privacy, we show $\rho\ge\lambda-\sigma$ as the general connection by similar argument to those for PSM protocols, where $\sigma$ is the length of secrets. We also obtain randomness lower bounds $\rho\ge (k-1)\sigma$ for XOR, AND, and generalized inner product functions. These imply the optimality of Applebaum and Arkis's $k$-party CDS protocol for a general function [Proc. TCC 2018, pp.317--344] up to a constant factor in a large $k$.
Impossible Boomerang Attacks Revisited: Applications to Deoxys-BC, Joltik-BC and SKINNY
The impossible boomerang (IB) attack was first introduced by Lu in his doctoral thesis and subsequently published at DCC in 2011. The IB attack is a variant of the impossible differential (ID) attack by incorporating the idea of the boomerang attack. In this paper, we revisit the IB attack, and introduce the incompatibility of two characteristics in boomerang to the construction of an IB distinguisher. With our methodology, all the constructions of IB distinguisher are represented in a unified manner. Moreover, we show that the related-(twea)key IB distinguishers possess more freedom than the ones of ID so that it can cover more rounds.
We also propose a new tool based on Mixed-Integer Quadratically-Constrained Programming (MIQCP) to search for IB attacks. To illustrate the power of the IB attack, we mount attacks against three tweakable block ciphers: Deoxys-BC, Joltik-BC and SKINNY. For Deoxys-BC, we propose a related-tweakey IB attack on 14-round Deoxys-BC-384, which improves the best previous related-tweakey ID attack by 2 rounds, and we improve the data complexity of the best previous related-tweakey ID attack on 10-round Deoxys-BC-256. For Joltik-BC, we propose the best attacks against 10-round Joltik-BC-128 and 14-round Joltik-BC-192 with related-tweakey IB attack. For SKINNY-n-3n, we propose a 27-round related-tweakey IB attack, which improves both the time and the memory complexities of the best previous ID attack. We also propose the first related-tweakey IB attack on 28 round SKINNY-n-3n, which improves the previous best ID attack by one round.
Unbounded Leakage-Resilient Encryption and Signatures
Given the devastating security compromises caused by side-channel attacks on existing classical systems, can we store our private data encoded as a quantum state so that they can be kept private in the face of arbitrary side-channel attacks?
The unclonable nature of quantum information allows us to build various quantum protection schemes for cryptographic information such as secret keys. Examples of quantum protection notions include copy-protection, secure leasing, and finally, unbounded leakage-resilience, which was recently introduced by Çakan, Goyal, Liu-Zhang and Ribeiro (TCC'24). Çakan et al show that secrets of various cryptographic schemes (such as cryptographic keys or secret shares) can be protected by storing them as quantum states so that they satisfy LOCC (local operation and classical communication) leakage-resilience: the scheme can tolerate any unbounded amount of adaptive leakage over unbounded rounds. As a special case (dubbed $1$-round leakage), this also means that those quantum states cannot be converted to classical strings (without completely losing their functionality).
In this work, we continue the study of unbounded/LOCC leakage-resilience and consider several new primitive. In more details, we build ciphertexts, signatures and non-interactive zero-knowledge proofs with unbounded leakage-resilience. We show the following results.
- Assuming the existence of a classical $X \in \{\text{secret-key encryption}, \text{public-key encryption}\}$ scheme, we construct an $X$ scheme with LOCC leakage-resilient ciphertexts. This guarantees that an adversary who obtains LOCC-leakage on ciphertexts cannot learn anything about their contents, even if they obtain the secret key later on.
- Assuming the existence of a classical signature scheme and indistinguishability obfuscation (iO), we construct a signature scheme with LOCC leakage-resilient signatures. This guarantees that an adversary who obtains LOCC-leakage on various signatures cannot produce any valid signatures at all other than the ones it obtained honestly!
- Assuming the existence of one-way functions and indistinguishability obfuscation (iO), we construct a NIZK proof system with LOCC leakage-resilient proofs. This guarantees that an adversary who obtains LOCC-leakage on a NIZK proof of an hard instance cannot produce a valid proof!
mUOV: Masking the Unbalanced Oil and Vinegar Digital Sigital Signature Scheme at First- and Higher-Order
The National Institute for Standards and Technology (NIST) initiated a standardization procedure for additional digital signatures and recently announced round-2 candidates for the PQ additional digital signature schemes. The multivariate digital signature scheme Unbalanced Oil and Vinegar (UOV) is one of the oldest post-quantum schemes and has been selected by NIST for Round 2. Although UOV is mathematically secure, several side-channel attacks (SCA) have been shown on the UOV or UOV-based digital signatures. We carefully analyze the sensitivity of variables and operations in the UOV scheme from the side-channel perspective and show which require protection.
To mitigate implementation-based SCA, we integrate a provably secure arbitrary-order masking technique with the key generation and signature generation algorithms of UOV. We propose efficient techniques for the masked dot-product and matrix-vector operations, which are both critical in multivariate DS schemes. We also implemented and demonstrate the practical feasibility of our masking algorithms for UOV-Ip on the ARM Cortex-M4 microcontroller. Our first-order masked UOV implementations have $2.7\times$ and $3.6\times$ performance overhead compared to the unmasked scheme for key generation and signature generation algorithms. Our first-order masked UOV implementations use $1.3\times$ and $1.9\times$ stack memory rather than the unmasked version of the key generation and signature generation algorithms.
Somewhat Homomorphic Encryption from Linear Homomorphism and Sparse LPN
We construct somewhat homomorphic encryption schemes from the learning sparse parities with noise (sparse LPN) problem, along with an assumption that implies linearly homomorphic encryption (e.g., the decisional Diffie-Hellman or decisional composite residuosity assumptions). Our resulting schemes support an a-priori bounded number of homomorphic operations: $O(\log \lambda/\log \log \lambda)$ multiplications followed by poly($\lambda$) additions, where $\lambda \in \mathbb{N}$ is a security parameter. These schemes have compact ciphertexts: after homomorphic evaluation, the bit-length of each ciphertext is a fixed polynomial in the security parameter $\lambda$, independent of the number of homomorphic operations applied to it. This gives the first somewhat homomorphic encryption schemes that can evaluate the class of bounded-degree polynomials with a bounded number of monomials without relying on lattice assumptions or bilinear maps.
Much like in the Gentry-Sahai-Waters fully homomorphic encryption scheme, ciphertexts in our scheme are matrices, homomorphic addition is matrix addition, and homomorphic multiplication is matrix multiplication. Moreover, when encrypting many messages at once and performing many homomorphic evaluations at once, the bit-length of ciphertexts in some of our schemes (before and after homomorphic evaluation) can be arbitrarily close to the bit-length of the plaintexts. The main limitation of our schemes is that they require a large evaluation key, whose size scales with the complexity of the homomorphic computation performed, though this key can be re-used across any polynomial number of encryptions and evaluations.
Distributed Differential Privacy via Shuffling vs Aggregation: a Curious Study
How to achieve distributed differential privacy (DP) without a trusted central party is of great interest in both theory and practice. Recently, the shuffle model has attracted much attention. Unlike the local DP model in which the users send randomized data directly to the data collector/analyzer, in the shuffle model an intermediate untrusted shuffler is introduced to randomly permute the data, which have already been randomized by the users, before they reach the analyzer. The most appealing aspect is that while shuffling does not explicitly add more noise to the data, it can make privacy better. The privacy amplification effect in consequence means the users need to add less noise to the data than in the local DP model, but can achieve the same level of differential privacy. Thus, protocols in the shuffle model can provide better accuracy than those in the local DP model. What looks interesting to us is that the architecture of the shuffle model is similar to private aggregation, which has been studied for more than a decade. In private aggregation, locally randomized user data are aggregated by an intermediate untrusted aggregator. Thus, our question is whether aggregation also exhibits some sort of privacy amplification effect? And if so, how good is this ``aggregation model'' in comparison with the shuffle model. We conducted the first comparative study between the two, covering privacy amplification, functionalities, protocol accuracy, and practicality. The results as yet suggest that the new shuffle model does not have obvious advantages over the old aggregation model. On the contrary, protocols in the aggregation model outperform those in the shuffle model, sometimes significantly, in many aspects.
The Learning Stabilizers with Noise problem
Random classical codes have good error correcting properties, and yet they are notoriously hard to decode in practice. Despite many decades of extensive study, the fastest known algorithms still run in exponential time. The Learning Parity with Noise (LPN) problem, which can be seen as the task of decoding a random linear code in the presence of noise, has thus emerged as a prominent hardness assumption with numerous applications in both cryptography and learning theory.
Is there a natural quantum analog of the LPN problem? In this work, we introduce the Learning Stabilizers with Noise (LSN) problem, the task of decoding a random stabilizer code in the presence of local depolarizing noise. We give both polynomial-time and exponential-time quantum algorithms for solving LSN in various depolarizing noise regimes, ranging from extremely low noise, to low constant noise rates, and even higher noise rates up to a threshold. Next, we provide concrete evidence that LSN is hard. First, we show that LSN includes LPN as a special case, which suggests that it is at least as hard as its classical counterpart. Second, we prove a worst-case to average-case reduction for variants of LSN. We then ask: what is the computational complexity of solving LSN? Because the task features quantum inputs, its complexity cannot be characterized by traditional complexity classes. Instead, we show that the LSN problem lies in a recently introduced (distributional and oracle) unitary synthesis class. Finally, we identify several applications of our LSN assumption, ranging from the construction of quantum bit commitment schemes to the computational limitations of learning from quantum data.
Masking Gaussian Elimination at Arbitrary Order, with Application to Multivariate- and Code-Based PQC
Digital signature schemes based on multivariate- and code-based hard problems are promising alternatives for lattice-based signature schemes, due to their smaller signature size. Hence, several candidates in the ongoing additional standardization for quantum secure digital signature (DS) schemes by the National Institute of Standards and Technology (NIST) rely on such alternate hard problems. Gaussian Elimination (GE) is a critical component in the signing procedure of these schemes. In this paper, we provide a masking scheme for GE with back substitution to defend against first- and higher-order attacks. To the best of our knowledge, this work is the first to analyze and propose masking techniques for multivariate- or code-based DS algorithms.
We propose a masked algorithm for transforming a system of linear equations into row-echelon form. This is realized by introducing techniques for efficiently making leading (pivot) elements one while avoiding costly conversions between Boolean and multiplicative masking at all orders. We also propose a technique for efficient masked back substitution, which eventually enables a secure unmasking of the public output. We evaluate the overhead of our countermeasure for several post-quantum candidates and their different security levels at first-, second-, and third-order, including UOV, MAYO, SNOVA, QR-UOV, and MQ-Sign. Notably, the operational cost of first-, second-, and third-order masked GE is 2.3$\times$ higher, and the randomness cost is 1.2$\times$ higher in MAYO compared to UOV for security levels III and V. In contrast, these costs are similar in UOV and MAYO for one version of level I. We also show detailed performance results for masked GE implementations for all three security versions of UOV on the Arm Cortex-M4 and compare them with unmasked results. Our first-order implementations targeting UOV parameters have overheads of factor 6.5$\times$, 5.9$\times$, and 5.7$\times$ compared to the unprotected implementation for NIST security level I, III, and V.
Multi-Holder Anonymous Credentials from BBS Signatures
The eIDAS 2.0 regulation aims to develop interoperable digital identities for European citizens, and it has recently become law. One of its requirements is that credentials be unlinkable. Anonymous credentials (AC) allow holders to prove statements about their identity in a way that does not require to reveal their identity and does not enable linking different usages of the same credential. As a result, they are likely to become the technology that provides digital identity for Europeans.
Any digital credential system, including anonymous credentials, needs to be secured against identity theft and fraud. In this work, we introduce the notion of a multi-holder anonymous credential scheme that allows issuing shares of credentials to different authentication factors (or ``holders''). To present the credential, the user's authentication factors jointly run a threshold presentation protocol. Our definition of security requires that the scheme provide unforgeability: the adversary cannot succeed in presenting a credential with identity attributes that do not correspond to an identity for which the adversary controls at least $t$ shares; this is true even if the adversary can obtain credentials of its choice and cause concurrent executions of the presentation protocol. Further, our definition requires that the presentation protocol provide security with identifiable abort. Finally, presentations generated by all honest holders must be unlinkable and must not reveal the user's secret identity attributes even to an adversary that controls some of the user's authentication factors.
We design and prove the (concurrent) security of a multi-holder version of the BBS anonymous credential scheme. In our construction, each holder is issued a secret share of a BBS credential.
Using these shares, the holders jointly compute a credential presentation that is identical to (and therefore compatible with) the traditional, single-holder variant (due to Tessaro and Zhu, Eurocrypt'23) of a BBS credential presentation.
$\mathsf{Cirrus}$: Performant and Accountable Distributed SNARK
As Succinct Non-interactive Arguments of Knowledge (SNARKs) gain traction for large-scale applications, distributed proof generation is a promising technique to horizontally scale up the performance. In such protocols, the workload to generate SNARK proofs is distributed among a set of workers, potentially with the help of a coordinator. Desiderata include linear worker time (in the size of their sub-tasks), low coordination overhead, low communication complexity, and accountability (the coordinator can identify malicious workers). State-of-the-art schemes, however, do not achieve these properties.
In this paper, we introduced $\mathsf{Cirrus}$, the first accountable distributed proof generation protocol with linear computation complexity for all parties. $\mathsf{Cirrus}$ is based on HyperPlonk (EUROCRYPT'23) and therefore supports a universal trusted setup.
$\mathsf{Cirrus}$ is horizontally scalable: proving statements about a circuit of size $O(MT)$ takes $O(T)$ time with $M$ workers. The per-machine communication cost of $\mathsf{Cirrus}$ is low, which is only logarithmic in the size of each sub-circuit. $\mathsf{Cirrus}$ is also accountable, and the verification overhead of the coordinator is efficient. We further devised a load balancing technique to make the workload of the coordinator independent of the size of each sub-circuit.
We implemented an end-to-end prototype of $\mathsf{Cirrus}$ and evaluated its performance on modestly powerful machines. Our results confirm the horizontal scalability of $\mathsf{Cirrus}$, and the proof generation time for circuits with $2^{25}$ gates is roughly $40$s using $32$ $8$-core machines. We also compared $\mathsf{Cirrus}$ with Hekaton (CCS'24), and $\mathsf{Cirrus}$ is faster when proving PLONK-friendly circuits such as Pedersen hash.
ColliderScript: Covenants in Bitcoin via 160-bit hash collisions
We introduce a method for enforcing covenants on Bitcoin outputs without requiring any changes to Bitcoin by designing a hash collision based equivalence check which bridges Bitcoin's limited Big Script to Bitcoin's Small Script. This allows us evaluate the signature of the spending transaction (available only to Big Script) in Small Script. As Small Script enables arbitrary computations, we can introspect into the spending transaction and enforce covenants on it.
Our approach leverages finding collisions in the $160$-bit hash functions: SHA-1 and RIPEMD-160. By the birthday bound this should cost $\sim2^{80}$ work. Each spend of our covenant costs $\sim2^{86}$ hash queries and $\sim2^{56}$ bytes of space. For security, we rely on an assumption regarding the hardness of finding a $3$-way collision (with short random inputs) in $160$-bit hash functions, arguing that if the assumption holds, breaking covenant enforcement requires $\sim2^{110}$ hash queries. To put this in perspective, the work to spend our covenant is $\sim33$ hours of the Bitcoin mining network, whereas breaking our covenant requires $\sim 450,000$ years of the Bitcoin mining network.
We believe there are multiple directions of future work that can significantly improve these numbers.
Evaluating covenants and our equivalence check requires performing many operations in Small Script, which must take no more than $4$ megabytes in total size, as Bitcoin does not allow transactions greater than $4$ megabytes. We only provide rough estimates of the transaction size because, as of this writing, no Small Script implementations of the hash functions required, SHA-1 and RIPEMD-160, have been written.
Amigo: Secure Group Mesh Messaging in Realistic Protest Settings
In large-scale protests, a repressive government will often disable the Internet to thwart communication between protesters. Smartphone mesh networks, which route messages over short range, possibly ephemeral, radio connections between nearby phones, allow protesters to communicate without relying on centralized Internet infrastructure. Unfortunately, prior work on mesh networks does not efficiently support cryptographically secure group messaging (a crucial requirement for protests); prior networks were also evaluated in unrealistically benign network environments which fail to accurately capture the link churn and physical spectrum contention found in realistic protest environments. In this paper, we introduce Amigo, an anonymous mesh messaging system which supports group communication through continuous key agreement, and forwards messages using a novel routing protocol designed to handle the challenges of ad-hoc routing scenarios. Our extensive simulations reveal the poor scalability of prior approaches, the benefits of Amigo's protest-specific optimizations, and the challenges that still must be solved to scale secure mesh networks to protests with thousands of participants.
Field-Agnostic SNARKs from Expand-Accumulate Codes
Efficient realizations of succinct non-interactive arguments of knowledge (SNARKs) have gained popularity due to their practical applications in various domains. Among existing schemes, those based on error-correcting codes are of particular interest because of their good concrete efficiency, transparent setup, and plausible post-quantum security. However, many existing code-based SNARKs suffer from the
disadvantage that they only work over specific finite fields.
In this work, we construct a code-based SNARK that does not rely on any specific underlying field; i.e., it is field-agnostic. Our construction follows the framework of Brakedown (CRYPTO '23) and builds a polynomial commitment scheme (and hence a SNARK) based on recently introduced expand-accumulate codes. Our work generalizes these codes to arbitrary finite fields; our main technical contribution is showing that, with high probability, these codes have constant rate and constant relative distance (crucial properties for building efficient SNARKs), solving an open problem from prior work.
As a result of our work we obtain a SNARK where, for a statement of size $M$ , the prover time is $O(M \log M )$ and the proof size is $O(\sqrt{M} )$. We demonstrate the concrete efficiency of our scheme empirically via experiments. Proving ECDSA verification on the secp256k1 curve requires only 0.23s for proof generation, 2 orders of magnitude faster than SNARKs that are not field-agnostic. Compared to the original Brakedown result (which is also field-agnostic), we obtain proofs that are 1.9–2.8$\times$ smaller due to the good concrete distance of our underlying error-correcting code, while introducing only a small overhead of 1.2$\times$ in the prover time.
Towards Practical Oblivious Map
Oblivious map (OMAP) is an important component in encrypted databases, utilized to safeguard against the server inferring sensitive information about client's encrypted key-value stores based on access patterns. Despite its widespread usage and importance, existing OMAP solutions face practical challenges, including the need for a large number of interaction rounds between the client and server, as well as the substantial communication bandwidth requirements. For example, the state-of-the-art protocol named OMIX++ in VLDB 2024 still requires $O(\log{n})$ interaction rounds and $O(\log^2{n})$ communication bandwidth per access, where $n$ denotes the total number of key-value pairs stored.
In this work, we introduce more practical and efficient OMAP constructions. Consistent with all prior OMAPs, our constructions also adapt only the tree-based Oblivious RAM (ORAM) and oblivious data structures (ODS) to achieve OMAP for enhanced practicality. In complexity, our approach needs $O(\log{n}/\log{\log{n})+O(\log{\lambda})}$ interaction rounds and $O(\log^2{n}/\log{\log{n}})+O(\log{\lambda}\log{n})$ communication bandwidth per data access where $\lambda$ is the security parameter. This new complexity results from our two main contributions. First, unlike prior works that rely solely on search trees, we design a novel framework for OMAP that combines hash table with search trees. Second, we propose a more efficient tree-based ORAM named DAORAM, which is of significant independent interest. This newly developed ORAM noticeably accelerates our constructions as it supports obliviously accessing hash tables much more efficiently. We implement both our proposed constructions and prior methods to experimentally demonstrate that our constructions substantially outperform prior methods in terms of efficiency.
Constructions of self-orthogonal codes and LCD codes from functions over finite fields
The construction of self-orthogonal codes from functions over finite fields has been widely studied in the literature. In this paper, we construct new families of self-orthogonal linear codes with few weights from trace functions and weakly regular plateaued functions over the finite fields of odd characteristics. We determine all parameters of the constructed self-orthogonal codes and their dual codes. Moreover, we employ the constructed $p$-ary self-orthogonal codes to construct $p$-ary LCD codes.
A Closer Look at Falcon
Falcon is a winner of NIST's six-year post-quantum cryptography standardisation competition. Based on the celebrated full-domain-hash framework of Gentry, Peikert and Vaikuntanathan (GPV) (STOC'08), Falcon leverages NTRU lattices to achieve the most compact signatures among lattice-based schemes.
Its security hinges on a Rényi divergence-based argument for Gaussian samplers, a core element of the scheme. However, the GPV proof, which uses statistical distance to argue closeness of distributions, fails when applied naively to Falcon due to parameter choices resulting in statistical distances as large as $2^{-34}$. Additional implementation-driven deviations from the GPV framework further invalidate the original proof, leaving Falcon without a security proof despite its selection for standardisation.
This work takes a closer look at Falcon and demonstrates that introducing a few minor, conservative modifications allows for the first formal proof of the scheme in the random oracle model. At the heart of our analysis lies an adaptation of the GPV framework to work with the Rényi divergence, along with an optimised method for parameter selection under this measure. Furthermore, we obtain a provable version of the GPV framework over NTRU rings. Both these tools may be of independent interest.
Unfortunately, our analysis shows that despite our modification of Falcon-512 and Falcon-1024 we do not achieve strong unforgeability for either scheme. For plain unforgeability we are able to show that our modifications to Falcon-512 barely satisfy the claimed 120-bit security target and for Falcon-1024 we confirm the claimed security level. As such we recommend revisiting falcon and its parameters.
Concretely Efficient Input Transformation Based Zero-Knowledge Argument System for Arbitrary Circuits
We introduce a new transparent zero-knowledge argument system based on the direct computation concept described in this paper. Our protocol converts input parameters into a format that any circuit can process directly. Once the circuit output can be computed using transformed inputs, the output of the polynomial a circuit generates can also be correctly computed by the verifier, allowing us to reduce the polynomial evaluation cost significantly.
In the default setting, the prover runtime cost for group exponentiation operations is only the square root of the degree ($\sqrt{p_d}$) of the polynomial the circuit generates. Furthermore, leveraging the ``merging through addition" and ``bootstrapping with breakers" techniques, the size of the polynomial our protocol generates can be much smaller than the total number of multiplicative operations. Our benchmark result shows our approach can significantly improve both prover runtime and verifier runtime performance over state-of-the-art by almost or over one order of magnitude while keeping the communication cost comparable with that of the state-of-the-art.
Our approach also allows our protocol to be made memory-efficient while being non-interactive. The theoretical memory cost of our protocol is $O(b)$, where $b = \sqrt{p_d}$ in the default setting. Setting the bootstrapping parameter ($b$) aggressively will result in better prover runtime performance at the expense of the higher communication cost.