DATE 2024 Accepted Papers
Congratulations to the authors and co-authors of the following regular papers and extended abstracts for the acceptance of your papers at DATE 2024! We look forward to meeting you in Valencia at DATE 2024 from 25 to 27 March 2024.
This is an early notification of acceptance! Further information will be sent to the authors by Tuesday, 14 November 2023 AoE at the latest.
We consider that you/your co-authors are committed to present the paper at the conference in Valencia. We reserve the right to remove the paper from the proceedings if none of the authors/co-authors registers and presents the paper at the conference.
Regular Papers
7 | Accelerating Machine Learning-Based Memristor Compact Modeling Using Sparse Gaussian Process |
9 | PIMSYN: Synthesizing Processing-in-memory CNN Accelerators |
21 | Algorithm-hardware co-design for Energy-Efficient A/D conversion in ReRAM-based accelerators |
22 | AutoWS: Automate Weights Streaming in Layer-wise Pipelined DNN Accelerators |
24 | ViTA: A Highly Efficient Dataflow and Architecture for Vision Transformers |
27 | Efficient Design of a Hyperdimensional Processing Unit for Multi-Layer Cognition |
31 | tSS-BO: Scalable Bayesian Optimization for Analog Circuit Sizing via Truncated Subspace Sampling |
33 | CPF: A Cross-Layer Prefetching Framework for High-Density Flash-based Storage |
34 | BESWAC: Boosting Exact Synthesis via Wiser SAT Solver Call |
37 | Computational and Storage Efficient Quadratic Neurons for Deep Neural Networks |
38 | OISA: Architecting an Optical In-Sensor Accelerator for Efficient Visual Computing |
45 | CRISP: Hybrid Structured Sparsity for Class-aware Model Pruning |
53 | Three Sidekicks to Support Spectre Countermeasures |
55 | A Modular Branch Predictor Performance Analysis Framework for Fast Design Space Exploration |
57 | DRAM-Locker: A General-Purpose DRAM Protection Mechanism against Adversarial DNN Weight Attacks |
61 | LaVA: An Effective Layer Variation Aware Bad Block Management for 3D CT NAND Flash |
62 | CTRL-B: Back-End-Of-Line Configuration Pathfinding using Cross-Technology Transferable Reinforcement Learning |
64 | Adaptive DRAM Cache Division for Computational Solid-state Drives |
68 | PACE: A Piece-Wise Approximate and Configurable Floating-Point Divider for Energy-Efficient Computing |
71 | ASCEND: Accurate yet Efficient End-to-End Stochastic Computing Acceleration of Vision Transformer |
72 | Programmable EM Sensor Array for Golden-Model Free Run-time Trojan Detection and Localization |
80 | High-Performance Data Mapping for BNNs on PCM-based Integrated Photonics |
84 | Optimal Fixed Priority Scheduling in Multi-Stage Multi-Resource Distributed Real-Time Systems |
89 | GPACE: An Energy-Efficient PQ-based GCN Accelerator with Redundancy Reduction |
95 | Deductive Formal Verification of Synthesizable, Transaction-level Hardware Designs Using Coq |
101 | Standard Cells Do Matter: Uncovering Hidden Connections for High-Quality Macro Placement |
103 | PhotonNTT: Energy-efficient Parallel Photonic Number Theoretic Transform Accelerator |
105 | IMCE: An In-Memory Computing and Encrypting Hardware Architecture for Robust Edge Security |
108 | ECM: Improving IoT Throughput with Energy-Aware Connection Management |
109 | An Autonomic Resource Allocating SSD |
113 | CBTune: Contextual Bandit Tuning for Logic Synthesis |
115 | Self-Learning and Transfer across Topologies of Constraints for Analog / Mixed-Signal Circuit Layout Synthesis |
116 | Formal Verification of Booth Radix-8 and Radix-16 Multipliers |
122 | EvilCS: An Evaluation of Information Leakage through Context Switching on Security Enclaves |
124 | MACO: Explorating GEMM Acceleration on a Loosely-Coupled Multi-core Processor |
135 | A Deep-learning-based Statistical Timing Prediction Method for Sub-16nm Technologies |
140 | X-PIM: Fast Modeling and Validation Framework for Mixed-Signal Processing-in-Memory Using Compressed Equivalent Model in SystemVerilog |
142 | HW-SW Optimization of DNNs for Privacy-preserving People Counting on Low-resolution Infrared Arrays |
154 | Statistical Profiling of Micro-Architectural Traces and Machine Learning For Spectre Detection: A Systematic Evaluation |
155 | Unveiling the Black-Box: Leveraging Explainable AI for FPGA Design Space Optimization |
160 | Efficient Fast Additive Homomorphic Encryption Cryptoprocessor for Privacy-preserving Federated Learning Aggregation |
161 | ViT-ToGo : Vision Transformer Accelerator with Grouped Token Pruning |
164 | Heterogeneous Static Timing Analysis with Advanced Delay Calculator |
168 | ROLDEF: RObust Layered DEFense for Intrusion Detection Against Adversarial Attacks |
175 | ISDC: Feedback-guided Iterative SDC Scheduling for High-level Synthesis |
183 | AttBind: Memory-efficient Acceleration for Long-range Attention using Vector-derived Symbolic Binding |
184 | Resource-efficient Heterogenous Federated Continual Learning on Edge |
189 | An Efficient Asynchronous Circuits Design Flow with Backward Delay Propagation Constraint |
196 | FusionArch: A Fusion-Based Accelerator for Point-Based Point Cloud Neural Networks |
204 | LLM-based Processor Verification: A Case Study for Neuromorphic Processor |
206 | DyPIM: Dynamic-inference-enabled Processing-In-Memory Accelerator |
217 | Cuper: Customized Dataflow and Perceptual Decoding for Sparse Matrix-Vector Multiplication on HBM-equipped FPGAs |
218 | Towards Reliable and Energy-Efficient RRAM based Discrete Fourier Transform Accelerator |
223 | Communication-Efficient Model Parallelism for Distributed In-situ Transformer Inference |
229 | CRONuS: Circuit Rapid Optimization with Neural Simulator |
243 | A Transistor Level Relational Semantics for Electrical Rule Checking by SMT Solving |
245 | COMET: A Cross-Layer Optimized Optical Phase Change Main Memory Architecture |
251 | FlexForge: Efficient Reconfigurable Cloud Acceleration via Peripheral Resource Disaggregation |
255 | On Gate Flip Errors in Computing-In-Memory |
256 | PathDriver-Wash: A Path-Driven Wash Optimization Method for Continuous-Flow Lab-on-a-Chip Systems |
258 | NVCA: A Computationally Efficient Neural Video Compression Accelerator Based on a Sparse CNN-Transformer Hybrid Network |
260 | CALLOC: Curriculum Adversarial Learning for Secure and Robust Indoor Localization |
268 | Discovering Efficient Fused Layer Configurations for Executing Multi-Workloads on Multi-core NPUs |
274 | HyQA: Hybrid Near-Data Processing Platform for Embedding based Question Answering System |
277 | DIAPASON: Differentiable Allocation, Partitioning and Fusion of Neural Networks for Distributed Inference |
278 | TSA-TICER: A Two-Stage TICER Acceleration Framework for Model Order Reduction |
284 | EcoFlex-HDP: High-Speed and Low-Power and Programmable Hyperdimensional-Computing Platform with CPU Co-processing |
293 | ESC-NTT: An Elastic, Seamless and Compact Architecture for Multi-Parameter NTT Acceleration |
300 | Multi-Level Analysis of GPU Utilization in ML Training Workloads |
304 | Electrostatics-Based Analytical Global Placement for Timing Optimization |
308 | A3PIM: An Automated, Analytic and Accurate Processing-in-Memory Offloader |
314 | NOVA: NoC-based Vector Unit for Mapping Attention Layers on a CNN Accelerator |
326 | A Semi-Tensor Product based Circuit Simulation for SAT-sweeping |
329 | A Golden-Free Formal Method for Trojan Detection in Non-Interfering Accelerators |
331 | Optimizing Ciphertext Management for Faster Fully Homomorphic Encryption Computation |
334 | A Concealable RRAM Physically Unclonable Function Compatible with In-Memory Computing |
345 | Decentralized Federated Learning in Partially Connected Networks with Non-IID Data |
353 | Class-Aware Pruning for Efficient Neural Networks |
360 | Trace-enabled timing model synthesis for ROS2-based autonomous applications |
365 | Adaptive ODE Solvers for Timed Data Flow Models in SystemC-AMS |
368 | Sava: A Spatial- and Value-Aware Accelerator for Point Cloud Transformer |
376 | Sparrow: Flexible Memory Deduplication in Android Systems with Similar-Page Awareness |
377 | MX: Enhancing RISC-V's Vector ISA for Ultra-Low Overhead, Energy-Efficient Matrix Multiplication |
381 | Near-Memory Parallel Indexing and Coalescing: Enabling Highly Efficient Indirect Access for SpMV |
382 | IndexMAC: A Custom RISC-V Vector Instruction to Accelerate Structured-Sparse Matrix Multiplications |
384 | A Stochastic Rounding-Enabled Low-Precision Floating-Point MAC for DNN Training |
387 | ONE SA: Enabling Nonlinear Operations in Systolic Arrays For Efficient and Flexible Neural Network Inference |
388 | DiMO-Sparse: Differentiable Modeling and Optimization of Sparse CNN Dataflow and Hardware Architecture |
392 | A RISC-V "V" VP: Unlocking Vector Processing for Evaluation at the System Level |
395 | Real-Time Multi-Person Identification and Tracking via HPE and IMU Data Fusion |
399 | ARCTIC: Agile and Robust Compute-In-Memory Compiler with Parameterized INT/FP Precision and Built-In Self Test |
401 | PELS: A Lightweight and Flexible Peripheral Event Linking System for Ultra-Low Power IoT Processors |
402 | An Isotropic Shift-Pointwise Network for Crossbar-Efficient Neural Network Design |
406 | CLSA-CIM: A Cross-Layer Scheduling Approach for Computing-in-Memory Architectures |
410 | CASCO: Cascaded Co-Optimization for Holistic Neural Network Acceleration |
417 | Quantum State Preparation Using an Exact CNOT Synthesis Formulation |
418 | Model-Driven Feature Engineering for Data-Driven Battery SOH Model |
420 | Dynamic Reconfigurable Security Cells based on Emerging Devices Integrable in FDSOI Technology |
432 | FARe: Fault-Aware GNN Training on ReRAM-based PIM Accelerators |
434 | A Novel March Test Algorithm for Testing 8T SRAM-based IMC Architectures |
438 | Analog Transistor Placement Optimization Considering Non-Linear Spatial Variation |
439 | DIAC: Design Exploration of Intermittent-Aware Computing Realizing Batteryless Systems |
442 | Learning Assisted Post-Manufacture Testing and Tuning of RRAM-Based DNNs for Yield Recovery |
444 | A Data-Driven Analog Circuit Synthesizer with Automatic Topology Selection and Sizing |
451 | Attention-Based EDA Tool Parameter Explorer: From Hybrid Parameters to Multi-QoR metrics |
462 | BusyMap, an Efficient Data Structure to Observe Interconnect Contention in SystemC TLM-2.0 |
467 | A Parallel Tempering Processing Architecture with Multi-Spin Update for Fully-Connected Ising Models |
468 | Gradient Boosting-accelerated Evolution for Multiple-Fault Diagnosis |
469 | DeepFrack: A Comprehensive Framework for Layer Fusion, Face Tiling, and Efficient Mapping in DNN Hardware Accelerators |
475 | Towards Scalable GPU System with Silicon Photonic Chiplet |
476 | PA-2SBF: Pattern-Adaptive Two-Stage Bloom Filter for Run-time Memory Diagnostic Data Compression in Automotive SoCs |
477 | Hardware-Assisted Control-Flow Integrity Enhancement for IoT Devices |
479 | CafeHD: A Charge-Domain FeFET-Based Compute-in-Memory Hyperdimensional Encoder with Hypervector Merging |
485 | Approximation Algorithm for Noisy Quantum Circuit Simulation |
488 | PURSE: Property Ordering Using Runtime Statistics for Efficient Multi-Property Verification |
490 | SpecHD: Hyperdimensional Computing Framework for FPGA-based Mass Spectrometry Clustering |
491 | Anonymous: Hierarchical Analog and Mixed Signal Routing Considering Versatile Routing Scenarios |
493 | AXI-REALM: A Lightweight and Modular Interconnect Extension for Traffic Regulation and Monitoring of Heterogeneous Real-Time SoCs |
494 | Cache Bandwidth Contention Leaks Secrets |
500 | OplixNet: Towards Area-Efficient Optical Split-Complex Networks with Real-to-Complex Data Assignment and Knowledge Distillation |
503 | WideSA: A High Array Utilization Mapping Scheme for Uniform Recurrences on ACAP |
504 | Fast IR-Drop Prediction of Analog Circuits Using Recurrent Synchronized GCN and Y-Net Model |
513 | Design Automation for Organs-on-Chip |
515 | S-LGCN: A Software–Hardware Co-design for Accelerating LightGCN |
518 | Standard Cell Layout Generator Amenable to Design Technology Co-Optimization in Advanced Process Nodes |
521 | BOXGB: Design Parameter Optimization with Systematic Integration of Bayesian Optimization and XGBoost |
532 | TroScan: Enhancing On-Chip Delivery Resilience to Physical Attack through Frequency-Triggered Key Generation |
543 | A Graph-learning-driven Prediction Method for Combined Electromigration and Thermomigration Stress on Multi-Segment Interconnects |
551 | RVCE-FAL: A RISC-V Vector-Scalar Custom Extension for Faster FALCON Digital Signature |
553 | LoADM: Load-aware Directory Migration Policy in Distributed File Systems |
554 | ARTmine: Automatic Association Rule Mining with Temporal Behavior for Hardware Verification |
559 | Parallel Multi-objective Bayesian Optimization Framework for CGRA Microarchitecture |
561 | DAISM: Digital Approximate In-SRAM Multiplier-based Accelerator for DNN Training and Inference |
563 | AFPR-CIM: An Analog-Domain Floating-Point RRAM-based Compute-In-Memory Architecture with Dynamic Range Adaptive FP-ADC |
565 | Miracle: Multi-Action Reinforcement Learning-Based Chip Floorplanning Reasoner |
581 | Towards High-throughput Neural Network Inference with Computational BRAM on Nonvolatile FPGAs |
590 | Parallel Grobner Basis Rewriting and Memory Optimization for Efficient Multiplier Verification |
591 | MATAR: Multi-Quantization-Aware Training for Accurate and Fast Hardware Retargeting |
599 | MSH: A Multi-Stage HiZ-Aware Homotopy Framework for Nonlinear DC Analysis |
600 | Embedding Hardware Approximations in Discrete Genetic-based Training for Printed MLPs |
602 | BORE: Energy-Efficient Banded Vector Similarity Search with Optimized Range Encoding for Memory-Augmented Neural Network |
603 | Circuits Physics Constrained Predictor of Static IR Drop with Limited Data |
604 | Efficient Spectral-Aware Power Supply Noisy Analysis for Low-Power Design Verification |
608 | Bitstream Fault Injection Attacks on CRYSTALS Kyber Implementations on FPGAs |
609 | A Multi-bit Near-RRAM based Computing Macro with Highly Computing Parallelism for CNN Application |
615 | SpecScope: Automating Discovery of Exploitable Spectre Gadgets on Black-box Microarchitectures |
622 | CycPUF: Cyclic Physical Unclonable Function |
632 | PP-HDC: A Privacy-Preserving Inference Framework For Hyperdimensional Computing |
638 | PoLM: Point Cloud and Large Pre-trained Model Catch Mixed-type Wafer Defect Pattern Recognition |
640 | Device-Aware Diagnosis for RRAM Unique Defects |
641 | Guided Fault Injection Strategy for Rapid Critical Bit Detection in Radiation-Prone SRAM-FPGA |
645 | SELCC: Enhancing MLC Reliability and Endurance with Single Cell Error Correction Codes |
650 | High Throughput Hardware Accelerated CoreSight Trace Decoding |
653 | Alleviating Barren Plateaus in Parameterized Quantum Machine Learning Circuits: Investigating Advanced Parameter Initialization Strategies |
659 | Scalable Logic Rewriting Using Don't Cares |
662 | Bit-Trimmer: Ineffectual Bit-operation Removal for CIM Architecture |
664 | Efficient Exploration of Cyber-Physical System Architectures Using Contracts and Subgraph Isomorphism |
669 | LRSCwait: Enabling Scalable and Efficient Synchronization in Manycore Systems through Polling-Free and Retry-Free Operation |
671 | MultimodalHD: Federated Learning Over Heterogeneous Sensor Modalities using Hyperdimensional Computing |
690 | SCGen: A Versatile Generator Framework for Agile Design of Stochastic Circuits |
691 | SuperFlow: A Fully-Customized RTL-to-GDS Design Automation Flow for Adiabatic Quantum-Flux-Parametron Superconducting Circuits |
693 | Fault-Tolerant Cyclic Queuing and Forwarding in Time-Sensitive Networking |
705 | Para-ZNS: Improving Small-zone ZNS SSDs Parallelism through Dynamic Zone Mapping |
714 | uHD: Unary Processing for Lightweight and Dynamic Hyperdimensional Computing |
718 | Hierarchical Source-to-Post-Route QoR Prediction in High-Level Synthesis with GNNs |
732 | Modeling Attack Tests and Security Enhancement of the Sub-threshold Voltage Divider Array PUF |
734 | IOMMU Deferred Invalidation Vulnerability: Exploit and Defense |
736 | Reinforcement Learning-Based Optimization of Back-side Power Delivery Networks in VLSI Design for IR-drop Reduction |
742 | Accelerating Chaining in Genomic Analysis Using RISC-V Custom Instructions |
751 | Flush+earlyReload: Covert Channels Attack on Shared LLC Using MSHR Merging |
753 | BlockAMC: Scalable In-Memory Analog Matrix Computing for Solving Linear Systems |
754 | Reliable Interval Prediction of Minimum Operating Voltage Based on On-chip Monitors via Conformalized Quantile Regression |
756 | Formal Verification of Secure Boot Process |
757 | Towards Efficient Reconfiguration through Lightweight Input Inversion for MLC NVFPGAs |
761 | JPlace: A Clock-Aware Length-Matching Placement for Rapid Single-Flux-Quantum Circuits |
764 | KRATT: QBF-Assisted Removal and Structural Analysis Attack Against Logic Locking |
772 | Pipette: Automatic Fine-grained Large Language Model Training Configurator for Real-World Clusters |
776 | Analog Printed Spiking Neuromorphic Circuit |
778 | SEA: Sign-Separated Accumulation Scheme for Resource-Efficient DNN Accelerators |
780 | STAR: Sum-Together/Apart Reconfigurable Multipliers for Precision-Scalable ML Workloads |
793 | A Read Latency Variation Aware Independent Read Scheme for QLC SSDs. |
795 | Low Power and Temperature-Resilient Compute-In-Memory Based on Subthreshold-FeFET |
800 | Optimizing Imperfectly-Nested Loop Mapping on CGRAs via Polyhedral-Guided Flattening |
804 | Reconfigurable Frequency Multipliers Based on Complementary Ferroelectric Transistors |
805 | Dynamic Realization of Multiple Control Toffoli Gate |
810 | FeReX: A Reconfigurable Design of Multi-bit Ferroelectric Compute-in-Memory for Nearest Neighbor Search |
811 | 12 mJ per Class On-Device Online Few-Shot Class-Incremental Learning |
823 | Performance Analysis and Optimizations of Matrix Multiplications on ARMv8 Processors |
825 | FMTT: Fused Multi-head Transformer with Tensor-compression for 3D Point Clouds Detection on Edge Devices |
829 | An Agile Deploying Approach for Large-Scale Workloads on CGRA-CPU Architecture |
834 | Technology/Algorithm Co-design for Reliable Energy-efficient NVM-based Hyperdimensional Computing under Voltage Scaling |
840 | Improvement of Mixed Track-Height Standard-Cell Placement |
845 | Memory Scraping Attack on Xilinx FPGAs: Private Data Extraction from Terminated Processes |
852 | Fast Parameter Optimization of Delayed Feedback Reservoir with Backpropagation and Gradient Descent |
866 | Decoupled Access-Execute enabled DVFS for tinyML deployments on STM32 MCUs |
882 | Selfie5: an autonomous, self-contained verification approach for high-throughput random testing of programmable processors. |
883 | A FeFET-based Time-Domain Associative Memory for Multi-bit Similarity Computation |
899 | Complete and Efficient Verification for a RISC-V Processor using Formal Verification |
903 | SenseDSE: Sensitivity-based Performance Evaluation for Design Space Exploration of Microarchitecture |
923 | ISPT-Net: A Noval Transient Backward-stepping Reduction Policy by Irregular Sequential PredictionTransformer |
932 | An Efficient Hypergraph Partitioner under Inter-Block Interconnection Constraints |
935 | Scalable Sequential Optimization Under Observability Don't Cares |
940 | Enhancing Side-Channel Attacks through X-Ray-Induced Leakage Amplification |
944 | Aloha-HE: A Low-Area Hardware Accelerator for Client-Side Operations in Homomorphic Encryption |
950 | On-FPGA Spiking Neural Networks for Integrated Near-Sensor ECG Analysis |
951 | EMAClave: An Efficient Memory Authentication for RISCV Enclaves |
959 | Tiny-VBF: Resource-Efficient Vision Transformer based Lightweight Beamformer for Ultrasound Single-Angle Plane Wave Imaging |
966 | TT-SNN: Tensor Train Decomposition for Efficient Spiking Neural Network Training |
970 | On-sensor Printed Machine Learning Classification via Bespoke ADC and Decision Tree Co-Design |
973 | HygHD: Hyperdimensional Hypergraph Learning |
975 | Testing Algorithms for Hard to Detect Thermal Crosstalk Induced Write Disturb Faults in Phase Change Memories |
984 | REDCAP: Reconfigurable RFET-based Circuits Against Power Side-Channel Attacks |
988 | Derailed: Arbitrarily Controlling DNN Outputs with Targeted Fault Injection Attacks |
1014 | XiNet-pose: Extremely lightweight pose detection for microcontrollers |
1037 | Enhancing Reliability of Neural Networks at the Edge: Inverted Normalization with Stochastic Affine Transformations |
1040 | Learning Circuit Placement Techniques through Reinforcement Learning with Adaptive Rewards |
1048 | RL-TPG: Automated Pre-Silicon Security Verification through Reinforcement Learning-Based Test Pattern Generation |
1056 | Prime+Reset: Introducing A Novel Cross-World Covert-Channel Through Comprehensive Security Analysis on ARM TrustZone |
1067 | A Compiler Phase to Optimally Split GPU Wavefronts for Safety-Critical Systems |
1068 | Uncertainty-Aware Hardware Trojan Detection Using Multimodal Deep Learning |
1088 | Automated Optimization of Deep Neural Networks: Dynamic Bit-Width and Layer-Width Selection via Cluster-Based Parzen Estimation |
1102 | IoT-GRAF: IoT Graph Learning-based Anomaly and Intrusion Detection through Multi-modal Data Fusion |
1137 | H3DFact: Heterogeneous 3D Integrated CIM for Factorization with Holographic Perceptual Representations |
1142 | Detecting Backdoor Attacks in Black-Box Neural Networks through Hardware Performance Counters |
1149 | Multi-Agent Reinforcement Learning for Thermally-Restricted Performance Optimization in Manycores |
1151 | sLET for distributed aerospace landing system |
1157 | MABFuzz: Multi-Armed Bandit Algorithms for Fuzzing Processors |
1158 | Value-Driven Mixed-Precision Quantization for Patch-Based Inference on Microcontrollers |
1178 | PIMLC: Logic Compiler for Bit-serial Based PIM |
1184 | MATADOR: Automated System-on-Chip Tsetlin Machine Design Generation for Edge Applications |
1191 | Shared Cache Analysis under Preemptive Scheduling |
1192 | A Mapping of Triangular Block Interleavers to DRAM for Optical Satellite Communication |
1194 | Full-Stack Optimization for CAM-Only DNN Inference |
1197 | TreeRNG: Binary Tree Random Number Generator for Efficient Probabilistic AI Hardware Design |
1199 | VACSEM: Verifying Average Errors in Approximate Circuits Using Simulation-Enhanced Model Counting |
1212 | Synthesizing Hardware-Software Leakage Contracts for RISC-V Open-Source Processors |
1214 | Beyond Random Inputs: A Novel ML-Based Hardware Fuzzing |
1217 | Compact Powers-of-Two: An Efficient Non-Uniform Quantization for Deep Neural Networks |
1221 | A Deep-Learning Technique to Locate Cryptographic Operations in Side-Channel Traces |
1226 | TitanCFI: Toward Enforcing Control-Flow Integrity in the Root-of-Trust |
1231 | Can Machine Learn Pipeline Leakage? |
1238 | Shared Data Kills Real-Time Cache Analysis. How to Resurrect It? |
1249 | A configurable approximate multiplier for CNNs using partial product speculation |
1275 | A Hardware Accelerated Autoencoder for RF Communication using Short-Time-Fourier-Transform Assisted Convolutional Neural Network |
Extended Abstracts
2 | Para-Pipe: Exploiting Parallelism and Pipelining of ML Computational Graphs on SoCs |
4 | Viper: Utilizing Hierarchical Program Structure to Accelerate Multi-core Simulation |
13 | FLInt: Exploiting Floating Point Enabled Integer Arithmetic for Efficient Random Forest Inference |
32 | Fully Adaptive and Memory-Efficient Heterogeneous Framework for Gate-Level Fault Simulation |
35 | FAMS: A Framework of Storage-centric Mapping for DNNs on Systolic Array Accelerators |
36 | LESS: Low-power Energy-efficient Subgraph Isomorphism on FPGA |
91 | A Floating Memristor Emulator with Inverse Frequency Characteristic |
106 | In-field Detection of Small Delay Defects and Runtime Degradation using On-Chip Sensors |
147 | Towards SEU Fault Propagation Prediction with Spatio-temporal Graph Convolutional Networks |
153 | Meta: A Memory-Efficient Tri-State Polynomial Multiplication Accelerator Using 2D Coupled-BFUs |
158 | Breaking XOR Arbiter PUFs without Reliability Information |
178 | BoolGebra: Attributed Graph-learning for Boolean Algebraic Manipulation |
197 | Trans-Net: Knowledge-Transferring Analog Circuit Optimizer with a Netlist-Based Circuit Representation |
201 | Early Dot-Product Termination for Computation Reduction in DLMs Supporting Non-ReLU Activation Functions |
208 | OC-DLRM: Minimizing the I/O Traffic of DLRM between Main Memory and OCSSD |
220 | Dynamic Per-Flow Queues for TSN switch |
279 | Microprocessor Design Space Exploration via Space Partitioning and Bayesian Optimization |
283 | AsymSAT: Accelerating SAT Solving with Asymmetric Graph-based Model Prediction |
311 | RTSA: An RRAM-TCAM based In-Memory-Search Accelerator for Sub-100 μs Collision Detection |
315 | Atomic Defect-Aware Physical Design of Silicon Dangling Bond Logic on the H-Si(100)-2x1 Surface |
338 | Demonstrating Post-Quantum Remote Attestation for RISC-V Devices |
351 | ScanCamouflage: Obfuscating Scan Chains with Camouflaged Sequential and Logic Gates |
354 | Block Floating-Point Approximate Computing Through Exponent-Guided Precision Adjustment |
356 | Methods for Generating Response Ranges for Run-time Requirement Enforcement of Non-Functional Properties on MPSoCs |
366 | High-Performance Feature Extraction for GPU-accelerated ORB-SLAMx |
389 | Towards Highly-Accurate Early-Stage Hardware-Software Partitioning |
413 | Zero-shot Classification using Hyperdimensional Computing |
416 | Securing ISW Masking Scheme Against Glitches |
423 | OTFGEncoder-HDC: Hardware-efficient Encoding Techniques for Low-overhead FPGA Mapping of Hyperdimensional~Computing |
435 | SGPRS: Seamless GPU Partitioning Real-Time Scheduler for Periodic Deep Learning Workloads |
478 | DNA-based Similar Image Retrieval via Triplet Network-driven Encoder |
482 | WaveFormer: Enabling Long Sequence Transformers for Extreme Edge Devices |
496 | Fast Estimation for Electromigration Nucleation Time Based on Random Activation Energy Model |
499 | I²SR: Immediate Interrupt Service Routine on RISC-V MCU to Control mmWave RF Transceivers |
501 | RLPlanner: Reinforcement Learning based Floorplanning for Chiplets with Fast Thermal Analysis |
508 | A Computing-in-Memory Pipeline Design for Diversely Connected Neural Networks Based on Nonvolatile Memory |
514 | High-Efficiency FPGA-Based Approximate Multipliers with LUT Sharing and Carry Switching |
528 | MicroNAS: Zero-Shot Neural Architecture Search for MCUs |
542 | ReTAP: Processing-in-ReRAM Bitap Approximate String Matching Accelerator for Genomic Analysis |
546 | PRD: AVX-512-based Parallelization of Resemblance Detection for Post-Deduplication Delta Compression |
571 | An Endeavor to Industrialize Hardware Fuzzing: Automating NoC Verification in UVM |
572 | PatternS: An Intelligent Hybrid Memory Scheduler Driven by Page Pattern Recognition |
575 | Parasitic Circus: On the Feasibility of Golden-Free PCB Verification |
619 | Lightweight Instrumentation for Accurate Performance Monitoring in RTOSes |
621 | Circumventing Restrictions in commercial High-Level Synthesis Tools |
624 | Out-of-Distribution Detection Using Power-Side Channels for Improving Functional Safety of Neural Network FPGA Accelerators |
630 | SEAL: Sensing Efficient Active Learning on Wearables through Context-awareness |
668 | A Novel Multi-objective Optimization Framework for Analog Circuit Customization |
673 | VeriBug: An Attention-based Dynamic Analysis Framework for Bug-Localization in Hardware Designs |
701 | Harnessing ML Privacy by Design Through Crossbar Array Non-idealities |
703 | Training Better CNN Models for 3-D Capacitance Extraction with Neural Architecture Search |
708 | Search-in-Memory (SiM): Conducting Data-Bound Computations on Flash Memory Chip for Enhanced Efficiency |
716 | Exploring Forward-Only Encoder Training for Accurate, Efficient Hyperdimensional Computing |
741 | DeepSeq: Deep Sequential Circuit Learning |
752 | Accelerating DNNs using Weight Clustering on RISC-V Custom Functional Units |
782 | A Hybrid Approach to Reverse Engineering on Combinational Circuits |
785 | Lightweight and predictable memory virtualization on medium-end microcontrollers |
838 | HDCircuit: Brain-inspired Hyperdimensional Computing for Circuit Recognition |
848 | CLAST: Cross-Layer Approximate High-Level Synthesis with Configurable Approximate Three-Operand Adders |
862 | A Framework for Designing Gaussian Belief Propagation Accelerators for use in SLAM Problems |
881 | ESOMICS: ML-Based Timing Behaviour Analysis For Efficient Mixed-Criticality System Design |
893 | A DTCO Framework for 3D NAND Flash Readout |
902 | DACO: Pursuing Ultra-low Power Consumption via DNN-Adaptive CPU-GPU CO-optimization on Mobile Devices |
955 | Shuttling for Scalable Trapped-Ion Quantum Computers |
1005 | Automated Hardware Security Countermeasure Integration inside High-Level Synthesis |
1036 | Deep Quasi-Periodic Priors: Signal Separation in Wearable Systems with Limited Data |
1077 | AdaP-CIM: Compute-in-Memory Based Neural Network Accelerator using Adaptive Posit |
1089 | Extending SSD Lifetime via Balancing Layer Endurance in 3D NAND Flash Memory |
1094 | An Efficient Logic Operation Scheduler for Minimizing Memory Footprint of In-Memory SIMD Computation |
1139 | Learning to Floorplan like Human Experts via Reinforcement Learning |
1156 | FHE-CGRA: Enable Efficient Acceleration of Fully Homomorphic Encryption on CGRAs |
1248 | Unleashing the Power of T1-cells in SFQ Arithmetic Circuits |
Share this page on social media