Data-Free Quantization of Neural Receivers: When 4-Bit Succeeds, Why 6-Bit Matters for 6G
Abstract
As wireless systems evolve toward 6G, Artificial Intelligence (AI) and DeepLearning (DL) are poised to revolutionize physical layer processing, offeringsuperior performance over model-based methods in metrics such as throughputand Block Error Rate (BLER). However, deploying DL-based neural receiversin resource-constrained environments requires balancing performance with hard-ware constraints like inference latency, energy consumption, and computationaloverhead. This work investigates Post-Training Quantization (PTQ) applied to aSingle-Input-Multiple-Output (SIMO) neural receiver architecture that processesfrequency-domain baseband samples to output Log-Likelihood Ratios (LLRs) forerror-control decoding using a data-free approach requiring no retraining data.Using symmetric per-channel PTQ to reduce the float32 weight precision to lowbit-width (int8, int6, and int4), we evaluate its impact on radio performanceacross diverse 3GPP channel models (Line-of-Sight (LoS) and Non-LoS (NLoS))and mobility scenarios. Experimental results demonstrate that in NLoS environ-ments, int8 and int6 quantization achieve near-float32 BLER with gains up to4.9 dB compared with baseline Least-Squares (LS) estimation in high-mobility con-ditions. int4 quantization offers great robustness, exceeding traditional receiverperformance by 1.7–2.6 dB in LoS scenarios across various mobility conditions re-sulting in an 8× smaller model. This work investigates the lower bit-width limits ofPTQ applied to two neural receivers, trained and tested across diverse channel mod-els and mobility conditions. These findings are important for hardware-softwareco-design for AI-native 6G air-interfaces, highlighting low-precision quantizationas a key enabler for efficient edge, sensing and cloud radio deployments