Kalman Filtering Attention for User Behavior Modeling in CTR Prediction
Hu Liu, Jing LU, Xiwei Zhao, Sulong Xu, Hao Peng, Yutong Liu, Zehua Zhang, Jian Li, Junsheng Jin, Yongjun Bao, Weipeng Yan
Spotlight presentation: Orals & Spotlights Track 02: COVID/Health/Bio Applications
on 2020-12-07T19:30:00-08:00 - 2020-12-07T19:40:00-08:00
on 2020-12-07T19:30:00-08:00 - 2020-12-07T19:40:00-08:00
Poster Session 1 (more posters)
on 2020-12-07T21:00:00-08:00 - 2020-12-07T23:00:00-08:00
GatherTown: Deep learning ( Town E1 - Spot D2 )
on 2020-12-07T21:00:00-08:00 - 2020-12-07T23:00:00-08:00
GatherTown: Deep learning ( Town E1 - Spot D2 )
Join GatherTown
Only iff poster is crowded, join Zoom . Authors have to start the Zoom call from their Profile page / Presentation History.
Only iff poster is crowded, join Zoom . Authors have to start the Zoom call from their Profile page / Presentation History.
Toggle Abstract Paper (in Proceedings / .pdf)
Abstract: Click-through rate (CTR) prediction is one of the fundamental tasks for e-commerce search engines. As search becomes more personalized, it is necessary to capture the user interest from rich behavior data. Existing user behavior modeling algorithms develop different attention mechanisms to emphasize query-relevant behaviors and suppress irrelevant ones. Despite being extensively studied, these attentions still suffer from two limitations. First, conventional attentions mostly limit the attention field only to a single user's behaviors, which is not suitable in e-commerce where users often hunt for new demands that are irrelevant to any historical behaviors. Second, these attentions are usually biased towards frequent behaviors, which is unreasonable since high frequency does not necessarily indicate great importance. To tackle the two limitations, we propose a novel attention mechanism, termed Kalman Filtering Attention (KFAtt), that considers the weighted pooling in attention as a maximum a posteriori (MAP) estimation. By incorporating a priori, KFAtt resorts to global statistics when few user behaviors are relevant. Moreover, a frequency capping mechanism is incorporated to correct the bias towards frequent behaviors. Offline experiments on both benchmark and a 10 billion scale real production dataset, together with an Online A/B test, show that KFAtt outperforms all compared state-of-the-arts. KFAtt has been deployed in the ranking system of JD.com, one of the largest B2C e-commerce websites in China, serving the main traffic of hundreds of millions of active users.