A Relevant and Diverse Retrieval-enhanced Data Augmentation Framework for Sequential Recommendation

1. Introduction

user-item interaction 의 scarce 로 인해 data sparsity 문제가 발생하고, recommendation에서는 cold-start scenario 가 대표적임
이러한 문제점을 해결하기 위해서 heuristic / model based augmentation 방법들이 recommendation에서도 많이 제안 되어 왔음
heuristic 한 방법들은 (crop, reorder, mask) 데이터의 sparsity를 완화하여 모델이 좀 더 일반화 되고, 강건하게 학습할 수 있도록 하긴 하지만,
augmented 된 데이터가 원래 데이터와 비교해봤을 때, relevance, diversity 측면에서 떨어진다. → semantic drift 문제 발생 (Rotum 참고..)
그래서 relevance and diversity를 보장할 수 있는 data augmentation 기법을 제안한다.

augmentation 방법으로 2개를 사용하는데 → 1) attentional / 2) interpolation (mixup)
attentional
- augmentation user들을 모두 concat해서 target user representation과 attention 해서 구해서 → augmented user representation 을 생성
interpolative(mixup)
- sequence encoding 을 위해 사용되는 transformer 구조에서 t번째 레이어로부터 나온 sequence의 embedding들을 섞는 것
- 이렇게 함으로써 다음과 같은 효과를 노리려고 함
  1. we operate on more abstractive sequence representations (instead of item embeddings as in original mixup)
  2. model can further adjust the model parameters according to the incorporated external information
  3. interpolate the hidden states at an intermediate layer for enriching the sequential semantics.
    
    기존 어떤 다른 연구에서 : decoding from an interpolation of two hidden vectors → generate a new sentences with mixed meaning of two original sentences.