Edge intelligence requires to fast access distributed data samples generated by edge devices. The challenge is using limited radio resource to acquire massive data samples for training machine learning models at edge server. In this article, we propose a new communication-efficient edge intelligence scheme where the most useful data samples are selected to train the model. Here the usefulness or values of data samples is measured by data diversity which is defined as the difference between data samples. We derive a close-form expression of data diversity that combines data informativeness and channel quality. Then a joint data-and-channel diversity aware multiuser scheduling algorithm is proposed. We find that noise is useful for enhancing data diversity under some conditions.
Submitted 14 Jan 2021 to Information Theory [cs.IT] Published 15 Jan 2021