back to list

Project: Predicting user selection for interactive labeling

Description

Data labeling, as a fundamental task in supervised machine learning, refers to the annotation of data with representative labels. In contrast to active learning (AL), interactive labeling relies on users’ knowledge and pattern identification ability to select meaningful instances to label [1]. Previous work shows that this user-centered instance selection method can relieve the cold-start problem and the query strategy definition issue in AL. Besides, it potentially speeds up the labeling process by identifying similar instances and labeling them once. Different user strategies may be defined to describe how users identify instances to label based on visualization results [2].

The goal of this project is to predict user strategies and recommend instances to label so as to facilitate the interactive labeling process. To achieve this goal, a visual interactive labeling interface is required, for example, as shown in the Figure. Some algorithms or machine learning models can be used to learn and predict user selection[3][4]. We also would like to conduct a following evaluation of the final solution. Therefore, we expect you to have programming fundamentals and know about basic knowledge on machine learning and visualization.


References

[1] J. Bernard, M. Hutter, M. Zeppelzauer, D. Fellner and M. Sedlmair, "Comparing Visual-Interactive Labeling with Active Learning: An Experimental Study," in IEEE Transactions on Visualization and Computer Graphics, vol. 24, no. 1, pp. 298-308, Jan. 2018, doi: 10.1109/TVCG.2017.2744818. 

[2] Bernard, J., Zeppelzauer, M., Lehmann, M., Müller, M. and Sedlmair, M. (2018), Towards User-Centered Active Learning Algorithms. Computer Graphics Forum, 37: 121-132. https://doi.org/10.1111/cgf.13406

[3] Fan, C. and Hauser, H. (2018), Fast and Accurate CNN-based Brushing in Scatterplots. Computer Graphics Forum, 37: 111-120. https://doi.org/10.1111/cgf.13405

[4] Gadhave K, Görtler J, Cutler Z, et al. Predicting intent behind selections in scatterplot visualizations. Information Visualization. 2021;20(4):207-228. doi:10.1177/14738716211038604


Details
Student
AV
Aishvarya Viswanathan
Supervisor
Anna Vilanova
Secondary supervisor
Linhao Meng
Link
Thesis