**1. Introduction**

This article compiles lessons learned by the writers from more than a decade of telepresence related research. The article is a review by nature, but due to the number of the described topics, the presentation is not tutorial in all parts, but relies on prior knowledge by its readers, and/or their interest to learn more e.g. from the given references. Based on the reviewed status and enablers, the writers reason and define a practical and affordable 3D telepresence solution based on screen displays. In this solution, efficient 3D capture and low bitrate streaming is an important enabler both for communication and XR functionalities. Essentially the same technical solutions can also be used for remote support applications in industry, e.g. for 3D monitoring, maintenance, control, analysis, and augmentation.

The outline of the paper is as follows. In Chapter 2, we introduce the 3D telepresence topic, describe main factors of spatial faithfulness, and give few examples of existing approaches. Several of the references are to our own patent publications,

which have not been published as papers. Chapter focus is in the requirements and challenges of supporting 3D geometries and perception, as perceived in real-world encounters (face-to-face).

In the future, glasses type of displays will likely be the best to support immersion, mobility and freedom of viewpoint. However, still today, glasses are still lacking many important properties, and have many defects limiting perceived quality, time of use, and user acceptance. At the same time, using screen displays is the most common way of supporting visual interaction in teleconferencing solutions. Correspondingly, we wanted to find out whether a simple screen-based telepresence solution could support 3D perception and XR functionalities with improved naturalness, quality, and usability. The answer seems to be positive, and in Chapter 3, we give a draft specification for such a system.

An important enabler both for communication and AR functionalities is efficient 3D capture and streaming. Further, in Chapter 3, an implementation applying existing coding methods is described together with some simulation results. Despite our demarcation to screen based solutions, we also discuss the possibilities and future of glasses based approaches. In Chapter 4, we describe ways of enhancing 3D perception and XR functionalities of the basic solution. Future improvements may include also supporting natural eye-focus by accommodative displays. Finally, Chapter 5 summarizes our findings.
