**A. Appendix. Parameter perturbation analysis**

To theoretically compare our proposed variational scaling fusion approach DRoGSuRe to DMSC, we proceed by way of a first order perturbation analysis on the parameter set *<sup>W</sup><sup>i</sup>* of respectively either technique *<sup>i</sup>* <sup>¼</sup> **<sup>1</sup>**, **<sup>2</sup>**. This will, in turn impact the associated affinity matrix *A<sup>i</sup>* , which as we will later elaborate directly impacts the subspace clustering procedure which is central to the inference following the fusion procedure.

Adopting the original formulation for the first persistently differential scaling approach, namely that *T* modalities are jointly exploited, results in, *X***1** ð Þ¼ *<sup>t</sup> <sup>x</sup>***<sup>1</sup> <sup>1</sup>**ð Þ*<sup>t</sup> <sup>x</sup>***<sup>1</sup> <sup>2</sup>**ð Þ*<sup>t</sup>* … *<sup>x</sup>***<sup>1</sup>** *<sup>n</sup>*ð Þ*<sup>t</sup>* � �, where *<sup>x</sup>***<sup>1</sup>** *<sup>k</sup>*ð Þ*<sup>t</sup>* <sup>∈</sup> *<sup>m</sup>*,*<sup>t</sup>* <sup>¼</sup> **<sup>1</sup>**, **<sup>2</sup>**, … , *<sup>T</sup>* represents the *<sup>k</sup>th* observation. The second approach only effectively uses only one subspace structure of the fused modalities *<sup>X</sup>***<sup>2</sup>**ð Þ¼ *<sup>t</sup> <sup>x</sup>***<sup>1</sup> 1***x***1 <sup>2</sup>** … *x***<sup>1</sup>** *n* � �.

A first order perturbation on the data may be due to noise or to a degradation of a given sensor, and results in a perturbation of the UoS parameters,

$$
\tilde{\boldsymbol{W}}\_1^\dagger = \mathbf{W}\_1^\dagger + \boldsymbol{\delta}^\dagger \tag{A1}
$$

For the first method, each modality will have an associated subspace cluster parameter set *W***<sup>1</sup>** *t* � � *<sup>t</sup>*¼**1**, … ,*<sup>T</sup>*, with *<sup>W</sup>***<sup>1</sup>** *<sup>t</sup>* ∈ *n*�*<sup>n</sup>*. The overall parameter set for DRoGSuRe can then be written as,

$$
\tilde{\boldsymbol{\mathcal{W}}}^1 = \tilde{\boldsymbol{\mathcal{W}}}\_1^1 + \boldsymbol{\mathcal{W}}\_2^1 + \dots + \boldsymbol{\mathcal{W}}\_m^1 \tag{A2}
$$

Where the unperturbed overall sparse coefficient matrix is written as follows, *W***<sup>1</sup>** *tot* <sup>¼</sup> *<sup>W</sup>***<sup>1</sup> <sup>1</sup>** <sup>þ</sup> *<sup>W</sup>***<sup>1</sup> <sup>2</sup>** <sup>þ</sup> … <sup>þ</sup> *<sup>W</sup>***<sup>1</sup>** *<sup>m</sup>*. A similar development follows for method 2, with the difference that the contributing modalities are fused a priori.

Proof. We first write the affinity matrix associated with DRoGSuRE as,

$$
\tilde{\mathbf{A}}^1 = \tilde{\mathbf{W}}\_{\text{tot}}^1 + \left(\tilde{\mathbf{W}}\_{\text{tot}}^1\right)^T \tag{A3}
$$

$$\tilde{\mathbf{A}}^1 = \tilde{\mathbf{W}}\_1^1 + \mathbf{W}\_2^1 + \dots + \mathbf{W}\_m^1 + \left(\bar{\mathbf{W}}\_1^1 + \mathbf{W}\_2^1 + \dots + \mathbf{W}\_m^1\right)^T \tag{A4}$$

where the superscript *T* denotes transpose. This is equivalent to,

$$
\tilde{\mathbf{A}}^1 = \tilde{\mathbf{A}}\_1^1 + \sum\_{i=2}^T \mathbf{A}\_i^1 \tag{A5}
$$

Where **<sup>0</sup>**<sup>≤</sup> *<sup>A</sup>*<sup>~</sup> **<sup>1</sup> <sup>1</sup>**ð Þ *<sup>i</sup>*, *<sup>j</sup>* <sup>≤</sup>**<sup>1</sup>** <sup>þ</sup> *<sup>δ</sup>***<sup>1</sup>** . The unperturbed collective affinity matrix *<sup>A</sup>***<sup>1</sup>** can be similarly written *<sup>A</sup>***<sup>1</sup>** <sup>¼</sup> <sup>P</sup>*<sup>T</sup> <sup>i</sup>*¼**<sup>1</sup>***A***<sup>1</sup>** *<sup>i</sup>* with the unity constraint on each entry of all matrices. We may also write the magnitude of the difference as,

$$\left| \mathbf{A}^{\mathbf{1}} - \mathbf{\tilde{A}}^{\mathbf{1}} \right| = \boldsymbol{\delta}^{\mathbf{1}} + \left( \boldsymbol{\delta}^{\mathbf{1}} \right)^{T} \tag{A6}$$

Letting <sup>Δ</sup> <sup>¼</sup> *<sup>δ</sup>***<sup>1</sup>** <sup>þ</sup> *<sup>δ</sup>***<sup>1</sup>** � �*<sup>T</sup>* <sup>∈</sup> *<sup>n</sup>*�*<sup>n</sup>*, and assuming *<sup>ϵ</sup>* <sup>¼</sup> *max <sup>i</sup>*,*<sup>j</sup>*½ � <sup>Δ</sup> *<sup>i</sup>*,*<sup>j</sup>* , we can write,

$$\left\|\mathbf{A} - \mathbf{\bar{A}}\right\|\_{F} \leq n\varepsilon \tag{A7}$$

Given the Δ matrix individual entry bounds, we conclude that,

$$0 \le \epsilon \le \frac{1}{t} \tag{A8}$$

Since DMSC assumes having one sparse coefficient matrix *W* for all data modalities, which is equivalent to only one subspace structure of the fused modalities *<sup>X</sup>***<sup>2</sup>**ð Þ¼ *<sup>t</sup> <sup>x</sup>***<sup>2</sup> <sup>1</sup>** … *x***<sup>2</sup>** *n* � �. Therefore, the UoS parameters will be perturbed by *δ***<sup>2</sup>** as follows,

$$
\tilde{\mathbf{W}}^2 = \mathbf{W}^2 + \delta^2 \tag{A9}
$$

The affinity matrix associated with DMSC can be written as follows, *<sup>A</sup>*<sup>~</sup> **<sup>2</sup>** <sup>¼</sup> *<sup>W</sup>*<sup>~</sup> **<sup>2</sup>** <sup>þ</sup> *<sup>W</sup>*<sup>~</sup> **<sup>2</sup>** � �*<sup>T</sup>* , which is equivalent to,

$$\tilde{\mathbf{A}}^2 = \mathbf{W}^2 + \delta^2 + \left(\mathbf{W}^2\right)^T + \left(\delta^2\right)^T \tag{A10}$$

Similarly, the unperturbed affinity matrix will be as follows,

$$\mathbf{A}^2 = \mathbf{W}^2 + \left(\mathbf{W}^2\right)^T \tag{A11}$$

From Eqs. (A10) and (A11), the magnitude of the difference can be written as follows,

$$|\mathbf{A}^2 - \tilde{\mathbf{A}}^2| = \delta^2 + \left(\delta^2\right)^T \tag{A12}$$

Letting *<sup>γ</sup>* <sup>¼</sup> *<sup>δ</sup>***<sup>2</sup>** <sup>þ</sup> *<sup>δ</sup>***<sup>2</sup>** � �*<sup>T</sup>* <sup>∈</sup> *<sup>n</sup>*�*<sup>n</sup>*, i.e., *<sup>A</sup>***<sup>2</sup>** � *<sup>A</sup>*<sup>~</sup> **<sup>2</sup>** � � � � � � <sup>¼</sup> *<sup>γ</sup>*, and assuming *<sup>Ψ</sup>* <sup>¼</sup> *max <sup>i</sup>*,*<sup>j</sup>*½ � *<sup>γ</sup> <sup>i</sup>*,*<sup>j</sup>* , we can write *<sup>A</sup>***<sup>2</sup>** � *<sup>A</sup>*<sup>~</sup> **<sup>2</sup>** � � � � � � *F* ≤*nΨ*. Given the *γ* matrix individual entry bounds, we conclude **0**≤ *Ψ* ≤**1**. If we only perturb one modality, knowing that **0**≤ *A i*ð Þ , *j* ≤**1**, therefore the error could lie between **0**≤ *Ψ* ≤**1**, which entails either creating a fake relation between two data points or erasing an existing relation. *ϵ* and *Ψ* are random variables that do not have to follow a specific distribution, however, in any case *E ϵ***<sup>2</sup>** � � ≪ *E Ψ***<sup>2</sup>** � � and therefore *SNRDRoGSuRe* ≫ *SNRDMSC:*

In light of the above two bounds, and the results of [42], where it is shown that the spectral clustering dependent on the respective projection operators *PW***<sup>1</sup>** and *<sup>P</sup>*<sup>~</sup> *W*~ **<sup>1</sup>** onto the vector subspaces spanned by the principal eigenvectors of *W***<sup>1</sup>** *tot* and *<sup>W</sup>*<sup>~</sup> **<sup>1</sup>** *tot* of may be written as,

$$\left\| P\_{\mathcal{W}^1} - \tilde{P}\_{\tilde{\mathcal{W}}^1} \right\|\_F \le \frac{\sqrt{2}}{a^1} \left\| \mathbf{A}^1 - \tilde{\mathbf{A}}^1 \right\|\_F \tag{A13}$$

where *<sup>α</sup>***<sup>1</sup>** is the spectral gap between the *kth* and ð Þ *<sup>k</sup>* <sup>þ</sup> **<sup>1</sup>** *st* eigen value of *<sup>A</sup>***<sup>1</sup>** , *λ***1** *<sup>k</sup>* � *<sup>λ</sup>***<sup>1</sup>** *k*þ**1** � � � �. Similarly, for DMSC, the bound on the projection operators is,

$$\left\|\boldsymbol{P}\_{\boldsymbol{W}^{2}}-\boldsymbol{\tilde{P}}\_{\boldsymbol{\tilde{W}}^{2}}\right\|\_{F}\leq\frac{\sqrt{2}}{\alpha^{2}}\left\|\boldsymbol{A}^{2}-\boldsymbol{\tilde{A}}^{2}\right\|\_{F}\tag{A14}$$

where *<sup>α</sup>***<sup>2</sup>** <sup>¼</sup> *<sup>λ</sup>***<sup>1</sup>** *<sup>k</sup>* � *<sup>λ</sup>***<sup>1</sup>** *k*þ**1** � � � �. Since *<sup>W</sup>***<sup>1</sup> 1**,*W***<sup>1</sup> <sup>2</sup>**, … ,*W***<sup>1</sup>** *<sup>T</sup>* happen to commute and if they happen to be diagonalizable, therefore, they share the same eigenvectors. As a result, the eigenvectors of *W***<sup>1</sup> <sup>1</sup>** <sup>þ</sup> *<sup>W</sup>***<sup>1</sup> <sup>2</sup>** <sup>þ</sup> … <sup>þ</sup> *<sup>W</sup>***<sup>1</sup>** *<sup>T</sup>* are also the same and the corresponding eigenvalue that is the sum of the corresponding eigenvalues of *W***<sup>1</sup> 1**,*W***<sup>1</sup> <sup>2</sup>**, … and *W***<sup>1</sup>** *<sup>T</sup>:*Therefore, *λ***1** *<sup>k</sup>* ≫ *λ***<sup>2</sup>** *<sup>k</sup>* From all the above, we can conclude that smaller error yielding to better clustering, hence preserving the performance, yields the improvement by the T-factor noted in the proposition and shown in the two perturbation developments.
