**2.1 Design of multilayer nanostructures by deep learning**

Multilayer nanostructures can exhibit unique optical properties including field enhancements and distributions, special transmission/reflection spectra, based on the interference of different modes supported by different layers in the nanostructures. Machine learning has emerged as a more and more promising tool to solve the inverse design of photonic nanostructures. It will enable effective inverse design by simultaneously considering various inter-linked parameters such as geometric

**67**

**Figure 2.**

*Deep Learning Enabled Nanophotonics DOI: http://dx.doi.org/10.5772/intechopen.93289*

some minor deviations.

parameters, material types, etc., simultaneously (unlike the current regular

A recent work done by Peurifoy et al. has demonstrated using deep neural network (DNN) to relate the geometry of SiO2/TiO2 multilayer spherical core-shell nanoparticles with their light-scattering properties (**Figure 2a**) [4]. The transfer matrix method has been used to analytically solve the scatterings to generate 50,000 different combinations of the shell thickness as the total examples for training, validation, and testing. The forward learning model was a fully-connected dense feed-forward network with four hidden layers. The inputs were set to be the thickness of each shell of the nanoparticles, and the outputs were the corresponding scattering cross section spectra. During the learning process, the output of the network was compared with the target response to provide a loss function against which the weights can be trained and updated. After the forward-feeding training process, by fixing the weights, and setting the inputs as a trainable variable and fix the output to the desired output, they run the neural network backwardly, let the neural networks to iterate the inputs and provide the desired geometry to give the target spectrum. After training, as can be seen from **Figure 2a**, for an arbitrarily given spectrum (blue curve), the DNN can successfully predict the thickness of each shell of the nanoparticles that can generate a similar scattering spectrum as wanted, with

A further improvement of this approach is to take into account the different material combinations for the core-shell nanoparticles. In another work done by So et al., they have considered a simultaneous inverse design of materials and structural parameters using the deep learning network (**Figure 2b**) [5]. Here, they use the network to map the extinction spectra of the electric dipole (ED) and magnetic dipole (MD) to the core-shell nanoparticles, including the material information and shell thicknesses. The DL model consists of two networks: a designed network to learn a mapping from optical properties to design parameters, and a spectrum network to learn from design parameters to optical properties. Here, in order to adapt the network to the different types of input data (materials and thicknesses), the loss function has been devised accordingly by the weighted average of material

*Application of DL for multilayer nanostructure design: (a) using DNN to retrieve the layer thicknesses of a multilayer particle based on its scattering spectrum. Inset: Network architecture. (b) Left: Geometry of three-layered core-shell nanoparticles with changeable materials and thicknesses. Right: Network architecture. (c) Left: Multilayer thin films of SiO*2 *and Si N*3 4 *. Right: The architecture of the tandem network composed of an inverse design network and a forward modelling network. (d) Left: Evolution of the training cost of the* 

*network. Right: Performance of the network using a Gaussian-shaped spectrum.*

approaches, which optimise one or two parameters only, at a time).

## *Deep Learning Enabled Nanophotonics DOI: http://dx.doi.org/10.5772/intechopen.93289*

*Advances and Applications in Deep Learning*

remarks and outlook are presented in Section 4.

**2. Optimisation of nanophotonics design by deep learning**

**2.1 Design of multilayer nanostructures by deep learning**

Recently, deep learning using an artificial neural network has emerged as a revolutionary and powerful methodology in nanophotonics field. Applying the deep learning algorithms to the nanophotonic inverse design can introduce remarkable design flexibility which is very challenging and even impossible to achieve based on conventional optimisation approaches [1]. In this section, we will provide a brief review of the implementation of deep learning to solve nanophotonic inverse design

Multilayer nanostructures can exhibit unique optical properties including field enhancements and distributions, special transmission/reflection spectra, based on the interference of different modes supported by different layers in the nanostructures. Machine learning has emerged as a more and more promising tool to solve the inverse design of photonic nanostructures. It will enable effective inverse design by simultaneously considering various inter-linked parameters such as geometric

local structure. As a branch of machine learning, deep learning has received much attention worldwide because it can efficiently process and analyse a vast number of datasets. It has already found great success in computer vision and speech recognition. Recently, researchers and scientists have applied it to quantum optics, material design and optimisation of nanophotonic devices due to its outstanding capability of finding optimal solution from enormous data. At the same time, the computational cost is much lower compared to other inverse design methods [2, 3]. Several neural networks including deep neural network, generative neural network and convolutional neural network are frequently used to retrieve the optimal structure parameters for irregular structure with limited sets of data and shorter time when many structure parameters are involved for opmisation. This book chapter is organised as follows: In Section 2, we will discuss the inverse design enabled by deep learning on four different topics: multilayer structure, plasmonic metasurface, dielectric metasurface, chiral metamaterials (See **Figure 1b**). In Section 3, we review the recent progress on all-optical neural networks. Then, concluding

*(a) Inverse design methods in nanophotonics. (b) Application of deep learning in nanophotonics.*

**66**

problems.

**Figure 1.**

parameters, material types, etc., simultaneously (unlike the current regular approaches, which optimise one or two parameters only, at a time).

A recent work done by Peurifoy et al. has demonstrated using deep neural network (DNN) to relate the geometry of SiO2/TiO2 multilayer spherical core-shell nanoparticles with their light-scattering properties (**Figure 2a**) [4]. The transfer matrix method has been used to analytically solve the scatterings to generate 50,000 different combinations of the shell thickness as the total examples for training, validation, and testing. The forward learning model was a fully-connected dense feed-forward network with four hidden layers. The inputs were set to be the thickness of each shell of the nanoparticles, and the outputs were the corresponding scattering cross section spectra. During the learning process, the output of the network was compared with the target response to provide a loss function against which the weights can be trained and updated. After the forward-feeding training process, by fixing the weights, and setting the inputs as a trainable variable and fix the output to the desired output, they run the neural network backwardly, let the neural networks to iterate the inputs and provide the desired geometry to give the target spectrum. After training, as can be seen from **Figure 2a**, for an arbitrarily given spectrum (blue curve), the DNN can successfully predict the thickness of each shell of the nanoparticles that can generate a similar scattering spectrum as wanted, with some minor deviations.

A further improvement of this approach is to take into account the different material combinations for the core-shell nanoparticles. In another work done by So et al., they have considered a simultaneous inverse design of materials and structural parameters using the deep learning network (**Figure 2b**) [5]. Here, they use the network to map the extinction spectra of the electric dipole (ED) and magnetic dipole (MD) to the core-shell nanoparticles, including the material information and shell thicknesses. The DL model consists of two networks: a designed network to learn a mapping from optical properties to design parameters, and a spectrum network to learn from design parameters to optical properties. Here, in order to adapt the network to the different types of input data (materials and thicknesses), the loss function has been devised accordingly by the weighted average of material

#### **Figure 2.**

*Application of DL for multilayer nanostructure design: (a) using DNN to retrieve the layer thicknesses of a multilayer particle based on its scattering spectrum. Inset: Network architecture. (b) Left: Geometry of three-layered core-shell nanoparticles with changeable materials and thicknesses. Right: Network architecture. (c) Left: Multilayer thin films of SiO*2 *and Si N*3 4 *. Right: The architecture of the tandem network composed of an inverse design network and a forward modelling network. (d) Left: Evolution of the training cost of the network. Right: Performance of the network using a Gaussian-shaped spectrum.*

and structural losses: design structure ( ) material *ll l* = +− ρ ρ 1 with ρ the weight of the structural error, which is also set as a hyper-parameter to be adjusted during the training process. The loss structure *l* was evaluated by the mean absolute error ( ) ( ) 2 MSE <sup>1</sup> , *n n <sup>n</sup> l xy x y <sup>n</sup>* = − ∑ , while the loss function for the materials material *<sup>l</sup>* was evaluated by binary cross-entropy with logits loss *l x y ylog x y x* BCE ( , ) =− + − − σ σ ( ) (1 log 1 ) ( ( )) , with x and y being the target and output, respectively, and σ ( *x*) is the Sigmoid function. After training, the network has demonstrated great ability to realise the inverse design for different types of problems, including spectral tuning the electric or magnetic resonances, or overlapping them which potentially facilitate the inverse design of nanostructures with specific functions, such as zero-forward (first Kerker condition) or zero-backwards (second Kerker condition) scatterings [6, 7].

A similar network has also been used to explore the optical transmission spectra from multilayer thin films (**Figure 2c, d**) [8]. Here, Liu et al. combined the forward network modelling and inverse design in tandem architecture to overcome the data inconsistency which originates from the non-uniqueness in inverse scattering problems, i.e., the same optical responses can correspond to different designs. This non-uniqueness of the response-to-design mapping will cause conflicting examples within the training set and might lead to non-convergence of the neural network. The TN architecture consists of an inverse-design network connected to a forward model network. The forward network learns the mapping from the structural parameters to the optical responses and is trained separately first. After the forward network is trained, it is placed after the inverse-design model network, and its network weights remain fixed during the training of the inverse-design model network. The inverse-design network learns a mapping from the optical responses to the structural parameters. After the training process, such a DNN can efficiently predict the geometry of a device which is both promising and much faster as compared with the conventional electromagnetic solvers. As shown in the right diagram of **Figure 2d**, the learning curve of this tandem neural network has demonstrated a fast convergence during the training process. The structures designed by the network matches the desired transmission spectra with high fidelity.

## **2.2 Design of plasmonic metasurfaces by deep learning**

Plasmonic metasurfaces have become the building blocks for the meta-optics field. It allows for manipulating the wavefront of the electromagnetic wave at will. In this section, we are going to give a summary of the current status applying deep learning approach for inversely designing plasmonic metasurfaces.

In recent years, with the burgeoning field of metasurfaces, deep learning has emerged as a powerful tool for realising efficient inverse design of different types of plasmonic metasurfaces for different applications including spectral control, near-field design [9–11]. In 2018, Malkiel et al. introduced a novel bidirectional DNN model which can realise both the design and characterisation of plasmonic metasurfaces [12]. The network consists of two standard DNNs: a geometrypredicting network (GPN) to solve the inverse design and a spectrum-predicting network (SPN) to solve the spectra prediction tasks for plasmonic metasurfaces of "H"-shaped gold nanostructures. They have shown that by combing these two networks and optimise them together, they can co-adapt to each other, which is more effective than training them separately, as shown in **Figure 3a**. The training data for the GPN consists of three groups of data: desired spectra for *x*-polarised pump and *y*-polarised pump, and the materials' properties. Each group of data is fed into a different layer and three DNNs in parallel before they join the fully connected joint

**69**

with only minor deviations.

**2.3 Design of dielectric metasurface by deep learning**

*Deep Learning Enabled Nanophotonics DOI: http://dx.doi.org/10.5772/intechopen.93289*

**Figure 3.**

layers. This architecture has considered the differences of properties in the inputs' data, thus allows a better performance of the networks suitable for the nanophotonic design. After that, they were using the predicted geometry from the GPN to feed the SPN and returns the predicted transmission spectra as the outputs. Then the backpropagation is used to optimise both networks. The networks show excellent agreement between the measurements, predictions and simulations, as demonstrated by two examples shown in **Figure 3b** using the network to realise the inverse design of

*Application of DL for plasmonic metasurfaces inverse design: (a) architecture of the DNN composed of xxx. (b) Demonstration of the inverse design of "H"-shaped gold metasurfaces. (c) The architecture of a proposed GAN model composed of a generator, a simulator, and a critic. (d) Transmission spectra of the original (left)* 

As the structural complexity grows, the generation of the training data sets takes

enormous time. Furthermore, the requirement for more degrees of freedom in metasurface patterns makes the problems more and more challenging for conventional neural networks. To solve this issue, generative adversarial network (GAN) has been employed for metasurface designs recently [13]. A GAN involves placing two neural networks (a generator and a critic) in competition with each other and trying to reach an optimum, as shown in **Figure 3c**. Here, the simulator was first pretrained using 6500 full-wave finite element method (FEM) simulations for metasurfaces with different shapes. After the training, the simulator was used to approximate the transmission spectra of any input patterns rather than using the full-wave FEM simulations to do it. This has significantly reduced the number of datasets for the network. The generator is used to produce the metasurface patterns in response to a given input spectra T, and then fed into the simulator to get the approximated spectra T′. The critic will compare the original input geometric data corresponding to T and the generated patterns from the generator and guide the generator to produce patterns that share common features with the geometric input data. **Figure 3d** gives one example demonstrating the excellent performance of this network on predicting and identifying the structure to produce the target spectra

Recently, dielectric metasurface has triggered extensive interests in the past decades. Analogous to metallic nanostructures supporting plasmonic resonance, high index dielectric nanostructures provide multipole electric and magnetic

"H"-shaped gold metasurfaces for target spectra.

*and generated (right) patterns from the proposed GAN approach.*

#### **Figure 3.**

*Advances and Applications in Deep Learning*

training process. The loss structure

( ) ( )

<sup>1</sup> , *n n <sup>n</sup> l xy x y*

MSE

and structural losses: design structure ( ) material *ll l* = +−

2

ρ

 ρ1 with

*<sup>n</sup>* = − ∑ , while the loss function for the materials material

( ) (1 log 1 ) ( ( )) , with x and y being the target and output, respectively, and

A similar network has also been used to explore the optical transmission spectra from multilayer thin films (**Figure 2c, d**) [8]. Here, Liu et al. combined the forward network modelling and inverse design in tandem architecture to overcome the data inconsistency which originates from the non-uniqueness in inverse scattering problems, i.e., the same optical responses can correspond to different designs. This non-uniqueness of the response-to-design mapping will cause conflicting examples within the training set and might lead to non-convergence of the neural network. The TN architecture consists of an inverse-design network connected to a forward model network. The forward network learns the mapping from the structural parameters to the optical responses and is trained separately first. After the forward network is trained, it is placed after the inverse-design model network, and its network weights remain fixed during the training of the inverse-design model network. The inverse-design network learns a mapping from the optical responses to the structural parameters. After the training process, such a DNN can efficiently predict the geometry of a device which is both promising and much faster as compared with the conventional electromagnetic solvers. As shown in the right diagram of **Figure 2d**, the learning curve of this tandem neural network has demonstrated a fast convergence during the training process. The structures designed by the

evaluated by binary cross-entropy with logits loss *l x y ylog x y x*

function. After training, the network has demonstrated great ability to realise the inverse design for different types of problems, including spectral tuning the electric or magnetic resonances, or overlapping them which potentially facilitate the inverse design of nanostructures with specific functions, such as zero-forward (first Kerker

condition) or zero-backwards (second Kerker condition) scatterings [6, 7].

network matches the desired transmission spectra with high fidelity.

learning approach for inversely designing plasmonic metasurfaces.

Plasmonic metasurfaces have become the building blocks for the meta-optics field. It allows for manipulating the wavefront of the electromagnetic wave at will. In this section, we are going to give a summary of the current status applying deep

In recent years, with the burgeoning field of metasurfaces, deep learning has emerged as a powerful tool for realising efficient inverse design of different types of plasmonic metasurfaces for different applications including spectral control, near-field design [9–11]. In 2018, Malkiel et al. introduced a novel bidirectional DNN model which can realise both the design and characterisation of plasmonic metasurfaces [12]. The network consists of two standard DNNs: a geometrypredicting network (GPN) to solve the inverse design and a spectrum-predicting network (SPN) to solve the spectra prediction tasks for plasmonic metasurfaces of "H"-shaped gold nanostructures. They have shown that by combing these two networks and optimise them together, they can co-adapt to each other, which is more effective than training them separately, as shown in **Figure 3a**. The training data for the GPN consists of three groups of data: desired spectra for *x*-polarised pump and *y*-polarised pump, and the materials' properties. Each group of data is fed into a different layer and three DNNs in parallel before they join the fully connected joint

**2.2 Design of plasmonic metasurfaces by deep learning**

structural error, which is also set as a hyper-parameter to be adjusted during the

ρ

*l* was evaluated by the mean absolute error

the weight of the

BCE ( , ) =− + − − σ

( *x*) is the Sigmoid

σ

*l* was

 σ

**68**

*Application of DL for plasmonic metasurfaces inverse design: (a) architecture of the DNN composed of xxx. (b) Demonstration of the inverse design of "H"-shaped gold metasurfaces. (c) The architecture of a proposed GAN model composed of a generator, a simulator, and a critic. (d) Transmission spectra of the original (left) and generated (right) patterns from the proposed GAN approach.*

layers. This architecture has considered the differences of properties in the inputs' data, thus allows a better performance of the networks suitable for the nanophotonic design. After that, they were using the predicted geometry from the GPN to feed the SPN and returns the predicted transmission spectra as the outputs. Then the backpropagation is used to optimise both networks. The networks show excellent agreement between the measurements, predictions and simulations, as demonstrated by two examples shown in **Figure 3b** using the network to realise the inverse design of "H"-shaped gold metasurfaces for target spectra.

As the structural complexity grows, the generation of the training data sets takes enormous time. Furthermore, the requirement for more degrees of freedom in metasurface patterns makes the problems more and more challenging for conventional neural networks. To solve this issue, generative adversarial network (GAN) has been employed for metasurface designs recently [13]. A GAN involves placing two neural networks (a generator and a critic) in competition with each other and trying to reach an optimum, as shown in **Figure 3c**. Here, the simulator was first pretrained using 6500 full-wave finite element method (FEM) simulations for metasurfaces with different shapes. After the training, the simulator was used to approximate the transmission spectra of any input patterns rather than using the full-wave FEM simulations to do it. This has significantly reduced the number of datasets for the network. The generator is used to produce the metasurface patterns in response to a given input spectra T, and then fed into the simulator to get the approximated spectra T′. The critic will compare the original input geometric data corresponding to T and the generated patterns from the generator and guide the generator to produce patterns that share common features with the geometric input data. **Figure 3d** gives one example demonstrating the excellent performance of this network on predicting and identifying the structure to produce the target spectra with only minor deviations.

### **2.3 Design of dielectric metasurface by deep learning**

Recently, dielectric metasurface has triggered extensive interests in the past decades. Analogous to metallic nanostructures supporting plasmonic resonance, high index dielectric nanostructures provide multipole electric and magnetic

resonance (also called as Mie resonance), which enable 2π phase coverage without ease. Besides, the intrinsic material loss is much lower for high index semiconductor than the counterpart of noble metals. These two unique properties make it possible to develop high-performance photonic devices based on dielectric metasurface. Although dielectric metasurfaces with such regular elements have much better performance compared to the plasmonic metasurfaces, they still do not reach the optimal one with the best efficiency. In order to further improve the performance of dielectric metasurface, inverse design approaches, including adjoint-based topology optimisation and genetic algorithms, have been widely used. The iterative optimisation methods lead to the findings of devices with high efficiency with irregular patterns which are usually beyond human intuition. However, these methods rely on extremely heavy computation, making them hard to apply to sophisticated devices featured by a very high dimensional design space. The recently developed deep learning approach, which is based on artificial neural networks, is viewed as the perfect solution of dealing massive data while reducing the computation cost. It has already found great success in computer vision and natural language processing. Recently, researchers have transferred deep learning to the inverse design of nanophotonic devices. Up to date, most frequently used neural networks in the design of dielectric metasurfaces are DNN, GAN, and convolution neural networks (CNN) In the following, we will illustrate them one by one and also discuss their unique strengths and drawbacks.

DNN with fully connected layers has been demonstrated as a versatile and efficient way of engineering a high-Q resonance with desired characteristics, including linewidth, amplitude, and spectral location [14]. The structure considered here is double identical silicon nanobars sitting on the substrate, as shown in **Figure 4b**. The width and length of nanobars are,, respectively, denoted as W and L while the centre to centre distance between nanobars is denoted as 2x0. To reduce the structure complexity, the period of the unit cell and the thickness of silicon bars are fixed as p = 900 nm and t = 150 nm, respectively. Previous studies have demonstrated that

#### **Figure 4.**

*(a) The architecture of the tandem network, which consists of inverse-design model network followed by the pretrained forward mode network. (b) Schematic drawing of the unit cell made of two identical silicon nanobar. Inverse design of metasurface supporting Fano profile spectra (c) λ0 = 1450 nm and 1500 nm, Δλ = 15 nm, q = 0.8. (d) λ0 = 1500 nm, Δλ = 10 nm, q = 0.3 and q = 0.5. (e) λ0 = 1500 nm, Δλ = 5 nm and Δλ = 15 nm, q = 0.7. (f) Schematic of the conditional GLOnet for metagrating design. (g) Optimised efficiency of metagrating from the conditional GLOnet.*

**71**

tivity (ε = n2

*Deep Learning Enabled Nanophotonics DOI: http://dx.doi.org/10.5772/intechopen.93289*

spectrum as the output of training data.

) and volume (V = πr2

such an array structure support a Fano resonance induced by the quasi bound state in the continuum. Since there are three parameters to be tuned, it is very challenging to find the desired structure parameters by one by one brute-force searching when the spectrum response is predefined. DNN can correctly address this issue in an reduced time period. 25,000 sets of the training data are randomly generated with rigorous coupled-wave analysis (RCWA). It is worth noting that it is straightforward and easy to train the network mapping from structure parameters to reflection/ transmission spectrum because one set of structure parameters can only produce a given spectrum. The objective is to search the structure parameter for the desired spectra response. It might be challenging to use an only forward neural network to find out the required parameters because the non-uniqueness issue arises. In other words, different designs may produce the same far-field electromagnetic response because the optical resonance is mainly governed by the volume of structure but shows weak dependence on the structure shape. To solve this one-to-many issue, as shown in **Figure 4a**, a Tandem neural network consisting of inverse design model network and the forward model network is proposed. More specifically, the forward network is trained first to learn the mapping from structure parameters to the optical response. After the training of the forward network is done, inverse design model network is trained while the weight and bias for the forward network are fixed. Once the full training process is completed, one can retrieve the structure parameters in several milliseconds while the optical spectrum is predefined. In order to test the validity of Tandem network, **Figure 4c**–**e** compares the predefined spectrum and predicted spectrum of Fano resonance with different wavelength, linewidth and amplitude. The excellent agreement can be found between two, indicating the effectiveness of the deep learning approach in the inverse design of nanophotonics. Note that only amplitude of transmission spectrum is considered here. In many applications of dielectric metasurface (e.g., metalens), both amplitude and phase should be considered to shape the wavefront of electromagnetic wave. Since optical resonance is always accompanied by π phase-shift, which may make training difficult for phase spectra because it is better to be differentiated for output parameters (i.e., phase or amplitude). Instead of using phase and amplitude, researchers adopt both real and imaginary parts of the reflection/transmission

Moreover, because of the huge mismatch between the dimensions of input and output, a revised neural network was applied. The first standard linear neural network was replaced with the bilinear tensor layer that can correlate two entity vectors in multiple dimensions. Training results indicated that modified neural network converges faster than the standard linear neural network. This is because input parameters are interdependent on each other. Taking an array of dielectric nanodisk as an example, the structure is fully described by four parameters: refractive index of materials, radius and height of disk, the gap between disks. As we mentioned previously, the optical resonance is mainly determined by the refractive index and volume of structures. In other words, the spectrum response is governed by permit-

bilinear tensor can better describe the nonlinearity, and thus facilitate the training process. However, it is worth pointing out that there are some limitations on deep neural network. First, the design solution retrieved from deep learning must fall into the boundary of the training data set. Second, it only works for structure defined by several simple parameters. When more parameters are involved, tens, hundreds of thousands of training data are required to guarantee the prediction accuracy. As a consequence, generating such a large amount of data may consume a long time and cause a high computational cost. Moreover, it will be challenging to train the data for dielectric metasurface with free form geometry via DNN.

h). Therefore, multiplication of two entities by

## *Deep Learning Enabled Nanophotonics DOI: http://dx.doi.org/10.5772/intechopen.93289*

*Advances and Applications in Deep Learning*

unique strengths and drawbacks.

resonance (also called as Mie resonance), which enable 2π phase coverage without ease. Besides, the intrinsic material loss is much lower for high index semiconductor than the counterpart of noble metals. These two unique properties make it possible to develop high-performance photonic devices based on dielectric metasurface. Although dielectric metasurfaces with such regular elements have much better performance compared to the plasmonic metasurfaces, they still do not reach the optimal one with the best efficiency. In order to further improve the performance of dielectric metasurface, inverse design approaches, including adjoint-based topology optimisation and genetic algorithms, have been widely used. The iterative optimisation methods lead to the findings of devices with high efficiency with irregular patterns which are usually beyond human intuition. However, these methods rely on extremely heavy computation, making them hard to apply to sophisticated devices featured by a very high dimensional design space. The recently developed deep learning approach, which is based on artificial neural networks, is viewed as the perfect solution of dealing massive data while reducing the computation cost. It has already found great success in computer vision and natural language processing. Recently, researchers have transferred deep learning to the inverse design of nanophotonic devices. Up to date, most frequently used neural networks in the design of dielectric metasurfaces are DNN, GAN, and convolution neural networks (CNN) In the following, we will illustrate them one by one and also discuss their

DNN with fully connected layers has been demonstrated as a versatile and efficient way of engineering a high-Q resonance with desired characteristics, including linewidth, amplitude, and spectral location [14]. The structure considered here is double identical silicon nanobars sitting on the substrate, as shown in **Figure 4b**. The width and length of nanobars are,, respectively, denoted as W and L while the centre to centre distance between nanobars is denoted as 2x0. To reduce the structure complexity, the period of the unit cell and the thickness of silicon bars are fixed as p = 900 nm and t = 150 nm, respectively. Previous studies have demonstrated that

*(a) The architecture of the tandem network, which consists of inverse-design model network followed by the pretrained forward mode network. (b) Schematic drawing of the unit cell made of two identical silicon nanobar. Inverse design of metasurface supporting Fano profile spectra (c) λ0 = 1450 nm and 1500 nm, Δλ = 15 nm, q = 0.8. (d) λ0 = 1500 nm, Δλ = 10 nm, q = 0.3 and q = 0.5. (e) λ0 = 1500 nm, Δλ = 5 nm and Δλ = 15 nm, q = 0.7. (f) Schematic of the conditional GLOnet for metagrating design. (g) Optimised efficiency* 

**70**

**Figure 4.**

*of metagrating from the conditional GLOnet.*

such an array structure support a Fano resonance induced by the quasi bound state in the continuum. Since there are three parameters to be tuned, it is very challenging to find the desired structure parameters by one by one brute-force searching when the spectrum response is predefined. DNN can correctly address this issue in an reduced time period. 25,000 sets of the training data are randomly generated with rigorous coupled-wave analysis (RCWA). It is worth noting that it is straightforward and easy to train the network mapping from structure parameters to reflection/ transmission spectrum because one set of structure parameters can only produce a given spectrum. The objective is to search the structure parameter for the desired spectra response. It might be challenging to use an only forward neural network to find out the required parameters because the non-uniqueness issue arises. In other words, different designs may produce the same far-field electromagnetic response because the optical resonance is mainly governed by the volume of structure but shows weak dependence on the structure shape. To solve this one-to-many issue, as shown in **Figure 4a**, a Tandem neural network consisting of inverse design model network and the forward model network is proposed. More specifically, the forward network is trained first to learn the mapping from structure parameters to the optical response. After the training of the forward network is done, inverse design model network is trained while the weight and bias for the forward network are fixed. Once the full training process is completed, one can retrieve the structure parameters in several milliseconds while the optical spectrum is predefined. In order to test the validity of Tandem network, **Figure 4c**–**e** compares the predefined spectrum and predicted spectrum of Fano resonance with different wavelength, linewidth and amplitude. The excellent agreement can be found between two, indicating the effectiveness of the deep learning approach in the inverse design of nanophotonics. Note that only amplitude of transmission spectrum is considered here. In many applications of dielectric metasurface (e.g., metalens), both amplitude and phase should be considered to shape the wavefront of electromagnetic wave. Since optical resonance is always accompanied by π phase-shift, which may make training difficult for phase spectra because it is better to be differentiated for output parameters (i.e., phase or amplitude). Instead of using phase and amplitude, researchers adopt both real and imaginary parts of the reflection/transmission spectrum as the output of training data.

Moreover, because of the huge mismatch between the dimensions of input and output, a revised neural network was applied. The first standard linear neural network was replaced with the bilinear tensor layer that can correlate two entity vectors in multiple dimensions. Training results indicated that modified neural network converges faster than the standard linear neural network. This is because input parameters are interdependent on each other. Taking an array of dielectric nanodisk as an example, the structure is fully described by four parameters: refractive index of materials, radius and height of disk, the gap between disks. As we mentioned previously, the optical resonance is mainly determined by the refractive index and volume of structures. In other words, the spectrum response is governed by permittivity (ε = n2 ) and volume (V = πr2 h). Therefore, multiplication of two entities by bilinear tensor can better describe the nonlinearity, and thus facilitate the training process. However, it is worth pointing out that there are some limitations on deep neural network. First, the design solution retrieved from deep learning must fall into the boundary of the training data set. Second, it only works for structure defined by several simple parameters. When more parameters are involved, tens, hundreds of thousands of training data are required to guarantee the prediction accuracy. As a consequence, generating such a large amount of data may consume a long time and cause a high computational cost. Moreover, it will be challenging to train the data for dielectric metasurface with free form geometry via DNN.

GAN has been found to overcome the above limitations effectively. GAN is originally proposed in the computer vision. It is capable of creating artificial images that even cannot be distinguished from true images by the computers [15]. GAN has been successfully applied to the design of subwavelength scale metallic nanostructures and multifunctional dielectric metasurface [13, 16]. The operation principles of GAN in the design of metasurface are described as follows. The unit cell of the metasurface is divided into N\*N (i.e., N = 32, 64) pixel images while the thickness of structure and period of the unit cell is fixed. There are two neural networks in GAN: generator and discriminator. The generator networks try to create the image so that it cannot be differentiated to the real image. In contrast, the discriminator networks are trained to distinguish the image produced by the generator from the real image sets. The competing process between these two networks leads to the creation of artificial images that cannot be distinguished from the real one. In fact, the topology optimisation method or deep learning approach does not always work alone. They can be combined together to build up a new generative network. Such a generative network has been proposed to optimise the efficiency of metagrating at large angle across a broadband wavelength range because it took both the advantages of GAN and adjoint-based topology optimisation [17]. Although GAN requires less training sets, the training data may be optimised first and thus demand more computation source. More recently, global topology optimisation networks (GLOnets) was proposed by Jiang et al. from Stanford [18, 19]. It incorporates the adjoint-based optimisation into the generative neural networks. Unlike DNN and GAN methods, it does not require pre-calculation of training data based on the electromagnetic solver. Instead, it adopts the generator networks followed by the adjoint-based topology optimiser, allowing for direct learning the physical relationship between geometry parameters of the device and electromagnetic response, as shown in **Figure 4f**. Such a global optimiser does not only reduce the computation time but also further improve the efficiency of metagrating at large angles compared to the topology optimisation method (See **Figure 4g**).
