A transfer learning metamodel using deep neural networks applied to natural convection flows in enclosures

While buoyancy drives natural convection, the performance of a natural convection system can be affected by other physical phenomena and different factors such as geometry, boundary conditions, and the behavior of a fluid. Data-driven metamodels that govern real-world natural convection systems need datasets that cover the entire feature space. Moreover, some unforeseen features may become active under a different process or after a system is redesigned. Among other limiting factors, generating a new full set of simulations or experiments is time-consuming. Our methodology using transfer learning (TL) with deep neural networks (DNNs) can flexibly adapt to the expansion of the feature space when a natural convection system becomes more complicated and needs to be described more precisely.

We considered a benchmark problem in square enclosures described by the Rayleigh and Prandtl numbers. We carried out a mesh refinement study on the input space, to find the appropriate grid size to perform accurate simulations. We adopted a TL approach using DNNs to this problem and demonstrated the capability of our approach in incorporating additional input features. We built a metamodel to predict Nusselt number, Nu, as a function of Rayleigh number, Ra (the only input variable), for an air-filled enclosure. We successfully applied a DNN and transferred its learning from an air-filled enclosure to an enclosure with arbitrary fluid (i.e., Prandtl number, Pr, was added as a new input). This TL strategy is versatile and can handle straightforward metamodeling problems for many different engineering systems.

We aimed to extract a metamodel out of a physical model that numerically predicts the natural convection characteristics of a square enclosure filled with Newtonian fluid. This problem is governed by two parameters: Ra and Pr (see here for details). We consider Ra of up to 10⁸ and Pr of greater than 0.05; however, lower Pr were also considered provided that the ratio of Ra/Pr is less than 10⁸.

The average Nu was evaluated for different grid systems at different Pr (for the highest Ra), as summarized in Table 1. The tested grids are all uniform and structured whereas a boundary mesh is applied in which the cells adjacent to the walls were split in half to account for higher gradients near the walls. In comparison to the results obtained by an 800×800 grid system, the highest errors for the results obtained using the 200×200 and 400×400 grid systems were about 1 and 0.5 percent, respectively (for the most stringent cases). A 400×400 grid system was shown to provide precise results for Nu even for the most stringent cases. Using a single logical processor on a 2.6 GHz Intel Core i7-3720QM CPU, an average computational time of about 4,850 seconds (as high as about 13,000 seconds for low Pr) was spent on obtaining the numerical solutions using a 400×400 grid system. A 200×200 grid system (with an average simulation time of 1,300 seconds) was reliably used for Ra of up to 10⁷ with errors of less than 0.5%.

Table 1. Comparison of the average Nu for Ra=10⁸ for different grid sizes; the values in parentheses are the relative errors compared with the most accurate results.

Grid system

Pr = 0.1

Pr = 1

Pr = 10

25×25

50×50

100×100

200×200

400×400

800×800

28.94 (14.8%)

26.78 (6.27%)

24.79 (1.65%)

25.09 (0.43%)

25.14 (0.26%)

25.20

34.65 (12.4%)

33.63 (9.12%)

31.05 (0.74%)

31.07 (0.82%)

30.93 (0.34%)

30.82

35.64 (12.1%)

35.07 (10.3%)

32.19 (1.27%)

32.13 (1.06%)

31.95 (0.51%)

31.79

We initially generated a metamodel that predicted the variation of Nu with Ra for an air-filled enclosure (Pr=0.71). We trained an ANN as illustrated in Fig. 1 (as block I) using 30 data points (ranging from Ra=1 to 2×108), without using a validation dataset during training (with the goal of lowering simulation costs). The results of testing our Ra-Nu metamodel using 10 test data points is presented in Table 2 under Step 1.

Fig. 1. The structure of the DNN for transfer learning. Block I consists of a metamodel that predicts the variation of Nu with Ra (where Pr is unchanged). In Block II, there is a second branch that takes Pr into account as a new input. The outputs of the two branches are Multiplied to produce the output of Block II. In Block III, the outputs are concatenated and two hidden layers are added to the model in order to improve the training.

We extended our metamodel to also consider the effects of Pr. We applied the same structure as Ra (the right branch in Fig. 1) and merged the outputs of the Pr branch and block I using a Multiply layer (Keras API). We added a one-node layer after the Multiply layer to adjust for the multiplication coefficient. We froze the layers on block I, and trained the new hidden layers using data points from Step 1 as well as a new dataset of constant Ra simulation points (24 training data at a fixed Ra=10⁵ and variable Pr ranging from Pr=10^-3 to 10⁵). We trained the ANN using 54 data points and validated it using 20 data points. We tested our ANN metamodel using a test dataset of 100 simulations using a 400×400 grid system. The test result for our ANN model is presented in Table 2 under Step 2.

Table 2. The details of the DNN training for the prediction of Nu in a square enclosure.

	Training data		ANN parameters		Test error			Model cost
Step No.	200²	400²	Trainable	Non-trainable	MSE (/10^-5)	MAE (/10^-3)	RE** (%)	ST (hr)	NNT (hr)
1: Block I	23	7	377	−	0.01	0.30	0.07±0.04	19.2*	0.2
2: Block II	42	32	379	377	22.2	9.35	2.12±2.55	42.7	0.8
3: Block III	64	53	337	754	2.39	3.08	0.71±0.89	75.7	3.6

* Also includes the simulation time for the test data that was generated for this step. ** Mean ± standard deviation of relative error.

To enhance accuracy, one can remove the Multiply layer from block II and replace it with a concatenated layer followed by two additional hidden layers (Block III in Fig. 1). We froze the training on the two branches in Fig. 1 and applied the weight values from step 2 (Block II). We trained the added hidden layers in Block III using new simulation points at fixed Ra=10⁸ or at fixed Pr=0.05 on top of the previous dataset. As presented in Table 2 (Step 3), our results on the test dataset improved remarkably after applying this procedure. The scatter graph for the different training datasets that we employed in this approach is presented in Fig. 2a. The test error contour for this approach is shown in Fig. 2b (for Step 3). Further improvements in model accuracy can be achieved after retraining the DNN from step 3 using more data. For example, using 31 additional data points (shown by grey symbols in Fig. 2a) decreased the test loss by 51% (RE = 0.45±0.66) at an increased simulation cost of 28%.

A transfer learning metamodel using deep neural networks applied to natural convection flows in enclosures

M. Ashouri

Leave a Reply Cancel reply