Magistrit%C3%B6%C3%B6_lisadeta_Tuttar.pdf - Page 9

Wickham, Hadley, Romain François, Lionel Henry, and Kirill Müller (2022).dplyr: A Grammar of Data Manipulation . R package version 1.0.10.url: https://CRAN.R- project.org/package=dplyr. Wüthrich, Mario V. (Jan. 2019). “From Generalized Linear Models to Neural Networks, and Back”. In: SSRN Electronic Journal 1. issn: 1556-5068. doi: 10 . 2139 / ssrn . 3491790. 58 Appendix Appendix A maidrr algorithms In this appendix, two of the key algorithms for the maidrr method are given. The first algorithm focuses on generating a suitable surrogate model if penalty parameters are given and the second algorithm shows how to find optimal penalty parameters. A.1 maidrr surrogate model algorithm In this section, a copy of the maidrr algorithm as seen in (Henckaerts, Antonio, and Côté, 2020) is presented. Note that here we assume optimal penalty values λmain, λintr are already found. Algorithm 2 maidrr surrogate algorithm Input: data, λmain, λintr, k, h // Main effect loop for j = 1 to p do Calculate ˆ PD(x{j}) for all unique values of variable Xj in the data. Apply DP algorithm to group values of Xj using k∗ {j} = argmin k{j}∈{1,...k} Eq. (3.5) for λ = λmain Define Xc j as the the grouped version of Xj with k∗ {j} groups end for Feature selection: Feat = {j|k∗ {j} > 1} // Interaction effect loop Interaction selection: I = {(l, m)|l, m ∈Feat, and H2 {l,m} ≥h} for (a, b) in I do Calculate ˆ PD(x{a,b}) for all unique combinations of variables Xa and Xb in the data. Apply DP algorithm to group interactions of (X{a,b}) using k∗ {a,b} = argmin k{a,b}∈{1,...k} Eq. (3.5) for λ = λintr Define Xc a:b as the the grouped version of interaction variable Xa:b with k∗ {a,b} groups end for Interaction selection Ieat = I{(l, m)|k∗ {l,m} = 1} Fit GLM to response using features Xc j for j ∈Feat and interaction Xc a:b for (a, b) ∈Ieat Output: Surrogate GLM 59

A.2 maidrr penalty tuning algorithm In this section, an algorithm for maidrr penalty tuning is presented. This algorithm is not ex- plicitly stated in (Henckaerts, Antonio, and Côté, 2020) but is implemented in the corresponding R package maidrr (Henckaerts, 2020). Denote ⃗λmain and ⃗λintr as grids of potential values for corresponding penalty parameters. Algorithm 3 maidrr penalty tuning algorithm Input: data, ⃗λmain, ⃗λintr, k, h, kf old Split data randomly intokf oldparts. //Main penalty tuning for λ in ⃗λmain do for i in 1 to kf olddo Use part i of the data as validation split and rest as training split Run main effect loop of maidrr surrogate Algorithm 2 withλmain = λ Fit a surrogate GLM using variablesX c j for j ∈ Feat on the training split Calculate the validation split lossVali Save Vali corresponding to penaltyλ and fold i end for end for ∀λ ∈ ⃗λmain calculate cverri = 1 kf old Pkf old i=1 Vali. Select λ∗ main = min cverri ⃗λmain. //Interaction penalty tuning Using F ∗ eat and groupings k∗ {j}, j = 1, . . . , pbelonging to λ∗ main for λ in ⃗λintr do for i=1 to kf olddo Use part i of the data as validation split and rest as training split Run interaction effect loop of maidrr surrogate Algorithm 2 withλintr = λ Fit a surrogate GLM using variablesX c j for j ∈ F ∗ eat and interactions X c a:b for (a, b) ∈ Ieat on the training split Calculate the validation split lossVali Save Vali corresponding to penaltyλ and fold i end for end for ∀λ ∈ ⃗λintr calculate cverri = 1 kf old Pkf old i=1 Vali. Select λ∗ intr = min cverri ⃗λintr. Fit GLM to response using featuresX c j for j ∈ F ∗ eat and interactions X c a:b for (a, b) ∈ I ∗ eat corresponding to penalties λ∗ main and λ∗ intr using all of the data. Output: Surrogate GLM with tuned values of penaltiesλ∗ main and λ∗ intr 60 Non-exclusive licence to reproduce thesis and make thesis public I, Artur Tuttar,

herewith grant the University of Tartu a free permit (non-exclusive licence) to reproduce, for the purpose of preservation, including for adding to the DSpace digital archives until the expiry of the term of copyright, Extending generalized linear models in insurance with machine learning techniques, supervised by Meelis Käärik.
I grant the University of Tartu a permit to make the thesis specified in point 1 available to the public via the web environment of the University of Tartu, including via the DSpace digital archives, under the Creative Commons licence CC BY NC ND 4.0, which allows, by giving appropriate credit to the author, to reproduce, distribute the work and communicate it to the public, and prohibits the creation of derivative works and any commercial use of the work until the expiry of the term of copyright.
I am aware of the fact that the author retains the rights specified in points 1 and 2.
I confirm that granting the non-exclusive licence does not infringe other persons’ intellec- tual property rights or rights arising from the personal data protection legislation. Artur Tuttar 16.05.2023 61

Page 9 of 9