A Design-of-Experiments-Based Approach for Efficient Estimation of Bimodal Gaussian Mixture Weights

IRIS

Normal mixture models are widely used to represent data arising from latent subpopulations. We propose a Design-of-Experiments (DOE) and Response Surface Methodology (RSM) framework to estimate the weights of a bimodal Gaussian mixture when component families are known. The procedure is non-iterative: rather than alternating Expectation Maximization (EM) steps, it performs a double-stage method - fit a quadratic response surface to the sample log-likelihood over the weight simplex and solve one constrained optimization - followed by a final Maximum Likelihood re-estimation of means and variances. This yields predictable runtime (driven by design size) and reduced sensitivity to initialization. The pipeline uses 1) k-medians to obtain preliminary component parameters and 99% confidence intervals (CIs) for component proportions; 2) builds a simplex-lattice mixture design within those CI bounds; 3) fits a quadratic response surface to log-likelihood; and 4) optimizes this surface under sum-to-one constraints. We validate the method in 27 Monte Carlo scenarios (n = 100, 500, 1000; low/medium/high differentiation and three weight settings). In medium/high separation, it attains comparable likelihoods to EM while achieving more favorable BIC in multiple scenarios and indistinguishable AIC in many, whereas EM is preferable under low separation. Two real data sets - Old Faithful (Waiting variable) and Photovoltaic Energy (Production variable) - further confirm applicability, with lower AIC/BIC in Old Faithful and lower BIC in PV; clustering agreement is high (κ ≈ 0.99 - 1.00). Overall, DOE-RSM offers a simple, interpretable, and often more parsimonious method, and constitutes a non-iterative alternative for mixture-weight estimation.

Normal mixture models are widely used to represent data arising from latent subpopulations. We propose a Design-of-Experiments (DOE) and Response Surface Methodology (RSM) framework to estimate the weights of a bimodal Gaussian mixture when component families are known. The procedure is non-iterative: rather than alternating Expectation Maximization (EM) steps, it performs a double-stage method - fit a quadratic response surface to the sample log-likelihood over the weight simplex and solve one constrained optimization - followed by a final Maximum Likelihood re-estimation of means and variances. This yields predictable runtime (driven by design size) and reduced sensitivity to initialization. The pipeline uses 1) k-medians to obtain preliminary component parameters and 99% confidence intervals (CIs) for component proportions; 2) builds a simplex-lattice mixture design within those CI bounds; 3) fits a quadratic response surface to log-likelihood; and 4) optimizes this surface under sum-to-one constraints. We validate the method in 27 Monte Carlo scenarios (n = 100, 500, 1000; low/medium/high differentiation and three weight settings). In medium/high separation, it attains comparable likelihoods to EM while achieving more favorable BIC in multiple scenarios and indistinguishable AIC in many, whereas EM is preferable under low separation. Two real data sets - Old Faithful (Waiting variable) and Photovoltaic Energy (Production variable) - further confirm applicability, with lower AIC/BIC in Old Faithful and lower BIC in PV; clustering agreement is high (κ ≈ 0.99 - 1.00). Overall, DOE-RSM offers a simple, interpretable, and often more parsimonious method, and constitutes a non-iterative alternative for mixture-weight estimation.

A Design-of-Experiments-Based Approach for Efficient Estimation of Bimodal Gaussian Mixture Weights / Leal, G.S., Bessegato, L.F., Xavier, Y.S.M., Melgani, F., Balestrassi, P.P.. - In: IEEE ACCESS. - ISSN 2169-3536. - 13:(2025), pp. 168322-168334. [10.1109/ACCESS.2025.3614023]

A Design-of-Experiments-Based Approach for Efficient Estimation of Bimodal Gaussian Mixture Weights

Leal G. S.;Bessegato L. F.;Xavier Y. S. M.;Melgani F.;Balestrassi P. P.

2025-01-01

Abstract

Normal mixture models are widely used to represent data arising from latent subpopulations. We propose a Design-of-Experiments (DOE) and Response Surface Methodology (RSM) framework to estimate the weights of a bimodal Gaussian mixture when component families are known. The procedure is non-iterative: rather than alternating Expectation Maximization (EM) steps, it performs a double-stage method - fit a quadratic response surface to the sample log-likelihood over the weight simplex and solve one constrained optimization - followed by a final Maximum Likelihood re-estimation of means and variances. This yields predictable runtime (driven by design size) and reduced sensitivity to initialization. The pipeline uses 1) k-medians to obtain preliminary component parameters and 99% confidence intervals (CIs) for component proportions; 2) builds a simplex-lattice mixture design within those CI bounds; 3) fits a quadratic response surface to log-likelihood; and 4) optimizes this surface under sum-to-one constraints. We validate the method in 27 Monte Carlo scenarios (n = 100, 500, 1000; low/medium/high differentiation and three weight settings). In medium/high separation, it attains comparable likelihoods to EM while achieving more favorable BIC in multiple scenarios and indistinguishable AIC in many, whereas EM is preferable under low separation. Two real data sets - Old Faithful (Waiting variable) and Photovoltaic Energy (Production variable) - further confirm applicability, with lower AIC/BIC in Old Faithful and lower BIC in PV; clustering agreement is high (κ ≈ 0.99 - 1.00). Overall, DOE-RSM offers a simple, interpretable, and often more parsimonious method, and constitutes a non-iterative alternative for mixture-weight estimation.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione (Date of publication)
	
				2025
			
	Titolo del periodico (Journal title)
	
				IEEE ACCESS
			
	DOI
	
				https://dx.doi.org/10.1109/ACCESS.2025.3614023
			
	Settori scientifico-disciplinari (validi dal 09/05/2024) - Reference SSD (valid from 09/05/2024)
	
				Settore IINF-03/A - Telecomunicazioni
			
	Codice Scopus (Scopus identifier)
	
				2-s2.0-105017389344
			
	Codice WOS (WOS identifier)
	
				WOS:001587270100048
			
	Tutti gli autori
	
						Leal, G. S.; Bessegato, L. F.; Xavier, Y. S. M.; Melgani, F.; Balestrassi, P. P.
					
	Citazione
	
				A Design-of-Experiments-Based Approach for Efficient Estimation of Bimodal Gaussian Mixture Weights / Leal, G.S., Bessegato, L.F., Xavier, Y.S.M., Melgani, F., Balestrassi, P.P.. - In: IEEE ACCESS. - ISSN 2169-3536. - 13:(2025), pp. 168322-168334. [10.1109/ACCESS.2025.3614023]

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11572/470966

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni

ND

0

0

0

social impact