Persistence probability is a statistical index for detecting one or more communities embedded in a network. However, despite its straightforward definition (the probability that a random walk remains in a group of nodes after a time period), it is seldom used, possibly because of the difficulty of developing an efficient algorithm to calculate it. Proposed here is a new mathematical programming model to find the community with the largest persistence probability. The model is formulated as integer fractional programming and then is reduced to mixed-integer linear programming with appropriate variable substitutions. Nevertheless, the exact problem is solved in a reasonable time only for small networks, so heuristic procedures are developed to approximate the optimal solution for large networks. First, a randomized greedy-ascent method is elaborated, taking advantage of a particular data structure to generate feasible solutions quickly. After analyzing the greedy output and determining where the optimal solution is eventually located, improvement procedures are implemented based on a local exchange but applying different long-term diversification principles, and based on tree neighborhood search and random restart. When applied to simulated graphs to determine the reliability and effectiveness of the proposed method, it accurately reproduces the clustering characteristics found in real networks. Finally, it is applied to two real networks, and comparing the results to those from two well-known community-detection procedures confirms that this index complements the other methods well.

On finding the community with maximum persistence probability / Avellone, Alessandro; Benati, Stefano; Grassi, Rosanna; Rizzini, Giorgio. - In: 4OR. - ISSN 1619-4500. - 2024, 22:(2024), pp. 435-463. [10.1007/s10288-023-00559-z]

On finding the community with maximum persistence probability

Benati, Stefano
Secondo
;
2024-01-01

Abstract

Persistence probability is a statistical index for detecting one or more communities embedded in a network. However, despite its straightforward definition (the probability that a random walk remains in a group of nodes after a time period), it is seldom used, possibly because of the difficulty of developing an efficient algorithm to calculate it. Proposed here is a new mathematical programming model to find the community with the largest persistence probability. The model is formulated as integer fractional programming and then is reduced to mixed-integer linear programming with appropriate variable substitutions. Nevertheless, the exact problem is solved in a reasonable time only for small networks, so heuristic procedures are developed to approximate the optimal solution for large networks. First, a randomized greedy-ascent method is elaborated, taking advantage of a particular data structure to generate feasible solutions quickly. After analyzing the greedy output and determining where the optimal solution is eventually located, improvement procedures are implemented based on a local exchange but applying different long-term diversification principles, and based on tree neighborhood search and random restart. When applied to simulated graphs to determine the reliability and effectiveness of the proposed method, it accurately reproduces the clustering characteristics found in real networks. Finally, it is applied to two real networks, and comparing the results to those from two well-known community-detection procedures confirms that this index complements the other methods well.
2024
4OR
Avellone, Alessandro; Benati, Stefano; Grassi, Rosanna; Rizzini, Giorgio
On finding the community with maximum persistence probability / Avellone, Alessandro; Benati, Stefano; Grassi, Rosanna; Rizzini, Giorgio. - In: 4OR. - ISSN 1619-4500. - 2024, 22:(2024), pp. 435-463. [10.1007/s10288-023-00559-z]
File in questo prodotto:
File Dimensione Formato  
Manuscript_with_authors_details.pdf

accesso aperto

Tipologia: Pre-print non referato (Non-refereed preprint)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 1.12 MB
Formato Adobe PDF
1.12 MB Adobe PDF Visualizza/Apri
s10288-023-00559-z.pdf

Solo gestori archivio

Tipologia: Versione editoriale (Publisher’s layout)
Licenza: Tutti i diritti riservati (All rights reserved)
Dimensione 955.04 kB
Formato Adobe PDF
955.04 kB Adobe PDF   Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11572/434917
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
  • OpenAlex ND
social impact