On-going Math Research Program

Title: The Genealogies in Expanding Population

Advisor: Louis Fan

Introduction:The stochastic analysis and computer simulations can be use to understand various aspects of randomly growing clusters in expanding population. One of its main application is in genealogies, to study the asymptotic growing shape and the heterogeneity of cancer tumors. We will apply the Morkov models, including branching process and stepping stone models to explain the expansion and mutation of a tumor cell in a simulated 2D design. With a statistics called directionality index, we should be able to decide the direction of tumor's growth and also locate the original of a tumors. For more on-going details, please contact me.

Key Words: Genealogies, Directionality Index, Original, Stochastic Differential Equation, Simulation


Xiaoyi Yang and her thesis advisor Profefssor Karl Rohe

My thesis advisor Professor Karl Rohe(on the right) and I(on the left) in the poster section for Honor thesis

Statistics Honor Thesis

(In preparation for publication)
The effect of different initializations on network clustering performance using balanced label propagation

Author: Xiaoyi Yang, Karl Rohe, and Norbert Binkiewicz

Abstract: Social networks are one of the largest data sets in modern times. Due to the constraints of hardware and the demands of fast data access, it is beneficial to store the network information into several smaller density connected groups. Moreover, in order to make full use of computer resources, it is necessary that the sizes of the clusters should be similar. One of the algorithms that may be able to achieve both goals is balanced label propagation. However, this algorithm must be initialized with balanced clusters. In that case, the quality of the initialization will largely affect the quality of the final clustering. We are interested in the effect of different initialization in different social networks with varying cluster sizes.

In this study, we have chosen to test three initialization: (1)random, (2)spectral clustering, and (3)ego-networks approach. In experiments performed on three different social networks, ignoring the restraint of balance, spectral clustering will always outperform the ego-network and random clustering. However, if we increase our focus on balance, the performance of spectral clustering becomes highly unstable, largely because it sometimes produce highly unbalanced clustering. On the other hand, the ego-network always performs better than random clustering, in any situation.

Key Words: Social networks, Spectral clustering, Ego-network, Balanced label propagation

Note: The full content of the thesis and the poster are avilable on request