HARSHVARDHAN SINGH

Data Scientist | The University of Arizona

Welcome!

I’m Harsh, a Data Scientist with experience in transformer-based models (e.g., BERTopic), Topic Modeling techniques (e.g., Structural Topic Models, Dynamic Topic Models, Labeled LDA), Natural Language Processing tasks (e.g., key phrase extraction using SpaCy/YAKE) and Machine Learning model training using methods such as Weak Supervision. I have a strong foundation in Statistical Modeling, Graph Networks, Data Preprocessing and implementing ETL processes. Additionally, I have experience analyzing regression models using key statistical metrics, such as beta coefficients, standard errors, T-values, and P-values.

I’m a staff member at the University of Arizona where I’m part of a US Army-funded research team that applies innovative methodologies in Data Science to study how nations leverage scientific enterprises to project influence on a global scale. This involves analyzing patterns of research agenda mimicry, tracking international mobility of scientists into positions of influence, and employing agent-based models to simulate policy diffusion mechanisms. My focus includes mapping the research landscape in fields such as Astronomy and Astrophysics by analyzing the research output of countries and their influence on topics generated using BERTopic and STMs. I investigate the key topics in the field, how important they are in terms of their prevalence and which countries have historically influenced the research output of other countries. I have incorporated Shannon Entropy to measure diversity in research focus (i.e., how specialized or broad a country’s research is). This helps to distinguish between generalist countries that cover a wide range of topics, and specialist countries that focus on fewer areas. I also rely on statistical measures such as the Hirschman-Herfindahl Index (HHI) to quantify topic concentration and its weighted impact, while using cosine similarity to assess the relationship between research in current and preceding time periods. By deriving influence ratios from term-level similarities, I provide insights into how prior research shapes current topic prevalence and trends.

Prior to my current position I received my Masters degree (M.S) in Data Science from The University of Arizona, where I was a Graduate Research Assistant at The Global Knowledge Lab and Observatory, and my Bachelors degree (B.Tech.) in Computer Science and Engineering from SRM University, India.

Please scroll down to view my projects and learn more about my work!

Let’s Connect!

I’m always eager to collaborate, learn, and take on new challenges. If you're looking for a detail-oriented professional with a passion for [your key skills, e.g., data analysis, strategy, or innovation], let's connect! I’d love to explore how my skills and experiences can align with your organization's goals.