The overall research objective is to combine big data analytics with econometric techniques and social psychology user studies, to enhance our fundamental understanding of how early-stage high-tech startups operate – and how we might model (and optimize) their behavior and decision processes. Of particular interest is the use of social media to influence the likelihood that a startup achieves the factors mentioned above. For example, how social media engagement might positively (or negatively) impact a startup’s ability to raise capital and sell products to early adopters. While other researchers have begun to analyze this phenomenon on small datasets or controlled user studies, our work will leverage big data technology in an innovative manner to carry out data-driven studies at a massive scale, employing modern machine learning algorithms in order to derive new insights.

The diagram above shows our data collection and analytics architecture. A number of high-performance parallel crawlers are used to gather social media inputs from Facebook, Twitter, CrunchBase, and AngelList. The crawled data is stored in the HDFS. We formulate different social, behavioral, and economic theories. Parallel statistical and machine learning queries can also be directly programmed in Spark to analyze the crawled data. Our platform also allows for external plug-ins, for example, the use of external community detection libraries. In future, we plan to provide familiar interfaces to social scientists, so that they can directly validate theories using computational platforms such as R, Matlab, and SPSS. A translation layer will map the theories to Spark queries for execution.


  • Predicting Startup Crowdfunding Success through Longitudinal Social Engagement Analysis
    Qizhen Zhang, Tengyuan Ye, Meryem Essaidi, Shivani Agarwal, Vincent Liu and Boon Thau Loo
    International Conference on Information and Knowledge Management (CIKM), Nov 2017.
  • Collection, Exploration and Analysis of Crowdfunding Social Networks [Paper]
    Miao Cheng, Anand Sriramulu, Sudarshan Muralidhar, Boon Thau Loo, Laura Huang and Po-Ling Loh.
    3rd International Workshop on Exploratory Search in Databases and the Web (ExploreDB), co-located with
    SIGMOD, June, 2016.


Shivani Agarwal
Laura Huang
Vincent Liu
Po-Ling Loh (UW-Madison)
Boon Thau Loo


Miao Cheng
Meryem Essaidi
Sudarshan Muralidhar
Anand Sriramulu
Tengyuan Ye
Qizhen Zhang