Columbia University Joint CS/EE Networking Seminar Series


Modeling and Data Mining of Academic Community and Research Trends

Dr. Dah Ming Chiu

Department of Information Engineering at the Chinese University of Hong Kong

Monday September 28th, 11 AM

Computer Science Building Conference Room



Abstract: Academic publication data is routinely used in academic assessment and university rankings. By digging deeper into the available data, in particular by considering the social aspect of research collaboration, we can learn more about important trends in academic research. For example, there is considerable paper and citation inflation in recent years; another significant trend is research is increasingly inter-disciplinary. We can explained these trends from data, rather than only qualitatively. We do standard data mining, but also try to develop models to explain what is going on. In particular, we have developed a populariont model of academic community (based on branching process), that help us understand the structure and population size changes in a community, and its activity and productivity growth trends. Most of our study is focused on the last 50 years of the computer science community, since we are more familiar with it. We also tested our approach with the physics community using a standard (APS) dataset.

Bio: Dr. Dah Ming Chiu received his first degree from Imperial College London and his Ph.D. degree from Harvard University. He worked in industry for several hightech companies of his time: Bell Labs, DEC and Sun Microsystem Labs. He returned to academia in 2002 to become a professor in the Department of Information Engineering at the Chinese University of Hong Kong. He served as department chairman from 2009 to 2015. He served as an associate editor for IEEE/ACM Transaction on Networking from 2006 to 2011, and served on the TPC for many computer networking conferences. He was the general co-chair of ACM Sigcomm 2013, held in Hong Kong with record attendance. His normal research areas include computer networking, especially network resource allocation, content distribution and network economics. His recent research interest is shifting towards data-driven analysis applied to a broader set of issues, for which the subject of this talk is an example.