This desire/need to access and analyze data is different from traditional Computer Science. The focus is not on devising new generic models, algorithms, or data structures. Instead, non-computer scientists need to apply, integrate, and customize already-developed methods within their domain. Similarly, it’s not a sub-discipline of the current non-CS disciplines because there are common challenges and approaches across these domains in utilizing computing and data for their requirements.
So “this” (whatever it is!) seems to me a discipline by itself with its own BS, MS, and Ph.D. degrees. Here’s what’s led me to this conclusion.
At USC, we started by offering the obvious choice, the MS degree, in areas such as CyberSecurity and Data-Informatics (see Informatics Program). It was clear that what was needed by MS Data-Informatics students, for example, wouldn’t be provided by the current CS courses. For example, our CS database course is designed to teach the underlying concepts of designing a data management system. However, Data-Informatics students, in just one course, must be trained in multiple data-management methods, e.g., relational databases, key-value stores, map-reduce, cloud,--and learn how to apply them in support of real-world, data-intensive applications.
We are now expanding the MS program by developing joint Master degrees across schools. One example is the Master in Communication-Informatics, offered in collaboration with the Annenberg School for Communication and Journalism. And in addition, we quickly learned that a similar curriculum could be put together as a BA, which would be a second degree in addition to the undergrad BA in science (e.g., biology, physics).
My last thought is this: as CS faculty members, I am sure we’ve sat on a CS Ph.D. defense (or perhaps an Electrical or even Industrial Engineering defense) where the candidate did not develop a new CS concept, but instead applied/customized/integrated current concepts for a pretty sophisticated application. We passed the candidate knowing the effort was worthy of a Ph.D. But maybe we also puzzled about the CS contribution there. I suspect our non-CS colleagues have had similar experiences when their Ph.D. students did the same, instead of advancing their corresponding field of science.
Now, if “this” is indeed a new discipline, what should it be called?
The first time I heard a version of this argument was by a very dear mentor, Dr. Jim Gray, who called it eScience: basically, understanding scientific phenomenon by data--as opposed to observation, models, or simulations. However, that term was never picked up by the academia.
Meanwhile, some universities in US started the field of Informatics to capture the need of science disciplines for computation. This term did get picked up, especially as a suffix for science fields, such as bioinformatics. However, it may not encompass the new data needs of businesses, arts, etc., also, in Europe, Informatics was used to refer to the whole field of CS. Moreover, in US, as observed by another dear mentor, Prof. Ramesh Jain, CS colleagues looked at Informatics as a lesser field, much as mathematicians looked at CS as a lesser field in 60’s.
In recent years, the idea of a new discipline has taken wings again with new supporters, due to the BigData revolution (see for example Prof. Mike Franklin’s talk at the IMSC retreat 2015). The industry name for it is Data Science; this time with a better reception by the academia and CS colleagues. However, I think the emphasis on “data” doesn’t capture the broader focus of this discipline; for example, it doesn’t capture cybersecurity, which is a challenge across many disciplines utilizing computing.
Well, I don't know which term will prevail. Or perhaps a new branding will emerge. What I do know is that the new discipline is coming and I am excited to be contributing to its formation.
"by Cyrus Shahabi, March 2015"