The New Industrial Revolution: Computing in Pole Position
We are living in a world that is witnessing an incredible pace of innovation, where computing research is playing the central role. Ambitions are not just high from a technical perspective, but also from a business impact perspective. This talk will outline the excitement in the industry that surrounds these eventful times, and the exhilarating journey that researchers are pursuing at Microsoft.
"Big Data" means different things to different people. To me, it means one of four totally different problems:
Big volumes of data, but "small" analytics. The traditional data warehouse vendors support SQL analytics on very large volumes of data. In this talk, I make a few comments on where I see this market going.
Big analytics on big volumes of data. By big analytics, I mean data clustering, regressions, machine learning, and other much more complex analytics on very large amounts of data. I will explain the various approaches to integrating complex analytics into DBMSs, and discuss which ones seem more promising. In addition, I will explore why Hadoop, in its current form, will not be a player in this market.
Big velocity. By this I mean being able to absorb and process a firehose of incoming data for applications like electronic trading. In this market, the traditional SQL vendors are a non-starter. I will discuss alternatives including complex event processing (CEP), NoSQL and NewSQL systems. I will also make a few comments about the "internet of things".
Big Diversity. Many enterprises are faced with integrating a larger and larger number of data sources with diverse data (spreadsheets, web sources, XML, traditional DBMSs). The traditional ETL products do not appear up to the challenges of this new world, and I talk about an alternate way to go.
A Mathematical View of Computer Systems
Mathematics provides what I believe to be the simplest and most powerful way to describe computer systems.
The Pit and the Pendulum
Since the elegant foundations of transaction processing were established in the mid 70's with the notion of serializability and the codification of the ACID (Atomicity, Consistency, Isolation, Durability) paradigm, performance has not been considered one of ACID's strong suits, especially for distributed data stores. Indeed, the NoSQL/BASE movement of the last decade was born out of frustration with the limited scalability of traditional ACID solutions, only to become itself a source of frustration once the challenges of programming applications in this new paradigm began to sink in. But how fundamental is the dichotomy between performance and ease of programming?
In this talk, I will share the insights I have gained in trying to unlock the performance potential of the ACID transactional paradigm without sacrificing the generality and ease of programming that define it.
Learning from Rational (*) Behavior
Machine Learning is increasingly becoming a technology that directly interacts with human users. In fact, much of the Big Data we collect today are the decisions that people make when they use the systems we built. This is already evident in search engines, recommender systems, and electronic commerce, while other applications are likely to follow in the near future (e.g., autonomous robots, smart homes, games). In this talk, I argue that learning from human interactions requires learning algorithms that explicitly account for human behavior and motivation. Towards this goal, the talk explores how integrating microeconomic models of human behavior into the learning process leads to new learning models that no longer misrepresent the user as a "labeling subroutine". This motivates an interesting area for theoretical, algorithmic, and applied machine learning research with connections to rational choice theory, econometrics, and behavioral economics.
(*) Restrictions apply. Some modeling required.
The Computer Simulation of People
The computer simulation of people is a grand challenge problem with potentially profound impact across many disciplines. In the field of computer graphics, it is especially relevant to the animation of realistic human characters for a variety of applications in the interactive computer game and motion picture industries. I will overview our progress on realistic human modeling and simulation, whose scope spans the biomechanical, behavioral, and social levels. In particular, I will review the state-of-the-art in musculoskeletal physics-based simulation and neuromuscular control of the human body, as well as the artificial life approach to multi-human simulation yielding 3D virtual worlds populated by lifelike autonomous pedestrians with some proper social etiquette. Finally, I will discuss the profound scientific and computational challenges that remain in comprehensively simulating humans as individuals and in collectives.
Age of AI: Artificial Intelligence, Agglomerative Intelligence, Adaptive Intelligence, and Ambient Intelligence
The building of a computer that can intelligently carry out tasks that require "human level intelligence" has been a goal for computer scientists since the 1950's. Since that time, tasks that were considered to exhibit human level intelligence such as understanding natural language, carrying out conversational speech, playing master level chess, and interpreting images have all seen a great deal of improvements and many have seen real world applications. In this talk, I will provide an overview of the progress of artificial intelligence, highlight some recent milestone results in image recognition and natural language understanding, and predict the progress of AI in research and application over the next 5-10 years.