A new whitepaper that Microsoft researchers are set to existing at a conference next month sheds much more light on Microsoft;s back-end cloud infrastructure.The paper,
Office Home And Business, entitled, “SCOPE: Easy and Efficient Parallel Processing of Massive Data Sets,” details a new declarative scripting language that is optimized for storing and analyzing massive data sets (like search logs and click streams) that are key to cloud-scale service architectures. SCOPE, or Structure Computations Optimized for Parallel Execution,
Office 2010, is the name of the language.According to the paper — which Microsoft is on tap to current at the VLDB 2008 conference in late August — SCOPE doesn;t require explicit parallelism, but it will be “amenable to efficient parallel execution” across large clusters. SCOPE is like SQL, but with C# extensions, the paper says.I found the new whitepaper via a blog link from Greg Linden, an employee of Microsoft;s Live Labs. Linden blogged:“Scope is similar to Yahoo;s Pig,
Windows 7 Key, which is a higher level language on top of Hadoop, or Google;s Sawzall, which is a higher level language on top of MapReduce. But, where Pig focuses on and advocates a extra imperative programming style, Scope looks a great deal far more like SQL.”Reading through the paper, I noticed an explanation of how SCOPE fits in with Cosmos, Microsoft;s back-end storage layer that currently powers Live Search and other Microsoft services. The SCOPE whitepaper sheds extra light on what Cosmos is and how it works. From the paper:“Microsoft has developed a distributed computing platform, called Cosmos, for storing and analyzing massive data sets. Cosmos is designed to run on large clusters consisting of thousands of commodity servers. Disk storage is distributed with each server having one or more direct-attached disks.”(A loosely-coupled aside: I wonder if Pat Helland;s decision to move to the SQL team at Microsoft has any connection to all of this. Helland;s expertise is in big-picture strategy around transactional and parallel processing, as well as service-oriented architectures.)Increasingly,
Windows 7 32 Bit, all of Microsoft;s future strategies and products finally seem to be converging. A lot more teams are thinking about parallel/distributed/multicore computing, with the experimental Windows successor code-named Midori being just the most recent of many examples. More Microsoft products are seemingly being designed with modeling in mind from the get-go.Maybe Chief Software Architect Ray Ozzie;s campaign to break “drive alignment” across the various Microsoft product groups is finally taking root…. Or maybe it;s simply that cloud computing, to be truly scalable,
Genuine Windows 7, must be built to work across increasingly large networks of distributed systems. Or maybe it;s a little of both….