Announcing the General Availability of Azure Machine Learning

February 18, 2015, 8:00 am

≫ Next: Microsoft Releases Azure HDInsight on Linux and Machine Learning GA at Strata Conference

≪ Previous: Big Learning Made Easy – with Counts!

This blog post is authored by Joseph Sirosh, Corporate Vice President of Information Management & Machine Learning at Microsoft.

We built Azure Machine Learning to democratize machine learning. We wanted to eliminate the heavy lifting involved in building and deploying machine learning technology and make it accessible to everybody. Supporting open source innovation and enabling breakthrough learning capabilities with big data were important. So were supporting community-driven development and the ability for developers to easily create and monetize cloud-hosted APIs and applications. Most importantly, we wanted our customers to easily leverage future advancements in data science.

And now that future is taking shape. Today, at Strata + Hadoop World, we are announcing the general availability release of Azure Machine Learning, a fully-managed, fully-supported service in the cloud. No software to download, no servers to manage – all you need to start doing data science is a browser and internet connectivity. This release is packed with game changing innovations – here are a few highlights:

Creating web services is now a lot easier. We completely revamped the process for creating web services. It is now far more intuitive to take a data science workflow and create an analytics web service from it – it takes only minutes. We even provide a ready-to-use Excel client into which you can plug in your own data to easily test your web service.
You can train/retrain through APIs. Also new in this release is programmatic access to refresh Azure Machine Learning models with new data. This capability lets you retrain a model periodically, for instance when new data becomes available. It also allows consumers of a model that you created to retrain the model with their own data. For example, now you can create an API in the marketplace for your customer, and provide support for the customer to update the model with fresh data.
Python is now supported, and so is R. Now you can use the Anaconda distribution of Python along with its rich ecosystem of libraries such as numpy, scipy, pandas, scikit-learn, etc. directly in Azure Machine Learning Studio. Python developers can easily build sophisticated analytics experiments and create web services in the cloud with a few clicks. You can do the same with R, and even build experiments/web services that compose Python, R and Microsoft’s machine learning algorithms in a single workflow. This is a boon for innovators who seek to leverage the rich open source libraries of these two ecosystems in application building.
“Big Learning” is now possible. Azure Machine Learning now supports Learning with Counts, a revolutionary feature transformation capability that allows efficient classification and regression with terabyte sized data sets. This new capability uses parallel mapreduce in Azure HDInsight to efficiently create reduced feature representations from big data. Using the transformed features and appropriate sampling, one can learn highly accurate predictive models using state of the art algorithms such as neural networks and boosted decision trees.
You can use finished web services on the Azure Store. We now have a set of web service applications available on the Azure Store for common machine learning applications. These include Recommendations, Anomaly Detection and Text Analytics. Any web site, phone app, or SaaS application can integrate these capabilities with a few lines of code. These are examples of powerful applications that data scientists can now create and publish to the Azure Machine Learning marketplace, and participate in the emerging data science economy
We added a new community gallery. This release includes a community-driven gallery that lets you discover and use interesting experiments authored by others. You can ask questions or post comments about experiments in the gallery or publish your own. You can share links to interesting experiments via social channels such as LinkedIn and Twitter. The gallery is a great way for users to get started with Azure Machine Learning and learn from others in the community.

But that’s not all. To ease the path for cloud-based data science, we have created a step-by-step guide for the Data Science journey from raw data to a consumable web service. We also added the ability to use great tools such as iPython Notebook and Python Tools for Visual Studio along with Azure Machine Learning. And there are new capabilities for data reading and transformation, a module for SQLite support, and new learning algorithms such as Quantile Regression. With the integration of these diverse capabilities, Azure Machine Learning is now the most comprehensive data science and machine learning service available.

Our customers continue to apply Azure Machine Learning in interesting business scenarios. For example, eSmart Systems of Norway is pioneering smart grid management using our tools. A traditional smart grid includes multiple data silos, including SCADA networks, building automation systems and substation meters. In this environment, it can be difficult to forecast consumption and prevent bottlenecks or outages. For a utility company, upgrading its entire infrastructure would be costly. Even when upgrades are made, e.g. new smart sensors or meters, data gets collected but is not readily accessible. eSmart Systems uses the Azure cloud platform to integrate and analyze usage data and create forecasts. Azure Machine Learning is the "brains" of the solution, running the data models for predictive analytics. The analytics are used to predict capacity problems and automatically control load in individual buildings.

Sigurd Setelev, Chief Strategy Officer of eSmart Systems, says:

“For what we’re doing at eSmart, we needed a cloud solution because of the sheer volume of data being collected; if we were to do it on premise we’d need a lot of storage. We also do a lot of data crunching using Hadoop, which also requires a lot of infrastructure. What we really like about Azure Machine Learning, and Azure in general, is that everything we do is through services available in Azure and we don’t need to monitor virtual machines.”

Mendeley is another innovative customer. One of the biggest repositories of scientific research content in the world, Mendeley provides a global platform and social network to foster discovery and community collaboration. To improve the user experience, Mendeley was looking to anticipate the behavior of new users in their initial adoption and engagement phase. Within two weeks of implementing Azure Machine Learning, developers were able to create a predictive model that was 30 percent more accurate than an earlier model that had taken them months to develop on their own. Not only is Mendeley able to iterate and deploy models three to five times faster, they can pinpoint their users’ needs with much greater confidence.

“Azure Machine Learning allowed us to build a better model than our previous solution in a third of the time, reducing lead time from model evaluation to deployment down to zero as it’s automated.”, says Mendeley CTO Fernando Fanton.“The beauty of Azure Machine Learning is that it’s open, allowing easy integration via widely adopted technologies such as REST and Hive.”

Hundreds of Microsoft partners including Booz Allen Hamilton, Cognizant Technology Solutions, Dell and Infosys are using Azure Machine Learning to build innovative advanced analytics solutions for customers. Several of them offer Azure Machine Learning -based learning services to global data science communities. We will share more information about our partner organizations’ adoption of Azure Machine Learning in an upcoming blog post.

Also, as we announced earlier today, Informatica has joined our ecosystem of partners. The Informatica Cloud service allows customers to pull data from a variety of on-premises systems and the cloud – including from SaaS applications such as Salesforce.com, Workday, Marketo and more – into Azure Blob storage. Once the data is in Azure, it is readily accessible for processing and analytics using Azure Machine Learning. Learn more about Informatica Cloud and the Informatica Azure Blob connector here.

For those of you who have not yet experienced Azure Machine Learning first-hand, I encourage you to check out our offering today– it is free and easy for new users to get started, and no credit cards or Azure subscriptions are needed.

We believe Azure Machine Learning is a game changer. No other advanced analytics service comes close to the scope, openness and breadth of the offering, or the ability to leverage the cloud for easy application development and deployment. Together with other Azure big data services such as HDInsight, stream analytics offerings such as Azure Stream Analytics, data pipeline orchestration services such as Azure Data Factory, and business intelligence services such as Power BI, Azure Machine Learning enables businesses to wring value out of every byte of data that they store and process. The future is bright for a world optimized with data, insights and intelligence.

Joseph
Follow me on Twitter

PS: If you are attending Strata this week, tune in to my talk on Cloud Machine Learning to learn about how businesses benefit when advanced analytics and the cloud come together.

↧

Microsoft Releases Azure HDInsight on Linux and Machine Learning GA at Strata Conference

February 18, 2015, 9:00 am

≫ Next: Strong Partner Momentum Around Microsoft Advanced Analytics

≪ Previous: Announcing the General Availability of Azure Machine Learning

Today at Strata + Hadoop World, Microsoft is announcing the public preview of Azure HDInsight running on Linux, the general availability of Storm on HDInsight, the general availability of Azure Machine Learning, and the availability of Informatica technology on Azure. These new services are part of our continued investment in a broad portfolio of solutions to unlock insights from data.

Head over to the announcement blogs to read more about these exciting developments:

↧

Strong Partner Momentum Around Microsoft Advanced Analytics

February 19, 2015, 9:00 am

≫ Next: Channel 9 Implements The Azure Machine Learning Recommendations API

≪ Previous: Microsoft Releases Azure HDInsight on Linux and Machine Learning GA at Strata Conference

This post is authored by Garth Fort, General Manager for Enterprise Partners in the Cloud & Enterprise Marketing team at Microsoft.

At Strata + Hadoop World yesterday, we announced the general availability of Microsoft Azure Machine Learning (Azure ML), a fully-managed cloud service offering powerful advanced analytics capabilities.

On the one hand, Azure ML, with its free version, rich templates and step-by-step guides makes it a breeze for new users, armed with just a browser and net connection, to get started on data science. On the other hand, Azure ML also offers battle-tested machine learning algorithms and sophisticated tools and capabilities that allow enterprise-grade analytics applications to be created and deployed with relative ease.

Since we launched the public preview of Azure ML in June 2014, thousands of analytics professionals – working at hundreds of Microsoft partner organizations – have adopted Azure ML as a core part of their offering. Yesterday’s announcement further strengthens the partner momentum around Azure ML and Microsoft advanced analytics.

Some of our partners who have embraced Azure ML share their first-hand experiences through the video below:

We are energized by our partners’ adoption and validation of our product – some of their experiences are recaptured below:

Cognizant Technology Solutions, a leading Microsoft partner in the big data space, has built a center of excellence around Azure ML and certified associates with hands-on experience are working extensively with the technology. Karthik Krishnamurthy, Global Head, Enterprise Information Management, says:

“Moving from data to value becomes very important, and that’s what Azure ML is really good at doing”.
Infosys, a global IT services giant, has seen their analytics associates rapidly embrace Azure ML. Issac Mathew, a Principal in Infosys’ Analytics Practice says:

“Something fantastic that I’ve seen in Azure ML studio is the ability to use R scripts. What that means for the existing team is they don’t need to relearn a whole new language. So if you’re an expert in R, you can pretty much use at least 90 percent of the core that you were using earlier as it is directly plugged into Azure ML. So that’s something amazing.”
Leading Independent Software Vendors (ISVs) too are embracing Azure ML to leverage the power of advanced analytics in the cloud. For instance, Dell is incorporating Azure ML into their Statistica predictive analytics solution. John Thompson, General Manager of Dell’s Advanced Analytics Group says:

“When we saw Azure Machine Learning, the first demo really blew us away. The benefits that we’re seeing from Azure Machine Learning or our customers are seeing from Azure Machine Learning is that we’re starting to have the cloud economics come to play in predictive analytics”.
“Born-in-analytics” startup Data Science Dojo offers learning services on Azure ML to customers across the globe and is helping build a global data science talent pool. CEO Raja Iqbal says:

“We leverage the seamless integration of Azure ML with rest of the Azure ecosystem to train data scientists and data engineers who can handle the proverbial 3 Vs (Volume, Velocity and Variety) of big data.”
Booz Allen Hamilton's Lead Technologist for Data Science, Sean Weppner, had this to say:

“Azure’s machine learning is a great opportunity for us as Booz Allen, because it enables our teams to connect that much more directly. It enables the management to be able to hop in and see what their engineering teams are doing, visualizing things at every single layer of the analytics in a way that’s much easier than tools that we’ve seen in the past or used in the past”.

To learn more about Azure ML today including a free trial, just click here.

To partner with Microsoft, visit the Microsoft Partner Network where you can enroll, manage your membership and claim partnership benefits.

Garth

↧

Channel 9 Implements The Azure Machine Learning Recommendations API

February 20, 2015, 8:00 am

≫ Next: New Advanced Publish Options to Specify Object Types to Exclude or Not Drop

≪ Previous: Strong Partner Momentum Around Microsoft Advanced Analytics

Reposted from Channel 9

ML Blog Team

↧

New Advanced Publish Options to Specify Object Types to Exclude or Not Drop

February 23, 2015, 11:36 am

≫ Next: Microsoft's Bing prediction engine correctly predicts all six top Oscars 2015 winners

≪ Previous: Channel 9 Implements The Azure Machine Learning Recommendations API

Our team has been hard at work these past few months, and we're thrilled to announce that the most frequently requested improvement to SQL Server Data Tools will soon be here. We plan to release an update of SQL Server Data Tools in the coming weeks, and in that update you'll find new options that provide greater flexibility and control over database publishing.

We've re-organized the Advanced Publish Settings dialog and added new options on the Drop and Ignore tabs. On the Drop tab, selecting Drop objects in target but not in source now enables additional options for selecting object types, like Users, that will not be dropped. The need for this frequently arises with TSQL user objects, as they are often jointly managed by the developer and the database administrator. Until now, selecting Drop objects in target but not in source always caused any target-only user objects to be dropped during publish, but the new Do not drop users option leaves the choice up to you.

Do not drop users can be combined with do not drop permissions and do not drop role membership to ensure that settings for user objects that exist only in the target database are not modified.

Note that the new "do not drop" options are just concerned with target-only objects. Selecting Do not drop users, for example, will not prevent users that are defined in your database project from being created or altered – it only prevents users that aren't defined from being dropped. Sometimes, though, you just want a certain object type to be completely ignored when publishing a database, so in addition to the "do not drop" options, we've added options for excluding object types from publish.

On the Ignore tab, the new options in the Excluded Object Types section allow you to choose to prevent certain object types from being published. Excluded object types aren't evaluated during publish and new, modified, or deleted objects of an excluded type aren't created, changed or dropped in the target database. (Note, though, that it is sometimes necessary to modify an excluded object when one of its dependencies has been changed or removed.)

We've also modified SqlPackage.exe and the Data-Tier Application Framework (DacFX) to accept these new options.

We'd like to thank the user community for the clear and consistent feedback you have provided about the importance of these options, and we encourage you to tell us what we should be doing next by:

Submitting feedback and voting at Connect
Posting on our Forum
Submitting a comment below

↧

Microsoft's Bing prediction engine correctly predicts all six top Oscars 2015 winners

February 24, 2015, 7:30 pm

≫ Next: Free Webinar: An Overview of the New Capabilities in Azure Machine Learning

≪ Previous: New Advanced Publish Options to Specify Object Types to Exclude or Not Drop

As reported by The Verge, the Microsoft Bing prediction engine – which has had great success in the past at predicting the World Cup, English soccer results and NFL games – successfully predicted the best picture, best director, best actor, best actress, supporting actor and actress out of the top awards for the 2015 Oscars.

In fact, Bing successfully predicted 84 percent of the 24 Oscar 2015 results.The prediction model was managed by Microsoft researcher David Rothschild, who, in the past, correctly predicted 21 of 24 Oscar winners in 2014 and 19 of 24 winners in 2013. In comparison, Vegas odds from the Wynn casino weren’t nearly as accurate – the Wynn predicted best picture, best actress, best actor, best supporting actress, best supporting actor, and best director, but only managed to correct 4 of 6 correctly. Microsoft predicted all six accurately.

ML Blog Team

↧

Free Webinar: An Overview of the New Capabilities in Azure Machine Learning

March 2, 2015, 12:00 pm

≫ Next: SQL Server Data Tools and Data-Tier Application Framework Update for February 2015

≪ Previous: Microsoft's Bing prediction engine correctly predicts all six top Oscars 2015 winners

We are pleased to bring you a free webinar where we will explore the numerous new capabilities we have introduced in the recently release GA version of Azure ML. This session will be full of examples and demos. There will be time for Q&A as well. The webinar will occur tomorrow, Tuesday March 3rd, at 10AM Pacific time. It will be hosted by Hai Ning, Principal Program Manager on the Azure ML team:

As you may already know, Azure ML offers a great experience for data scientists of all skill levels and requires nothing but a browser and net connectivity.

You can create simple data flow graphs and ML experiments via simple drag and drop.
The Azure ML Studio includes best-in-class algorithms from Microsoft businesses like Xbox and Bing, R and Python packages and a gallery of sample experiments which make it easy to get started.
You can operationalize an ML model into web services within seconds.

ML Blog Team

↧

SQL Server Data Tools and Data-Tier Application Framework Update for February 2015

March 2, 2015, 4:17 pm

≫ Next: IoT to Help the Stanford Linear Accelerator Beam Keep Going for the Next 50 Years

≪ Previous: Free Webinar: An Overview of the New Capabilities in Azure Machine Learning

The SQL Server Data Tools team is pleased to announce an update for SQL Server Data Tools in Visual Studio and the Data-tier Application Framework (DACFX) is now available.

Get it here:

SQL Server Data Tools: https://msdn.microsoft.com/data/hh297027

Data-Tier Application Framework (DACFX): http://www.microsoft.com/en-us/download/details.aspx?id=45886

What’s New?

Support for the latest Azure SQL Database V12 features

Many features were added to Azure SQL Database in V12, and now the developer tools in Visual Studio support V12.

Improved Cross-Platform Schema Comparison

Schema compare has required you to select the "allow incompatible platforms" option when comparing a source to a target that supports fewer objects, as when the source is SQL Server 2014 and the target is SQL Server 2008. Now you can compare anything without selecting the "allow incompatible platforms" option. If any compatibility issues exist, you'll be notified when you attempt to update the target. Note, though, that incompatible DML, such as procedure bodies, won't be identified. Look for that feature in a future release.

New advanced publish options

We've added new options to increase your control over publishing, including the ability to not drop users. For more details, click here.

Bug fixes to customer-reported issues

This release includes fixes for the following issues:

Problems encountered on High DPI displays.
Operations performed on Microsoft Azure SQL Database would fail due to timeout.
Data compression options on inlined indexes were not honored.
Grant permission statements were not correctly generated for objects inside of a schema whose authorization is not dbo.
Import would fail when populating data in a table that has a nonclustered columnstore index.
Extracting a dacpac from a database would fail if a stored procedure contained an external reference and the verify extraction option was selected.
Pasting into a View Data cell replaced both the selected and unselected text.
Updating a value in the View Data grid failed if a column was filtered.
Schema Compare and publish failed to enforce column order in foreign key constraint definitions.
Schema Compare incorrectly showed differences for the IsPersistedNullable property.
Using the output keyword with a table-valued function caused an error.
Using an inline index with a table-valued function caused an error.
Updates performed from Schema Compare sometimes failed to re-enabled DDL triggers after disabling them for deployment.

Contact Us

If you have any questions or feedback, please visit our forum or Microsoft Connect page. We look forward to hearing from you.

↧

IoT to Help the Stanford Linear Accelerator Beam Keep Going for the Next 50 Years

March 4, 2015, 11:00 am

≫ Next: Microsoft Enhances Data Platform with Availability of Fully-Managed NoSQL and Search Services

≪ Previous: SQL Server Data Tools and Data-Tier Application Framework Update for February 2015

Re-posted from the Microsoft News Center

Over its 50-year history, research conducted at the SLAC (Stanford Linear Accelerator Center) has led to the discovery of matter’s fundamental building blocks and unveiled insights into the origin of the universe and the nature of dark energy. The work at SLAC has also helped scientists discover new drugs, new materials and new ways to produce clean energy. Six scientists at the center have been awarded Nobel Prizes.

The SLAC National Accelerator Laboratory, nestled in the hills west of Stanford University, is the longest and straightest building in the world – a two mile long particle accelerator built in the 1960s.

After decades running the facility, SLAC’s laboratory setup is finely tuned, but unexpected outages do occur. To ensure continued operation of this critical scientific resource, SLAC is exploring the Internet of Things (IoT). The team is working on a future plan to take data from all intelligent sensors that monitor the SLAC systems and feed the data into the cloud where it can be processed, analyzed and delivered back to engineers who can then take action before potential failures occur.

SLAC is working with Microsoft Open Technologies, Azure Event Hubs and Azure Machine Learning to integrate the facility’s diverse array of sensor formats into the cloud and perform real-time analysis on it to obtain actionable insights.

Essentially, the SLAC team is using IoT to “listen to the machine”, and teaching the machine to communicate back with engineers, to report on its symptoms well ahead of any problem. It’s all in the name of science, and keeping the SLAC particle accelerator beam going for the next 50 years.

Learn more about the SLAC story at this link or by clicking the image below.

Illustration of an experiment at SLAC to reveal how a protein from photosynthetic bacteria changes shape in response to light. (Photo courtesy SLAC.)

ML Blog Team

↧

Microsoft Enhances Data Platform with Availability of Fully-Managed NoSQL and Search Services

March 5, 2015, 10:00 am

≫ Next: Machine Learning and Advanced Analytics for Non-Data Scientists

≪ Previous: IoT to Help the Stanford Linear Accelerator Beam Keep Going for the Next 50 Years

Guest post by Tiffany Wissner, Senior Director, Data Platform

As part of our commitment to delivering a world-class data platform for our customers, I am excited to announce the general availability of Azure Search and Azure DocumentDB to support the search and unstructured data needs of today’s modern cloud applications.

Generally available today, Azure Search is a fully manage search service that enables developers to easily add search capabilities to web and mobile applications.
We are also announcing the April 8 general availability of Azure DocumentDB , our NoSQL document database-as-a-service.

These new data services extend our investments in a broad portfolio of solutions to unlock insights from data. Azure is at the center of our strategy, offering customers scale, simplicity and great economics. And we continue to make it easier for customers to work with data of any type and size – using the tools, languages and frameworks they want – in a trusted cloud environment.

Azure Search now generally available

Azure Search helps developers build sophisticated search experiences into web and mobile applications. It reduces the friction and complexity of implementing full-text search and helps developers differentiate their applications through powerful features not available with other search packages. As an example, we are adding enhanced multi-language support for more than 50 languages built on our many years of natural language processing experience from products like Microsoft Office and Bing. With general availability, Azure Search now offers customers the ability to more easily load data from Azure DocumentDB, Azure SQL Database, and SQL Server running in Azure VMs in to Azure Search using new indexers. Plus, a .NET software development kit (SDK) is now available to make working with Azure Search a more familiar experience.

JLL, a professional services and investment management company that specializes in commercial real estate, is currently using Azure Search to enable search at scale as it was difficult to do so previously.

“It has always been our plan to have our entire web listings platform in the cloud, but the only component that stopped us from doing that was search. Now, with the Azure Search service, we can realize our goal,” said Sridhar Potineni, director of Innovation at JLL.

Gjirafa, a full-text web search engine and news aggregator specialized in the Albanian language, is also using Azure Search to take on web search giants. Gjirafa is able to use Azure Search to prioritize results that directly tie to their business model.

“Using Azure Search to pre-process the language, we can determine the exact meaning of a phrase before returning the search results. This, along with our local data, means we can serve the Albanian market better than the big search engines,” said Mergim Cahani, founder and CEO of Gjirafa.

Azure DocumentDB generally availability next month

Azure DocumentDB offers rich query and transactional processing over a schema-free JavaScript Object Notation (JSON) data model, which helps enable rapid development and high performance for cloud-born data and applications. Offered in units that scale to meet application performance and storage needs, Azure DocumentDB allows customers to quickly build, grow, and scale cloud applications. The global reach of Azure datacenters ensures that data can scale with the global growth of the application.

New at general availability are flexible performance levels within our standard tier which allow fine control over throughput and cost for data depending on application needs. Azure DocumentDB will be available in three standard performance levels: S1, S2, and S3. Collections of data within a DocumentDB database can be assigned to different performance levels allowing you to purchase only the performance you need. In preview, DocumentDB has been used for a wide variety of scenarios including telemetry and logging data, event and workflow data, device and app configuration data, and user-generated content.

Customers such as Telenor and News Republic are currently using Azure DocumentDB to store and query schema-free event and app configuration data. Telenor, based in Fornebu, Norway, is a mobile operator with subscribers in 27 nations. To attract customers to their service, the company used Azure DocumentDB to get a promotion up and running quickly and track user sign ups.

“With Azure DocumentDB, we didn’t have to say ‘no’ to the business, and we weren’t a bottleneck to launching the promotion—in fact, we came in ahead of schedule,” said Andreas Helland, Mobility architect at Telenor.

News Republic, a free mobile app that aggregates news and delivers it to more than 1 million users in 15 countries, uses Azure DocumentDB to make its app more interactive and create more user-focused features.

“Many people read the news passively, but we have built personalization and interactivity into our app with Azure DocumentDB. This is definitely a great way to get more people using the app and keep existing users interested,” said Marc Tonnes, database administrator for News Republic.

Microsoft data services

The value provided by our data services multiplies when customers use them together. For example, Xomni uses both Azure DocumentDB and Azure Search to create omni-channel marketing experiences for top-tier retailers like Gamestop. Likewise, we are committed to making it easy for customers to connect these services to others within the data platform through tools like the Search Indexer (to connect with SQL Database, Azure DocumentDB, and SQL Server in an Azure VM) and the recently announced Azure DocumentDB Hadoop Connector. By making it simpler to connect data sources across Azure, we want to make it easier to implement big data and Internet-of-Things solutions.

Azure data services provide unparalleled choice for businesses, developers and IT pros with a variety of managed services from Microsoft and our partners that work together seamlessly and connect to our customers’ data platform investments-- from relational data to non-relational data, structured data to unstructured data, constant and evolving data models. I encourage you to try out our new and expanded Azure data services and let us know what you think.

↧

Machine Learning and Advanced Analytics for Non-Data Scientists

March 5, 2015, 11:00 am

≫ Next: SQL Automated Backup and Patching Support in Existing SQL Server Virtual Machines

≪ Previous: Microsoft Enhances Data Platform with Availability of Fully-Managed NoSQL and Search Services

You might be a data scientist, but your boss probably is not.

IT decision-makers need to have a good grasp of advanced analytics technologies like Hadoop and machine learning but the time and energy investment can seem intense. And it can be discouraging to them if your current technology platform is oriented towards OLTP, data warehousing and BI. Who wants to think about starting over with all new tech?

One fast and painless way for non-data scientists to get the background they need is to attend Microsoft Ignite in May. Ignite is a gathering of tech leaders, IT professionals and Microsoft partners who are there to teach, learn, and spark big new ideas.

Ignite is also where IT decision-makers can attend a session called Advanced Analytics: Navigating Your Way There, led by Gigaom Research Director for Big Data and Analytics, Andrew Brust, a SQL Server MVP and Microsoft BI expert. He’ll demo Microsoft's advanced analytics technologies and explain their strategic value and compelling economics. Andrew will also walk through a competitive analysis at this session.

Best of all, he’ll demonstrate how you can fit these new technologies into environments currently based on SQL Server, Analysis Services, and similar technologies.

You’ll find a lot more happening at Ignite, including in-depth sessions for data scientists. You can stay updated via email by subscribing here. If you haven’t signed up already, register for Ignite here.

↧

SQL Automated Backup and Patching Support in Existing SQL Server Virtual Machines

March 6, 2015, 1:00 pm

≫ Next: Updated SQL Server PHP Driver Now Available

≪ Previous: Machine Learning and Advanced Analytics for Non-Data Scientists

In January, we released the SQL Automated Backup and SQL Automated Patching services for SQL Server Virtual Machines in Azure. These services automate the processes of backing up and patching your SQL Server VMs. In that release, you were able to configure these services in the Azure Preview Portal when provisioning a new SQL Server 2014 Enterprise or Standard VM. You could also configure these services in an existing Virtual Machine via PowerShell commandlets.

We have now expanded the experience so you can configure these services in an existing SQL Server VM in the Azure Preview Portal. Whether you have already enabled these services or not inside your Virtual Machine, you can go to that VM in the Azure Preview Portal and either update your configuration or create a new configuration for each service. You will find both services under the Configuration label, shown in Figure 1 below.

Figure 1. Configuration label has both services

Try these services out in the Azure Portal, and check out the documentation for further details.

↧

Updated SQL Server PHP Driver Now Available

March 9, 2015, 12:00 pm

≫ Next: Get More Out of the Hybrid Cloud at Ignite

≪ Previous: SQL Automated Backup and Patching Support in Existing SQL Server Virtual Machines

As part of SQL Server’s ongoing interoperability program, we are pleased to announce an updated Microsoft SQL Server driver for PHP. The new driver, which supports PHP 5.6, is now available!

This driver allows developers who use the PHP scripting language to access Microsoft SQL Server and Microsoft Azure SQL Database, and to take advantage of new features implemented in ODBC. The new version works with Microsoft ODBC Driver 11 or higher.

You can download the PHP driver here. We invite you to explore the rest of the latest the Microsoft Data Platform has to offer via a trial evaluation of Microsoft SQL Server 2014, or by trying the new preview of Microsoft Azure SQL Database.

↧

Get More Out of the Hybrid Cloud at Ignite

March 10, 2015, 8:00 am

≫ Next: Convolutional Neural Nets in Net#

≪ Previous: Updated SQL Server PHP Driver Now Available

Hybrid cloud solutions offer the best of both worlds: you get the flexible power of the cloud combined with the tight control of localized datacenters. SQL Server offers tons of out-of-the-box hybrid capabilities, and Microsoft Principal PM Nosheen Syed will show you how to make the most of them in his Ignite session "Microsoft SQL Server to Microsoft Azure Virtual Machines: Hybrid Story."

You'll learn all about:

Using Managed Backup to take charge of data storage sensibly
Boosting IT efficiency with Azure Replica Wizard
Easy "lift and shift" migration of on-site SQL Server workloads to Azure Virtual Machines
Backup to Block Blob, which is as easy to use as it is difficult to say

There’s plenty more happening at Ignite. Get email updates by subscribing here. And if you haven’t already, register for Ignite here.

↧

Convolutional Neural Nets in Net#

March 10, 2015, 9:00 am

≫ Next: EF6.1.3 RTM Available

≪ Previous: Get More Out of the Hybrid Cloud at Ignite

This blog post is authored by Alexey Kamenev, Software Engineer at Microsoft.

After introducing Net# in the previous post, we continue with our overview of the language and examples of convolutional neural nets or convnets.

Convnets have become very popular in recent years as they consistently produce great results on hard problems in computer vision, automatic speech recognition and various natural language processing tasks. In most such problems, the features have some geometric relationship, like pixels in an image or samples in audio stream. An excellent introduction to convnets can be found here:

https://www.coursera.org/course/neuralnets (Lecture 5)
http://deeplearning.net/tutorial/lenet.html

Before we start discussing convnets, let’s introduce one definition that is important to understand when working with Net#. In a neural net structure, each trainable layer (a hidden or an output layer) has one or more connectionbundles. A connection bundle consists of a source layer and a specification of the connections from that source layer. All the connections in a given bundle share the same source layer and the same destination layer. In Net#, a connection bundle is considered as belonging to the bundle's destination layer. Net# supports various kinds of bundles like fully connected, convolutional, pooling and so on. A layer might have multiple bundles which connect it to different source layers.

For example:

// Two input layers.

input Picture [28, 28];

input Metadata [100];

hidden H1 [200] from Picture all;

// H2 connected both to H1 and Metadata layers using different connection types

hidden H2 [200] {

from H1 all; // This is a fully-connected bundle.

from Metadata where (s, d) => s < 50; // This is a filtered (sparse) bundle.

}

output Result [10] softmax from H2 all;

Layer H2 has 2 bundles, one is fully connected and the other filtered. In such a scenario, defining layer H2 as just “fully connected” or “filtered” does not fully reflect the actual configuration of the layer, so that’s why we chose term “bundle” to describe connection types.

Now let’s talk about convnets. As in previous post, we will use the MNIST dataset. Let us start with a very simple convnet from the “Neural Network: Basic Convolution” sample. Sign up for our free trial to run this sample.

const { T = true; F = false; }

input Picture [28, 28];

hidden C1 [5, 12, 12]

from Picture convolve {

InputShape = [28, 28]; // required

KernelShape = [ 5, 5]; // required

Stride = [ 2, 2]; // optional, default is 1 in all dimensions

MapCount = 5; // optional, default is 1

}

// Only part of the net shown, see the sample for complete net structure

Hidden layer C1 has a convolutional bundle of the following configuration:

InputShape defines the shape of the source layer for the purpose of applying convolution. The product of the dimensions of InputShape must be equal to the product of the dimensions of source layer output but the number of dimensions does not have to match. For example, Picture layer can be declared as:
input Picture[784];
Receptive field is 5 by 5 pixels (KernelShape).
Stride is 2 pixels in each dimension. In general, smaller strides will produce better results but take more time to train. Larger strides will allow the net to train faster but may produce worse results.
Number of output feature maps is 5. The same will be the number of sets of weights (kernels or also known as filters).
No automatic padding.
Weights will be shared in both dimensions. That means there will be:
MapCount * (KernelShape[0] * KernelShape[1] + 1) = 130 total weights in this bundle (the “+ 1” term is for the bias which is a separate weight for each kernel).

Note that the C1 layer can be represented as 3-dimensional layer as it has five 2D-feature maps, each feature map of size 12 by 12. When automatic padding is disabled, the number of outputs in a particular dimension can be calculated using the following simple formula:
O = (I - K) / S + 1, where O is output, I – input, K – kernel and S – stride sizes.

For the layer C1:
O = (28 - 5) / 2 + 1 = 12 (using integer arithmetic)

Layer C2 is the next hidden layer with convolutional bundle that is connected to C1. It has a similar declaration except one new attribute, Sharing:

hidden C2 [50, 4, 4]

from C1 convolve {

InputShape = [ 5, 12, 12];

KernelShape = [ 1, 5, 5];

Stride = [ 1, 2, 2];

Sharing = [ F, T, T]; // optional, default is true in all dimensions

MapCount = 10;

}

Note that weight sharing is disabled in Z dimension, let’s take a closer look on what happens in such a case. Our kernel is essentially 2D (KernelShape Z dimension is 1), however, the input is 3D, with InputShape Z dimension being 5, in other words we have 5 input feature maps, each feature map being 12 by 12. Suppose we finished applying the kernel to the first input feature map and want to move to the second input feature map. If sharing in Z dimension were true, we would use the same kernel weights for the second input map as we used for the first. However, the sharing is false which means a different set of weights will be used to apply the kernel in the second input map. That means we have more weights and larger model capacity (this may or may not be a good thing as convnets are usually quite tricky to train).

To see the difference, here is the layer C2 weight count with disabled Z sharing:
(KernelShape[1] * KernelShape[2] + 1) * InputShape[0] * MapCount = (5*5+1)*5*10 = 1300

In case of enabled Z weight sharing, we get:
(KernelShape[1] * KernelShape[2] + 1) * MapCount = (5*5+1)*10 = 260

If you run the sample it should finish in about 5 minutes providing an accuracy of 98.41% (or 1.59% error) which is a further improvement over the fully-connected net from the previous Net# blog post (98.1% accuracy, 1.9% error).

Another important layer type in many convnets is a pooling layer. A few pooling kinds are used in convnets – the most popular are max and average pooling. In Net#, a pooling bundle uses the same syntax as a convolutional bundle thus allowing the same level of flexibility in defining pooling topology. Note that the pooling bundles are non-trainable, that is, they don’t have weights that are updated during the training process as pooling just applies a fixed mathematical operation. To demonstrate the usage of a pooling bundle, we’ll take a convnet from the “Neural Network: Convolution and Pooling Deep Net” sample. This net has several hidden layers with both convolutional and pooling bundles and uses parameterization to simplify the calculation of input and output dimensions.

hidden P1 [C1Maps, P1OutH, P1OutW]

from C1 max pool {

InputShape = [C1Maps, C1OutH, C1OutW];

KernelShape = [1, P1KernH, P1KernW];

Stride = [1, P1StrideH, P1StrideW];

}

Using the same convolutional syntax to define pooling allows us to create various pooling configurations, for example, maxout layers (which are essentially 3D max pooling layer + linear activation function in convolutional layer) and many others.

If you run the sample it should finish in about 25 minutes providing an accuracy of 98.89% (or 1.1% error) which is a further improvement over the basic convnet described previously.

One more feature of a Net# convolutional bundle is padding which can be either automatic (the Net# compiler decides how to pad) or precise (the user provides exact padding sizes). In the case of automatic padding (Padding optional attribute, default is false), the input feature maps will be automatically padded with zeroes in the specified dimension(s):

hidden C1 [16, 24, 24] rlinear

from Image convolve {

InputShape = [3, 24, 24];

KernelShape = [3, 5, 5];

Stride = [1, 1, 1];

Padding = [false, true, true];

MapCount = 16;

}

hidden P1 [16, 12, 12]

from C1 max pool {

InputShape = [16, 24, 24];

KernelShape = [1, 3, 3];

Stride = [1, 2, 2];

Padding = [false, true, true];

}

Also note a few other Net# features shown in this example:

C1 layer uses a rectified linear unit (ReLU) activation function - rlinear. See the Net# guide for the complete list of supported activation functions.
Pooling bundle in P1 layer uses overlapped pooling: kernel size is 3 in both X and Y dimensions while stride is 2 in both dimensions, so the applications of pooling kernel will overlap.

When padding is enabled, a slightly different formula is used to calculate output dimensions:
O = (I - 1) / S + 1

For the layer P1:
O = (24 - 1) / 2 + 1 = 12 (using integer arithmetic)

While training a complex neural network it is important to monitor the progress and adjust the training parameters or network topology in case the training does not converge or converges poorly. To see the progress of neural net training in Azure ML, run an experiment and, when the Train Model module starts running, click on the “View output log” link in the Properties pane. You should see something like this:

MeanErr is the mean training error and generally should be decreasing during the training process (it is ok for it to wiggle a bit from time to time). If it’s not decreasing, try changing training parameters (learning rate, initial weights diameter or momentum) or network architecture or both.

Try playing with these convnets and see what happens if you change:

Kernel size (for example to [3, 3] or [8, 8]).
Stride size.
Feature map count (MapCount).
Training parameters like learning rate, initial weights diameter, momentum and number of iterations.

A guide to Net# is also available in case you want to get an overview of most important features of Net#.

Do not hesitate to ask questions or share your thoughts – we value your opinion!

Alexey

↧

EF6.1.3 RTM Available

March 10, 2015, 3:57 pm

≫ Next: Connected Cows? Azure Machine Learning in an Unlikely Place

≪ Previous: Convolutional Neural Nets in Net#

Today we are pleased to announce the availability of EF6.1.3. This patch release contains only high priority bug fixes.

What’s in EF6.1.3?

EF6.1.3 just contain fixes to high priority issues that have been reported on the 6.1.2 release. The fixes include:

Where do I get EF6.1.3?

The runtime is available on NuGet. Follow the instructions on our Get It page for installing the latest version of Entity Framework runtime.

The tooling is available on the Microsoft Download Center. You only need to install the tooling if you want to create models using the EF Designer, or generate a Code First model from an existing database.

Download tooling for Visual Studio 2012
Download tooling for Visual Studio 2013
The tooling will be included in future releases of Visual Studio 2015 (currently in preview).

What’s next?

In addition to working on the next major version of EF (Entity Framework 7), we’re also working on another update to EF6. We’ve already made a series of changes and accepted some community contributions into the code base for this next release. We don’t have a specific timeline for the release just yet.

↧

Connected Cows? Azure Machine Learning in an Unlikely Place

March 11, 2015, 9:00 am

≫ Next: Early investigation into supporting the OData libraries in ASP.NET 5/MVC 6

≪ Previous: EF6.1.3 RTM Available

Repost from a session delivered by Joseph Sirosh at Strata + Hadoop World in San Jose in February.

A surprising conversation about a farmer’s dilemma, a professor’s ingenuity and how the cloud, advanced data analytics and devices all came together to fundamentally re-imagine an age old way of doing business.

This video is just over 8 minutes long.

ML Blog Team

↧

Early investigation into supporting the OData libraries in ASP.NET 5/MVC 6

March 12, 2015, 1:31 am

≫ Next: 5 lucrative tech careers to pursue in 2015

≪ Previous: Connected Cows? Azure Machine Learning in an Unlikely Place

Dear OData lovers,

The OData team has started the investigation into supporting the OData .NET V4 libraries in ASP.NET 5/MVC 6. The work is currently going well and tracked by the following two GitHub issues:

Support ODataLib/EdmLib/Microsoft.Spatial on ASP.NET 5/ASP.NET Core 5

https://github.com/OData/odata.net/issues/97

Port ASP.NET Web API OData to ASP.NET 5/MVC 6

https://github.com/OData/WebApi/issues/229 (initiated by one of our enthusiastic users @PinpointTownes on GitHub)

It's been an interesting journey as it's a lot of fun testing how our libraries works with the new framework, absorbing its new design philosophy, and venting about strange incompatibilities.

If you're interested in joining, just comment the issues. We look forward to see you there.

Best,

The OData team

↧

5 lucrative tech careers to pursue in 2015

March 12, 2015, 9:00 am

≫ Next: SQL Server 2014 is Certified for SAP Applications On-Premises and in the Cloud

≪ Previous: Early investigation into supporting the OData libraries in ASP.NET 5/MVC 6

Re-post from a recent article featured on

Looking to make the biggest bucks in the fastest growing industry? There are two crystal clear - albeit unsurprising - commonalities among the top 10 highest paying jobs in the tech industry.

If you want to cash in the biggest checks, you have to:

Step 1: Become a killer programmer or big data expert.
Step 2: Move to the West Coast.

Folks who know how to handle, parse and analyze an overwhelming amount of data will get the fattest paychecks in 2015 — three of the five highest paid jobs are centered on big data.

Read the original articlehere.

ML Blog Team

↧

SQL Server 2014 is Certified for SAP Applications On-Premises and in the Cloud

March 12, 2015, 11:00 am

≫ Next: What Beer with that Burger? Ziosk Can Help!

≪ Previous: 5 lucrative tech careers to pursue in 2015

As of March 11 2015, SAP has certified support for SAP NetWeaver-based applications on Microsoft SQL Server 2014. Now you can run even more of your Tier-1, mission-critical workloads on SQL Server. And, the ability to run SAP on Microsoft Azure means that it can be accomplished with low total cost of ownership (TCO).

SQL Server 2014 provides the higher scale, availability, and breakthrough performance needed for your most demanding SAP workloads. The updatable in-memory ColumnStore will deliver blazing fast query performance for your SAP Business Warehouse (BW). SQL Server AlwaysOn availability groups help with the reliability and availability requirements of SAP systems by enabling multiple, readable secondaries that can be used for failover and read workloads like reporting and backup.

With SAP’s certification, you can also run SAP in Microsoft Azure Virtual Machines with SQL Server 2014. Azure enables SAP customers to reduce TCO by leveraging Microsoft infrastructure as system needs grow, rather than investing in additional servers and storage. With Microsoft Azure, customers can leverage development and test environments in the cloud that can be spun up and scaled out as needed. SQL Server 2014 also introduced Disaster Recovery to Azure using an asynchronous AlwaysOn secondary, which can make Azure a part of your SAP disaster recovery plan.

With the certification, customers can now adopt SQL Server 2014 for mission-critical SAP workloads, and we look forward to telling you their stories soon. Here are some customers who are taking advantage of SAP on SQL Server today:

Quanta Computer Boosts Performance of Its SAP ERP System with In-Memory Technology
Zespri International Prunes Costs, Defends Business from Disasters by Running SAP in the Cloud
Saudi Electric Company Increases Query Times by 75 Percent, Can Respond Faster to Customers
Mitsui & Co. Deploys High-Availability and Disaster-Recovery Solution After Earthquake

Many companies are already betting their mission critical apps on SQL Server 2014. To read about Microsoft’s leader position for business critical operational database management and data warehouse workloads, read Gartner's Magic Quadrant for Operational Database Management Systems and Magic Quadrant for Data Warehouse Database Management Systems Report. For more information about how customers are already using SQL Server 2014 for mission critical applications, read these case studies:

NASDAQ OMX Reduces 2 PB of Data to 500 TB with Microsoft In-Memory Technology
Top-Five Energy Giant Attains Throughput Speeds up to 17 times Faster with In-Memory OLTP
Ferranti Computer Systems Scales to Meet Customer Needs for Storage and Analysis of Big Data

For more about the powerful combination of Microsoft and SAP, visit http://www.microsoft.com/SAP. To get started with SQL Server 2014, click here.

↧