BeyeNETWORK Spotlights focus on news, events and products in the business intelligence ecosystem that are poised to have a significant impact on the industry as a whole; on the enterprises that rely on business intelligence, analytics, performance management, data warehousing and/or data governance products to understand and act on the vital information that can be gleaned from their data; or on the providers of these mission-critical products.Presented as Q&A-style articles, these interviews conducted by the BeyeNETWORK present the behind-the-scene view that you won’t read in press releases.
This BeyeNETWORK spotlight features Ron Powell's interview with Tapan Patel, Global Product Marketing Manager – Predictive Analytics and Data Mining at SAS. Ron and Tapan discuss how big data is changing the analytics landscape.Tapan, SAS continues to invest heavily in high-performance analytics to provide customers with faster analysis of data and more accurate insights – allowing people or processes to make use of analytical insights. Why are organizations still struggling with making the right decisions and making a broader impact?Tapan Patel:
When looking for answers to complex questions and trying to gain an edge in today's marketplace, organizations must look at it from not only a technological perspective but also from a process and people perspective. We have to stop thinking of analytics as just a tool or product, but instead as a component of the overall business process. To improve decision making, more focus needs to be on asking the right business questions, bringing more ideas into your decision-making process and running different scenarios on your data. To make a broader impact, IT needs to get involved early on to start thinking about the infrastructure necessary to enable high-performance analytics and working hand-in-hand with the analytic team to integrate models into business processes. That makes a lot of sense because planning is critical to succeeding with analytics. Where do organizations need to focus in the next five years when it comes to predictive analytics and data mining? Tapan Patel:
More organizations are relying on unstructured data sources in combination with structured data sources. That combination provides more insights and improves predictive power for decision making so you can expect text analytics to grow in importance.
Business users are seeking analytics to shape their decisions, so it's imperative that analytics be pushed to a variety of decision makers at different levels. In order to increase adoption, tools and interfaces must allow data mining results to be quickly consumed or adopted at many levels in the organization. The usability of predictive analytics tools is growing in importance. SAS provides tools and processes that are suitable for not just the analytics experts but also for business analysts and domain experts in different business units. They are knowledgeable about the business issues at hand but less knowledgeable about the data mining process. SAS empowers these business users with predictive analytics tools that help them easily interpret results.
Let’s not forget that culture and communication are important issues. How do different business units collaborate to extract more value from analytics for their organizations? Finding and retaining the right analytic talent and encouraging the use of a common set of analytics best practices across business units will tremendously help organizations.
During the years ahead, we will see a greater need for skilled staff at different levels in organizations. Organizations need people with a blend of analytical, business and communication skills to foster the right analytical culture. People with all of these skills are the analytics champions facilitating broader adoption of analytics today.Pushing analytics, especially to a larger audience, and embedding analytics is very strategic. How does big data alter the predictive analytics and data mining life cycle with regard to sampling, visualization, model development, and model deployment?Tapan Patel:
Big data challenges have put a spotlight on analytics, especially on the predictive analytics and data mining life cycle.
SAS has always offered capabilities to do representative sampling, and now with high-performance analytics, analyzing the entire data set is practical in many cases. Organizations are looking at big data not just from a statistical and data mining perspective but also from a cost/benefit perspective. When there is a potential to increase accuracy, using the entire data set makes good sense.
When data is continually growing, visualization tools used for data discovery need to adapt in order to quickly reveal trends and patterns. Visualizing data over time requires tools that allow analysts to look at attributes from multiple viewpoints as quickly as possible. Easily showing the visual relationships within the data can help the analyst quickly identify key variables affecting outcomes.
In the model development stage, organizations need the capability to analyze large volumes of data without being restricted to simple modeling techniques. Using sophisticated modeling techniques can greatly improve outcomes. Also, more model iterations bring the model closer to optimum. That results in highly accurate insights. How can you quickly add new variables/more variables to reflect the current state of the market as part of model refinement? The latter is important to ensure that you continue to get high-impact results.
Big data – however you define it – isn’t getting smaller, and having a scalable and reliable analytics infrastructure to support the entire modeling life cycle is critical.Could you describe the functionality that SAS offers for predictive analytics?Tapan Patel:
Our customers use predictive analytics to uncover patterns, opportunities and insights to drive proactive, evidence-based decisions. SAS offers a wide range of capabilities across every step in the predictive analytics life cycle.
Whether in a global corporation or a midmarket company, every predictive analytics life cycle starts with defining the business problem.
Then, data preparation is where the majority of time is spent for any data mining exercise. The purpose is to put the data in a form in which the data mining question can be asked and to make it easier for the analytic techniques to answer that question. Every change in the data – cleansing, transformation or augmentation of data – means a change occurs in the problem space that the analytics has to explore. SAS provides a variety of tools to access data, manipulate data, profile the data, visualize the data, and explore and discover the data across a variety of sources.
The next key step is the discovery, exploration and understanding of the complex relationships within the data set. SAS offers a variety of tools and techniques to assist in discovery and learning from a diverse set of data, keeping in mind that users have different skill sets. We help them quickly identify the valuable elements and the key variables hidden in the data.
SAS offers a variety of data mining and statistical techniques for the model development stage. We offer many analytical and industry-specific techniques to solve a variety of problems. The goal is to enable modelers to quickly compare, validate and share the insights they have gained from this stage.
The fourth step is the deployment – integrating the champion model into business processes or delivering the insights to people who are going to take actions. The model deployment step closes the gap between insights and action.
Finally, performance monitoring of models is critically important in deciding what to do with the model, given the changes happening in the market or the customer base. Should the model be updated? Should it be retired and a brand new model created? That's where the life cycle begins again, or iterates again from a process standpoint. Tapan, today enterprises of all sizes are struggling with how they can get insights and value from big data. What are you hearing from your customers and prospective customers regarding this issue for big data?Tapan Patel:
Organizations need help managing big data from a data management and information management perspective and quickly extracting value from big data by using analytics. In the age of big data, the role of data management and properly accessing, storing, cleansing, transforming, and then governing the data is becoming more important. Just finding the relevant set of data is also gaining more importance for customers. What is an ideal customer criterion where high-performance analytics is applicable?Tapan Patel:
Ron, that’s an important question. From a technical perspective, as data volumes grow, customers don’t want to change the kind of analytics they're doing. They don't want to be limited to using simple analytical techniques or required to dumb things down just because there's a lot of data to crunch. They are asking questions such as: How quickly can I add/remove variables or test multiple modeling techniques to get results? What if I could do iterations not just seven times but 50 times in a given timeframe? And they are learning the benefits of getting insights in minutes or seconds rather than hours or days.
From a business perspective, we hear things such as: We have complex problems we ignored because of infrastructure constraints, but we don’t want to ignore them now because solving them could differentiate us in the marketplace. With rapid time-to-insights, more questions can be asked, more ideas can be brought into the decision-making process to provide multiple scenarios helping decision makers with accurate, proactive decisions. Tapan, SAS has made some announcements recently with EMC Greenplum and with Teradata talking about dedicated high-performance analytic appliances. What new developments can we expect in 2012 from SAS?Tapan Patel:
SAS is quick to reengineer its software to take advantage of hardware advancements. In-memory analytics will continue to be a focus area for us. Last year, we launched SAS® High-Performance Analytics, an in-memory predictive analytics offering that uses hardware from database partners, Teradata or EMC Greenplum. In March 2012 we launched SAS Visual Analytics, which allows customers to explore and visualize big data using in-memory capabilities and publish reports to the Web and mobile devices. In the year ahead, SAS will continue rolling out industry-specific solutions using in-memory analytics to meet customer needs.
Another area where we are focusing our efforts is Hadoop. Some of our customers have decided to use the Hadoop framework for storing big data. We provide capabilities for data access to Hadoop, enable MapReduce programming, and scripting support and the execution of HDFS commands from within the SAS environment as well as interactive tools for Hadoop data. Tapan, I think we are just at the very early stages of big data and, obviously, SAS is playing a major role in this area going forward. Thank you for sharing with our readers how SAS continues to help companies achieve success with big data analytics.
SOURCE: High Performance Analytics and Big Data: A Q&A Spotlight with Tapan Patel of SAS
Recent articles by Ron Powell