by Jeffrey Walker
| Feb 10, 2014
In our journey so far we’ve seen that predictive analytics takes a considerable amount of preparation before you get to really do any kind of analysis. Not only does the business problem need to be identified but the supporting data needs to be assessed to determine if there’s enough history to prove your case. These processes are never linear, especially the first time through. We saw as well that the data preparation stage is particularly “busy” as there are a large number of quality checks that need to be undertaken in order to ensure the integrity of your outcomes.
So finally, the moment has come when the analyst can actually start modeling the data for insights, observations, and clues that will prove (or disprove) the value of your business objective.
Applying a Predictive Model to Your Data
Predictive modeling is defined as “the process by which a model is created or chosen to try to best predict the probability of an outcome.” It mathematically represents various relationships in historical data in order to make predictions about the likelihood of future events.
Since the predictive modeling process can get overwhelming very fast, let’s break it down into some concrete, actionable items:
- Identify your staffing and software resources: Will you have to train someone in-house or bring in an outside consultant? Consider the predictive modeling software your organization has or will need to purchase? What are the costs involved and how will that software integrate into your IT infrastructure?
- Identify your observation period and outcome: The observation period is when a given behavior is observed and characterized. The outcome period is the time when the desired outcome is observed. If you select 24 months of customer data, you might designate the first 12 months as the observation period and the next 12 as the outcome.
- Identify your dependent variable: Analysts commonly refer to the model’s outcome as the dependent or outcome variable. This variable relates directly to the outcome or behavior you’re trying to predict. For example, if you want to observe and identify customer loyalty in your performance window, how should loyalty be defined? Should the variable be a certain amount of purchases or dollars spent?
- Identify your key variables: During this step analysts determine which independent or explanatory variables truly relate to the desired outcome. This may require extra levels of data preprocessing or cleansing to improve the predictive power of the variables.
- Identify your method: This stage can be daunting as there are so many approaches, methods, and tools out there. Some of the most popular predictive modeling techniques used by data scientists today are Neural Networks (NNs), Support Vector Machines (SVMs), Decision trees, Clustering, Association rules, Scorecards, and Linear and logistic regression. The purpose of the predictive model is to “best” summarize all the potential variables and attributes into a single solution that can easily be implemented within your organization. The choice will depend on the software that you’re using, the experience of your staff, time constraints, and the outcomes that you’re trying to predict. The optimal path is to develop several different models using various techniques and then to compare them to derive the “best” outcome.
- Build and test your model: Once your independent variables are identified and the method chosen, the analyst will construct an initial predictive model. This process is iterative and will involve a series of prototyping, testing, and refining to get the model right. The data is usually divided into a training set, which is used to build up the model, and then a testing set, which is used to validate the model. This testing or “validation sample” will determine if the model is robust or if it needs to be further tweaked.
- Implement your model: If you’ve tested various modeling techniques, now it’s time to choose the one that best predicts the intended outcome. You will now implement the desired model in order to meet your business objectives. For example, if the requirement is to send a mailing to the top 10,000 likely responders, the predictive model will need to be transformed into a “scoring” algorithm, where the scores indicate the likelihood of an outcome (e.g. Member A is 90% likely to respond, while Member B is only 20% likely).
As a small business leader, who may be new to the world of predictive analytics and modeling, one of the most important points is to not get overwhelmed. The best advice especially for the novice is to make ready use of the growing numbers of predictive analytics tools on the market today. These can provide great value for revealing customer behaviors and trends and showing who is likely to respond to offers and who will likely drop out of the sales funnel.
In the next and final segment we’ll discuss the importance of the deployment stage before concluding with a review of the main takeaways for your predictive analytics strategy.
Post Tagged with