Data Dictionary: Variables and Analytical Learning

Analytics can be overwhelming or confusing without the proper context. So, this month, I’m going to focus on some of the statistist… uh, math… jargon to help with understanding various types of analytics you might use in your fundraising.

Variables

A variable is the brick that our wall of analytics is built on. A variable is something that we can measure (age, salary, donation amount, etc.) and is broken into two categories, independent and dependent. The independent variable is typically regarded as the ‘cause’ variable. The dependent variable is the effect. Here is one example:

We know that age, years on file and frequency of giving are strongly correlated to a donor affinity to leave a legacy gift to a nonprofit.

In this example, age, years on file and frequency are the independent variables (they are independent measurements), and planned giving prospect classification is the dependent variable (it is dependent on the values of the independent variables).

Types of Variables

Variables are classified based on their value as continuous or categorical. Continuous variables are quantitative, meaning that they would make sense in an arithmetic operation (dollars, age, frequency).  Categorical variables are qualitative, or categorize the data. There are two types here: Nominal have two or more categories with no intrinsic order (gender, religion, marital status, etc.) and ordinal variables have two or more categories that can be ranked (wealth grade, reward tiers).

In the above example, age, years, and frequency are all continuous variables (they make sense as part of an arithmetic equation), and the classification on whether a constituent is a viable planned giving prospect (yes or no) is a nominal categorical variable.

Types of Learning

There are different types of learning specified in analytical processes, supervised and unsupervised learning. The primary use of supervised learning is to predict or explain an outcome based on known data.  Some of the methods of supervised learning are prediction models and classification models.  Prediction models are used to predict value – ie, an extra investment in my direct mail appeal will yield how much in revenue – and a classification model is attempting to classify or label a value – as in likely or unlikely to contribute to a particular campaign.

It is looking at what has happened to predict what will happen. A few examples of supervised learning follow:

  • Predict if a donor will or will not donate to a particular campaign
  • Estimating the revenue generated from an additional number of appeals sent
  • Determining the probability of a donor becoming a sustaining donor or major giver
  • Classifying acquisition targets as responders or non-responders

The other type is unsupervised learning. This type is also known as descriptive analytics. The most common form of this is direct marketing segmentation. It attempts to find patterns in the data that is not tied to a particular outcome. Two examples of these are affinity and cluster analysis.

If our objective in our legacy giving example is to classify a constituent as likely or unlikely to leave a legacy to the nonprofit organization above, we would probably use the supervised learning technique of a classification model (we want to classify them as likely or unlikely). We would use data from prior constituents that have and have not made a legacy contribution to determine the weight and value to give age, years, and frequency and build the model from there.

Jargon, Smargon

You now know all about variables and types of analytical learning. File this away for the next time you are on “Who Wants to be a Millionaire?” or impress your colleagues with your multi-syllabic description of ‘age’ as a ‘continuous variable to be used in the supervised learning technique of classification modeling’.  Cheers.