Artificial Intelligence And Love

Artificial intelligence is slowly beginning to have an impact on our society. The technology becomes increasingly better, people who’ve tried VR are surprised as to how real it is, some are scared…

Smartphone

独家优惠奖金 100% 高达 1 BTC + 180 免费旋转




Population Lifecycle Tracking

Using the Python data science toolkit to track trends in lifecycles of any group

There are many lifecycles that happen in populations. A lifecycle is a process with a beginning, a middle and an end that population members transition through. These can include customer, biological, and equipment lifecycles. Lifecycles are any process with a beginning, a middle and an end that population members transition through. For example, equipment lifecycles start with the purchase, maintain during operation and end with the disposal of the equipment.

During the course of a lifecycle there are metrics that measure the state of the population members. For customer lifecycles it could be number of orders, for equipment, it could be temperature. These metrics vary in a potentially infinite number of patterns. While it can be interesting to examine the journeys of individual population members, the most powerful insights will come through seeing the patterns in movement in the population as a whole.

Discovering patterns among the lifecycles of all of the members of the population can be challenging. One challenge is that the cycles start and end at different times and have different durations. Therefore, comparing members of the population at a particular point of time isn’t helpful because members will be at different stages in their lifecycle. A system is needed to align all of the lifecycles so that they can be compared in a uniform way.

This story looks at a method of comparing the metrics of population members across their lifecycle regardless of start times, end times and durations. This example illustrates tracking customer activity on a car review website. Customers come to the website to research cars. They are active for a period during the buying process then disappear once they have made a purchase. Although this example is focused on customer activity, the same methods can be used to analyze the members of any population.

Each of these customers has an activity value for each day during their lifecycle. The patterns are quite erratic due to random variation so it is difficult to see any patterns across all of the lifecycles. The first step in comparing the lifecycles is to align the start times and end times.

The start times and end times are aligned using standardization. Each lifecycle will start at time 0 and end at time 1. Then customers can compared based on correlated times in their lifecycle. The function in the snippet below performs this standardization regardless of if the granularity of data is days, hours, minutes or seconds.

The lifecycles are now all synchronized and have the same length which makes them easier to compare. Now that the lifecycle timelines are standardized, let’s look at how to generate a function that can accurately approximate the changes of activity over time.

In order to compare lifecycles, the activity value needs to be compared at any point along the timeline. One way to do this is to create a function that will closely approximate the activity values for an individual customer. The activity values are highly variable and a single linear or quadratic function won’t be able to fit the data closely. Cubic splines can be used in order to generate a function flexible enough to match the original data.

Splines approximate the shape of even highly variable data sets. They divide the x-axis into multiple intervals in which they fit a polynomial. Since each polynomial has to fit only a small chunk of the entire range it can match the data closely. Here is a graph that shows two splines fit to a data set.

The yellow line represents a linear spline. This will fit each region of data with a straight line. The linear spline exactly fits the data here, but won’t do a great job of interpolating the values between real data points. For example, the linear spline is flat for the region between 7 and 8, but the true data values are likely above this flat region.

The dashed green line represents a cubic spline. Cubic splines are fit with polynomials of degree 3 so they can more closely model the curves of complex functions. The cubic spline in the graph shows a much smoother approximation of the data.

After completing this simple sanity check, check how a cubic spline will fit a real customer’s activity. The activity for the customer below has already been standardized to the timeline 0 to 1. A cubic spline was fit using 20 interpolated points. The spline fits the activity values closely even though a low number of interpolated points were used. A higher number of interpolated points would fit the activity values even more smoothly.

Spline fitted to activity lifecycle

Once a spline is fitted for a single customer, it is a simple matter to expand this to all customers. Then for each customer in our population, we have a spline that can interpolate values.

Now we’re ready to examine how the population as a whole moves through its lifecycle. We want to find the most common path that a typical account takes. One way to do this is to take the quartile values at each interpolated point along the timeline. This will track how the top, median and lower members of the customer population are moving. By analyzing the movements of the quartiles of the population, the randomness of individual customers will be smoothed out and we can see how the whole population behaves.

This chart shows the movement of the quartiles in the generated activity data. The 50th percentile line shows that the typical account increases its activity until halfway through its lifecycle and then gradually decreases its activity down to zero in the second half of the lifecycle. The 25th and 75th percentile show the same patterns with lower and higher peaks respectively.

Knowing typical activity patterns helps us serve our customers on the car research site. We can customize their experiences based on their activity levels on the site. While their activity is increasing they are still exploring options and we can guide them to lesser-known models that they may be interested in. Once their activity starts to decrease, they are narrowing down their options and looking in detail at a few models. At this point in their journey we could help them with tips on how to find and negotiate the best deal.

Tracking a population’s movement through a lifecycle is applicable in many domains beyond just tracking customer activity. It can be applied in any area where population members go through a cycle with offset beginnings and ends and variable duration.

Add a comment

Related posts:

Brotherly Love in 2019

So I am due with the next episode of my #efjourney, the one where after policy discussions in DC, I met several dedicated NGO advocates, social workers, lawyers, community organizers in Philadelphia…

I just took out my first mortgage on a home. How long do I have to wait before I can refi for a lower rate?

I just took out my first mortgage on a home. How long do I have to wait before I can refi for a lower rate? Answer : I would recommend one to visit this website where you can get from the best…