TV Labs Performance Score
Motivation
The TV Labs Performance Score (TVPS) evaluates the performance of smart TV apps through the lens of a customer. To determine this score, we compare relevant metrics with industrie benchmarks and adjust for devices that naturally perform slower. Ultimately, this produces a numerical value that quickly tells us how well an app is performing.
How the Performance Score is Calculated
The Performance Score provides a quick look at how well smart TV apps are functioning. In a nutshell, this is how we calculate the TV Labs Performance Score:
Market Share: Every type of TV (like different brands or models) has its popularity. Some are more common in homes than others. We begin by determining how widespread each TV model is compared to the rest. TVs that are more common or popular will have their scores weighted more because they represent a larger share of the market.
Aggregated Scores: For every TV, we measure common metrics, like how quickly it starts an app for example, and compare this speed to the entire set of all measurements taken within the last 30 days for the same make and model. Based on this comparison, we assign points. Certain metrics might be more critical, so they get a higher point value. Next, we total these points.
Diminishing Returns and The Law of Little Gains: Imagine an already fast App becoming a tad quicker; it might not be very noticeable. But a slower App making significant improvements can be easily noticed. So, we adjust the scores to factor in these nuances.
Final Score: After all the evaluations and tweaks, we convert the total points into a percentage score out of 100. This final number offers a clear indication of the TV app's overall performance.
In-depth description
If you want to learn more, the following is a more detailed description of these calculations:
Normalized Market Share:
For each device, the market share is calculated as a fraction of the total market share of all devices.
Individual Measurements
To determine a metrics value we repeatetly run the same test between 10 and 50 times. A measurement is determined by taking the median of all measurements. This way we can adjust for both outliers and small variations between tests. Usually tests run in the early morning.
Aggregated Performance Score:
Each metric's value is first normalized with respect to a baseline.
After normalization, the metrics are weighted.1
The aggregated score for a device is a sum of the weighted metric scores, which is then multiplied by the device's normalized market share.
Diminishing Returns:
The aggregated score is adjusted to account for diminishing returns, emphasizing that as performance metrics improve, the reward for further improvement is reduced. This is done using the formula, also known as gamma curve:
By default, the Gamma Value is set to 2.0.
Final Performance Score:
The adjusted score is then converted into a percentage for a clearer representation.
Putting it all together into a comprehensive equation the
TVPS
(TV Labs Performance Score) is calculated as:where refers to devices and to metrics:
- Normalized market share for device .
- Baseline for device and metric . The baseline is defined as the 90th percentile of measurements taken during the last 30 days.
- Metric value for device and metric where the metric value is calculated as the median of all the performance measurements taken on a specific day.
- Performance Score in %.
- correction factor for diminishing returns, which is set to
2.0
Why this Approach?
- Customizability: By using predefined weights and baselines for each device, we can prioritize specific metrics and adjust for unique device characteristics.
- Market Share Importance: Metrics from devices with a larger market share will have a more significant impact on the overall score, ensuring market relevance.
- Diminishing Returns: This emphasizes significant improvements in poorly performing metrics and avoids excessive rewards for minor enhancements in already optimal metrics.
- Relative Performance: Metrics are interpreted relative to typical device performance (via the baseline), giving users a contextual view rather than just absolute values.
- For now the only metrics the following metrics are used with a weight of 0.5 each (indicating both have equal importance):
video_start_time
andapp_start_time
. Future versions of the TVPS might include other metrics and weights.↩