1. Sign up for a Timewise Analytics account, to do this simply login with your Google Account here and make a note of your subscription key.
2. Install the Timewise Analytics Google Sheets™ Addon from Google Marketplace here
3. In Google Sheets™, click on:
Extensions->Timewise Analytics->Enter Subscription Key
in the menu
and enter your subscription key
Installation and Setup demo
A timeseries handle is a reference to timeseries data containing a list of dates and values. They look like this:
{twHandle_22029477793476544593}
The advantage of timeseries handles is that timeseries data can be manipulated and controlled while referencing only a single cell in a spreadsheet.
For example n a very simple case, timeseries handles can be added, subtracted, multiplied together and in doing so a new timeseries handle will be created.
twDescribe
will output the contents of the handle to the spreadsheet.
Timeseries handles can be charted using the menus under the Extensions menu in Google Sheets™, supported charts are:
A collapsed timeseries handle is a reference to a collapsed timeseries. They look like this:
{twCHandle_52029477793476554593}
Collapsed timeseries are created when the period of a timeseries is changed. For example, if you have a daily timeseries of sales data you may want to summarise this data monthly.
This can be done using the twCollapse
function which will return a twCHandle
containing the summarized data.
twDescribeCollapsedHandle
will output the contents of the handle to the spreadsheet.
This handle can be used to chart a Box Plot displaying a summary of the median, min, max and inter quartile range.
A histogram handle is a reference to a histogram. They look like this:
{twHistHandle_7214105611774397050}
This handle is created using the twHistogram
function.
twDescribeHistogram
will output the contents of the handle to the spreadsheet.
This handle can be used to chart a Histogram showing the distribution of values in the timeseries.
A PCA handle is a reference to a Principle Component Analysis. They look like this:
{twPCAHandle_7214105611774397050}
This handle is created using the twPCA
function.
twDescribePCA
will output the contents of the handle to the spreadsheet.
This handle can be used to chart the first two eigen vectors captured by the principle component analysis.
Handles demo
One of the most useful applications of timeseries is to be able to quickly create, manipulate and then chart data using only single cells. Timewise Analytics provides a menu to allow many types of charts to be instantly generated by simply clicking on one or more timeseries handles and clicking on the appropriate chart function in the menu.
The following charts are available
Charting demo
plots timeseries as a line chart
Timeseries demo
plots the difference between two timeseries as a line chart
Spread demo
plots the points of two timeseries as a scatter plot with the line of best fit and displays the correlation and R2
Regression demo
plots the min, max, median and inter-quantile range as a box plot chart based on the selected collapsed timeseries handle
Box Plot demo
plots the bins of a histogram as a bar chart from the selected histogram handle
Histogram demo
plots the most significant 2 eigen vectors as a scatter chart based on the selected PCA handle
PCA demo
General Functions demo
creates a timeseries from the passed range of dates and values
twCreate demo
returns the time series data as a range
twDescribe demo
returns the dates from the timeseries
twDates demo
returns the values from the timeseries
twValues demo
returns the last value from the timeseries
twLast demo
returns the last value from the timeseries to the spreadsheet
twFirst demo
returns the value from the timeseries based on the passed index
twValue demo
returns the number of points in the timeseries
twCount demo
returns a lagged version of the timeseries
In time series analysis, a lag refers to the time between two observations in a time series. More specifically, the lag refers to the number of time intervals between two consecutive observations, usually measured in units of time such as hours, days, or months.
Lag is an important concept in time series analysis because it can affect the autocorrelation of the data, which is the degree to which the values in the time series are correlated with their own past values at different time lags.
For example, if we have a daily time series of stock prices, we can compute the autocorrelation between each day's price and the price on the previous day (lag 1), the price two days ago (lag 2), and so on. The autocorrelation at each lag can help us to identify any patterns or trends in the data.
Lags are also important in time series forecasting, as they can be used to model the relationships between the current value of the time series and its past values at different lags. Lagged values can be included as explanatory variables in regression models or used to create time series models, such as autoregressive integrated moving average (ARIMA) models.
twLag demo
returns a leading version of the timeseries
In time series analysis, a lead refers to the time between two observations in a time series in the future direction. More specifically, the lead refers to the number of time intervals between two consecutive observations, usually measured in units of time such as hours, days, or months, in the future direction.
For example, if we have a monthly time series of sales data, we can compute the lead between each month's sales and the sales in the following month (lead 1), the sales two months ahead (lead 2), and so on. The leads can help us to analyze the future trends in the data and forecast future values of the time series.
Leads are important in time series forecasting, as they can be used to model the relationships between the current value of the time series and its future values at different leads. Lead values can be included as explanatory variables in regression models or used to create time series models, such as Vector Autoregressive (VAR) models.
In summary, lags refer to the past direction, and leads refer to the future direction in time series analysis. Both lags and leads can be used to model the temporal relationships between variables and forecast future values.
twLead demo
aligns the dates and points of multiple timeseries removing any points not present in all timeseries
twAlign demo
removes spikes from the timeseries using the passed window and standard deviations
In time series analysis, despiking refers to the process of removing outliers or spurious data points from a time series dataset that can distort the analysis or modeling of the data.
Outliers can arise due to measurement errors, data entry errors, or other anomalies that are not representative of the true underlying process. These outliers can lead to incorrect conclusions or predictions if they are not properly identified and removed.
twDespike demo
expands the points of a timeseries to align with the points of another using roll-forward
twExpand demo
returns a timeseries containing the 1 step changes from the passed timeseries
twChange demo
Math demo
adds two timeseries or adds a constant to a timeseries
twAdd demo
subtract two timeseries
twSubtract demo
multiplies two timeseries or multplies a timeseries with a constant
twMultiply demo
divides two timeseries or divides a timeseries by a constant
twDivide demo
returns the absolute values of a timeseries
twAbsolute demo
Stats demo
calculate the mean of a timeseries
In statistics, the mean is a measure of central tendency that represents the average value of a dataset. It is also known as the arithmetic mean or simply the average.
The mean is calculated by summing all the values in a dataset and then dividing the sum by the number of values in the dataset. Mathematically, the formula for the mean is:
The mean is a commonly used measure of central tendency because it takes into account all the values in the dataset. However, it can be affected by extreme values or outliers, which may not be representative of the typical values in the dataset. In such cases, other measures of central tendency such as the median or mode may be more appropriate.
twMean demo
calculate the min of a timeseries
twMin demo
calculate the max of a timeseries
twMax demo
returns the index of the minimum value in the timeseries
twMinIndex demo
returns the index of the maximum value in the timeseries
twMaxIndex demo
returns the date of the minimum value in the timeseries
twMinDate demo
returns the date of the maximum value in the timeseries
twMaxDate demo
calculate the sum of a timeseries
twSum demo
calculate the median of a timeseries
In statistics, the median is a measure of central tendency that represents the value in a dataset that separates the lower half from the upper half of the values.
More precisely, the median is the middle value of a sorted dataset, or the average of the two middle values if the dataset has an even number of values.
For example, consider the following dataset of 7 values: {4, 5, 2, 8, 6, 1, 9}. To find the median, we first sort the values in ascending order: {1, 2, 4, 5, 6, 8, 9}. The median is the middle value, which in this case is 5.
The median is a useful measure of central tendency when the dataset contains extreme values or outliers that may skew the mean (average). In contrast to the mean, the median is not affected by extreme values and provides a more robust measure of central tendency in such cases.
The median is also commonly used in non-parametric statistical tests, such as the Mann-Whitney U test and the Wilcoxon signed-rank test, which do not assume a normal distribution of the data.
twMedian demo
calculate the variance of a timeseries
In statistics, variance is a measure of how spread out a set of data is around its mean (average) value. Specifically, it measures the average of the squared differences of each data point from the mean.
The variance tells us how much the data is dispersed from the mean value. If the variance is large, it means that the data is more spread out, whereas if the variance is small, it means that the data is more tightly clustered around the mean.
Variance is an important concept in many statistical applications, including hypothesis testing, confidence intervals, and regression analysis. The square root of the variance is the standard deviation, which is another commonly used measure of variability.
twVariance demo
calculate the standard deviation of a timeseries
In statistics, the standard deviation is a measure of the spread or dispersion of a set of data values from their mean (average) value. It is a widely used measure of variability or diversity in data.
The standard deviation is defined as the square root of the variance, which is the average of the squared differences between each data point and the mean. Mathematically, the standard deviation can be expressed as:
The standard deviation tells us how much the data is spread out from the mean value. If the standard deviation is large, it means that the data is more spread out, whereas if the standard deviation is small, it means that the data is more tightly clustered around the mean.
The standard deviation is commonly used in many statistical calculations, including the calculation of confidence intervals, hypothesis testing, and the computation of z-scores and t-scores in statistical inference.
twStandardDeviation demo
calculate the skew of a timeseries
Skewness is a measure of the asymmetry of a probability distribution. It measures the degree to which the values in a distribution are skewed to one side or the other of the mean.
A distribution that is symmetric around the mean has zero skewness. Positive skewness occurs when the tail of the distribution is longer on the positive side than on the negative side, while negative skewness occurs when the tail is longer on the negative side than on the positive side.
Skewness is typically measured using the third central moment of a distribution, which quantifies how much the data deviate from the mean. The most commonly used measure of skewness is the sample skewness, which is a standardized measure that is based on the sample mean, sample standard deviation, and sample size.
twSkew demo
calculate the kurtosis of a timeseries
Kurtosis is a statistical measure that describes the shape of a probability distribution. Specifically, it measures the "peakedness" and "tail-heaviness" of a distribution relative to a normal distribution.
A distribution with positive kurtosis has a higher peak and heavier tails than a normal distribution, while a distribution with negative kurtosis has a lower peak and lighter tails. A distribution with zero kurtosis is said to be mesokurtic, meaning it has the same peak and tail behavior as a normal distribution.
The formula for kurtosis involves the fourth central moment of a distribution, which measures the spread of the data around the mean. The most commonly used measure of kurtosis is the excess kurtosis, which subtracts 3 from the kurtosis of a normal distribution so that a normal distribution has a kurtosis of zero.
twKurtosis demo
calculate the autocorrelation of a timeseries with lag 1
Autocorrelation, also known as serial correlation, is a statistical measure that indicates the degree to which the values of a time series are correlated with their own past values at different time lags.
In other words, autocorrelation measures the similarity between a given value in a time series and the previous values in the series. A high degree of autocorrelation indicates that the time series is not random and that the values in the series are related to each other over time.
Autocorrelation is often measured using a correlogram or autocorrelation function (ACF) plot, which shows the correlation between a time series and its own lagged values. The ACF plot displays the autocorrelation coefficient (or correlation coefficient) on the vertical axis and the time lag on the horizontal axis.
Autocorrelation is important in time series analysis and forecasting because it can affect the accuracy of predictions. If a time series exhibits strong autocorrelation, it means that the past values of the series can be used to predict future values, and it may be necessary to include lagged values of the series in a forecasting model. Conversely, if a time series has low or no autocorrelation, it may be more difficult to predict future values based on past values.
twAutocorrelation demo
calculate the covariance of two timeseries
Covariance is a measure of how two variables are related to each other. Specifically, it measures the degree to which two variables move together, or co-vary, over time or across observations.
A positive covariance indicates that when one variable is above its mean value, the other variable tends to be above its mean value as well. A negative covariance indicates that when one variable is above its mean value, the other variable tends to be below its mean value. A covariance of zero indicates that the two variables are not related.
Covariance is often used in finance and portfolio management to measure the relationship between the returns of different assets. It is also used in regression analysis to estimate the coefficients of a linear regression model. However, covariance has some limitations, and its magnitude depends on the scales of the two variables being compared, which can make it difficult to interpret. Therefore, the standardized version of covariance, called correlation, is often preferred.
twCovariance demo
calculate the correlation of two timeseries
Correlation is a statistical measure that indicates the strength and direction of the relationship between two variables. Specifically, it measures how closely the values of two variables are related to each other.
Correlation is often measured using Pearson's correlation coefficient, which is a standardized measure that ranges from -1 to +1. A correlation coefficient of +1 indicates a perfect positive correlation, meaning that when one variable increases, the other variable increases in proportion. A correlation coefficient of -1 indicates a perfect negative correlation, meaning that when one variable increases, the other variable decreases in proportion. A correlation coefficient of 0 indicates no correlation, meaning that the variables are not related.
Correlation is widely used in many fields, including finance, social sciences, and engineering, to analyze the relationships between variables and to make predictions about future outcomes. However, it is important to note that correlation does not necessarily imply causation, and other factors may be influencing the relationship between the variables.
twCorrelation demo
calculate the best-fit linear regression coefficients of two timeseries
Linear regression is a statistical technique used to model the relationship between a dependent variable (also known as the response variable) and one or more independent variables (also known as explanatory variables or predictors) using a straight line.
The goal of linear regression is to estimate the values of the coefficients β0, β1, β2, ..., βn that minimize the sum of the squared errors between the predicted values of the dependent variable and the actual values of the dependent variable.
Linear regression is commonly used in many fields, including finance, economics, marketing, and social sciences, to analyze the relationships between variables and to make predictions about future outcomes.
twRegression demo
perform principle component analysis on the range of timeseries
Principal Component Analysis (PCA) is a statistical technique used to reduce the dimensionality of a dataset while retaining as much of the variability in the data as possible. It is a commonly used technique in data science, machine learning, and statistics.
PCA works by transforming a dataset into a new set of variables, known as principal components, that are uncorrelated with each other and capture the maximum amount of variation in the original dataset. The first principal component captures the most variation, the second principal component captures the second most variation, and so on.
PCA is particularly useful for datasets with a large number of variables, as it can help to identify the most important variables and reduce the complexity of the data. It is also useful for visualizing high-dimensional data by projecting it onto a lower-dimensional space.
twPCA demo
collapses a timeseries to the passed frequency, eg. Daily data to monthly data, returns the mean, median, min, max and inter quantile range for the target frequency
twCollapse demo
returns a handle to a histogram of a timeseries. The number of buckets are decided using the Freedman–Diaconis rule
A histogram is a graphical representation of the distribution of a dataset, typically used to show the frequency of values in a given range of values.
A histogram consists of a set of rectangular bins, each representing a range of values, and the height of each bin represents the number of data points in that range. The bins are usually arranged side by side, with no gaps between them, to emphasize the continuity of the data.
The Freedman-Diaconis rule is a method for determining the number and width of the bins in a histogram. It is based on the interquartile range (IQR), which is the difference between the 75th percentile (Q3) and the 25th percentile (Q1) of the dataset.
The rule suggests that the bin width should be set to:
bin width = 2 * IQR / (n^(1/3))
where n is the sample size. The number of bins can then be determined by dividing the range of the data by the bin width and rounding to the nearest integer.
The Freedman-Diaconis rule is a useful way to determine the appropriate bin size for a histogram, as it takes into account the spread of the data and is less sensitive to outliers than other methods, such as Sturges' rule. However, it is important to note that the rule is only a guideline and may not always be optimal for all datasets.
twHistogram demo
Moving Stats demo
calculate a moving average of a timeseries
A moving average is a statistical technique used to smooth out fluctuations in a time series data by calculating the average of a rolling window of values. It is a commonly used method in finance, economics, and engineering.
The moving average is calculated by taking the average of a specified number of consecutive data points in a time series, and then moving the window by one data point and calculating the average again. This process is repeated for all data points in the time series.
For example, consider a time series of 12 monthly sales data: {10, 8, 12, 14, 16, 20, 22, 18, 16, 12, 10, 8}. To calculate a 3-month moving average, we would take the average of the first three data points (10, 8, 12), then the second three data points (8, 12, 14), and so on, until we have calculated the moving average for all 12 data points.
The moving average can help to smooth out the fluctuations in the data and highlight any trends or patterns that may be present. It can also be useful in forecasting future values of the time series by extrapolating the trend observed in the moving average.
twMovingAverage demo
calculate a moving min of a timeseries
twMovingMin demo
calculate a moving max of a timeseries
twMovingMax demo
calculate a moving median of a timeseries
twMovingMedian demo
calculate a moving variance of a timeseries
twMovingVariance demo
calculate a moving standard deviation of a timeseries
twMovingStandardDeviation demo