Friday, February 24, 2012

How to calculate cumulative rolling product on a 5 day window.

Here's what the data looks like:
REC# RETURN DESIRED RESULT
-- -- --
1 1.000859
2 0.999705
3 1.000471
4 0.994672
5 1.001945 -0.24%
6 0.999718 -0.35%
7 0.997262 -0.59%
8 0.997613 -0.88%
9 1.000932 -0.25%
10 0.999584 -0.49%
To calculate the DESIRED RESULT on REC# 5, the following calculation
is needed: (notice 5 days of returns are multiplied together)
((1.000859 * 0.999705 * 1.000471 * 0.994672 * 1.001945) - 1) * 100 = -0.24
In order to calculate the DESIRED RESULT for REC# 6 I use the same
calculation only using RETURN values from REC#'s 2 through 6.
DESIRED RESULT on REC# 7 uses RETURN values from REC#'s 3 through 7
It has to be a 5 day window of values that is why REC#'s 1-4 have no
DESIRED RESULT.
I'm trying to figure out how to do this without using a cursor....I
hate cursors.
Any ideas I've been thinking about if for a couple hours now and
haven't found a good solution yet.On Wed, 19 Sep 2007 19:45:46 -0000, Izzy wrote:
>Here's what the data looks like:
>REC# RETURN DESIRED RESULT
>-- -- --
> 1 1.000859
> 2 0.999705
> 3 1.000471
> 4 0.994672
> 5 1.001945 -0.24%
> 6 0.999718 -0.35%
> 7 0.997262 -0.59%
> 8 0.997613 -0.88%
> 9 1.000932 -0.25%
> 10 0.999584 -0.49%
>
>To calculate the DESIRED RESULT on REC# 5, the following calculation
>is needed: (notice 5 days of returns are multiplied together)
>((1.000859 * 0.999705 * 1.000471 * 0.994672 * 1.001945) - 1) * 100 =>-0.24
>
>In order to calculate the DESIRED RESULT for REC# 6 I use the same
>calculation only using RETURN values from REC#'s 2 through 6.
>
>DESIRED RESULT on REC# 7 uses RETURN values from REC#'s 3 through 7
>
>It has to be a 5 day window of values that is why REC#'s 1-4 have no
>DESIRED RESULT.
>
>I'm trying to figure out how to do this without using a cursor....I
>hate cursors.
>
>Any ideas I've been thinking about if for a couple hours now and
>haven't found a good solution yet.
Hi Izzy,
If the REC# column is guaranteed to hold consecutive numbers (i.e. no
gaps), you can use
SELECT a.RecNo, a."Return",
CASE
WHEN COUNT(b."Return") = 5
THEN (POWER(10.0, SUM(LOG10(b."Return"))) - 1) * 100
ELSE NULL
END AS Result
FROM YourTable AS a
INNER JOIN YourTable AS b
ON b.RecNo BETWEEN a.RecNo - 4 AND a.RecNo
GROUP BY a.RecNo, a."Return";
Note that I'm using POWER(10.0,SUM(LOG10(Value))) to mimic the aggregate
product function that SQL Server lacks. This technique works if all
values are >0, as in your sample. If values =0 or <0 are allwoed as
well, you'll have to use a more complicated expression to get the
aggregate product (Google for SQL Server product aggregate Itzik Ben-Gan
and you'll find it).
If RecNo values can have gaps, *and* you are on SQL Server 2005, you can
generate consecutive rownumbers by using ROW_NUMBER(). If you're still
on SQL Server 2000, you'll need some real nasty work - if you want me to
help with that, you'll have to post CREATE TABLE and SELECT statements
so that I can set up a test environment by using copy and paste.
--
Hugo Kornelis, SQL Server MVP
My SQL Server blog: http://sqlblog.com/blogs/hugo_kornelis|||I suggest you read the thread:
microsoft.public.sqlserver.programming
Sunday, July 01, 2007 7:45 AM
TSQl-Accmulations
http://tinyurl.com/yvdqbc
Look especially at these 2 solutions in the thread (which assume S2005):
Steve Dassin
Monday, July 02, 2007 10:18 PM
Itzik Ben-Gan
Tuesday, July 03, 2007 5:16 AM
(But I think Itzik Ben-Gan also shows possible S2000 solutions).
In your case (given S2005) the row definition is:
ROWS BETWEEN 4 PRECEDING AND CURRENT ROW
ie, exec SqlWindowCustomers @.from=-4,@.to=0
You can use Hugo's product aggregate to get what you want.
Easy solutions to problems like this is what Itzik Ben-Gan argued for in:
Itzik Ben-Gan and Sujata Mehta
OVER Clause and Ordered Calculations
http://www.insidetsql.com/OVER_Clause_and_Ordered_Calculations.doc
which follow the sql standard of an sql window that's quite simple and
powerful, especially for moving/running sum problems.
And here is the MS response regarding Katami (S2008):
OVER clause enhancement request - ORDER BY for aggregates
OVER clause enhancement request - ROWS and RANGE window sub-clauses
https://connect.microsoft.com/SQLServer/feedback/ViewFeedback.aspx?FeedbackID=254392
'This feature unfortunately did not fit into our schedule for SQL Server
2008.'
If Itzik Ben-Gan has so little pull what does that say for the rest
of us! Just how do they decide on what features to implement?
Who the hell is in charge there. Is anyone in charge? :-)
In this case I think it's fair to say the Connect(ion) is broken :-)
For more on this subject see:
http://beyondsql.blogspot.com/2007/07/dataphor-example-of-rapid-application.html
best,
www.beyondsql.blogspot.com|||On Sep 19, 4:46 pm, Hugo Kornelis
<h...@.perFact.REMOVETHIS.info.INVALID> wrote:
> On Wed, 19 Sep 2007 19:45:46 -0000, Izzy wrote:
> >Here's what the data looks like:
> >REC# RETURN DESIRED RESULT
> >-- -- --
> > 1 1.000859
> > 2 0.999705
> > 3 1.000471
> > 4 0.994672
> > 5 1.001945 -0.24%
> > 6 0.999718 -0.35%
> > 7 0.997262 -0.59%
> > 8 0.997613 -0.88%
> > 9 1.000932 -0.25%
> > 10 0.999584 -0.49%
> >To calculate the DESIRED RESULT on REC# 5, the following calculation
> >is needed: (notice 5 days of returns are multiplied together)
> >((1.000859 * 0.999705 * 1.000471 * 0.994672 * 1.001945) - 1) * 100 => >-0.24
> >In order to calculate the DESIRED RESULT for REC# 6 I use the same
> >calculation only using RETURN values from REC#'s 2 through 6.
> >DESIRED RESULT on REC# 7 uses RETURN values from REC#'s 3 through 7
> >It has to be a 5 day window of values that is why REC#'s 1-4 have no
> >DESIRED RESULT.
> >I'm trying to figure out how to do this without using a cursor....I
> >hate cursors.
> >Any ideas I've been thinking about if for a couple hours now and
> >haven't found a good solution yet.
> Hi Izzy,
> If the REC# column is guaranteed to hold consecutive numbers (i.e. no
> gaps), you can use
> SELECT a.RecNo, a."Return",
> CASE
> WHEN COUNT(b."Return") = 5
> THEN (POWER(10.0, SUM(LOG10(b."Return"))) - 1) * 100
> ELSE NULL
> END AS Result
> FROM YourTable AS a
> INNER JOIN YourTable AS b
> ON b.RecNo BETWEEN a.RecNo - 4 AND a.RecNo
> GROUP BY a.RecNo, a."Return";
> Note that I'm using POWER(10.0,SUM(LOG10(Value))) to mimic the aggregate
> product function that SQL Server lacks. This technique works if all
> values are >0, as in your sample. If values =0 or <0 are allwoed as
> well, you'll have to use a more complicated expression to get the
> aggregate product (Google for SQL Server product aggregate Itzik Ben-Gan
> and you'll find it).
> If RecNo values can have gaps, *and* you are on SQL Server 2005, you can
> generate consecutive rownumbers by using ROW_NUMBER(). If you're still
> on SQL Server 2000, you'll need some real nasty work - if you want me to
> help with that, you'll have to post CREATE TABLE and SELECT statements
> so that I can set up a test environment by using copy and paste.
> --
> Hugo Kornelis, SQL Server MVP
> My SQL Server blog:http://sqlblog.com/blogs/hugo_kornelis- Hide quoted text -
> - Show quoted text -
Thanks Hugo I think we are on the right path......but
When I run your query it returns only 0.0 for all rows.
RecNo is consecutive without gaps.
Return column is of type FLOAT.
I'm using SQL Server 2005
To set up a test:
CREATE TABLE #HIST_RETURNS
(
DAYNUM INT,
TOTAL_RETURN FLOAT
);
INSERT INTO #HIST_RETURNS (DAYNUM, TOTAL_RETURN)
VALUES (1, 1.000859);
INSERT INTO #HIST_RETURNS (DAYNUM, TOTAL_RETURN)
VALUES (2, 0.999705);
INSERT INTO #HIST_RETURNS (DAYNUM, TOTAL_RETURN)
VALUES (3, 1.000471);
INSERT INTO #HIST_RETURNS (DAYNUM, TOTAL_RETURN)
VALUES (4, 0.994672);
INSERT INTO #HIST_RETURNS (DAYNUM, TOTAL_RETURN)
VALUES (5, 1.001945);
INSERT INTO #HIST_RETURNS (DAYNUM, TOTAL_RETURN)
VALUES (6, 0.999718);
INSERT INTO #HIST_RETURNS (DAYNUM, TOTAL_RETURN)
VALUES (7, 0.997262);
INSERT INTO #HIST_RETURNS (DAYNUM, TOTAL_RETURN)
VALUES (8, 0.997613);
INSERT INTO #HIST_RETURNS (DAYNUM, TOTAL_RETURN)
VALUES (9, 1.000932);
INSERT INTO #HIST_RETURNS (DAYNUM, TOTAL_RETURN)
VALUES (10, 0.999584);
SELECT A.DAYNUM, A.TOTAL_RETURN,
CASE
WHEN COUNT(B.TOTAL_RETURN) = 5
THEN (POWER(10.0, SUM(LOG10(B.TOTAL_RETURN))) - 1) * 100
ELSE NULL
END AS RESULT
FROM #HIST_RETURNS AS A
JOIN #HIST_RETURNS AS B ON B.DAYNUM BETWEEN A.DAYNUM - 4 AND
A.DAYNUM
GROUP BY A.DAYNUM, A.TOTAL_RETURN
ORDER BY A.DAYNUM;
DROP TABLE #HIST_RETURNS;
Thanks for the response.|||On Sep 20, 8:05 am, Izzy <israel.rich...@.gmail.com> wrote:
> On Sep 19, 4:46 pm, Hugo Kornelis
>
>
> <h...@.perFact.REMOVETHIS.info.INVALID> wrote:
> > On Wed, 19 Sep 2007 19:45:46 -0000, Izzy wrote:
> > >Here's what the data looks like:
> > >REC# RETURN DESIRED RESULT
> > >-- -- --
> > > 1 1.000859
> > > 2 0.999705
> > > 3 1.000471
> > > 4 0.994672
> > > 5 1.001945 -0.24%
> > > 6 0.999718 -0.35%
> > > 7 0.997262 -0.59%
> > > 8 0.997613 -0.88%
> > > 9 1.000932 -0.25%
> > > 10 0.999584 -0.49%
> > >To calculate the DESIRED RESULT on REC# 5, the following calculation
> > >is needed: (notice 5 days of returns are multiplied together)
> > >((1.000859 * 0.999705 * 1.000471 * 0.994672 * 1.001945) - 1) * 100 => > >-0.24
> > >In order to calculate the DESIRED RESULT for REC# 6 I use the same
> > >calculation only using RETURN values from REC#'s 2 through 6.
> > >DESIRED RESULT on REC# 7 uses RETURN values from REC#'s 3 through 7
> > >It has to be a 5 day window of values that is why REC#'s 1-4 have no
> > >DESIRED RESULT.
> > >I'm trying to figure out how to do this without using a cursor....I
> > >hate cursors.
> > >Any ideas I've been thinking about if for a couple hours now and
> > >haven't found a good solution yet.
> > Hi Izzy,
> > If the REC# column is guaranteed to hold consecutive numbers (i.e. no
> > gaps), you can use
> > SELECT a.RecNo, a."Return",
> > CASE
> > WHEN COUNT(b."Return") = 5
> > THEN (POWER(10.0, SUM(LOG10(b."Return"))) - 1) * 100
> > ELSE NULL
> > END AS Result
> > FROM YourTable AS a
> > INNER JOIN YourTable AS b
> > ON b.RecNo BETWEEN a.RecNo - 4 AND a.RecNo
> > GROUP BY a.RecNo, a."Return";
> > Note that I'm using POWER(10.0,SUM(LOG10(Value))) to mimic the aggregate
> > product function that SQL Server lacks. This technique works if all
> > values are >0, as in your sample. If values =0 or <0 are allwoed as
> > well, you'll have to use a more complicated expression to get the
> > aggregate product (Google for SQL Server product aggregate Itzik Ben-Gan
> > and you'll find it).
> > If RecNo values can have gaps, *and* you are on SQL Server 2005, you can
> > generate consecutive rownumbers by using ROW_NUMBER(). If you're still
> > on SQL Server 2000, you'll need some real nasty work - if you want me to
> > help with that, you'll have to post CREATE TABLE and SELECT statements
> > so that I can set up a test environment by using copy and paste.
> > --
> > Hugo Kornelis, SQL Server MVP
> > My SQL Server blog:http://sqlblog.com/blogs/hugo_kornelis-Hide quoted text -
> > - Show quoted text -
> Thanks Hugo I think we are on the right path......but
> When I run your query it returns only 0.0 for all rows.
> RecNo is consecutive without gaps.
> Return column is of type FLOAT.
> I'm using SQL Server 2005
> To set up a test:
> CREATE TABLE #HIST_RETURNS
> (
> DAYNUM INT,
> TOTAL_RETURN FLOAT
> );
> INSERT INTO #HIST_RETURNS (DAYNUM, TOTAL_RETURN)
> VALUES (1, 1.000859);
> INSERT INTO #HIST_RETURNS (DAYNUM, TOTAL_RETURN)
> VALUES (2, 0.999705);
> INSERT INTO #HIST_RETURNS (DAYNUM, TOTAL_RETURN)
> VALUES (3, 1.000471);
> INSERT INTO #HIST_RETURNS (DAYNUM, TOTAL_RETURN)
> VALUES (4, 0.994672);
> INSERT INTO #HIST_RETURNS (DAYNUM, TOTAL_RETURN)
> VALUES (5, 1.001945);
> INSERT INTO #HIST_RETURNS (DAYNUM, TOTAL_RETURN)
> VALUES (6, 0.999718);
> INSERT INTO #HIST_RETURNS (DAYNUM, TOTAL_RETURN)
> VALUES (7, 0.997262);
> INSERT INTO #HIST_RETURNS (DAYNUM, TOTAL_RETURN)
> VALUES (8, 0.997613);
> INSERT INTO #HIST_RETURNS (DAYNUM, TOTAL_RETURN)
> VALUES (9, 1.000932);
> INSERT INTO #HIST_RETURNS (DAYNUM, TOTAL_RETURN)
> VALUES (10, 0.999584);
> SELECT A.DAYNUM, A.TOTAL_RETURN,
> CASE
> WHEN COUNT(B.TOTAL_RETURN) = 5
> THEN (POWER(10.0, SUM(LOG10(B.TOTAL_RETURN))) - 1) * 100
> ELSE NULL
> END AS RESULT
> FROM #HIST_RETURNS AS A
> JOIN #HIST_RETURNS AS B ON B.DAYNUM BETWEEN A.DAYNUM - 4 AND
> A.DAYNUM
> GROUP BY A.DAYNUM, A.TOTAL_RETURN
> ORDER BY A.DAYNUM;
> DROP TABLE #HIST_RETURNS;
> Thanks for the response.- Hide quoted text -
> - Show quoted text -
I think I figured it out, by adding more zero's to the "10.0" and
making it "10.00000".
SELECT A.DAYNUM, A.TOTAL_RETURN,
CASE
WHEN COUNT(B.TOTAL_RETURN) = 5
THEN (POWER(10.000000, SUM(LOG10(B.TOTAL_RETURN))) - 1) * 100
ELSE NULL
END AS RESULT
FROM #HIST_RETURNS AS A
JOIN #HIST_RETURNS AS B ON B.DAYNUM BETWEEN A.DAYNUM - 4 AND
A.DAYNUM
GROUP BY A.DAYNUM, A.TOTAL_RETURN
ORDER BY A.DAYNUM;|||On Sep 20, 8:17 am, Izzy <israel.rich...@.gmail.com> wrote:
> On Sep 20, 8:05 am, Izzy <israel.rich...@.gmail.com> wrote:
>
>
> > On Sep 19, 4:46 pm, Hugo Kornelis
> > <h...@.perFact.REMOVETHIS.info.INVALID> wrote:
> > > On Wed, 19 Sep 2007 19:45:46 -0000, Izzy wrote:
> > > >Here's what the data looks like:
> > > >REC# RETURN DESIRED RESULT
> > > >-- -- --
> > > > 1 1.000859
> > > > 2 0.999705
> > > > 3 1.000471
> > > > 4 0.994672
> > > > 5 1.001945 -0.24%
> > > > 6 0.999718 -0.35%
> > > > 7 0.997262 -0.59%
> > > > 8 0.997613 -0.88%
> > > > 9 1.000932 -0.25%
> > > > 10 0.999584 -0.49%
> > > >To calculate the DESIRED RESULT on REC# 5, the following calculation
> > > >is needed: (notice 5 days of returns are multiplied together)
> > > >((1.000859 * 0.999705 * 1.000471 * 0.994672 * 1.001945) - 1) * 100 => > > >-0.24
> > > >In order to calculate the DESIRED RESULT for REC# 6 I use the same
> > > >calculation only using RETURN values from REC#'s 2 through 6.
> > > >DESIRED RESULT on REC# 7 uses RETURN values from REC#'s 3 through 7
> > > >It has to be a 5 day window of values that is why REC#'s 1-4 have no
> > > >DESIRED RESULT.
> > > >I'm trying to figure out how to do this without using a cursor....I
> > > >hate cursors.
> > > >Any ideas I've been thinking about if for a couple hours now and
> > > >haven't found a good solution yet.
> > > Hi Izzy,
> > > If the REC# column is guaranteed to hold consecutive numbers (i.e. no
> > > gaps), you can use
> > > SELECT a.RecNo, a."Return",
> > > CASE
> > > WHEN COUNT(b."Return") = 5
> > > THEN (POWER(10.0, SUM(LOG10(b."Return"))) - 1) * 100
> > > ELSE NULL
> > > END AS Result
> > > FROM YourTable AS a
> > > INNER JOIN YourTable AS b
> > > ON b.RecNo BETWEEN a.RecNo - 4 AND a.RecNo
> > > GROUP BY a.RecNo, a."Return";
> > > Note that I'm using POWER(10.0,SUM(LOG10(Value))) to mimic the aggregate
> > > product function that SQL Server lacks. This technique works if all
> > > values are >0, as in your sample. If values =0 or <0 are allwoed as
> > > well, you'll have to use a more complicated expression to get the
> > > aggregate product (Google for SQL Server product aggregate Itzik Ben-Gan
> > > and you'll find it).
> > > If RecNo values can have gaps, *and* you are on SQL Server 2005, you can
> > > generate consecutive rownumbers by using ROW_NUMBER(). If you're still
> > > on SQL Server 2000, you'll need some real nasty work - if you want me to
> > > help with that, you'll have to post CREATE TABLE and SELECT statements
> > > so that I can set up a test environment by using copy and paste.
> > > --
> > > Hugo Kornelis, SQL Server MVP
> > > My SQL Server blog:http://sqlblog.com/blogs/hugo_kornelis-Hidequoted text -
> > > - Show quoted text -
> > Thanks Hugo I think we are on the right path......but
> > When I run your query it returns only 0.0 for all rows.
> > RecNo is consecutive without gaps.
> > Return column is of type FLOAT.
> > I'm using SQL Server 2005
> > To set up a test:
> > CREATE TABLE #HIST_RETURNS
> > (
> > DAYNUM INT,
> > TOTAL_RETURN FLOAT
> > );
> > INSERT INTO #HIST_RETURNS (DAYNUM, TOTAL_RETURN)
> > VALUES (1, 1.000859);
> > INSERT INTO #HIST_RETURNS (DAYNUM, TOTAL_RETURN)
> > VALUES (2, 0.999705);
> > INSERT INTO #HIST_RETURNS (DAYNUM, TOTAL_RETURN)
> > VALUES (3, 1.000471);
> > INSERT INTO #HIST_RETURNS (DAYNUM, TOTAL_RETURN)
> > VALUES (4, 0.994672);
> > INSERT INTO #HIST_RETURNS (DAYNUM, TOTAL_RETURN)
> > VALUES (5, 1.001945);
> > INSERT INTO #HIST_RETURNS (DAYNUM, TOTAL_RETURN)
> > VALUES (6, 0.999718);
> > INSERT INTO #HIST_RETURNS (DAYNUM, TOTAL_RETURN)
> > VALUES (7, 0.997262);
> > INSERT INTO #HIST_RETURNS (DAYNUM, TOTAL_RETURN)
> > VALUES (8, 0.997613);
> > INSERT INTO #HIST_RETURNS (DAYNUM, TOTAL_RETURN)
> > VALUES (9, 1.000932);
> > INSERT INTO #HIST_RETURNS (DAYNUM, TOTAL_RETURN)
> > VALUES (10, 0.999584);
> > SELECT A.DAYNUM, A.TOTAL_RETURN,
> > CASE
> > WHEN COUNT(B.TOTAL_RETURN) = 5
> > THEN (POWER(10.0, SUM(LOG10(B.TOTAL_RETURN))) - 1) * 100
> > ELSE NULL
> > END AS RESULT
> > FROM #HIST_RETURNS AS A
> > JOIN #HIST_RETURNS AS B ON B.DAYNUM BETWEEN A.DAYNUM - 4 AND
> > A.DAYNUM
> > GROUP BY A.DAYNUM, A.TOTAL_RETURN
> > ORDER BY A.DAYNUM;
> > DROP TABLE #HIST_RETURNS;
> > Thanks for the response.- Hide quoted text -
> > - Show quoted text -
> I think I figured it out, by adding more zero's to the "10.0" and
> making it "10.00000".
> SELECT A.DAYNUM, A.TOTAL_RETURN,
> CASE
> WHEN COUNT(B.TOTAL_RETURN) = 5
> THEN (POWER(10.000000, SUM(LOG10(B.TOTAL_RETURN))) - 1) * 100
> ELSE NULL
> END AS RESULT
> FROM #HIST_RETURNS AS A
> JOIN #HIST_RETURNS AS B ON B.DAYNUM BETWEEN A.DAYNUM - 4 AND
> A.DAYNUM
> GROUP BY A.DAYNUM, A.TOTAL_RETURN
> ORDER BY A.DAYNUM;- Hide quoted text -
> - Show quoted text -
Thanks a bunch for that post Hugo it was exactly what I needed.|||On 20 Sep 2007 06:05:35 -0700, Izzy wrote:
(snip)
>Thanks Hugo I think we are on the right path......but
>When I run your query it returns only 0.0 for all rows.
(...)
>To set up a test:
Hi Izzy,
Thanks for providing a repro. Interesting find!
I've seen your other post with the workaround, but I think I've found a
better way - replace 10.0 (or 10.00000) with 10e0. This results in more
precision in the results from the query:
SELECT A.DAYNUM, A.TOTAL_RETURN,
CASE
WHEN COUNT(B.TOTAL_RETURN) = 5
THEN (POWER(10e0, SUM(LOG10(B.TOTAL_RETURN))) - 1) * 100
ELSE NULL
END AS RESULT
FROM #HIST_RETURNS AS A
JOIN #HIST_RETURNS AS B
ON B.DAYNUM BETWEEN A.DAYNUM - 4 AND A.DAYNUM
GROUP BY A.DAYNUM, A.TOTAL_RETURN
ORDER BY A.DAYNUM;
My guess at the cause is that 10.0 is interpreted by the query engine as
DECIMAL(3,1). The other values are then converted from FLOAT to this
same datatype, causing you to lose quite a bit of precision. Your
workaround of 10.00000 reduced the precision loss, since you now use
DECIMAL(8,6). The 10e0 in my solution is considered to be FLOAT, so
there is no conversion required at all.
Another solution would be to explicitly write CAST(10 AS float)
--
Hugo Kornelis, SQL Server MVP
My SQL Server blog: http://sqlblog.com/blogs/hugo_kornelis

No comments:

Post a Comment