Monday, 14 February 2011

Build a pivot table from more than one Excel Worksheet

As you may know, all versions of excel before 2007 have a limit of 65536 rows in a single worksheet. If you have very large data sets that you need to generate into a pivot table this can be a very limiting factor. Lets say that you needed to analyse products over 20 weeks. To uniquely identify a product and week you would need to use one line per product per week. 65500 / 20 = 3275 products which is not really that many.

In order to get round this, you would normally have to set the source of the pivot table data to be something other than a worksheet and use Microsoft Query to get the data from a database or another workbook. If you do not have Microsoft Query installed there is a way to generate a single pivot table from multiple worksheets using VBA. A shout out goes to Jon von der Heyden on the mrExcel message boards for providing me with this script.

http://www.mrexcel.com/forum/showthread.php?t=528875

Public Sub ConsolidateAndPivot()
Dim rngCell As Range, strRange As String, rngData As Range
Dim pvc As PivotCache, pvt As PivotTable
Dim arrSheets As Variant, wks As Worksheet
Dim recData As ADOR.Recordset

Set recData = New ADOR.Recordset
arrSheets = Array("Sheet1", "Sheet2", "Sheet3")
strRange = "A:E"
Set pvt = Sheets(1).PivotTables(1)

For Each wks In Sheets(arrSheets)
Set rngData = Intersect(wks.Range(strRange), wks.UsedRange)
If wks.Name = arrSheets(0) Then
For Each rngCell In rngData.Resize(1)
With recData
With .Fields
Select Case TypeName(rngCell.Offset(1).Value)
Case "String"
.Append rngCell.Text, adVarChar, 255
Case "Date"
.Append rngCell.Text, adDate
Case "Double", "Currency"
.Append rngCell.Text, adDouble
End Select
End With
End With
Next rngCell
End If
If recData.State = 0 Then recData.Open
Set rngData = Intersect(rngData.Offset(1), wks.UsedRange)
For Each rngCell In rngData
With recData
If rngData(0, rngCell.Column).Text = .Fields(0).Name Then .AddNew
.Fields(rngData(0, rngCell.Column).Text).Value = rngCell.Value
.Update
End With
Next rngCell
Next wks

With Sheets.Add(Sheets(1))
Set pvc = ThisWorkbook.PivotCaches.Add(SourceType:=xlExternal)
Set pvc.Recordset = recData
With pvc.CreatePivotTable(TableDestination:=.Range("A1"))
pvt.CacheIndex = .CacheIndex
End With
Application.DisplayAlerts = False
.Delete
Application.DisplayAlerts = True
End With

recData.Close
Set recData = Nothing
End Sub


This builds a pivot cache from data in the worksheets named in the arrSheets array, using the columns defined in strRange. It will use the data to populate a pivot table already defined in Sheet(1). If you dont already have a pivot table or you want to add it to a new worksheet, you will need to create the sheet and table as part of the script.

WARNING! This may be slow and should only be used as a last resort. It is much more efficient to get the data from a database or text file or another workbook if possible.

Wednesday, 2 February 2011

using WSSI and OTB in a mail order environment

High street retailers set great store on the basis os their WSSI reports. WSSI stands for Weekly Sales & Stock Intake (pronounced Whizzy). OTB stands for Open to Buy (the amount of stock that can be bought)

On the face of it, it's a fiarly simple calculation:
Opening Stock + Receipts - Sales = Closing Stock.

The plan is that closing stock should be enough to cover X trading periods (days, weeks or months). If it is not enough then you have an Open to Buy position - the buyers are permitted to order more stock. This is fine in a bricks an mortar environment because if it is not in stock then it is normally a lost sale. In Mail Order, provided the delay is not too long, the sale will go into backorder (sometimes known as To-follow) and will be sent when stock becomes available. The issue with this is that the longer the item is on back order, the more of the potential sales will be cancelled. Mail order also has a high returns rate compared to high street and this needs to be factored in when considering how much stock is required for order. To add to the cmoplications, returns are delayed by a time period and so need to be correlated to sales from a previous period. If the item was out of stock then there will be no returns for the period + X.

Factor that in with the fact that the product may appear in a number of different catalogues and that most catalogues are front loaded on their demand (i.e. a high proportion of the sales will occur in the first few weeks - typically 60-80% in 1st 4 weeks then remainder in last 16-20 weeks) and the peaks and troughs in demand can swing wildly during a season.


I have built a spreadsheet in Excel 2003 that factors in returns, demand decay on back orders and stock in overlapping catalogues. If you think it might be useful to you, drop me a line and we can discuss it.

How to avoid: Vlookup returns zero if the target cell is blank

As much as I love VLOOKUP() is has one annoying habit of returning 0 if there is nothing in the target cell. This is particularly important if you need to validate between 0 and blank (for example you are building a forecast model that will take either the forecasted value or actual value - the actual value could be zero if there was no activity or blank if you have not yet reached that period).

The solution to this is complicated and so I will break it down in to its constituent parts.

1) In order to determine whether the target cell contains a numeric value we will use COUNTBLANK(). The problem with this is that COUNTBLANK() requires a cell reference or range (as opposed to COUNT() which just requires an array of values, which may or may not be cell references)

2) In order to get a cell reference we will use the CELL("address",[target cell]) syntax. This will return the reference in the format [sheet!][$ColID$Rownumber]. The problem is how to get the [target cell]. You can't use VLOOKUP() for this because it will only return the cell contents. instead we will use OFFSET(). You can't use the cell reference directly so we will wrap it in an INDIRECT().

3) OFFSET() will give us the cell n rows and m columns from the reference cell. We will know the reference cell because it will be the start of the VLOOKUP() range, but we don't know how far down the range it will be because the source value will be different for each search. To get round this we can use the MATCH() function.

4) MATCH() takes a source value and returns the number of cells in the range that had to be searched to find a match (position in the range array). If we limit the range to the 1st column in the VLOOKUP() then this will give us the row offset from our start point.

5) now to put it all together. Lets assume that the value we are looking for is in Sheet1!$B$1 and the value we want to return is in Sheet2, columnD. First we test to see if the cell is empty and if it is then we insert "", otherwise we insert the value of the cell (we can use VLOOKUP() for this as the syntax will be shorter)

=IF
(COUNTBLANK
(INDIRECT
(CELL
("address",OFFSET
(Sheet2!$B$1,MATCH(Sheet1!$B$1,Sheet2!$A:$A,0
)
,4
)
)
)=1
,""
,VLOOKUP
($B$1,sheet2!A:D,4,0),FALSE
)
)


In summary, this formula will check if the target cell is empty or not a number and if so return nothing ("") otherwise it will return the contents of the cell.