Organising the Provision
What do I mean by ‘Organising the Provision’? As our data can be complex, we need to look at how and when we provide the data. I would like to consider the following:
- Live reporting
- Scheduled reporting from the Live database
- Data Store
- Data Warehouse
- Other systems
This means asking the database for data and then displaying the results - a simple configuration, but has a down side. Each person requesting the report enforces the system to do all the processing each time the report is generated. All the queries and filters use the system’s memory and processor time. This use of resources means that the system may be starved of resources for short periods of time causing main software to “hiccup”.
Scheduled Reporting from Live
This is a specific time when the report is generated and ether emailed or stored on a shared file/web location. Alternatively, if a report is generated, then a cached version is created which lasts for a specific time, and any more requests get a copy of the original rather than making a Live query. Scheduling can mean that processing is kept to a minimum and so the host system has less impact.
My concept of a Data Store is a repository of data collected from various places. The data has had much of the recombining (de-normalising) completed whilst copying the original data. It is easier, and much quicker to create and process reports. The processing of data is scheduled, usually overnight. In general calculations, the filters are not applied to the data store and are completed by the reporting tool.
A data warehouse, like a data store, is a processed repository for data. Whilst the data store is designed to simplify the data from the original databases, the data warehouse is designed with ‘reports’ in mind. If you want to report on attendance by department, you can create tables that hold the departments and the number of marks of each type for each class, for each day.
SSIS: SQL Server Integration Services
This was designed for reporting using pivot tables. It is a highly efficient method of delivering calculated data. As virtually all the calculations are pre-processed, producing good quality accurate reports is very fast, although a large data storage facility is required.
There are so many other methods of provisioning data for reporting. I will just mention our integration engine - System Integration Manager (SIM). SIM has the facility to manage the processing of data, such as executing a store procedure on a schedule. This data can then be pushed to a place where the results can be reported on - for instance a nightly export of students’ attendance rates by student/week/module to an excel spreadsheet every night.