This chapter describes step-by-step how to use the Cubes. You will learn:
The tutorial contains examples for both: standard tool use and Python use. You don’t need to know Python to follow this tutorial.
The example data used are IBRD Balance Sheet taken from The World Bank. Backend used for the examples is sql.browser.
Create a tutorial directory and download this example file.
Start with imports:
>>> from sqlalchemy import create_engine
>>> from cubes.tutorial.sql import create_table_from_csv
Note
Cubes comes with tutorial helper methods in cubes.tutorial. It is advised not to use them in production, they are provided just to simplify learner’s life.
Prepare the data using the tutorial helpers. This will create a table and populate it with contents of the CSV file:
>>> engine = create_engine('sqlite:///data.sqlite')
... create_table_from_csv(engine,
... "data.csv",
... table_name="irbd_balance",
... fields=[
... ("category", "string"),
... ("category_label", "string"),
... ("subcategory", "string"),
... ("subcategory_label", "string"),
... ("line_item", "string"),
... ("year", "integer"),
... ("amount", "integer")],
... create_id=True
... )
...
Everything in Cubes happens in an analytical workspace. It contains cubes, maintains connections to the data stores (with cube data), provides connection to external cubes and more.
The workspace properties are specified in a configuration file slicer.ini (default name). First thing we have to do is to specify a data store – database where are the cube’s data:
[datastore]
type: sql
url: sqlite:///data.sqlite
In Python it would be:
from cubes import Workspace
workspace = Workspace()
workspace.register_default_store("sql", url="sqlite:///data.sqlite")
Or alternatively, you can use the slicer.ini file in Python as well:
workspace = Workspace(config="slicer.ini")
Download the example model and save it as model.json.
In the slicer.ini file specify the model:
[workspace]
model: model.json
For more information how to add more models to the workspace see the configuration documentation.
Equivalent in Python is:
>>> workspace.load_model("model.json")
You might call load_model()<cubes.Workspace.load_model() with as many models as you need. Only limitation is that the public cubes and public dimensions should have unique names.
Browser is an object that does the actual aggregations and other data queries for a cube. To obtain one:
>>> browser = workspace.browser("ibrd_balance")
Compute the aggregate. Measure fields of aggregation result have aggregation suffix. Also a total record count within the cell is included as record_count.
>>> result = browser.aggregate()
>>> result.summary["record_count"]
62
>>> result.summary["amount_sum"]
1116860
Now try some drill-down by year dimension:
>>> result = browser.aggregate(cell, drilldown=["year"])
>>> for record in result.drilldown:
... print record
{u'record_count': 31, u'amount_sum': 550840, u'year': 2009}
{u'record_count': 31, u'amount_sum': 566020, u'year': 2010}
Drill-dow by item category:
>>> result = browser.aggregate(cell, drilldown=["item"])
>>> for record in result.drilldown:
... print record
{u'item.category': u'a', u'item.category_label': u'Assets', u'record_count': 32, u'amount_sum': 558430}
{u'item.category': u'e', u'item.category_label': u'Equity', u'record_count': 8, u'amount_sum': 77592}
{u'item.category': u'l', u'item.category_label': u'Liabilities', u'record_count': 22, u'amount_sum': 480838}
>>> cube = workspace.cube("ibrd_balance")
cell defines context of interest - part of the cube we are looking at. We start with whole cube: