Data Streams¶

Overview¶

Data streams provide interface for common way of reading from and writing to various structured data sources. With streams you can easily read CSV file and merge it with Excel spreadsheet or Google spreadsheet, then perform cleansing and write it to a relational database table or create a report.

Example of data streams.

The streams can be compared to file-like stream objects where structured data is being passed instead of bytes. There are two ways how to look at structured data: as a set of lists of values or as a set of key-value pairs (set of dictionaries). Some sources provide one or the other way of looking at the data, some processing is better with the list form, another might be better with the dictionary form. Brewery allows you to use the form which is most suitable for you.

Example of streaming data as sequence of rows - tuples of values.

Example of streaming data as sequence of records - dictionaries with key-value pairs.

Metadata¶

At any time you are able to retrieve stream metadata: list of fields being streamed. For more information see documentation for brewery.ds.Field where you can find list of field attributes, data storage types, analytical types and more.

Data Sources¶

Data source	Description	Dataset reference
csv	Comma separated values (CSV) file/URI resource	file path, file-like object, URL
xls	MS Excel spreadsheet	file path, URL
gdoc	Google Spreadsheet	spreadsheet key or name
rdb	Relational database table (using DB API 2 interface)	connection + table name
mongodb	MongoDB database collection	connection + table name
yamldir	Directory containing yaml files - one file per record	directory
jsondir	Directory containing json files - one file per record	directory

Data Targets¶

Data target	Description
csv	Comma separated values (CSV) file/URI resource
rdb	Relational database table
mongodb	MongoDB database collection
yamldir	Directory containing yaml files - one file per record
jsondir	Directory containing json files - one file per record
html	HTML file or a string target

relational database table non-relational database collection

See also

Class brewery.ds.DataTarget

Data target documentation

API¶

Data sources should implement:

initialize() - delayed initialisation: get fields if they are not set, open file stream, ...
rows() - returns iterable with value tuples
records() - returns iterable with dictionaries of key-value pairs

Should provide property fields, optionally might provide assignment of this property.

Data targets should implement:

initialize() - create dataset if required, open file stream, open db connection, ...
append(object) - appends object as row or record depending whether it is a dictionary or a list

Data Streams¶

Overview¶

Metadata¶

Data Sources¶

Data Targets¶

API¶

Table Of Contents

Previous topic

Next topic

This Page

Navigation

Data Streams¶

Overview¶

Metadata¶

Data Sources¶

Data Targets¶

API¶

Table Of Contents

Previous topic

Next topic

This Page

Quick search

Navigation