Statistical Data¶
Wazimap presents statistical data about geographies, such as the number of English speakers in a city. You need to tell Wazimap what data is available and for which geographies.
Wazimap stores statistics in what it calls data tables. There are two different types of data tables: Field Tables and Simple Tables.
- Field Tables
- Field Tables are more flexible than Simple Tables. They allow Wazimap to store combinations, such as the number of people in
an area by both age and gender. Most census data is best stored using Field Tables. Field Tables almost always have
only one column that contains a number, the
total
column. - Simple Tables
- Simple Tables look a lot like spreadsheet tables. Each column is a different statistic about a place. They are
easy to think about and work with but are limited in their flexibility. Simple Tables often have many columns
with numbers, in addition to a
total
column.
Both types of tables have some metadata linked to them, such as an id, year and a description of the population that it covers.
Datasets and Releases¶
In Wazimap, a dataset is a collection of related data tables, such as a national census. A dataset can be updated with new releases every few years. Not all data data tables will always be updated in every release, so Wazimap lets you link data tables to releases individually.
Sometimes a release has a different name to the original dataset. For example, South Africa conducts a full census every decade, but releases a community survey in between each full census. A community survey is a statistical sampling and is not a full census, so it would be incorrect to call them both “census”. The results of the community survey are very similar to the census and are directly comparable. We consider census and community surveys to be different releases of the same dataset.
Important
You must add at least one dataset and one release before you can add any data tables. See below for details on how to do this.
Create a Dataset and Release¶
- Go to the Django admin section at http://localhost:8000/admin and log in.
- Under Wazimap, click the Add button alongside Datasets.
- Give your dataset a name.
- Under Releases, fill in the name and the year of your first release. For example, you could use
Census
and2017
. - Click Save.
Configuring Tables¶
Datasets, releases and data tables are configured through the Django admin interface, at http://localhost:8000/admin.
Once you have told Wazimap about your tables, it’ll ensure that they exist in the database. You can then import the raw data from CSV.
Field Tables¶
A Field Table is a logical collection of fields and values that describe numeric facts about a geography, along with some extra metadata about the table such as a name.
A field
is generally an attribute of a place or a person in that place, such as language
or gender
. A field has corresponding keys
such as English
or Female
. Fields and their keys describe a collection of people that match those attributes, such as all the English-speaking females in a province. The value
associated with a collection of fields and keys is the number of people with that attribute.
For example, here is a Field Table with two fields, language
and gender
:
language | gender | total |
---|---|---|
English | Male | 298 |
English | Female | 312 |
French | Male | 128 |
French | Female | 779 |
Most census Field Tables describe a partitioning of the population: the population is broken into groups (such as by language or gender) and every person is counted exactly once. If we added up all the values for all key combinations, we’d get the total population. That’s useful because it means we can express the value for each combination of keys as a percentage of the total.
Each Field Table is stored in a physical PostgreSQL database table. Each entry in a Field Table is linked to a geography, since a row is a statistic about a place, and so each row has the geography level, code and version associated with it.
geo_level | geo_code | geo_version | language | gender | total |
---|---|---|---|---|---|
province | WC | English | Male | 283 | |
province | WC | English | Female | 199 | |
province | WC | French | Male | 324 | |
province | WC | French | Female | 287 | |
province | GT | English | Male | 298 | |
province | GT | English | Female | 312 | |
province | GT | French | Male | 128 | |
province | GT | French | Female | 779 |
Adding a Field Table¶
First, ensure that you have created at least one dataset and release.
- Go to the Django admin section at http://localhost:8000/admin and log in.
- Under Wazimap, click the Add button alongside Field tables.
- Choose the dataset the table belongs to.
- Name the Universe this table describes, such as
Population
,Households
orYouth aged 14 to 25
. - Provide a comma-separated list of the Fields in your table.
- Leave the Description blank and it will be generated for you.
- Click Save.
Now import the data into the table. The easiest way of doing this is to look at the database to understand the columns in your new table, shape your data accordingly, and import it using psql’s CSV import support.
Simple Tables¶
A Simple Table looks a lot like a spreadsheet. It contains statistics for many places, one geography per row. Each column has a name and the cell values are the numerical statistics for that row’s geography. Each Simple Table is stored in a physical PostgreSQL database table.
For example, here is a Simple Table with two columns, votes_cast
and registered_voters
.
geo_level | geo_code | geo_version | votes_cast | registered_voters |
---|---|---|---|---|
province | WC | 829 | 1024 | |
province | GT | 773 | 990 |
You can see that in contrast with a Field Table, a Simple Table can have multiple statistics per geography.
A Simple Tables usually has a column which represents a total value, usually (but not always) called total
.
It is used to calculate percentages for other columns in the table. In the
example above, the registered_voters
column is the total column,
because we can express votes_cast
as a percentage of the registered voters
in each province.
Wazimap uses this to allow the user to switch between absolute values and percentages when viewing data for the table. You can also tell Wazimap that a table doesn’t have a total column, in which case it always shows absolute values.
Note
If your table has a total column, it’s important that all the statistics in it are related. If it doesn’t make sense to express a column as a percentage, put it in another table that doesn’t have a total column.
Adding a Simple Table¶
First, ensure that you have created at least one dataset and release.
- Go to the Django admin section at http://localhost:8000/admin and log in.
- Under Wazimap, click the Add button alongside Simple tables.
- Give your table a descriptive name.
- Choose the dataset the table belongs to.
- Name the Universe this table describes, such as
Population
,Households
orYouth aged 14 to 25
. - Add a Description of your table.
- Click Save.
Now import the data into the table. The easiest way of doing this is to look at the database to understand the columns in your new table, shape your data accordingly, and import it using psql’s CSV import support.