Draft: This is a work-in-progress by the Open Data Collaboration group at ODI Leeds.
Open Innovations

Business rates: data format

The Open Data Collaboration group at Open Innovations (Open Innovations, Bradford Council, Calderdale Council, and Leeds Council) started a project to visualise business rates data. Although each of the three councils publish open data about their business rates, they each have different field titles and don't always include the same fields. This project aimed to standardise the way the data was shared to make it more usable and to get value (within a council and externally) from the datasets.

File format

The file should be saved as a CSV file. If you are starting from an Excel file you can choose to export the file as CSV. Make sure to only export the table of data rather than any extra lines of explanatory notes or totals. Excel's export option tends to do a bad job with dates. So we created the CSV Cleaner tool to help tidy them up. You might also want to try ODI HQ's CSV Lint tool to check your CSV file is valid. See our Open Data tips for more. We've also made a validator to check your CSV file against.

Validate your business rates file

Headings and data formats

To make a tool that works across multiple councils, we have to first agree a common standard for sharing the data. This means standard column headings and requiring some columns to exist. However, we are practical and know that some fields will be hard for some councils to include and other councils will want to include more fields where they have them. The result was the following suggestion for common heading titles:

  1. Property reference number e.g. A89123
    A string representing the unique ID used inside the council. The format of this will vary from council to council. It is possibly useful internally within a council to match to other records.
  2. BA reference number e.g. 2053386634713
    This is the unique property code from the VOA; at the moment it is only included in the Leeds data and may be hard for some councils so isn't mandatory. If this matches the "Building reference number" included in the Energy Performance Certificate data it would allow us to join the two datasets together to get extra insight.
  3. RatepayerRequired e.g. Local Business Ltd
    If the rate payer is a named individual, you may wish to replace the name with REDACTED or INDIVIDUAL to protect their privacy.
  4. Company number e.g. 09417733
    A string representing the unique ID used to identify the company and verify the fact that it is an entity registered with Companies House.
  5. Charity number e.g. 202918
    A string representing the unique ID used to identify the charity and verify the fact that it is an entity registered with the Charity Comission.
  6. AddressRequired e.g. 3rd floor Munro House, Duke Street, Leeds
  7. PostcodeRequired e.g. LS9 8AG
  8. LatitudeRequired e.g. 53.79682
    Units: degrees. It only needs to be given to 5 or 6 decimal places. This may be calculated from postcode centroids but if more accurate data exists that would be better. If you only have postcodes, you could use the National Statistics Postcode Lookup (Latest) Centroids dataset to convert from postcode to latitude and longitude. It is released under the Open Government Licence.
  9. LongitudeRequired e.g. -1.53445
    Units: degrees. It only needs to be given to 5 or 6 decimal places. This may be calculated from postcode centroids but if more accurate data exists that would be better. If you only have postcodes, you could use the National Statistics Postcode Lookup (Latest) Centroids dataset to convert from postcode to latitude and longitude. It is released under the Open Government Licence.
  10. OccupiedRequired e.g. Y
    A simple flag with Y for occupied and either blank or N to indicate empty (to help reduce file size)
  11. Liability start dateRequired e.g. 2019-04-01
    Should be in YYYY-MM-DD format as recommended by GDS. This format reduces ambiguity in parsing dates and gives flexibility with time precision.
  12. Empty fromRequired e.g. 2019-04-01
    Should be in YYYY-MM-DD format as recommended by GDS. This format reduces ambiguity in parsing dates and gives flexibility with time precision.
  13. Rateable valueRequired e.g. 13250
    Units: GBP. Should be in GBP and just as a decimal number without commas or currency symbols.
  14. VOA codeRequired e.g. CO
    These are the Primary Description Code values.
  15. VOA description e.g. Offices and Premises
    These are the Default Description values.
  16. Exemptions
  17. Exemptions start date e.g. 2019-04-05
    Should be in YYYY-MM-DD format as recommended by GDS. This format reduces ambiguity in parsing dates and gives flexibility with time precision.
  18. Relief types
  19. Relief total e.g. 13250
    Units: GBP. Should be in GBP and just as a decimal number without commas or currency symbols.
  20. Relief mandatory e.g. 6541.65
    Units: GBP. Should be in GBP and just as a decimal number without commas or currency symbols.
  21. Relief discretionary e.g. 341.89
    Units: GBP. Should be in GBP and just as a decimal number without commas or currency symbols.

Obviously, it can be hard to export data in the right formats. To help we've created a CSVCleaner tool that should help reformat dates, clean-up currency, remove trailing spaces, and fix some column headings. You just drop a CSV file into it.

URLs

Business Rates data is updated fairly regularly. You don't want developers using out-of-date versions of your data. Help them stay up-to-date by using reliable, persistent, URLs to access your data; as the founder of the web says "Cool URIs don't change". We suggest the following:

These can also be set up as redirects to send people to the CSV file if it is already hosted elsewhere e.g. AWS/a council data store/other. You could use a similar URL structure for Brownfield Sites and many other common datasets.