Current languages for data analysis such as R, python or Matlab provide no protection for the sensitive personal data that researchers commonly encounter. Data is often available in its raw form and can be freely copied. Private creates a buffer between the data and the analyst that allows companies and institutions to provide access to data without compromising the privacy of the people from whom it was collected. Private:
- Couples the data and the analysis so that data is not copied from one place to another
- Checks all results before release to ensure they are not sensitive to the inclusion or exclusion of any given individual
- Uses the familiar syntax of Python augmented with probabilistic constructs imported from BUGS/JAGS/Stan
- Automatically parallelises at the variable level across a multicore machine or over a cluster
You can find a running instance of Private that you can try out at private.mall-lab.com. Several tutorials can be found at www.simondennis.blog. These include an introduction to the interface and the language and a demonstration of the use of Private for processing sensitive data.
Private is open source (GNU GPLv3) and is available at github.com/complex-human-data-hub/Private/. Instructions to build and deploy using Docker are available on the github site.