This file was created from the following Jupyter-notebook: docs/pipes.ipynb
Interactive version: Binder badge

Pipeline-based processing in pypillometry

[1]:
import sys
sys.path.insert(0,"..")
import pypillometry as pp

pypillometry implements a pipeline-like approach where each operation executed on a PupilData-object returns a copy of the (modified) object. This enables the “chaining” of commands as follows:

[2]:
d=pp.PupilData.from_file("../data/test.pd")\
    .blinks_detect()\
    .blinks_merge()\
    .lowpass_filter(3)\
    .downsample(50)

This command loads a data-file (test.pd), applies a 3Hz low-pass filter to it, downsamples the signal to 50 Hz, detects blinks in the signal and merges short, successive blinks together. The final result of this processing-pipeline is stored in object d.

Here, for better visibility, we put each operation in a separate line. For that to work, we need to tell Python that the line has not yet ended at the end of the statement which we achieve by putting a backslash \ at the end of each (non-final) line.

We can get a useful summary of the dataset and the operations applied to it by simply printing it:

[3]:
print(d)
PupilData(test_ro_ka_si_hu_re_vu_vi_be, 331.3KiB):
 n                 : 6001
 nmiss             : 117.2
 perc_miss         : 1.9530078320279955
 nevents           : 56
 nblinks           : 24
 ninterpolated     : 0.0
 blinks_per_min    : 11.998000333277787
 fs                : 50
 duration_minutes  : 2.0003333333333333
 start_min         : 4.00015
 end_min           : 6.0
 baseline_estimated: False
 response_estimated: False
 History:
 *
 └ reset_time()
  └ blinks_detect()
   └ sub_slice(4,6,units=min)
    └ drop_original()
     └ blinks_detect()
      └ blinks_merge()
       └ lowpass_filter(3)
        └ downsample(50)

We see that sampling rate, number of datapoints and more is automatically printed along with the history of all operations applied to the dataset. This information can also be retrieved separately and in a form useful for further processing the function summary() which returns the information in the form of a dict:

[4]:
d.summary()
[4]:
{'name': 'test_ro_ka_si_hu_re_vu_vi_be',
 'n': 6001,
 'nmiss': 117.2,
 'perc_miss': 1.9530078320279955,
 'nevents': 56,
 'nblinks': 24,
 'ninterpolated': 0.0,
 'blinks_per_min': 11.998000333277787,
 'fs': 50,
 'duration_minutes': 2.0003333333333333,
 'start_min': 4.00015,
 'end_min': 6.0,
 'baseline_estimated': False,
 'response_estimated': False}

The history is internally stored in PupilData’s history member and can be applied to another object for convenience. That way, a pipeline can be developed on a single dataset and later be transferred to a whole folder of other (similar) datasets.

As an example, we create several “fake” datasets representing data from several subjects (each with 10 trials):

[5]:
nsubj=10 # number of subjects
data={k:pp.create_fake_pupildata(ntrials=10, fs=500) for k in range(1,nsubj+1)}

The dict data now contains ten PupilData datasets. We will now use the data from the first subject to create a pipeline of processing operations:

[6]:
template=data[1].lowpass_filter(5).downsample(100)
template.print_history()
* fake_bomitime_ni_fu
└ lowpass_filter(5)
 └ downsample(100)

We have stored the result of these operations in a new dataset template which contains a record of these operations. We can now easily apply identical operations on all the datasets using the apply_history() function:

[7]:
preproc_data={k:template.apply_history(d) for k,d in data.items()}
preproc_data[5].print_history()
* fake_kowelale_wu_ni
└ lowpass_filter(5)
 └ downsample(100)
This file was created from the following Jupyter-notebook: docs/pipes.ipynb
Interactive version: Binder badge